Host LLama 2 for free using Cloudflare AI

Aman Kumar
4 min readFeb 13, 2024

In this guide, we’ll explore how you can deploy and host LLaMA 2, a powerful language model, for free using Cloudflare Workers.

LLMs (Large Language Models) and AI technologies are rapidly advancing, and with Cloudflare’s generous pricing model, you’re perfectly positioned to start developing your own AI applications.

Cloudflare workers pricing

Follow these steps to set up your application:

1. Create a Cloudflare Account

Start by signing up or logging into your Cloudflare account.

2. Navigate to Workers & Pages

Within your dashboard, find the section for “Workers & Pages” to begin setting up your new application.

3. Create Your Application

Click on the “Create Application” button to initiate the setup process.

4. Create a Worker

Choose “LLM App” from the templates under the Workers tab. This choice serves as the foundation of your application, enabling JavaScript execution on Cloudflare’s servers. Selecting the LLM App template provides a head start with the necessary packages for running such an application.

Workers Tab

5. Deploy Your Worker

After creating your worker, click on “Deploy”. Don’t worry; you can update your worker’s code later as needed.

Deploy LLM App

Hurrah! Our worker is now live, make a request like below to see the magic happen. Replace the url with your worker url.

curl --location 'https://worker-white-tooth-29e9.2000-aman-sinha.workers.dev/' \
--header 'Content-Type: application/json' \
--data '{
"prompt": "say a joke"
}'

6. Edit Your Worker’s Code

Now, it’s time to customize your worker. Click on “Edit code” to start coding. Initially, your worker is up and running.

Edit index.js to contain the below content

The Final project looks like this

7. Click “Save and Deploy”

Our LLM is now up and running, and accepting requests and prompts from us.

Let’s try by making the below request to get a joke. Change URL with your worker URL for it to work.

curl --location 'https://worker-white-tooth-29e9.2000-aman-sinha.workers.dev/' \
--header 'Content-Type: application/json' \
--data '{
"prompt": "say a joke"
}'

Well Done. Our app is live and we can make use of Llama2 now.

You can checkout various models available on Cloudflare AI here:

Try out various models, see what fits best for your use case, and move forward.

I’m deeply involved with AI and LLMs. Follow me on Medium for more insights.

Feel free to say hi or connect via Twitter and LinkedIn.

--

--