Deploy LLama 3 for free using Cloudflare AI in 3 minutes

4 min readMay 13, 2024

In this guide, we’ll explore how to deploy and host LLaMA 3, a powerful language model, for free using Cloudflare Workers.

Llama 3 models take data and scale to new heights. It’s been trained on Meta’s two recently announced custom-built 24K GPU clusters on over 15T tokens of data — a training dataset 7x larger than that used for Llama 2, including 4x more code.

LLMs (Large Language Models) and AI technologies are rapidly advancing, and with Cloudflare’s generous pricing model, you’re perfectly positioned to start developing your own AI applications.

Follow these steps to set up your application:

1. Create a Cloudflare Account

Start by signing up or logging into your Cloudflare account.

2. Navigate to Workers & Pages

Find the “Workers & Pages” section in your dashboard to begin setting up your new application.

3. Create Your Application

Click on the “Create Application” button to initiate the setup process.

4. Create a Worker

Choose “LLM App” from the templates under the Workers tab. This choice serves as the foundation of your application, enabling JavaScript execution on Cloudflare’s servers. Selecting the LLM App template provides a head start with the necessary packages for running such an application.

5. Deploy Your Worker

After creating your worker, click on “Deploy”. Don’t worry; we can update your worker’s code later.

Hurrah! Our worker is now live; request like the below to see the magic happen. Replace the url with your worker url.

curl --location 'https://llm-app-damp-sun.2000-aman-sinha.workers.dev/' \
--header 'Content-Type: application/json' \
--data '{
    "prompt": "say a joke"
}'

6. Edit Worker’s Code

Now, it’s time to customize your worker. Click on Edit code it to start coding. Initially, your worker is up and running.

Edit index.js to contain the below content (Code)

The Final project looks like this.

7. Click “Save and Deploy ”

Click on Deploy button on the top right, then Save and Deploy

Our LLM is now up and running and accepting our requests and prompts.

Let’s try by making the below request to get a joke. Change URL your worker URL for it to work.

curl --location 'https://llm-app-damp-sun.2000-aman-sinha.workers.dev/' \
--header 'Content-Type: application/json' \
--data '{
    "prompt": "say a joke"
}'

Well Done. Our app is live, and we can now make use of Llama3.

You can make API Calls using Chat, too, by making calls in the following format:

curl --location 'https://llm-app-damp-sun.2000-aman-sinha.workers.dev/' \
--header 'Content-Type: application/json' \
--data '{
    "messages": [
        { "role": "system", "content": "You are a helpful assistant." },
        { "role": "user", "content": "How many world cups has france won?" }
    ]
}'

You can check various models available on Cloudflare AI here:

Models · Cloudflare Workers AI docs

Browse our entire catalog of models.

developers.cloudflare.com

Try multiple models, see what fits best for your use case, and move forward.

If this article was helpful, give it some claps. I’m deeply involved with AI and LLMs. Follow me on Medium for more insights.
Feel free to say hi or connect via Twitter and LinkedIn.

References

Workers AI · Cloudflare Workers AI docs

Workers AI allows you to run machine learning models, on the Cloudflare network, from your own code — whether that be…

developers.cloudflare.com

Writing poems using LLama 2 on Workers AI

Matthew and Michelle, co-founders of Cloudflare, published their annual founders’ letter today. The letter ends with a…

blog.cloudflare.com

Models · Cloudflare Workers AI docs

Browse our entire catalog of models.

developers.cloudflare.com

Meta Llama 3

Build the future of AI with Meta Llama 3. Now available with both 8B and 70B pretrained and instruction-tuned versions…

llama.meta.com

Deploy LLama 3 for free using Cloudflare AI in 3 minutes

1. Create a Cloudflare Account

2. Navigate to Workers & Pages

3. Create Your Application

4. Create a Worker

5. Deploy Your Worker

6. Edit Worker’s Code

7. Click “Save and Deploy ”

Models · Cloudflare Workers AI docs

Browse our entire catalog of models.

References

Workers AI · Cloudflare Workers AI docs

Workers AI allows you to run machine learning models, on the Cloudflare network, from your own code — whether that be…

Writing poems using LLama 2 on Workers AI

Matthew and Michelle, co-founders of Cloudflare, published their annual founders’ letter today. The letter ends with a…

Models · Cloudflare Workers AI docs

Browse our entire catalog of models.

Meta Llama 3

Build the future of AI with Meta Llama 3. Now available with both 8B and 70B pretrained and instruction-tuned versions…

Written by Aman Kumar