Run an LLM with a single file

Last month, I came across a video by NetworkChuck on hosting AI locally. In short, he built a server specifically for running LLMs using Ollama and other tools. From his video I learned that running an LLM on your machine is actually not a complicated task.

But then a few weeks later, I discovered Mozilla’s llamafile project which makes it even more easier to use LLMs on your computer. There is nothing to install. All you have to do is run a single file. It even comes with a web chat interface.

One of the big advantages of using an LLM locally is privacy. Everything runs offline and you don’t need to worry about your data being used to train AI models.

To start using llamafile on Linux,

  1. Go to the project’s repo by clicking here and download the model you want to run. I downloaded TinyLlama-1.1B-Chat-v1.0.F16.llamafile because of it’s relatively small size. Download the model which you prefer.

  2. Go to the folder where you downloaded your llamafile and open the terminal there. You will need to give permission to run the file so type the command below in the terminal.

    chmod +x TinyLlama-1.1B-Chat-v1.0.F16.llamafile

    (If you downloaded a different model, remember to change the file name.)

  3. Run the file by entering the following command

    ./TinyLlama-1.1B-Chat-v1.0.F16.llamafile

  4. Your browser should open automatically and that is all. If it doesn’t, go to http://localhost:8080/ . That is all. See how easy that was!

Instructions are from the llamafile quick start guide

I use a really old laptop and it was able to run the model alright. Sometimes, my desktop environment did freeze for a few moments. I actually was surprised that it was able to work on my machine.

Anyway, I hope you got a good idea on how simple it is to a run an LLM on your machine. Cheers!

Screenshot of the web interface of running a llamafile.

The TinyLlama model running on my computer

This is day 30 of #100DaystoOffload

Comment via Email

RSS Feed