Basically runs like a tiny server and can run a model locally and do inference on Llama. You can use Ollama to configure / download the various models you want.
[https://ollama.ai/library](https://ollama.ai/library)
Also, can probably get a model from HuggingFace but they’re stupid slow.