Basically runs like a tiny server and can run a model locally and do inference on Llama. You can use Ollama to configure / download the various models you want. [https://ollama.ai/library](https://ollama.ai/library) Also, can probably get a model from HuggingFace but they’re stupid slow.