Ollama setup

To get started with Ollama, follow the instructions on their download page. Or if you're on Linux:

curl -fsSL https://ollama.com/install.sh | sh

Configure server options as you see fit and then launch the client:

export OLLAMA_MAX_LOADED_MODELS=2 # sets the max number of loaded models
export OLLAMA_NUM_PARALLEL=2 # sets the max number of parallel tasks
ollama serve

Then download a model using the ollama cli:

ollama pull llama3

You can optionally test if everything works as you expect by using a chat session:

ollama run llama3 # runs an interactive session in the terminal