Introduction
Ollama setup
To get started with Ollama, follow the instructions on their download page. Or if you're on Linux:
curl -fsSL https://ollama.com/install.sh | sh
Configure server options as you see fit and then launch the client:
export OLLAMA_MAX_LOADED_MODELS=2 # sets the max number of loaded models
export OLLAMA_NUM_PARALLEL=2 # sets the max number of parallel tasks
ollama serve
Then download a model using the ollama cli:
ollama pull llama3
You can optionally test if everything works as you expect by using a chat session:
ollama run llama3 # runs an interactive session in the terminal