The simplest way to run open-source LLMs locally -- one command to download, one command to run, with a REST API that works like OpenAI's.
Get from zero to running a local LLM in under a minute.
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Run your first model
ollama run llama3.2
# Or use the API
curl http://localhost:11434/api/chat -d '{
"model": "llama3.2",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Ollama auto-detects your GPU and downloads the best model variant for your hardware.