Local LLM Usage
LARGESTACK supports local LLMs through two patterns.
Pattern A — Native Ollama provider, chat-only
Use this when you need local/private chat or RAG without tool calls.
ollama serve
ollama pull llama3.2
export LARGESTACK_OLLAMA_BASE_URL=http://localhost:11434
from largestack import Agent
agent = Agent(
name="local-chat",
llm="ollama/llama3.2",
instructions="Reply concisely.",
cost_budget=0.0,
)
The native OllamaProvider is chat-only in this release.
Pattern B — LiteLLM/OpenAI-compatible local endpoint
Use this when you need a unified gateway across cloud and local models. Tool calling depends on the local model and proxy support.
pip install largestack[litellm]
# Configure LiteLLM/Ollama according to your proxy setup.
from largestack import Agent
agent = Agent(
name="local-router",
llm="litellm/ollama/llama3.1",
instructions="Use the local model.",
)
Production rule
Before relying on local tool automation in production, run an end-to-end test with your exact model, proxy, schema, and tool-calling configuration — behavior varies between local models and setups.