Jarvis is a private, on-device AI chat assistant that runs entirely on your machine — no internet required, no API fees, no data leaving your computer.
Jarvis is a local AI chatbot pipeline — a private troubleshooting assistant built on two open-source tools that handle all routing and inference on your own hardware.
Nothing leaves your machine. No cloud calls, no telemetry, no account required. Your conversations stay local.
Ollama runs quantized models directly on your GPU or CPU — typically 3–20 tokens/sec depending on hardware and model size.
Swap the underlying LLM any time. Mistral, LLaMA 3, Gemma, Phi — any model Ollama supports works out of the box.
Every message travels through a five-stage pipeline — from your keyboard to the local model and back. Here is the exact sequence of operations:
localhost:11434. Ollama handles all model inference locally on your hardware.
The routing layer. OpenClaw receives chat input, manages conversation state, and forwards requests to the configured local model endpoint. Lightweight and extensible.
Runs quantized LLMs locally on your machine via a simple REST API. Supports GPU acceleration when available, falls back gracefully to CPU.
Two steps. Start Ollama to spin up your local model server, then run openclaw from the project directory.
Ollama must be running before you launch Jarvis. It starts a local inference server on port 11434.
From the Jarvis project directory, launch OpenClaw. It connects to Ollama and opens the chat interface.
Once both services are up, Jarvis is live. Ask it to debug errors, explain code, or troubleshoot your system. All responses stay on your machine.
Core pipeline is functional. Currently refining conversation memory, adding system prompt customization, and improving multi-turn context handling.