How I'm Rebuilding Cursor with My Own Homelab AI — Zero API Costs

🚫 Why Not Just Pay for Cursor or Copilot?
Because I knew I could do better — for free.
Cursor and GitHub Copilot Chat are incredible tools. They inject context, let you chat about your codebase, and feel like having a pair-programmer built into your editor. But here's the thing:
- They require paid API keys (OpenAI, Claude, etc.)
- They tie your usage and logs to external systems
- They don’t respect your compute — they rent you theirs
I wanted the same dev-experience, but entirely self-hosted, powered by my own GPU server at home. No token limits, no vendor lock-in.
And I’m almost there.
🧱 My Stack (aka The Homelab Dev Agent Engine)
Component | Role |
---|---|
Ollama | Local LLM runtime |
DeepSeek-R1 | Fast, local, open chat model |
Node.js API | My custom wrapper server |
Avante.nvim | Neovim AI interface plugin |
PostgreSQL | Token, prompt, error tracking |
Tailscale VPN | Internal access across machines |
I wanted to mimic Cursor — with /slash
commands, local file injection, contextual AI completions — using nothing but my own infrastructure.
🧩 How It Works (End to End)
- My Node.js API does the heavy lifting:
- Validates bearer/API keys
- Accepts
selected_files
from Avante - Reads each file and injects its contents into the prompt (like Copilot Chat)
- Sends the structured request to Ollama, locally
- Streams the AI response back to Neovim
- Logs every prompt, token count, and even simulated OpenAI/Claude pricing into PostgreSQL
- The Ollama Server, running DeepSeek-R1 on my GPU workstation, handles completions locally.
Avante.nvim in Neovim is configured to call my API:
require("avante").setup({
endpoint = "https://api-ai.cargdev.io/api/chat",
api_key = "local-dev-key",
model = "deepseek-r1:latest"
})
🪄 Cursor-Like Features I Already Have
Feature | Status | Implementation |
---|---|---|
File-aware chat | ✅ | Avante + my wrapper inject file contents |
Slash commands | ✅ | extensions.avante.make_slash_commands |
Streaming responses | ✅ | Via Ollama stream + SSE |
Chat UI in Neovim | ✅ | Avante handles buffer formatting |
Prompt + error logging | ✅ | PostgreSQL backend |
Token cost simulation | ✅ | GPT-4, Claude, Gemini savings logged |
🧪 Still Missing Full Cursor Behavior — But Getting Close
I don’t yet have true buffer modification support (e.g. code refactor suggestions auto-applied). Cursor and Copilot Chat can modify code intelligently based on full-project context.
But the pieces are almost in place:
- My wrapper supports per-file context
- Prompt injection works for any file selection
- Avante can be extended or patched to support diff/edit mode
And unlike Cursor, my tool:
- Runs 100% local
- Is fully inspectable
- Costs me $0/month
📊 Tracking Everything
Every request logs:
{
"model": "deepseek-r1:latest",
"total_tokens": 9992,
"prompt": "...",
"selected_files": ["mcphub.lua"],
"created_at": "2025-05-24"
}
I also record:
- GPT-4 equivalent cost (e.g. $0.10)
- Claude and Gemini pricing
- My actual cost: $0
This makes it easy to say: “This week I saved $7.20 in token fees.”
🙌 Want to Use or Contribute?
Check out the repo:
🔗 https://github.com/CarGDev/apiAi
You’ll find:
- Express.js wrapper code
- Ollama integration
- Token tracking logic
- Prompt/error endpoints
- Sample setup for Avante.nvim
Feel free to fork it, log an issue, or submit a pull request. There’s plenty of room to grow: edit suggestions, embedded diff views, buffer streaming, agent queues, you name it.
☕ Final Thoughts
This isn’t just about saving money — it’s about owning your tools.
I’m not building a Cursor clone for the hype.
I’m building a private AI dev agent, fully under my control.
If you’re a Neovim user, have a spare GPU, and want to break free from paid APIs, give this a try.
There’s still a lot to add — but the core is solid.
And yeah, I proudly say it:
“I’m running my own Cursor-style AI stack — from my homelab.”
More posts coming soon. Until then, keep breaking and building. 💻🔥
Comments ()