How I'm Rebuilding Cursor with My Own Homelab AI β Zero API Costs
π« Why Not Just Pay for Cursor or Copilot?
Because I knew I could do better β for free.
Cursor and GitHub Copilot Chat are incredible tools. They inject context, let you chat about your codebase, and feel like having a pair-programmer built into your editor. But here's the thing:
- They require paid API keys (OpenAI, Claude, etc.)
- They tie your usage and logs to external systems
- They donβt respect your compute β they rent you theirs
I wanted the same dev-experience, but entirely self-hosted, powered by my own GPU server at home. No token limits, no vendor lock-in.
And Iβm almost there.
π§± My Stack (aka The Homelab Dev Agent Engine)
| Component | Role |
|---|---|
| Ollama | Local LLM runtime |
| DeepSeek-R1 | Fast, local, open chat model |
| Node.js API | My custom wrapper server |
| Avante.nvim | Neovim AI interface plugin |
| PostgreSQL | Token, prompt, error tracking |
| Tailscale VPN | Internal access across machines |
I wanted to mimic Cursor β with /slash commands, local file injection, contextual AI completions β using nothing but my own infrastructure.π§© How It Works (End to End)
- My Node.js API does the heavy lifting:
- Validates bearer/API keys
- Accepts
selected_filesfrom Avante - Reads each file and injects its contents into the prompt (like Copilot Chat)
- Sends the structured request to Ollama, locally
- Streams the AI response back to Neovim
- Logs every prompt, token count, and even simulated OpenAI/Claude pricing into PostgreSQL
- The Ollama Server, running DeepSeek-R1 on my GPU workstation, handles completions locally.
Avante.nvim in Neovim is configured to call my API:
require("avante").setup({
endpoint = "https://api-ai.cargdev.io/api/chat",
api_key = "local-dev-key",
model = "deepseek-r1:latest"
})
πͺ Cursor-Like Features I Already Have
| Feature | Status | Implementation |
|---|---|---|
| File-aware chat | β | Avante + my wrapper inject file contents |
| Slash commands | β | extensions.avante.make_slash_commands |
| Streaming responses | β | Via Ollama stream + SSE |
| Chat UI in Neovim | β | Avante handles buffer formatting |
| Prompt + error logging | β | PostgreSQL backend |
| Token cost simulation | β | GPT-4, Claude, Gemini savings logged |
π§ͺ Still Missing Full Cursor Behavior β But Getting Close
I donβt yet have true buffer modification support (e.g. code refactor suggestions auto-applied). Cursor and Copilot Chat can modify code intelligently based on full-project context.
But the pieces are almost in place:
- My wrapper supports per-file context
- Prompt injection works for any file selection
- Avante can be extended or patched to support diff/edit mode
And unlike Cursor, my tool:
- Runs 100% local
- Is fully inspectable
- Costs me $0/month
π Tracking Everything
Every request logs:
{
"model": "deepseek-r1:latest",
"total_tokens": 9992,
"prompt": "...",
"selected_files": ["mcphub.lua"],
"created_at": "2025-05-24"
}
I also record:
- GPT-4 equivalent cost (e.g. $0.10)
- Claude and Gemini pricing
- My actual cost: $0
This makes it easy to say: βThis week I saved $7.20 in token fees.β
π Want to Use or Contribute?
Check out the repo:
π https://github.com/CarGDev/apiAi
Youβll find:
- Express.js wrapper code
- Ollama integration
- Token tracking logic
- Prompt/error endpoints
- Sample setup for Avante.nvim
Feel free to fork it, log an issue, or submit a pull request. Thereβs plenty of room to grow: edit suggestions, embedded diff views, buffer streaming, agent queues, you name it.
β Final Thoughts
This isnβt just about saving money β itβs about owning your tools.
Iβm not building a Cursor clone for the hype.
Iβm building a private AI dev agent, fully under my control.
If youβre a Neovim user, have a spare GPU, and want to break free from paid APIs, give this a try.
Thereβs still a lot to add β but the core is solid.
And yeah, I proudly say it:
βIβm running my own Cursor-style AI stack β from my homelab.β
More posts coming soon. Until then, keep breaking and building. π»π₯
Comments ()