CodeRecall/README.md

160 lines
No EOL
3.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
# 🧠 CodeRecall
> Context-aware local AI assistant for developers using Ollama, LanceDB, and VS Codes Continue extension.
CodeRecall ingests your entire codebase — including Git history and diffs — into a local vector database (LanceDB), enabling RAG-augmented queries via Ollama models right inside VS Code.
No cloud APIs. No latency. Full control.
---
## 🚀 Features
- 🔍 **Semantic code search** across multiple languages
- 📜 **Git commit + diff embedding** for code evolution awareness
- 🤖 **RAG integration** with local Ollama models (e.g. LLaMA 3)
- 💻 **VS Code Continue extension support**
- ⚙️ Configurable with a simple `config.ini`
---
## 🧩 Project Structure
```
CodeRecall/
├── lancedb_ingest.py # Ingest codebase + Git into LanceDB
├── lancedb_context_provider.py # VS Code Continue context provider
├── config.ini.example # Ollama + LanceDB settings
├── lancedb-data/ # LanceDB persistence directory
└── config.yaml.example # Continue extension config
```
---
## 🔧 Setup
### 1. Install dependencies
```bash
pip install lancedb
```
Make sure you have:
- 🦙 [Ollama](https://ollama.com/) installed and running
- ✅ [Continue Extension](https://marketplace.visualstudio.com/items?itemName=Continue.continue) for VS Code
- 🐙 Git repo initialized (optional but recommended)
### 2. Configure `config.ini`
```ini
[[ollama]
url = http://localhost:11434
[lancedb]
persist_directory = ./lancedb-data
[s3]
enable = True
bucket_name = my-s3-bucket
access_key_id = my-access-key
secret_access_key = my-secret-key
region = us-east-1
# Optional, if using third party s3 providers
endpoint = http://minio:9000
[server]
host = 0.0.0.0
port = 8080
```
---
## 📥 Ingest your project
```bash
python lancedb_ingest.py
```
This loads:
- Source code in common languages
- Markdown and text docs
- Git commit messages and full diffs
---
## 🧠 Add as a VS Code Context Provider
### `config.yaml` for Continue
```yaml
name: Local Assistant
version: 1.0.0
schema: v1
models:
- name: Ollama Autodetect
provider: ollama
model: AUTODETECT
apiBase: http://localhost:11434
- name: Ollama Autocomplete
provider: ollama
model: qwen2.5-coder:1.5b-base
apiBase: http://localhost:11434
roles:
- autocomplete
- name: Nomic Embed Text
provider: ollama
model: nomic-embed-text
apiBase: http://localhost:11434
roles:
- embed
context:
- provider: code
- provider: docs
- provider: diff
- provider: terminal
- provider: problems
- provider: folder
- provider: codebase
# LanceDB Context Provider
- provider: http
params:
url: http://localhost/retrieve
```
---
## ✨ Usage
1. Launch VS Code.
2. Open the Continue sidebar.
3. Set `"@HTTP"` as your context provider.
4. Ask your model questions about your codebase, architecture, or commits.
Example prompt:
> _“How does the Git ingestion pipeline work?”_
---
## 📌 Notes
- Default embedding model is `nomic-embed-text` (via Ollama).
- Change `n_results` in `lancedb_context_provider.py` for broader/narrower context.
- Works offline, no API keys required.
---
## 🧪 Roadmap Ideas
- [ ] Add OpenAPI spec ingestion
- [ ] Enable full-text search fallback
- [ ] Support for multi-repo ingestion
- [ ] Optional chunking for large diffs or files
---
## 🛡 License
Creative Commons Attribution-NonCommercial 4.0 International license.