sgrep - Semantic Code Search
Use sgrep to search code semantically using natural language queries. sgrep understands code meaning, not just text patterns.
When to Use
- Finding code by concept or functionality ("where do we handle authentication?")
- Discovering related code patterns ("show me retry logic")
- Exploring codebase structure ("how is the database connection managed?")
- Searching for implementation patterns ("where do we validate user input?")
Prerequisites
Ensure sgrep is installed:
curl -fsSL https://raw.githubusercontent.com/rika-labs/sgrep/main/scripts/install.sh | sh
Basic Usage
Search Command
sgrep search "your natural language query"
Common Patterns
Find functionality:
sgrep search "where do we handle user authentication?"
Search with filters:
sgrep search "error handling" --filters lang=rust
sgrep search "API endpoints" --glob "src/**/*.rs"
Get more results:
sgrep search "database queries" --limit 20
Show full context:
sgrep search "retry logic" --context
Command Options
--limit <n>or-n <n>: Maximum results (default: 10)--contextor-c: Show full chunk content instead of snippet--path <dir>or-p <dir>: Repository path (default: current directory)--glob <pattern>: File pattern filter (repeatable)--filters key=value: Metadata filters likelang=rust(repeatable)--json: Emit structured JSON output (agent-friendly)--threads <n>: Maximum threads for parallel operations--cpu-preset <preset>: CPU usage preset (auto|low|medium|high|background)
Indexing
If no index exists, sgrep will automatically create one on first search. To manually index:
sgrep index # Index current directory
sgrep index --force # Rebuild from scratch
Watch Mode
For real-time index updates during development:
sgrep watch # Watch current repo
sgrep watch --debounce-ms 200
Configuration
Check or create embedding provider configuration:
sgrep config # Show current configuration
sgrep config --init # Create default config file
sgrep config --show-model-dir # Show model cache directory
sgrep config --verify-model # Check if model files are present
sgrep uses local embeddings by default. Config lives at ~/.sgrep/config.toml.
If HuggingFace is blocked (e.g., in China), set HTTPS_PROXY environment variable or see the offline installation guide.
Examples
Find authentication code:
sgrep search "how do we authenticate users?"
Find error handling:
sgrep search "error handling patterns" --filters lang=rust
Search specific file types:
sgrep search "API rate limiting" --glob "src/**/*.rs"
Get detailed results:
sgrep search "database connection pooling" --context --limit 5
Agent-friendly JSON output:
sgrep search --json "retry logic"
Understanding Results
Results show:
- File path and line numbers: Where the code is located
- Score: Relevance score (higher is better)
- Semantic score: How well it matches the query meaning
- Keyword score: Text matching score
- Code snippet: Relevant code excerpt
Best Practices
- Use natural language: Ask questions like you would ask a colleague
- Be specific: "authentication middleware" is better than "auth"
- Combine with filters: Use
--filters lang=rustto narrow by language - Use globs:
--glob "src/**/*.rs"to search specific directories - Check context: Use
--contextwhen you need full function/class definitions - Use JSON for automation: Use
--jsonfor structured output in scripts