I will be discussing a command that holds a significant place in your toolkit and command-line operations: grep. The first version of this article was published in 2019. Since then, the developer tools ecosystem, particularly AI-assisted development workflows, has undergone significant changes. In this update, I will cover everything from grep fundamentals to modern alternatives like ripgrep, how AI coding agents use these tools, and next-generation semantic search solutions.
grep Fundamentals
grep, the Global Regular Expression Printer, allows you to select and mark lines from a text corpus based on a specified pattern. The provided pattern is processed within the specified path, and matching results are listed. It can be used independently or combined with pipes (|) to enhance its capabilities.
Its most basic usage:
grep '[search-text]' [file-path]
Commonly Used Parameters
| Parameter | Description | Example |
|---|---|---|
-i | Case-insensitive search | grep -i 'error' log.txt |
-r | Recursive search in subdirectories | grep -r 'TODO' src/ |
-n | Show line numbers | grep -n 'function' app.js |
-v | Show non-matching lines (exclude) | grep -v 'debug' log.txt |
-l | List only file names | grep -l 'import' *.ts |
-c | Show match count | grep -c 'error' log.txt |
-w | Whole word matching | grep -w 'return true' *.php |
-o | Show only matching part | grep -o 'v[0-9]\+' changelog |
-A N | N lines after match | grep -A 3 'error' log.txt |
-B N | N lines before match | grep -B 2 'error' log.txt |
-C N | N lines around match | grep -C 5 'error' log.txt |
-E | Extended regex | grep -E 'err(or|eur)' log.txt |
-F | Fixed string (no regex) | grep -F '$variable' code.sh |
Pipeline Usage
grep’s power emerges when chained with other commands:
# Search for nginx in running processes
ps aux | grep 'nginx'
# View 500 errors in log file page by page
grep '500' /var/log/access.log | more
# Check file permissions for specific extensions
ls -l ~/var/www/html/*.jpg | grep rwxrwxrwx
To search for multiple words:
grep -i 'spam\|hashes' access_log.txt # method 1
grep -iE 'spam|hashes' access_log.txt # method 2
grep -i -e 'spam' -e 'hashes' access_log.txt # method 3
The latest version of GNU grep is 3.12 (April 2025), which fixes the issue of being unable to search in directories containing more than 100,000 entries.
Modern Alternatives: Why New Tools Were Needed
grep has served as one of the cornerstones of Unix philosophy for decades. However, today’s massive codebases, multi-core processors, and AI-assisted development workflows have brought new requirements:
- Performance: grep uses a single core. This becomes slow in projects with hundreds of thousands of files.
- Smart defaults: Directories like
node_modules,dist,.gitneed to be manually excluded. - Unicode and modern regex: Multi-language support and advanced regex requirements in modern codebases are increasing.
ripgrep (rg)
ripgrep is a search tool developed in Rust by Andrew Gallant, positioned as a modern alternative to grep1. With over 59,700 stars on GitHub, ripgrep continues its active development with the latest version 15.1.0 (October 2025).
Key features that set ripgrep apart:
- Multi-core parallel search: Search operations are automatically distributed across CPU cores
- Automatic .gitignore support: Reads
.gitignoreand.ignorefiles to skip directories likenode_modulesandbuildby default - Advanced regex engine: Finite automaton-based, SIMD-optimized Rust regex engine
- Unicode support: Full Unicode character class support
- Jujutsu VCS recognition: Jujutsu version control system repository recognition support since version 15.0.0
Performance Comparison
Benchmark on Linux kernel source code (4,640 directories, 178 .gitignore files)2:
| Operation | GNU grep | ripgrep | Difference |
|---|---|---|---|
| Simple pattern search | ~0.67s | ~0.06s | 11x faster |
Line-numbered search (-n) | 9.48s | 1.66s | 5.7x faster |
Basic Usage
ripgrep’s command-line interface feels familiar to grep users:
# Simple search (recursive and .gitignore-aware by default)
rg 'TODO'
# File type filter
rg --type ts 'interface'
rg --glob '*.tsx' 'useState'
# Fixed string search (no regex, faster)
rg -F '$variable'
# Search with context
rg -C 3 'error'
# File names only
rg -l 'import.*lodash'
# Multiple patterns
rg -e 'TODO' -e 'FIXME' -e 'HACK'
# JSON output format (for programmatic use)
rg --json 'pattern'
Other Alternatives
| Tool | Language | Latest Version | GitHub Stars | Status |
|---|---|---|---|---|
| ripgrep (rg) | Rust | 15.1.0 (October 2025) | 59,700+ | Active |
| ack | Perl | 3.9.0 (May 2025) | 799 | Active |
| ag (Silver Searcher) | C | 2.2.0 (August 2018) | 27,200+ | Unmaintained |
| ugrep | C++ | 7.5 (2025) | 3,000+ | Active |
| GNU grep | C | 3.12 (April 2025) | N/A | Active (slow pace) |
ack, a Perl-based search tool, continues to be actively developed. With version 3.9.0, it offers Boolean search operators such as --and, --or, --not; a feature not directly available in grep or ripgrep3.
ag (The Silver Searcher) was an important stepping stone from grep to ripgrep. However, no new version has been released since 2018, and it is unmaintained.
ugrep stands out as an alternative fully compatible with GNU grep. It offers unique features such as an interactive TUI interface, searching within compressed files (gz, bz2, xz, zstd) and archives (zip, 7z, tar), and searching in PDF and Word documents4.
AI Agents and Text Search
The proliferation of AI-powered coding tools has created a new layer in the text search ecosystem. The first step for an AI agent to “understand” a codebase is finding the relevant files and code snippets. Text search tools are vital for answering questions like “Where is this function defined?” or “Which file uses this API key?”
Which Tool Do Agents Use?
AI coding agents largely rely on ripgrep as their internal search engine, but Claude Code took a different path in April 2026 by replacing ripgrep with ugrep and bfs on native builds:
| Agent | Search Tool | Source |
|---|---|---|
| Claude Code | ugrep + bfs (Bash, native) | 2.1.117 release notes5 |
| GitHub Copilot CLI | ripgrep (included November 2025) | GitHub Blog6 |
| OpenAI Codex | ripgrep (primary), grep (fallback) | GitHub repository7 |
| Aider | grep-ast (tree-sitter powered) | GitHub repository8 |
Claude Code shipped a ripgrep-based Grep tool (and a Glob tool) for a long time. Version 2.1.117, released in April 2026, removed both tools on macOS and Linux native builds; embedded ugrep and bfs binaries are now invoked through Bash instead. Anthropic’s 2.1.118 release notes flagged the change as “four months in the making, now faster than ever and all Bash.” Windows builds and npm-installed versions retain the previous behavior5.
Note: ugrep positions itself as a drop-in replacement fully compatible with GNU grep (see the Modern Alternatives section). With this switch, Claude Code’s search layer traded ripgrep’s
.gitignore-aware performance for ugrep’s regex compatibility and compressed-file support.
Challenges AI Agents Face
1. Noisy Results
# Problematic: Searching the entire project
grep -r 'config' .
# Thousands of irrelevant results in node_modules, dist, .next
This approach pollutes the agent’s context and rapidly consumes token limits. ripgrep’s .gitignore support largely solves this problem, though there are still cases where it falls short.
2. Lack of Context
grep only returns the matching line. Even with surrounding lines via the -C parameter, it may not be sufficient to understand the entire function or class. Aider’s grep-ast tool uses the tree-sitter parser to show the matching line along with the function, class, or method it belongs to8.
3. Regex Errors
AI agents can sometimes generate incorrect regex patterns. Inconsistencies are particularly observed with escaping special characters like ., *, (, ). Therefore, the -F (fixed string) parameter should be preferred when exact string matching is needed:
# Fixed string instead of potentially incorrect regex
rg -F 'interface{}' --type go
4. Token Consumption
As noted in discussions in the OpenAI Codex repository, “grep or filename heuristics fall short in multilingual repositories, with renamed identifiers, or when concepts are expressed differently from the query”7. This drives the need for semantic search tools.
Solutions and Best Practices
Narrowing the search: Constraining to specific directories and file types instead of searching the entire project:
# Instead of searching the entire project
rg 'handleSubmit' src/components/ --glob '*.tsx'
Fixed string search: Using -F when regex is not required:
rg -F 'process.env.DATABASE_URL'
Choosing output mode: Proceeding in two stages, first file list, then content search:
# First, which files contain it?
rg -l 'useAuth'
# Then search in detail within those files
rg -C 3 'useAuth' src/hooks/useAuth.ts
Next-Generation Tools: Beyond Text Search
In the 2025-2026 period, the text search ecosystem is evolving into a three-layer structure:
Layer 1: Exact Text Matching (grep, ripgrep)
Fast, reliable classical text search that produces no false positives. Still the best choice for searching a known string or regex pattern.
Layer 2: Structural Code Search (ast-grep)
ast-grep (sg) performs structural search on the Abstract Syntax Tree instead of text-based search9. Using the tree-sitter parser to understand code structure, it can run queries beyond text matching:
# Find console.log calls (only function calls, excluding those in strings)
sg -p 'console.log($$$)' --lang typescript
# Find async functions without try-catch blocks
sg -p 'async function $NAME($$$) { $$$ }' --lang javascript
ast-grep also provides an MCP (Model Context Protocol) server for AI agent integration. This enables tools like Claude Code or Cursor to perform structural code searches.
Layer 3: Semantic Search (mgrep, grepai)
For a concrete implementation of this layer, see the Content Intelligence System post.
mgrep is a semantic search tool developed by Mixedbread AI that works with AI embeddings10. It can search code, text, and even PDF files using natural language queries:
# Natural language search
mgrep "user authentication flow"
# Auto-index git repository
mgrep watch
In benchmarks comparing mgrep with Claude Code integration, mgrep-based workflows reportedly consume approximately 2x fewer tokens than grep-based workflows11.
grepai is a fully local semantic code search tool that uses vector embeddings. It offers features like natural language queries, conceptual similarity search, and call graph tracing. It provides AI agent integration through its built-in MCP server12.
ast-grep vs ripgrep: When Each One Wins
This is the comparison people search for, but it frames the two as rivals when they answer different questions. ripgrep matches text; ast-grep matches structure. Reach for ripgrep when you already know roughly what the bytes look like, a string, a regex, a log line, and you want every occurrence across the tree in milliseconds. Reach for ast-grep the moment your regex starts catching matches inside comments, strings, or the wrong syntactic position, because at that point you are fighting the lack of grammar awareness instead of using it.
| Question you are asking | Tool |
|---|---|
| Where does this string or regex appear? | ripgrep |
Every call shaped like foo($X) in real code? | ast-grep |
| Rename a structural pattern across the codebase? | ast-grep |
| Fastest raw scan with no language parsing? | ripgrep |
In practice I keep ripgrep as the default and only switch to ast-grep when a search needs to understand the code, not just read it. They sit on different layers of the stack described above, so the honest answer to “which one” is almost always “both, for different jobs.”
Practical Guide: Which Tool for Which Scenario?
| Scenario | Recommended Tool | Why |
|---|---|---|
| Minimal server, Docker image | grep | No additional installation required |
| Simple pipeline filtering | grep | ps aux | grep nginx |
| Daily development search | rg | Speed, .gitignore support |
| Large codebase | rg | Parallel search, smart filtering |
| AI agent command | rg | All agents support it |
| Code structure search | ast-grep | AST-based structural queries |
| Refactoring | ast-grep | Structural find-and-replace |
| Conceptual search | mgrep / grepai | Natural language queries |
| Compressed file search | ugrep | zip, gz, PDF support |
| Boolean combinations | ack | --and, --or, --not |
Installation
# ripgrep
brew install ripgrep # macOS
apt install ripgrep # Debian/Ubuntu
choco install ripgrep # Windows
# ast-grep
npm install -g @ast-grep/cli
brew install ast-grep
# mgrep
pip install mgrep
# ugrep
brew install ugrep
Conclusion
grep maintains its value as one of the cornerstones of Unix philosophy. However, in modern development workflows, ripgrep’s speed and smart defaults make it a better choice in nearly every scenario. With the rise of AI coding agents, the efficient use of text search tools has become a critical skill for both humans and AI agents alike.
The text search ecosystem is evolving into a three-layer structure: exact matching (ripgrep), structural search (ast-grep), and semantic search (mgrep). Each of these layers addresses a different need and complements one another.
While researching this article, I took a closer look at the problems AI agents face during search operations: false negatives in built-in grep tools, noisy results that burn through the context window, hallucinated file paths. Each of these issues is frustrating on its own, but combined they create a domino effect that directly impacts the quality of code an agent produces. In the next article, I covered these problems in detail and shared a local semantic code search MCP server I built as a solution.
Footnotes
- ripgrep GitHub Repository ↩
- ripgrep is faster than grep, ag, git grep, ucg, pt, sift ↩
- ack: Beyond grep ↩
- ugrep: Ultra fast grep ↩
- Grep and Glob removed in Claude Code 2.1.117 (ugrep + bfs via Bash). GitHub Issue ↩ ↩2
- GitHub Copilot CLI Changelog, November 2025 ↩
- OpenAI Codex - Semantic Search Proposal ↩ ↩2
- Aider grep-ast ↩ ↩2
- ast-grep: Structural Code Search ↩
- mgrep: Semantic grep by Mixedbread AI ↩
- Boosting Claude: Faster Code Analysis with mgrep ↩
- grepai: Local Semantic Code Search ↩
- 01 ripgrep runs up to 10x faster than grep and automatically reads .gitignore files
- 02 The AI coding agent search layer is shifting fast: Copilot CLI and Codex use ripgrep, while Claude Code moved to ugrep and bfs with 2.1.117
- 03 Modern text search is evolving into a three-layer structure: exact matching (ripgrep), structural search (ast-grep), semantic search (mgrep)
- 04 The biggest challenge for AI agents is noisy results consuming the context window
- 05 grep remains indispensable in minimal environments and simple pipeline operations
+ Why is ripgrep faster than grep?
ripgrep is written in Rust and utilizes multi-core parallel search, SIMD optimizations, and a finite automaton-based regex engine. It also automatically reads .gitignore files to skip unnecessary directories. In benchmarks conducted on the Linux kernel source code, ripgrep delivers results up to 10x faster than GNU grep.
+ Which text search tool do AI coding agents use?
GitHub Copilot CLI and OpenAI Codex use ripgrep as their internal search engine, while Claude Code removed its ripgrep-based Grep and Glob tools in April 2026 (version 2.1.117) on macOS and Linux native builds and switched to embedded ugrep and bfs. Search now runs through Bash, without a separate tool round-trip.
+ What is ast-grep and how does it differ from grep?
ast-grep performs structural code search on the Abstract Syntax Tree instead of text-based search. Because it understands code structure, it can run queries like 'find all async functions without error handling blocks'. It uses the tree-sitter parser.
+ Can semantic search tools replace grep?
No, they are complementary tools. grep and ripgrep are fast and reliable for exact text matching. Semantic search tools (mgrep, grepai) are used for natural language queries and conceptual similarity search. Both layers have a place in modern development workflows.
+ Is ast-grep a replacement for ripgrep?
No, they operate on different layers. ripgrep does fast text and regex matching over raw bytes, while ast-grep matches and rewrites code structurally on the syntax tree. Use ripgrep as your default search and switch to ast-grep when a query needs to understand code structure, for example finding every call shaped a certain way or running a structural refactor.