# Codebase Summarizer A command-line tool that extracts a concise summary of a codebase, including symbols and their documentation comments, generating a `codebase_summary.md` file. ## Why? LLMs charge per-token. When you feed them an entire codebase, you're paying for lines that don't matter. This tool reduces context size by extracting only the essential structure: symbols, their types, and their documentation. The result is a lean summary that lets you understand any codebase in seconds—without the cost of feeding it everything. ## Features - **Multi-language support**: Go, Rust, Python, Dart, R, TypeScript/JavaScript, Java, C, C++, C#, Ruby, PHP, Swift, Kotlin, and more - **Symbol extraction**: Functions, structs, enums, classes, methods, traits, interfaces, type aliases - **Doc comment preservation**: Captures documentation comments associated with each symbol - **Markdown output**: Clean `codebase_summary.md` documenting the entire codebase - **AI agent optimized**: Generates `AGENTS.md` with navigation instructions for AI agents ## Installation ### Build from source ```bash git clone gogs.dmsc.dev/dmsc/codebase_summarizer.git cd codebase_summarizer cargo build --release sudo cp target/release/codebase_summarizer /usr/local/bin/ ``` ## Usage ```bash # Scan current directory codebase_summarizer # Scan a specific directory codebase_summarizer --directory /path/to/codebase # Output to custom location codebase_summarizer --output /tmp/summary.md # Skip AGENTS.md generation codebase_summarizer --no-agents # Enable verbose output codebase_summarizer --verbose # Include private symbols codebase_summarizer --include-all ``` ## How it works 1. **Scan**: Recursively walks the directory, filtering out `target/`, `node_modules/`, `.git/`, and other non-source directories 2. **Parse**: Extracts symbols from each code file using language-specific parsers 3. **Summarize**: Generates a `codebase_summary.md` with file tree and symbol documentation 4. **Optional**: Creates `AGENTS.md` with navigation protocol for AI agents