git-to-text is a Go-based tool inspired by the Python project gpt-repository-loader. It converts the contents of a Git repository into a single text file—ideal for loading into an LLM for repository analysis or chat-based interactions with your codebase.
This project is a Go port of the original gpt-repository-loader by mpoon. We appreciate their work and encourage you to check out the original Python implementation.
- Converts an entire Git repository into a single text file with clear file boundaries.
- Uses a detailed default ignore list to automatically skip build artifacts, caches, and dependency folders from nearly every ecosystem.
- Supports custom ignore patterns via a
.gptignorefile placed in the repository root. - Offers a
--unignoreflag so you can override default ignores and include specific directories if needed. - Accepts a local repository path or a GitHub URL; if a URL is provided, the tool clones the repository (using a shallow clone) into a temporary directory and cleans it up afterward.
- Supports custom preamble files for contextual output.
- Ensures deterministic file ordering and skips binary files using a simple heuristic.
- Go 1.16 or higher
-
Clone the repository:
git clone https://github.com/adammpkins/git-to-text.git cd git-to-text -
Install dependencies:
go get github.com/bmatcuk/doublestar/v4 -
Build the project:
go build
This will create an executable named git-to-text (or git-to-text.exe on Windows) in your project directory.
Run the program with the following syntax:
./git-to-text /path/to/git/repository [-p /path/to/preamble.txt] [-o /path/to/output_file.txt]
<repository_path_or_github_url>: Either the path to the Git repository or a GitHub URL.-p /path/to/preamble.txt: Path to a custom preamble file (optional). If not provided, a default preamble is used.-o /path/to/output_file.txt: Path for the output file (optional, defaults tooutput.txt).--unignore dir1,dir2,...: (Optional) Comma-separated list of default ignored directories to include in the output.
- Local Repository:
./git-to-text /home/user/projects/my-repo -p /home/user/preamble.txt -o /home/user/my-repo-output.txt
- GitHub URL:
./git-to-text https://github.com/adammpkins/my-repo --unignore node_modules,vendor
The tool will clone the repository into a temporary directory, process it, and then clean up the clone.
By default, git-to-text automatically skips certain directories and files that are typically irrelevant to code analysis (e.g., build artifacts, caches, dependencies). Below is the exhaustive list:
.git.idea.vscode.vsnode_modulesvendorbower_componentsdistbuildcoveragetmpcache.sass-cache.nexttarget.bundlelogbinpkgzig-out.gradleout_builddepspycache.venvenvobj.dart_toolDerivedDataCMakeFilescmake-build-debugcmake-build-releasePodsLibraryTempLogsBinariesIntermediateSavedxcuserdataRproj.userbazel-outbazel-binbazel-testlogsbazel-genfilesnimcacheTestResultselm-stuffexport.eggsblibebin
Note: If any of these directories are important for your use case, you can include them via the --unignore flag (see above).
Place a .gptignore file in the root of your Git repository to specify files or patterns to ignore. The syntax is similar to .gitignore. Note that if a pattern ends with a slash (e.g., logs/), the tool will automatically append ** so that all files within that directory are excluded.
Example .gptignore:
bootstrap/
storage/
.env
By default, the tool uses a standard preamble explaining the output file's structure. You can override this by providing your own preamble file using the -p option.
git-to-text uses a simple heuristic to detect binary files: it scans each file for any NUL bytes (0x00). If a NUL byte is found, the file is considered binary and is automatically skipped. This helps ensure that non-text content or minified code isn't included in the output.
When you provide a GitHub URL (or any HTTP/HTTPS Git repository URL) instead of a local path, git-to-text performs a shallow clone using git clone --depth 1 into a temporary directory. This minimizes both download size and processing time. After processing the repository, the temporary clone is automatically cleaned up.
Contributions are welcome! Please submit a Pull Request. We encourage leveraging AI assistance in development while maintaining the spirit of the original project.
This project is licensed under the MIT License – see the LICENSE file for details.
- Thanks to mpoon for the original gpt-repository-loader project.
- Thanks to the creators of the
doublestarpackage for providing powerful file pattern matching capabilities.