A powerful tool to scrape all files from a GitHub repository and convert them into JSON or plain text format.
Install the package globally using npm:
npm install -g git-repo-parser
Or add it to your project as a dependency:
npm install git-repo-parser
This package provides two CLI commands:
git-repo-to-json
: Scrapes a GitHub repository and saves the result as a JSON file.git-repo-to-text
: Scrapes a GitHub repository and saves the result as a plain text file.
git-repo-to-json https://github.com/username/repo-name.git
git-repo-to-text https://github.com/username/repo-name.git
The scraped data will be saved as files.json
or files.txt
in your current directory.
You can also use the package in your Node.js projects:
import { scrapeRepositoryToJson, scrapeRepositoryToPlainText } from 'git-repo-parser';
// To get JSON output
const jsonResult = await scrapeRepositoryToJson('https://github.com/username/repo-name.git');
// To get plain text output
const textResult = await scrapeRepositoryToPlainText('https://github.com/username/repo-name.git');
Scrapes the given GitHub repository and returns a promise that resolves to an array of FileData
objects.
Scrapes the given GitHub repository and returns a promise that resolves to a string containing the repository contents in a structured plain text format.
The FileData
interface represents the structure of files and directories in the JSON output:
interface FileData {
name: string;
path: string;
type: 'file' | 'directory';
children?: FileData[];
content?: string;
}
- Clones the repository locally (temporary)
- Ignores binary files and common non-source files
- Supports nested directory structures
- Provides both JSON and plain text output formats
- Cleans up cloned repository after scraping
The following file types and patterns are ignored during scraping:
- package-lock.json
- Binary files (pdf, png, jpg, jpeg, gif, ico, svg, woff, woff2, eot, ttf, otf)
- Media files (mp4, avi, webm, mov, mp3, wav, flac, ogg, webp)
- Debug and error logs (npm-debug, yarn-debug, yarn-error)
- Configuration files (tsconfig, jest.config)
- The
.git
directory
This project is licensed under the MIT License.
arnab2001
Contributions, issues, and feature requests are welcome. Feel free to check [issues page] if you want to contribute. Also Check Contribution Guide Open Source Community Conduct
We are committed to fostering a welcoming and inclusive open-source community. We expect all contributors to adhere to our Code of Conduct to create a respectful and collaborative environment.
Give a ⭐️ if this project helped you!