git-repo-parser

A powerful tool to scrape all files from a GitHub repository and convert them into JSON or plain text format.

Installation

Install the package globally using npm:

npm install -g git-repo-parser

Or add it to your project as a dependency:

npm install git-repo-parser

Usage

Command Line Interface (CLI)

This package provides two CLI commands:

git-repo-to-json: Scrapes a GitHub repository and saves the result as a JSON file.
git-repo-to-text: Scrapes a GitHub repository and saves the result as a plain text file.

Example usage:

git-repo-to-json https://github.com/username/repo-name.git
git-repo-to-text https://github.com/username/repo-name.git

The scraped data will be saved as files.json or files.txt in your current directory.

Programmatic Usage

You can also use the package in your Node.js projects:

import { scrapeRepositoryToJson, scrapeRepositoryToPlainText } from 'git-repo-parser';

// To get JSON output
const jsonResult = await scrapeRepositoryToJson('https://github.com/username/repo-name.git');

// To get plain text output
const textResult = await scrapeRepositoryToPlainText('https://github.com/username/repo-name.git');

API

`scrapeRepositoryToJson(repoUrl: string): Promise<FileData[]>`

Scrapes the given GitHub repository and returns a promise that resolves to an array of FileData objects.

`scrapeRepositoryToPlainText(repoUrl: string): Promise<string>`

Scrapes the given GitHub repository and returns a promise that resolves to a string containing the repository contents in a structured plain text format.

FileData Interface

The FileData interface represents the structure of files and directories in the JSON output:

interface FileData {
    name: string;
    path: string;
    type: 'file' | 'directory';
    children?: FileData[];
    content?: string;
}

Features

Clones the repository locally (temporary)
Ignores binary files and common non-source files
Supports nested directory structures
Provides both JSON and plain text output formats
Cleans up cloned repository after scraping

Ignored Files

The following file types and patterns are ignored during scraping:

package-lock.json
Binary files (pdf, png, jpg, jpeg, gif, ico, svg, woff, woff2, eot, ttf, otf)
Media files (mp4, avi, webm, mov, mp3, wav, flac, ogg, webp)
Debug and error logs (npm-debug, yarn-debug, yarn-error)
Configuration files (tsconfig, jest.config)
The .git directory

License

This project is licensed under the MIT License.

Author

arnab2001

Contributing

Contributions, issues, and feature requests are welcome. Feel free to check [issues page] if you want to contribute. Also Check Contribution Guide Open Source Community Conduct

We are committed to fostering a welcoming and inclusive open-source community. We expect all contributors to adhere to our Code of Conduct to create a respectful and collaborative environment.

Show your support

Give a ⭐️ if this project helped you!

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
files.txt		files.txt
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

git-repo-parser

Installation

Usage

Command Line Interface (CLI)

Example usage:

Programmatic Usage

API

`scrapeRepositoryToJson(repoUrl: string): Promise<FileData[]>`

`scrapeRepositoryToPlainText(repoUrl: string): Promise<string>`

FileData Interface

Features

Ignored Files

License

Author

Contributing

Show your support

About

Releases 10

Packages

Languages

License

arnab2001/git-repo-parser

Folders and files

Latest commit

History

Repository files navigation

git-repo-parser

Installation

Usage

Command Line Interface (CLI)

Example usage:

Programmatic Usage

API

scrapeRepositoryToJson(repoUrl: string): Promise<FileData[]>

scrapeRepositoryToPlainText(repoUrl: string): Promise<string>

FileData Interface

Features

Ignored Files

License

Author

Contributing

Show your support

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 10

Packages 0

Languages

`scrapeRepositoryToJson(repoUrl: string): Promise<FileData[]>`

`scrapeRepositoryToPlainText(repoUrl: string): Promise<string>`

Packages