-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Balchivist is a Python library for archiving datasets to the Internet Archive. It automates the process of gathering a list of data available for a type of dataset and uploading them to the Internet Archive, using an SQL database as a backend for storing the list.
Balchivist alone only provides the infrastructure for accessing the backend SQL database and the tools for interacting with the Internet Archive, along with other supporting modules such as a processing daemon, etc. The actual data processing is implemented via individual modules (available under the modules
directory), making it easy to extend the library for working on other datasets.
Clone this repository using Git and run the following command in the Balchivist root directory:
pip install -r requirements.txt
Likewise, install the MySQL server and the python-mysqldb library so as to provide the SQL database functionality. Update settings.conf
accordingly, using settings.conf.example
as a template.
Encountered any issues? Feel free to report them on the issues tracker.