Skip to content

A Trie/Graph hybrid memory structure used by the Hyphe crawler to index pages & webentities.

License

Notifications You must be signed in to change notification settings

medialab/hyphe-traph

Repository files navigation

Build Status

hyphe-traph

The Traph is an on-file index structure designed to store hyphe's network of pages & webentities.

Under the hood, the Traph is the combination of a ternary search tree of URL stems and linked lists of pages' hyperlinks (hence the portmanteau name).

Development

hyphe-traph was written to work with the 2.7 version of Python.

# Install dev dependencies (preferably in a virtual env)
pip install -r requirements.txt

# Run the tests
make test

# Run the linter
make lint

About

A Trie/Graph hybrid memory structure used by the Hyphe crawler to index pages & webentities.

Resources

License

Stars

Watchers

Forks

Packages

No packages published