haystack-rag-showcase

This project is a showcase for haystack's api 2.0, highlighting its pipeline. This project consists of a few parts:

Scraping TUM CIT website with either scrapy (scripts/TUM_RAG.ipynb) or beautiful soup (scripts/TUM_RAG_with_beautiful_soup.ipynb)
HTML text processing, chunking, and writing to different stores
Retrieval-Augmented Question Answering system, which handles english and german input differently.

Local setup with access to disk (more powerful)

Go to tum_crawler with cd tum_crawler

In terminal, run scrapy crawl tum (might require installation)

Go back to project root.

Run docker-compose up to run the qdrant databases.

Create .env at project root, with your token: OAI_TOKEN="sk-..."

Read & execute scripts/TUM_RAG.ipynb

Read & execute scripts/TUM_RAG_with_beautiful_soup.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
scripts		scripts
tum_crawler		tum_crawler
.gitignore		.gitignore
README.md		README.md
docker-compose.yaml		docker-compose.yaml
requirements.txt		requirements.txt