MinuteMate improves how municipalities communicate with their citizens by simplifying the creation of meeting minutes. Upload your meeting audio and documents and present a queriable interface to the public to get the most out of government transparency.
-
MinuteMate App - The public-facing chat application (in development). This requires integration with the vector database, an embedding model (must match one of the embedding models used for preprocessing), and at least one generation model. Other possible integrations include a RAG-reranking model.
-
Preprocessing Pipeline - A set of tools to tranform raw audio and text files into vector-indexed chunks. At minimum, it requires integration with an audio transcription model (currently AssemblyAI), an embedding model, and the vector database (currently Weaviate) which will serve as the repository. Other possible integrations include a generative model to be used to assist with data cleaning.
-
Llama on Modal - Deploys Llama models to be served by Modal. This provides both embedding and generative models for use by other major components.
-
Dev Notebooks - This includes various notebooks for developing or testing components of the preprocessing pipeline and application.
- Clone this repository
- Curate a corpus of information you want to present
- Deploy embedding or generative models (optional)
- Set up a RAG database (usually a vector database)
- Preprocess your corpus to populate the RAG database
- Deploy the backend to handle prompt and response logic
- Deploy the frontend to present a prompt/response interface to users
Contribution guidelines - Guidelines and instructions for contributing to the project