A privacy-preserving search system based on MIT's Tiptoe paper. This implementation allows you to search through data while maintaining query privacy - the server never learns what you're searching for. For more details, see our blog.
- Privacy-preserving search using PIR
- Local embedding generation
- Clustering-based optimization for faster searches
- FastAPI-based REST API
- Interactive API documentation at
/docs
- Documents are converted into embeddings and clustered for efficient searching
- The client downloads cluster centroids (~32 kB for a 1 GB database)
- The client locally compares query vectors to centroids to find relevant clusters
- Using SimplePIR, the client privately retrieves matching documents
- All queries remain private - the server never sees what you're searching for
Architecture overview of private search with homomorphic encryption. The query is encrypted before being sent to the server, which processes it without being able to see the contents. The encrypted results are sent back to the client for decryption.
pip install -r requirements.txt
- Start the server:
python server.py
- Run the client:
python client.py
- Enter natural language queries to search through the documents privately.
Visit http://127.0.0.1:8000/docs
for interactive API documentation.
- Build the Docker image:
docker build -t private-search .
- Run the container:
docker run -p 8000:8000 private-search
The system uses a combination of:
- Sentence transformers for embedding generation
- K-means clustering for search optimization
- SimplePIR for private information retrieval
- FastAPI for the REST API interface
- Queries are never revealed to the server
- Document retrieval patterns remain private
- All sensitive computations happen client-side
- Server only sees encrypted PIR queries
The clustering-based approach provides significant performance improvements:
- Reduces the number of PIR operations needed
- Allows for efficient searching in large document collections
- Maintains privacy while providing fast results
Contributions are welcome! Please feel free to submit a Pull Request.
This project is open source and available under the MIT License.