This demonstration will send data into an Apache Kafka topic which will be picked up by the Iceberg Connector for Kafka Connect and written to an Iceberg table.
- Apache Iceberg
- Apache Spark / PySpark
- MinIo
- Apache Kafka
- Kafka Connect
- Python
-
Launch server applications
docker compose up -d
-
Create Kafka topic
docker exec -t broker kafka-topics --create --topic completed-pizzas --partitions 6 --bootstrap-server broker:9092
-
Launch the Iceberg connector (installed via docker compose)
curl -X PUT http://localhost:8083/connectors/pizzas-on-ice/config \ -i -H "Content-Type: application/json" -d @pizzas_on_ice.json
-
Check status of connector
curl http://localhost:8083/connectors/pizzas-on-ice/status |jq
-
Install
confluent-kafka
packagepip install confluent-kafka
-
In a seperate terminal window, run the pizza loader script
python pizza_loader.py
-
Launch PySpark
docker exec -it spark-iceberg pyspark
-
Explore Iceberg table
df = spark.table("demo.rpc.pizzas")
df.count()
df.show(5)
df.groupBy("store_id").count().sort("count", ascending=False).show(5)
-
Clean up resources when you're done.
docker compose down