email: [email protected]
All notebooks can be found in notebooks/
.
Some videos are strictly based on Cypher querys, which can be found in cypher/
.
Stay tuned to the Neo4j YouTube channel for new episodes coming soon!
The notebooks in this repository are not meant to be stand-alone and thus are not commented. They go with the videos. So you are encouraged to watch the videos and then consult the notebooks should you will to look at the actual code in depth.
✨ ✨ Find this video series as its own webpage on the Neo4j webpage!!! ✨ ✨
Part 1: Connect from Jupyter to a Neo4j Sandbox
Part 2: Using the py2neo Python Driver
Part 3: Using the Neo4j Python Driver
Part 4: Basic Cypher Queries (and with Google Colab)
- This video uses a Google Colab notebook, which can be found here
Part 5: Populating the Database from Pandas
- This video refers to a YouTube video on how to create efficient Cypher queries, which is linked in the references below.
Part 6: Populating the Database with LOAD CSV
- This video references this GitHub repo that has the data used in this part.
Part 7: Populating the Database with the neo4j-admin tool
- This video works from the command line using Docker. The shell commands are provided in GitHub gists, which can be found here.
- The data for this part can be found in
data/
(the files aregot-s1-nodes.csv
andgot-s1-edges.csv
).
Part 8: Populating the Database from a JSON file
- This video references a JSON file I created for my NODES 2021 tutorial, "Creating a Knowledge Graph with Neo4j: A Simple Machine Learning Approach."
- Repository for the workshop: Contains the JSON file
- I have also put this file in the
data/
directory of this repository, but the Cypher query I used in the video (cypher_queries/part8.cql
) uses the workshop repo.
- I have also put this file in the
- Video of the workshop
- Repository for the workshop: Contains the JSON file
Part 10: Creating In-Memory Graphs with Cypher Projections
Part 11: Import RDF Data from Wikidata
- To query Wikidata, it is helpful to know how to use SPARQL. The query builder that I showed (which has several great example queries) can be found here. Wikidata also provides a good SPARQL tutorial.
- This video shows the use of Neosemantics for importing the RDF data. See below in the References for docs on how to use it.
- This video also shows very quickly demonstrates Neo4j Bloom for visualization and queries. For an in-depth look at how to use Bloom, see this video.
Part 12: Creating In-Memory Graphs with Native Projections
- This is the sister video for Part 10, which explored the other method for creating in-memory graphs.
Part 13: Calculating Centrality
Part 14: Community Detection with the Louvain Method
Part 15: Community Detection via Weakly Connected Components
Part 16: Using Strongly Connected Components to Detect Communities
Part 17: Creating FastRP Graph Embeddings
- For more information on how FastRP works, see the following blog posts:
Part 18: Putting Graph Embeddings into a Machine Learning Model
- This video moves quickly! It will be important to read this blog post, particularly for understanding how to get the embeddings into a format for the machine learning model.
Part 19: Starting with a SQL table...
- This video is the start of a series looking at why we might want to go from SQL to a graph database
- It is based off of the graph data that can be found in here
- I use PostgreSQL for my demonstrations, but you can use your SQL of choice
- All queries to populate your database are in
./sql_queries/part19
Part 20: ...And compare it to a graph... (2/n)
- This video builds off of Part 19, using the same data imported into Neo4j
- To create the CSV files used for this graph, I exported each of the tables in Part 19 directly from Postgres via pgAdmin
- I made some tweaks of the headers to get them into Neo4j via
LOAD CSV
easily - The data files can be found in
./data
- I made some tweaks of the headers to get them into Neo4j via
Part 21: An example of when querying a graph can be easier than SQL (3/n)
- This video builds off of Parts 19 and 20 of this series
- If you do not already have a Neo4j database populated with this data, follow the instructions in Part 20 or run the script
./cypher_queries/part20.cql
to populate the database
Part 22: A side-by-side calculation of degree using SQL and Neo4j (4/n)
- This video builds off of Parts 19-21 of this series
- If you do not already have a SQL database populated with this data, use the queries in
./sql_queries/part19/
- If you do not already have a Neo4j database populated with this data, follow the instructions in Part 20 or run the script
./cypher_queries/part20.cql
to populate the database
Part 23: PageRank done two ways (5/n)
- This video builds off of Parts 19-22 of this series
- We will be using a very simplistic graph for this demonstration
- The PageRank SQL query was taken from this Stack Overflow post, which was originally written for T-SQL and has been modified in this repo to work in PostgreSQL
- This video builds off of Parts 19-23 of this series
- This is the final video in the mini series-within-a-series for the SQL vs. Neo4j comparisons
Part 25: Creating a graph for a Kaggle competition
- This video is based off of the H&M Personalized Fashion Recommendations Kaggle competition
- The original data can be found and downloaded from the Kaggle public API via their CLI tool, assuming you have a Kaggle account
- For information on how to use the Kaggle public API, see this article
Part 26: Creating a graph model of the Kaggle competition (2/n)
- This video is based off of Part 25, which uses the H&M Personalized Fashion Recommendations Kaggle competition
- There is no code used in this part
- If you would like to make an image of a graph model for yourself, check out arrows.app
Part 27: Node similarity of Kaggle competition graph (3/n)
- This video is based off of Parts 25 and 26, which uses the H&M Personalized Fashion Recommendations Kaggle competition
- If you need a refresher on how to create an in-memory graph projection as is done in this video, please consult Part 12
Part 28: Using KNN to identify similar items of Kaggle competition graph (4/n)
- This video is based off of Parts 25-27
- If you need a refresher on how to create an in-memory graph projection as is done in this video, please consult Part 12
- In this video we will do some very basic feature engineering to explore the K-Nearest Neighbors for each article of clothing to obtain similar articles
- (The next video will also do KNN, but using some much more sophisticated features!)
Part 29: Using KNN with more sophisticated feature vectors (5/n)
- This video is based off of Parts 25-28
- This video just scrapes the surface of all of the new offerings within GDS 2.0, but focuses on the new GDS Python Client