Exploiting GNNs for Business Recommendation on Yelp Data

Abstract: 

Graph Neural Networks have demonstrated incredible results in different domains where a network is present. The workshop objective is to use the Yelp Dataset to create business recommendations for users exploiting the network composed of reviews, users, friends, tips, and businesses. The workshop will start from the downloaded jsons of the Yelp dataset from which we will create csvs for the import on a Neo4j Database. The Neo4j library will allow us to visualize and better understand the schema and properties of the underline graph. Once we have a glimpse of our training graph we will start the preprocessing of the data, analyzing each node label and relationship type in order to understand how to transform properties into appropriate input features for the GNN. Another preprocessing step will be the drop of the features with too many missing values created from the translation from jsons to a graph. After that, we will create an in-memory graph with the DGL library that will also help us to split the data into training, validation, and test sets and to create a minibatch sampler for each set. Once all is done we will train our graph neural network for the business recommendation task. To complete the cycle we will need a tool for the inference of new business recommendations. We will save our business and user vectors in an open-source vector database (Qdrant), which offers good speed, accuracy, and flexibility. In this way, we can perform a fast and accurate similarity search and retrieval of our business vectors based on their compatibility with the user.

Session outline:

download the Yelp dataset jsons and create the csvs for neo4j import for visualization
preprocess and clean the dataset
create the DGL in-memory graph
train a heterogeneous graph neural network for business recommendation
use the trained model together with a vector database for inference

Bio: 

Valerio Piccioni is an AI Engineer at LARUS who primarily focuses on Graph Neural Networks, but also likes to have a go with other deep learning fields like NLP and Computer Vision. He is also interested in MLOps as building machine learning models that can arrive into production is harder than it seems. Currently he is working on a project regarding fraud detection with graphs.

Open Data Science

 

 

 

Open Data Science
One Broadway
Cambridge, MA 02142
info@odsc.com

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google