Abstract: Recommendation describes suggesting, or recommending, items tailored to a particular user. As generative AI creates an explosion of digital content, personalization will be more important than ever! Whether the application is sneaker designs, blog posts, or even pre-trained machine learning model weights, most recommendation tasks have a similar underlying structure. We need some way to represent items and users, typically as vectors, as well as a way to index them for fast computation. We also need to design intuitive APIs that interface the recommendation system to application developers. Weaviate is an open-source vector search database that has many unique search and database features. On the database side, Weaviate offers replication, backups, horizontal scalability and many more to help developers easily serve their apps to millions of users across the world! On the search side, Weaviate enables access to approximate nearest neighbor indexing algorithms with support for symbolic filtering, as well as keyword-based inverted indexing, and tieing everything together with APIs to build search pipelines. We have recently developed Ref2Vec, a new feature in Weaviate for representing users and building recommendation systems! Ref2Vec presents a graph-structured interface for connecting users and their interactions with miscellaneous products or brands that in turn create a representation for the user. For example, the simplest case is to construct a bipartite graph between users and products and represent the user as the average representation of “liked” products. This session will also present a hands on example of personalized search through Diffusion-generated sneakers with the open-source Weaviate engine! Following the example, we will dive further into the details of Collaborative Filtering, HDBSCAN clustering, and Graph Neural Networks! Listeners of this talk will gain an understanding of how vector search technology is impacting recommendation and a practical walkthrough of how to use the technology themselves to build applications.
Bio: Connor Shorten is a Research Scientist at Weaviate, an Open-Source Vector Search Database. Connor has had a role in the development of Ref2Vec, Hybrid Search, Generative Search, Weaviate’s Pipe API, and Re-Ranking. Connor has also hosted 34 episodes of the Weaviate podcast featuring guests from OpenAI, Cohere, You.com, MosaicML, Jina AI, Deepset, Neural Magic and many others! Connor also co-hosts Weaviate meetups in Boston and New York City! Prior to Weaviate, Connor has earned a Ph.D. in Computer Science from Florida Atlantic University. Connor’s Ph.D. was primarily focusing on Data Augmentation in Deep Learning and Applications of Deep Learning for COVID-19. Connor’s publication “A survey on image data augmentation in deep learning” has achieved over 5,000 citations.