
UPDATE 5/14: Our SOTA model (>0.6 item recall now updated on OtakuRoll)
UPDATE 5/4: Submitted our SOTA results for publication! Please reach out if you want to read the preprint!
UPDATE 4/3: Got valuable feedback from Reddit roast thread *initiate search for devs*…
UPDATE 3/28: Launched! Sign up for free at https://otakuroll.net
UPDATE 3/24: We received back feedback for our final project report. We’ll now be working hard towards publication this year (keeping conference anonymous for now)!
UPDATE 3/1: DeepAniNet (our model) is now live! Try it on the site.
UPDATE 2/21: Hacked together a prototype with a well known baseline algorithm. Try it out now!
UPDATE 2/16: We handed in our project proposal, so it’s time to start building models! We’re implementing a former NeurIPS paper.
UPDATE 2/07/21: We finished scraping our dataset, and am converging on a paper we want to emulate.
(This spun out of a CS224N team project.)
Abstract
Guest users are common in real world applications, requiring industrial recommendation systems to handle the “cold start” problem, where no existing interactions between new users and recommendable items can be drawn from to make predictions.
Prior work addresses this problem by learning profiling user representations to bootstrap recommendations. However, this process can often be invasive, requiring new users to provide personal data to construct profiles, or shallow, yielding representations that are not expressive enough to make accurate recommendations.
In this work, we propose a new representation for guest users based on a “content basket.” A set of seed items is submitted by the user to use the service, allowing each user to be represented as a function of a collection of items. Simultaneously, we design a rich representation space where a graph of item nodes is connected by edges that signify joint, written recommendations between items.
We design a graph neural network architecture that inductively learns item (node) and inter-item (edge) representations as a combination of deep language encodings of textual content descriptions and a graph embedding learned via message passing on the graph edges.
The rich graph representation, relating similar items by their content and mutual links, enables effective generalization to items unseen during training. To evaluate our model and demonstrate a novel application, we construct a new dataset for anime recommendations, AnimeULike, that contains anonymized interactions between ~13k users and 10k animes, and present DeepNaniNet, our new anime recommendation engine (demo available at https://otakuroll.net) which can exclusively serve out-of-matrix users.
Our empirical results on both AnimeULike and CiteULike, two very different domains, demonstrate a significant performance improvement over previous cold start solutions that do not learn to dynamically represent new users.
Results
TLDR; We achieve amazing results (>60% item recall@100) on our dataset, coined AnimeULike, and equivalent SOTA results on CiteULike (for scientific articles). In other words, our model returns >60% of animes you will watch in the top 100 results out of a database of 10000 animes. Moreover, the recommendations are far more diverse than WMF and Top-k CF:

More on this to come as our manuscript gets reviewed.