- Intro
- How was your day?
- Any plans for this weekend?
- "Coding"
- Brainstorm how to review recipe similarity for learning to rank recommender systems
- What is Learning to Rank/Machine Learning to Rank
- What are we looking to judge for MeaLeon?
- Is the suggestion even good?
- E.g., returning hollandaise for buffalo wings is not a good suggestion
- How different/similar are the suggestions?
- E.g., If I search tiramisu, it should not return tiramisu back, even if it’s from a different cuisine
- How similar are the ingredients?
- E.g., this recipe looks cool, but I can’t really make it with what I actually have. Like chicken thighs vs chicken breast vs whole chicken
- MLR Notes
- Since MLR works better with two phases, perhaps the "fast result" can use existing TFIDF ideas/methods, and MLR can swoop in after
- For MLR, we use feature vectors, these features are typically of 3 groups
- Static (query-independent) features which are dependent on the document
- Dynamic (query-dependent) features which depend on the contents of the document and the query (like TF-IDF score)
- Query-level features (query features), like the number of words in a query
- LETOR uses:
- TF, TF-IDF, BM25, and language modeling scores of document’s zones
- Lengths and IDF sums of document’s zones
- Doc’s PageRank
- What metric will you use for MeaLeon’s learning to rank?
- First thought was Mean Average Precision
- I was interpreting as how often the recommended recipes are different enough from the searched one (see the "what are we looking to judge" above)
- This would be good for binary judgements
- DCG is used in academia
- Can move to expected reciprocal rank or Yandex’s pfound
- Other article of interest
- Two tower architecture with two neural nets/multilayer perceptrons
- 1 tower processes the query, the other processes the candidate
- Query can be a user in an user-2-item scenario (home feed) or an item in an item-2-item scenario (related items recommendation)
- "How is this problem framed as a machine learning task?"
- For candidate generation, it’s a supervised multi-class classification task: For each observation in the dataset, we want to accurately output the correct label among a set of possible ones
- Perhaps we can use a softmax on the cosine similarity to determine likelihood of falling into the "too similar (cuisine)" etc label
- Should MeaLeon return more than 5 recipes? or more than 1 page
- Kinda think no
- Presents too many choices for a simple app
- Eventually when user data and history is stored, showing more than 5 choices is cluttered and excessive
- Maybe have a way to unlock "slow query" if for some reason the user doesn’t like the top 5
- TODO #MeaLeon show an example recipe for the user’s query
Socials
Related