2023.01.13 - Stream Notes

Intro
- How was your day?
- Any plans for this weekend?
"Coding"
- Brainstorm how to review recipe similarity for learning to rank recommender systems
- What is Learning to Rank/Machine Learning to Rank
  - Good ol wikipedia
What are we looking to judge for MeaLeon?
- Is the suggestion even good?
  - E.g., returning hollandaise for buffalo wings is not a good suggestion
- How different/similar are the suggestions?
  - E.g., If I search tiramisu, it should not return tiramisu back, even if it’s from a different cuisine
- How similar are the ingredients?
  - E.g., this recipe looks cool, but I can’t really make it with what I actually have. Like chicken thighs vs chicken breast vs whole chicken
MLR Notes
- Since MLR works better with two phases, perhaps the "fast result" can use existing TFIDF ideas/methods, and MLR can swoop in after
- For MLR, we use feature vectors, these features are typically of 3 groups
  - Static (query-independent) features which are dependent on the document
  - Dynamic (query-dependent) features which depend on the contents of the document and the query (like TF-IDF score)
  - Query-level features (query features), like the number of words in a query
- LETOR uses:
  - TF, TF-IDF, BM25, and language modeling scores of document’s zones
  - Lengths and IDF sums of document’s zones
  - Doc’s PageRank
- What metric will you use for MeaLeon’s learning to rank?
  - First thought was Mean Average Precision
    - I was interpreting as how often the recommended recipes are different enough from the searched one (see the "what are we looking to judge" above)
    - This would be good for binary judgements
  - DCG is used in academia
  - Can move to expected reciprocal rank or Yandex’s pfound
- Other article of interest
  - Two tower architecture with two neural nets/multilayer perceptrons
    - 1 tower processes the query, the other processes the candidate
    - Query can be a user in an user-2-item scenario (home feed) or an item in an item-2-item scenario (related items recommendation)
  - "How is this problem framed as a machine learning task?"
    - For candidate generation, it’s a supervised multi-class classification task: For each observation in the dataset, we want to accurately output the correct label among a set of possible ones
      - Perhaps we can use a softmax on the cosine similarity to determine likelihood of falling into the "too similar (cuisine)" etc label
Should MeaLeon return more than 5 recipes? or more than 1 page
- Kinda think no
- Presents too many choices for a simple app
- Eventually when user data and history is stored, showing more than 5 choices is cluttered and excessive
  - Maybe have a way to unlock "slow query" if for some reason the user doesn’t like the top 5
TODO #MeaLeon show an example recipe for the user’s query

2023.01.13 – Stream Notes

Socials

Related