Skip to content
Home » 2024.10.02 – Stream Notes

2024.10.02 – Stream Notes

  • by
  • Stream Notes
    • Doing
      • Work through the Vespa example(s) and see what we can adapt for MeaLeon
        • Text Search Tutorial
          • Goal of this tutorial is: we want to build an end-to-end search application that returns relevant documents to a text query
          • Looks like n-grams in Vespa refers to character n-grams instead of word-based ones
            • Vespa docs recommend n-gram matching for languages that are not tokenized (like Asian languages)
            • Also says these are generally not useful for text searching, unless ngrams are needed for increased recall
            • Later in the tutorial, looking at how queries and documents can get processed, sounds like it’s better to enforce one language instead of allowing autodetection, which can struggle with short query strings
          • Rank operand does not change the retrieval or matching as the number of documents exposed to ranking is the same as before. The rank operator can be used to implement a variety of use case around boosting
            • Can I use this for ingredient suggestions? Or a way to restrict clustering of similar cuisines?
            • Dependent on the ranking algorithm
        • TODO Look up how to reuse one Docker image to run multiple containers. Having an issue trying to do different tutorials using same Vespa Docker image
      • Machine Learning subreddit recommended this NLP Newsletter: https://nlp.elvissaravia.com
    • From Chat / Derail

Socials