Skip to content
Home » 2022.10.31 – Stream Notes (Happy Halloween!)

2022.10.31 – Stream Notes (Happy Halloween!)

  • by

Stream notes

Video

Here’s the VOD:

Below is an embed


Summary

  1. Intro
    1. Happy Halloween
    2. Weekend
      1. Got the Miata!
  2. Coding
    1. Try to figure out ideal number of clusters from silhouette analysis
  3. From chat
    1. texoport: what’s new in clustering
      1. DBSCAN vs…
      2. Hypersphere density based approach
        1. Maybe hold off on this one, but density
      3. HDBSCAN?
      4. LGBM or XGBoost?
    2. Wait but Why
  4. Derail
    1. Ming Tsai opened a new restaurant in Big Sky, Montana
    2. Group therapy and advice session
  5. We got raided by TimeEnjoyed
    1. MeaLeon redeem for Taiwanese braised pork
      1. Braised Pork with Orange and Fennel
      2. Chicken curry
      3. Duck Fried Rice
      4. Moroccan Chicken and Lentils
      5. Almost Grandmothers Challah
    2. MeaLeon redeem for fortune cookies (American)
      1. Chocolate Genoise with Chocolate Peppermint Sauce
      2. White Chocolate Raspberry Creme Brulee Tartlets
      3. Double Chocolate Financier Cake
      4. Apricot Almonst Linzertorte
      5. Salted Chocolate Caramel Tart
  6. We got raided by dm_adam2
    1. Shrimp and Veggie Stir Fry
    2. Vegetarian Style Congee Em Xi Fan Em
    3. Grilled Shrimp and Scallions with Southeast Asian Dipping Sauces
    4. Vietnamese Shrimp and Pork Crepes
    5. Thai Style Chicken and Rice Soup
      1. dm_adam2 tried and got this error
      2. Then this one
  7. We got raided by Maweexy
    1. MeaLeon is down…
  8. We got raided by data_dude
    1. MeaLeon…
  9. We raided NovaLiminal

Shoutouts

Streamers who were active in chat

  1. intelijens
  2. TimeEnjoyed
  3. dm_adam2
  4. Maweexy
  5. R4D4R_Live
  6. earend
  7. bedtimebear_808
  8. texoport
  9. SeaFilmz

To Do

Code

  • [ ] Add custom function to look at the term frequency distributions
    • [ ] This is even what the scikit-learn docs do…they should just build it in to the algorithms…
  • [ ] What is a good minimium document frequency for a term and what is a cutoff? 1/number of topics?
  • [ ] Look at the example on sklearn for NMF and topic extraction
  • [ ] add ingredient information to search result box
  • [ ] Make the background tile
  • [ ] take another look at streamlit
  • [ ] Look up math theory behind t-SNE
  • [ ] optuna for automated hyperparameter searching
  • [ ] Should .lock and tool-versions be added to .gitignore? I’ve never committed them
  • [ ] Refactor to use Pola.rs
  • [ ] Write better/more thorough docstrings
  • [ ] Look up examples of applying OOP practices to data science
  • [ ] Make a PR to update the documentation for how **kwargs are used for sklearn Pipeline (their example and docs are seemingly incorrect)
  • [ ] Figure out the ideal process for putting a corpus with collection of documents through a sklearn pipeline considering you previously got the overall counts and then did tfidf on the individual recipes using the CV from overall.

Photo

  • [ ] Try to see if the preview pane in Bridge can get shifted left a little

General Stream/Admin

  • [ ] How to change wordpress endpoint, specifically /admin
  • [ ] Make hand cam a separate scene?
  • [ ] Make subgoal buying a wet fart soundboard (thanks chat) – [ ] CornoZeewo scuffed model
  • [ ] Corno mode
  • [ ] CuwonoZeewo scuffed mode
  • [ ] Need a Klawful knife emote, or a shotgun (thanks R4D4R)
  • [ ] Install minecraft
  • [X] Check and commit from laptop
  • [ ] Update suggested python dev list
    • [ ] Add statquest channel
  • [ ] 2022.10.14 First ever follow bot attack
  • [ ] Expensive redeem: crono tells lore from vtuber chat
  • [ ] Add smort command
  • [ ] Maweexy wants "Data Da Da", maybe that can be a different mode
    • [ ] Intelijens +1
  • [ ] !Keyboard command
  • [X] Really need to set up a timer for breaks/ads
    • [ ] Or just have fewer ads, Crono
  • [X] Share the actual data for MeaLeon (remove it from .gitignore)
    • [X] Files are too big: Free GitHub has a per file limit of 50MB
    • [X] JSON has been shared to Kaggle
  • [ ] Add contributing guidelines to repo
  • [ ] Use silhouette analysis to determine number of clusters
    • [ ] Elbow method said 6, my hypothesis was 12
    • [ ] Not worthwhile on the tSNE transformed data
  • [ ] Use KMeans to predict cluster for the Missing Cuisine recipes
  • [ ] Run clustering on the untransformed dataframes to see what we get out
  • [X] Fix cuisine filter (Vietnamese did not exclude Asian or Chinese, want to check "parent" and "sibling" cuisines)
  • [] Update README
  • [] Check speed limiter on Edamam, it may be too restrictive
    • See if there’s an error code to diagnose
  • For next stream
    • Classify and or cluster the missing labels to see how they can augment existing recipe database
  • 2022.10.31 Raided 4 times!

Socials