Skip to content
Home » 2022.11.21 – Stream Notes

2022.11.21 – Stream Notes

  • by

Stream notes

Video

Here’s the VOD:

Below is an embed


Summary

  1. Intro
    1. How was your day?
      1. Work
      2. Got Pokemon Scarlet
  2. Coding
    1. HDBSCAN – Hierarchical Density-Based Spatial Clustering of Applications with Noise
    2. Get predictions/labels out
    3. Plot a tree visualization
  3. From chat/derail
    1. Pandapoopums says to try hair oil!
    2. Also get a 2nd set of sheets, like an adult
    3. Name a pokemon after Pandapoopums!
      1. A pandalike one or wooper (woopums)
    4. New additional video capture card, less expensive than Elgato
  4. Music
    1. Japanese City Pop on vinyl with Shaqdi
    2. Zag tries to keep up with Habibi Funk’s selection
    3. Funky Drum Machine Soul and Boogie with Zag & Kay Suzuki
  5. We raided AriaADM

Shoutouts

Streamers who were active in chat

  1. intelijens
  2. nickwan_datasci
  3. valkeryias

To Do

Code

  • [ ] Add custom function to look at the term frequency distributions
    • [ ] This is even what the scikit-learn docs do…they should just build it in to the algorithms…
  • [ ] What is a good minimium document frequency for a term and what is a cutoff? 1/number of topics?
  • [ ] Look at the example on sklearn for NMF and topic extraction
  • [ ] add ingredient information to search result box
  • [ ] Make the background tile
  • [ ] take another look at streamlit
  • [ ] Look up math theory behind t-SNE
  • [ ] optuna for automated hyperparameter searching
  • [ ] Should .lock and tool-versions be added to .gitignore? I’ve never committed them
  • [ ] Refactor to use Pola.rs
  • [-] Write better/more thorough docstrings
  • [ ] Look up examples of applying OOP practices to data science
  • [ ] Make a PR to update the documentation for how **kwargs are used for sklearn Pipeline (their example and docs are seemingly incorrect)
  • [ ] Make a PR to update the FeatureAgglomeration docs to have linkage ward also say "only euclidean is accepted"
  • [ ] Make a PR to update FeatureAgglomeration docs: cosine affinity cannot work with sparse matrices
  • [ ] Add contributing guidelines to repo
  • [ ] Use KMeans to predict cluster for the Missing Cuisine recipes
    • [ ] Move to rawer data (see step below)
  • [ ] Run clustering on the untransformed dataframes to see what we get out
  • [] Update README
  • [] Check speed limiter on Edamam, it may be too restrictive
    • See if there’s an error code to diagnose
  • [] Would be cool to have something like this graph
    • Have interactivity to display the closest recipes to your searched recipe
  • [] Link VSCode to the droplet (can use SSH)
  • [] How to (check vod from 11.11)
  • [] Add button to copy links in markdown format in MeaLeon
  • [] Reenable TFIDF instead of the OHE vectors

Photo

  • [ ] Try to see if the preview pane in Bridge can get shifted left a little

General Stream/Admin

  • [ ] How to change wordpress endpoint, specifically /admin

  • [ ] Make hand cam a separate scene?

  • [ ] Make subgoal buying a wet fart soundboard (thanks chat) – [ ] CornoZeewo scuffed model

  • [ ] Corno mode

  • [ ] CuwonoZeewo scuffed mode

  • [ ] Need a Klawful knife emote, or a shotgun (thanks R4D4R)

  • [ ] Install minecraft

  • [ ] Update suggested python dev list

    • [ ] Add statquest channel
  • [ ] Expensive redeem: crono tells lore from vtuber chat (Requested by maweexy)

  • [ ] Add smort command

  • [ ] Maweexy wants "Data Da Da", maybe that can be a different mode

    • [ ] Intelijens +1
  • [X] Really need to set up a timer for breaks/ads

    • [ ] Or just have fewer ads, Crono
  • [ ] Fix me with small screen scene

  • [X] Add a raid command that shouts out raiders and links to an intro/MeaLeon

  • check out deadmau5 one day

    • LTT x deadmau5
  • What were the results of the pun poll?

  • [] Make a command to link directly to dataset uploaded to Kaggle

  • Get Hoppip and Bulbasaur planters

  • For next stream

    • Classify and or cluster the missing labels to see how they can augment existing recipe database
    1. Can do another dimension reduction to make a 2D visualization after attaching HDBSCAN labels to the original data
      1. Mix in the probability to impact "density" of color (like the HDBSCAN plot examples…but in Bokeh)
      2. Refactor existing template for Bokeh

    Or

    1. See how the HDBSCAN labels "map" to cuisine labels
      1. Go back, reattach index labels to data so that you can compare the original recipes with the HDBSCAN labels
    2. Predict HDBSCAN labels for the recipes that are missing cuisines and see where they line up (noise, etc)
      1. But also maybe compared to cuisine labels…depending on how many "noisy" recipes there are, this might not be very productive, but see how this looks
    3. Could this new HDBSCAN label used instead of a hard mapped cuisine label? What would the results look like?
      1. Gets a little tricky if recipes map to noise
      2. This involves a refactor of MeaLeon’s webapp (which is overdue)
    4. Can still do 2D visualization to have something up
      1. Mix in the probability to impact "density" of color (like the HDBSCAN plot examples…but in Bokeh)

Socials