Skip to content
Home » Helpful Data Science Career Resources

Helpful Data Science Career Resources

closeup photo of blue pen tinted spiral notepad placed beside pen die cast car and coffee cup
Photo by David Bares on Pexels.com

This page serves as a meta compilation of resources that have been helpful for me in my data science career. I’ll update this over time!

In General:

  • Remember that you are driving this process! It is rare that people will push you or give you the time to prep or do any of these things. You have to remember to block out time for yourself and keep to it.
  • If you look at the Refresh Technical Skills section, it’s going to look like we’re expected to know a lot. We kinda are. It really depends on the job you’re looking to do or are applying for. I’ve applied to jobs that actually wanted a hardcore statistician and failed those interviews. I’ve also applied to jobs that are looking for a SQL expert and failed those too. Lastly, I’ve applied to jobs where the technical interviews had a heavy focus on data structures and algorithms and definitely failed those. Try not to stress out: it’s extremely difficult to be a technical master on all of these topics all at once if you are not using them frequently.


On LinkedIn:


On Resumes:

  • One page! Unless you are applying to academic positions that specifically ask for your whole CV. I’ve had tons of people tell me that resumes should be 1 page
  • I heard somewhere that people spend 6-15 seconds looking at a resume
  • Your application is very likely going to be digital. To save visual space on my resume, I used icons and embedded hyperlinks where possible

Classes:

Code:

Statistics:

Applied Statistics/Machine Learning:


Working on Code/Problems:

For the record, I don’t think working through algorithms problems are necessarily reflective of what coders or especially data scientists will do on a day to day basis. However, they are considered part of our interview processes.

Leetcode, Hacker Rank, and Code Wars are somewhat all similar and I find that solving most Easy level and some Medium level problems will probably be an accurate reflection of things you should expect to see in interviews. Medium is even a stretch.

For SQL, I’ve really liked both StrataScratch and Mode. SQL is an interesting language where, in my opinion, the difficult problems come from more complex databases and understanding syntax…until you get to the point where you need to optimize your queries on huge datasets. Both of these resources start teaching more advanced SQL functions.

Python:

SQL:

Data Sets:

  • Kaggle: https://www.kaggle.com/datasets
    • Kaggle has TONS of datasets. You can compete there like with Leetcode but given that Kaggle has had scandals with people cheating, (and the dubiousness as to whether people actually check your Kaggle score), I wouldn’t try to compete, but instead leverage their datasets to start working on projects or problems that are more interesting to you!
  • Microsoft Research Open Data: https://msropendata.com/
  • US Government’s Open Data: https://data.gov/

Puzzles


Compensation:

  • Blind: https://www.teamblind.com/
    • Blind is like an inverse LinkedIn: instead of people praising employers, it has anonymous but verified employees discussing negatives about the companies they work for. You have to list your total compensation when posting/commenting
  • Levels: https://www.levels.fyi/
    • Levels has employees anonymously share their offers and allows people to compare compensation packages

Content Creators:

You are more than welcome to follow along with my resources, but I believe the following creators have more established libraries of content and maybe you’ll vibe more with them!

Twitch:

YouTube

Non Video Based

  • Chip Huyen at Chip Huyen | LinkedIn: posts a lot of helpful and interesting discussion on data science, machine learning, and career advice

Podcasts:

  • Software Engineering Daily: https://softwareengineeringdaily.com/
    • I think this is the biggest software engineering podcast and it has a subset discussing machine learning. There are some really cool episodes talking about tech deployment and history.
    • In all honesty, the founder and main host decided to literally go off his meds when COVID shelter in place happened and revealed more of his true personality, conspiracy theory leanings, and generally entitled and somewhat fragile mindset and personality. I think the best content is before 2021. Afterwards, he begins pressuring guests to agree with him or provide funding for a start up he founded and the interviews become low quality. I find myself skipping half the episodes now.
    • Some time in 2021, the host of Data Skeptic came in and started to host, and the quality of the episodes started going back up! Basically, the less Jeff Meyerson in the episodes, the better…although I wrote this line before I found out he passed away in July 2022
  • Not So Standard Deviations: https://nssdeviations.com/
    • Roger and Hilary talk about data science in academia and industry.
    • This has actually become one of my favorite tech related podcasts
  • Data Skeptic: https://dataskeptic.com/
    • Data Skeptic and Not So Standard Deviations tend to talk about practical effects of using data science, which is why I find them to be so interesting. How do we integrate with teams to help everyone work better?
  • Linear Digressions: http://lineardigressions.com/
    • This podcast has ended, which is unfortunate, but the episodes should still be available!
  • Partially Derivative: http://partiallyderivative.com/
    • This podcast has ended, which is unfortunate, but the episodes should still be available!
    • It’s been a while since I listened to either Linear Digressions or Partially Derivative, but I remember missing both when they announced their ends
  • Towards Data Science: https://towardsdatascience.com/podcast/home
    • To be frank and maybe a bit harsh, I don’t think a lot of the user-submitted/crowd sourced content on Towards Data Science or Analytics Vidhya is of the highest quality. However, this podcast has been more interesting as it teaches techniques, advice, or history of data science in different organizations.

Socials