Data Science Resources
Below you will find a compiled list of all my favorite data science resources, broken down into the following subject categories:
 General Guidance
 Process & Skills Breakdowns
 Industry Roles
 Job Search
 Building a Portfolio
 Bootcamps
 Online Courses
 Python
 Asking Questions
 SQL
 Pandas
 Tidy Data
 Scientific Computing
 Inferential Statistics
 Experimental Design
 Machine Learning
 Data Storytelling
Note that these resources are meant to be referenced on an asneeded basis. I do not recommend trying to read through everything at once.
Please consider contributing your own favorites, via the submission form below. This page will be a work in progress, so I’d love your help in expanding it.
Share YOUR Favorite Resources:
AJ’s Favorites
General Guidance
These articles helped set the foundation for my approach to learning Data Science.
 How to (actually) learn data science (DataQuest)
 How to learn data science without a degree (Springboard)
 Raj Bandyopadhyay (Quora)
 5 Things You Should Know Before Getting a Degree in Data Science (Medium)
 To Become a Data Scientist, Focus on Competencies Before Skills
Process & Skill Breakdowns
These articles were the main online sources for the “Data Science Deconstructed” infographic I created above:
 The Data Science Process: What a data scientist actually does daytoday (Medium)
 The Data Science Process, Rediscovered (KD Nuggets)
 8 Skills You Need to Be a Data Scientist (Udacity)
 10 Must Have Data Science Skills (KD Nuggets)

A Comprehensive Review of Skills Required for Data Scientist Jobs (Dataversity)
Industry Roles
These are helpful for understanding the current data science job market.
 The Data Science Industry: Who Does What Infographic (DataCamp)
 Data Science Career Paths: Different Roles in the Industry (Springboard)
 The State of Data Science & Machine Learning (Kaggle)
Job Search
These are longer PDFs and case studies that explain how to find DS jobs & interview well.
 Guide to Data Science Jobs (70 page PDF) (Springboard)
 Guide to Data Science Interviews (90 page PDF) – (Springboard)
 Lessons from Analyzing Hundreds of Data Science Interviews (Springboard)
Building a Portfolio
Your portfolio is your most important asset for landing a job. Here are some brilliant blog posts from Vik at DataQuest on how to do it right.
 Building a data science portfolio: Storytelling with data (DataQuest)
 The key to building a data science portfolio that will get you a job (DataQuest)
Bootcamps
Bootcamps tend to complement accelerated learning well. Personally, I ended up choosing Springboard’s Data Science Intensive (here’s a $100 coupon code off any Springboard course), but I also considered Udacity & Metis:
Online Courses
Both of these courses have been absolutely indispensable for me. Some best online instructional videos out there, hands down.
Python (Steps 36)
This graph from Kaggle’s 2017 industrywide survey shows why I would recommend you start with learning Python over R.
 Learn Python (Codecademy)
 Intermediate Python for Data Science (DataCamp)
 Python Data Science Handbook (GitHub)
Asking Questions – (Step 1: Frame the Problem)
Asking intelligent questions is a skill you have to develop. Here’s some insights on how to ask questions that data can answer.
 How to ask questions data science can solve (Medium)
 Ask a question you can answer with data (Microsoft)
SQL – (Step 2: Collect raw data)
There are many SQL tutorial out there to choose from, but here are my personal top 3.
Pandas – (Step 3: Process the Data)
Pandas is an indispensable skill in the Data Science toolbox. It’s important to become very comfortable with using it for data wrangling & data cleaning.
 10 Minute Pandas Tutorial (Pandas Docs)
 Data analysis in Python with pandas (YouTube)
 Data Wrangling Cheat Sheet (GitHub)
 Reshaping in Pandas – Pivot, PivotTable, Stack, & Unstack Explained with Pictures
Tidy Data – (Step 3: Process the Data)
This research paper is a mustread to understand the conventions behind data cleaning:
 Tidy Data research paper by Hadley Wickham
 Here are my annotated Evernote notes for a quicker read
Scientific Computing – (Step 4: Explore the Data)
Everything you need to know about Exploratory Data Analysis (EDA) and more:
 Exploratory Data Analysis Conceptual Handbook (NIST)
 the assumptions, principles, and techniques necessary to gain insight into data via EDA
 Ultimate guide for Data Exploration in Python (Analytics Vidhya)
 a walkthrough of examples with numpy, matplotlib, and pandas
Inferential Statistics – (Step 4: Explore the Data)
 Inferential Statistics by Univ of Amsterdam (Coursera)
 Statistical Aspects of Data Mining – Google TechTalk (YouTube)
Experimental Design – (Step 4: Explore the Data)
Machine Learning – (Step 5: Indepth Analysis)
This section could use some more work. Does anyone have any good resources they’re able to submit above?
 7 Steps to Mastering Machine Learning in Python
 Cheatsheet: ScikitLearn & Caret Package for Python & R respectively
 Feature Engineering (Microsoft)
Data Storytelling – (Step 6: Communicate Results)
Some amazing examples of data storytelling:
 Wealth Inequality in America
 How Mariano Rivera Dominates Hitters
 Hans Rosling TEDTalk – The best stats you’ve ever seen
 Top 10 TEDTalks for Data Scientists
These two lectures are what Tim Ferriss would call the “Minimum Effective Dose” (MED) of data storytelling:
Here’s some other inspiring blogs/courses on data storytelling: