Announcing Aware Healthcare

After 10 months of hard work, I’m excited to announce that Aware Healthcare is live and growing!

Last week we launched to our first 200 patients and now we’re hiring for 7 new positions (2 full-time roles, 5 summer interns) to join us on our mission of building the thermometer for the mind.

We’re Hiring:

The positions we’re hiring for include:

  • Principal Data Scientist (Full-Time)
  • Full-Stack Software Engineer (Full-Time)
  • Data Analyst (Intern)
  • Data Engineer (Intern)
  • Software Engineer (Intern)
  • Research Assistant (Intern)
  • Administrative Assistant (Intern)

And here are some of the many benefits we’re offering as part of our internship program (starting in < 2 weeks on June 8th), in addition to cash compensation:

  1. Early-stage startup experience
  2. Portfolio projects
  3. Summer case study
  4. Letter of recommendation
  5. Future job referrals
  6. Professional mentorship
  7. Networking opportunities
  8. Real impact

More details on each of these benefits in the job descriptions linked here:

Our Why:

I started Aware Healthcare because I believe the greatest problem in mental healthcare is not stigma or access, but measurement. Mental health patients often don’t receive the care they need because we lack a universal vocabulary for describing how we’re mentally feeling. And mental health providers often can’t treat those who need it most because we lack an objective measure for prioritizing one condition or approach over another. 

So why do we have a thermometer for the body but not a thermometer for the mind?

We’re starting with addiction because there is simply no more costly, preventable, or unmanaged disease in the United States of America. Addiction costs our nation $700 billion every year and leads to 70 other comorbid medical conditions. 16% of Americans meet criteria for clinical addiction and another 32% classify as risky users. Yet, with nearly half of Americans *directly* affected, we continue to turn our attention away from this chronic, complex brain disease.

So how big of a problem does addiction need to become before we give it the attention it deserves?

Join Us:

Now in a post-COVID world, the need for remote monitoring of mental healthcare has never been greater. This is no longer just an idea. Real patients in recovery from a substance-use disorder are being touched by our work every day. Which is why we’re inviting 7 new people to join us, in paid positions, starting immediately.

Interested? Check out our open positions below, and message me directly if any of the roles are a good fit for you or somebody you know.

We’d love to hear from you.

Our Mission Just Got Real: Losing a Teammate to Drug Overdose While Working to Prevent Addiction Relapse

(5 minute read)

Over the past 4 months, I’ve been starting a new technology company focused on predicting and preventing addiction relapse.

Then yesterday, on Christmas morning, I found out that one of my teammates had overdosed. Overnight late last week, he passed away in his sleep. The next morning, his mother found him lying in his bed, his glasses still on… his body cold. He was 23 years old.

My teammate [on the left] as a child in Michigan with his friends. Out of respect for his family, he will not be named.

On Christmas morning, as I sat for a half-hour on the phone with his mother, listening to her crying hysterically… still mourning the loss of her son… she told that me in the days leading up to his death she had never seen him so happy.

“I’m sorry, I think the phone may have cut out. Did you say ‘happy’?”… “Yes. He told me about the promotion you gave him last week. He was so excited. He said it was his dream job… that he finally felt successful. He told me ‘Mom, I want to make you proud’.”

The Story, As I Understand It

Speaking with his childhood friends thereafter, I learned that, for at least the past decade, he’s struggled with addiction, depression, insomnia, and anxiety. One of his friends from middle school told me he started seeing a therapist for it all when he was 15 or 16. A second friend told me “he took Xanax a lot”. A third told me he “had a thing with opioids” and “kind of had an alcohol addiction”. He never me told any of this.

His mother explained that her husband — his father — has struggled with “these issues” his entire life. I know from my own research that addiction is a complex brain disease originating in the reward circuitry of the brain, and that genetics account for 50-75% of the risk. As one addiction psychiatrist put it, “people may choose to take drugs, but nobody chooses to be an addict”

According to CASA Columbia’s 2012 report on Addiction Medicine, risky substance use and addiction are the largest preventable and most costly public health and medical problems in the US today. Together they are the leading causes of preventable death, cause or contribute to more than 70 other medical conditions, and result in total costs to the government alone of at least $468 billion each year.

Two days before his death, I promoted him to lead our engineering team. While we had only met just 3 months ago, his skillset was incredibly impressive. He was a full-stack engineer. A product manager. A digital designer. On our last Zoom call, he showed more initiative than he ever had before… volunteering to stand-up our technology live in the cloud on Amazon Web Services (AWS), to help with customer discovery, to reach out to every investor he knew. It was just last year that he graduated the University of Michigan with a degree in Computer Science… but boy did he know how to hustle.

The night he passed, I set him up with a new email on our company’s Google Admin console. Then we texted to coordinate him setting up a new AWS account for the company. I emailed him asking if he could take ownership over finding us the AWS credits we needed in the most cost-effective way possible. He emailed back in less than 3 minutes, saying “Yup, got it covered. Have rough estimates of the cost, will let you know once it’s finalized.”

Then I never heard from him again.

When Facts Become Feeling

The thing I just can’t get over is the irony of it all. He overdosed while building a tool to prevent addiction relapse. On one hand, it makes absolutely no sense. On the other hand, it makes perfect sense… but in a way that I can’t exactly wrap my head around.

It gives me chills to think that I was likely the last person he communicated with before he died… that our work together was one of the last things he thought about. The night he passed, he told his Mom he would be “up late, finishing a project”. Nothing unusual: he was a hard-worker who, when he couldn’t sleep, often worked through the night. Nobody thought it would be his last.

Figure 1.A.png
Contrary to popular belief, addiction has nothing to do with a lack of willpower or moral weakness. In fact, 48% of all Americans over the age of 12 are directly affected, with 16% meeting criteria for ‘clinical addiction’, and 32% more classifying as ‘risky users’.

Over the past 2 months I’ve been learning everything I can about addiction. I just finished a 2012 report by CASA Columbia which spanned 250 pages. I’ve been listening to people’s struggles first-hand, through attending addiction support groups — like AA, NA, SMART, and Refuge Recovery — several times per week in my area. I’ve even been holding near-daily meetings with addiction psychiatrists, to learn everything they know about treating and managing the disease.

And yet, despite all of this, I couldn’t see it right in front of my own face… on my own team.

Everything I’ve been reading… everything I’ve been learning… did not truly come alive until just yesterday, when I was on the phone with the mother of my teammate, listening to her sobbing over the death of her only son.

Our mission just got real.

What Comes Next

While we’re still an early-stage startup in stealth-mode, this unexpected loss has me feeling the need to share a bit about where we’re headed.

The one-liner is this: my company, Conscious Insights, is a consulting group developing an AI technology to predict and prevent substance-use disorder relapse using passive meta-data from patients’ smartphones.

Our mission is to build the ‘thermometer for the mind’; enabling care providers to ‘check the mental temperature’ of their patients (with explicit permission) — in an objective, continuous, ecological, and passive fashion — at any time. This way, they can determine which patients require heightened attention and intervention, in advance of relapse.

In my teammate’s case, our technology could have potentially let his mother and doctor know he needed help, days or weeks in advance.

Portal. .80 .90 100.png
Our mission is the build the ‘thermometer for the mind’

And while there is still a ton of work to do, what started as ‘a crazy idea’ just a few months ago is starting to come together.

In March 2020 Conscious Insights will be kicking off a 300+ person clinical study at several Community Health Centers across the state of California. It will be the first study of its kind, and the largest to ever use technology to predict/prevent addiction relapse.

This is why my teammate was up late that night. Because he saw setting up our AWS account as the first step in a larger opportunity to help hundreds more with the same struggle that’s plagued him and his father their entire lives.

I just wish I knew before it was too late. For others in the future, it’s our mission to change that.

Join the Mission

If you were moved by this story and are interested in supporting our mission, here are some ways you can get involved:

  1. RESEARCH WITH US: We are seeking additional research partners (e.g. health centers, treatment clinics) to join our upcoming clinical studies. Through partnering with us, we are offering to provide research staff, obtain IRB approval, grant early access to the eventual product, and fully compensate all parties involved for their time. Not to mention the opportunity to be recognized as a leader in the field of addiction medicine as we learn, together, what early warning signs exist addiction relapse… and ultimately develop a tool that alerts trained medical professionals to intervene before it’s too late.
  2. BUILD WITH US: We are openly hiring for paid Data Science, Machine Learning, and Backend Engineering positions on our team. We have several part-time roles (10-20 hours/week) starting in January 2020 with our consulting services business, which you can apply for here. Likewise, we have several full-time positions (40 hours/week) starting in August 2020 (involving signal processing analysis with smartphone meta-data) for this new product business.
  3. SPEAK WITH US: We are actively recruiting advisors, with stock-options, primarily across three different areas:
    1. (A) experts in treating substance-use disorders (e.g. addiction psychiatrists),
    2. (B) health professionals in managing addiction (e.g. nurse care managers),
    3. (C) healthcare leaders whose systems financially ‘bear risk’ for addiction relapse (e.g. capitated insurance payers)

If any or all of the above apply to you, please fill out this short Google Form and someone from our team will contact you shortly:

Conscious Insights - Interest Form

Alternatively, if none of the above apply to you, but you are still interested in helping support our mission, please fill out the form with how you’d like to get involved, and we’ll be in touch.

Fixing our country’s broken model of care is going to take a village, so thank you in advance for your interest in getting involved.


Psychedelics & The Future of Mental Health Treatment


How many people know someone who’s depressed despite the fact that they take anti-depressants? Or have people in your lives affected by addiction – opioid, alcohol, or otherwise?

Well, there’s hope.

The scientific research that continues to come out around the use of psychedelics to treat depression, anxiety, opioid addiction, alcoholism, and a whole host of other mental disorders… is nothing short of amazing.

In two such examples, cited in the video below:

1) [4:10-7:10] In the largest study to-date examining the effect of psilocybin on depression and anxiety in individuals with life threatening cancer diagnosis (people freaking out because they’re going to die), a single high-dose session of so-called magic mushrooms resulted in sustained reduction of depression and anxiety, from clinically-severe levels (23/25, 26/30) to nearly-symptom-free levels (6/25, 7/30) a full 6 months out.

Screen Shot 2019-05-12 at 10.24.06 AM

To put this in perspective: current depression medication (most commonly SSRIs) hasn’t evolved since the 1980’s, requires people to swallow a pill every day, comes with a long-list of side-effects, and does absolutely nothing for the 1/3 of depressed adults with treatment-resistant depression (TRD).

2) [7:10-9:00] In a pilot study examining the effect of psilocybin in the treatment of tobacco addiction (people who want to but cannot quit smoking), three low-doses of mushrooms led to 80% of participants being biologically-confirmed (e.g. breathe samples, urine samples) as smoke-free 6 months out. And these results held up to 60% 2.5 years after their target quit-date.

Screen Shot 2019-05-12 at 10.24.25 AM.png

Comparatively, the best FDA-approved medication we currently have in treating tobacco addiction is less than half as effective, averaging 35% abstinence 6 months out.

And the best part?

The Imperial College London just announced they’re launching the world’s first Centre for Psychedelic Research (2.5 minute teaser video here) so it appears this is just the beginning.

As someone with a family history of mental illness, who’s had my own fair share of battles with depression, and lost family/friends to the addictions described above… watching this video and reading these studies, I can’t help but feel incredibly hopeful for the future.

The future of mental health treatment is bright.

My Appearance on the KYŌ Conversations Podcast

Screen Shot 2019-03-11 at 9.19.22 PM

This week marks one full year since I launched my podcast, which makes it fitting that my first appearance on another show just went live!

I’m incredibly excited to share this conversation I had with Marc Champagne on the KYŌ Conversations Podcast:

Marc asked some very thought-provoking questions, and in just 45 minutes we had the chance to walk through much of my story leading up to today. We covered topics like:

1) how/why I started my first business at 12 years old

2) driving myself into a dark depression through 6 months of 80+ hour weeks on my first tech startup

3) the friends and mentors who helped through the darkest period of my life

4) starting to meditate everyday 1044 days ago, and how the daily streak is still going today

5) what I see at the intersection of data science <-> mindfulness and why I’m obsessed (hint: see episode title)

6) my current step-by-step 3-hour morning routine (including meditation / journaling / exercise / eating / etc), how it’s iterated over the years

This is just the beginning!:


Internal vs. External Problems


When faced with what appears to be an insurmountable or overwhelming challenge, I go through the following mental checklist:

  1. How’d you sleep last night?
  2. When was the last time you’ve eaten?
  3. Did you exercise this morning?
  4. Have you meditated yet today?

And in the process I’ve found, time and again, that problems are rarely external.

The situations I think ‘need to change’ are, 80% of the time, an issue of self-care.

So the next time you’re faced with a feeling of sadness, frustration, confusion, or doubt, try asking yourself:

  1. Is this an external or internal problem?
  2. Am I tired, hungry, sedentary, or distracted?
  3. Could this situation be resolved, or more easily navigated, by first attending to these needs?

Before focusing on what’s external, let’s take care of ourselves.

1000+ Bookmarks Later: These Are The 5 Most Influential Articles I’ve Read In the Past 5 Years


Man has the internet taught me a lot.

Today I found myself digging through a time-capsule worth of bookmarks from the past 5 years in search for a master list of design resources requested by a co-worker.

In the process of all this digging, I came across countless articles I’ve since forgotten existed, yet at the time of reading were nothing short of mind-expanding for me.

It’s funny how learning works. When we really learn something, the lesson becomes part of who we are. But somewhere along the way, we tend to forget the source.

Perhaps this is where the myth of a self-made person comes from? In ourselves and others, all we ever see is the end-result; too easily forgetting all the people that’ve helped along the way.

As I look back on the 1000+ articles I’ve read and bookmarked over the past 5 years, I thought I would share the 5 that have been most influential on my thinking.

I’ve chosen these articles because, since reading each of them, I’ve experienced a distinct before/after in how I approach the given topic. And collectively, I would go so far as to say that the lessons I’ve taken away have served as more of an education than school ever could:

1) On making big life decisions:

2) On minimizing regret and living a good life:

3) On asking for, and giving, advice:

4) On the pursuit of mastery:

5) On making things people want:

Here’s to feeling infinitely grateful to all the people we’ve never actually met, and yet thanks to the internet, have permanently changed the course of our lives 🙏

What free article/video has expanded your mind or changed how you see and interact with the world? Share your own favorite(s) in the comments below!

Plot Twist: Why I’m Moving from San Francisco to Portland, Shifting from Conscious Insights to Oak Meditation


Life sure does keep you on your toes.

In what has been quite the unexpected turn of events over the past week, I have some big news to share.

This week I will be packing up my things, moving from San Francisco California, and putting my business — Conscious Insights — on hold to go work full-time for Kevin Rose at his latest venture — Oak Meditation — as employee #7 / data scientist #1 in Portland Oregon.

For the past year and a half, I’ve been fully focused on applying data science to mindfulness-technology, and up until now have been convinced that consulting for many companies in this space was the best way to make the most positive impact.

Just 10 days ago, I would have never considered this a possibility.

However, after several face-to-face conversations with Kevin and the Oak team in Portland last week, and an abundance of video chats thereafter, I have been successfully convinced otherwise; given an offer and opportunity that I just cannot refuse.

So instead of working for many companies in this space, I’ll be going all-in on just one.

At Oak Meditation, not only will I be able to continue contributing to the cause I care so much about, but I’ll also have the chance to lead all-things data within this new organization, and build a world-class team around me in the process.

I’ll be growing in tangible and measurable ways each and everyday, creating a modern data architecture from the ground up, participating in BoD meetings + VC pitches, weaving data into every aspect of the business, and — most importantly — working alongside just a fantastic bunch of humans.

It’s ironic that, as I write this, I’m on a plane from SFO —> NYC for a 2-day Mindfulness in America conference. Since starting to meditate everyday 908 days ago (April 27 2016 was when it all began), this practice has found it’s way to the total center of my work and life.

Climbing to the top of Corona Heights late last night to say goodbye to San Francisco, a friend and I recounted about how much of a turning point that day has been for me. Now, when I think about how deeply the practice has transformed my life, and all the ways it’s allowed me to give back to others’ lives, I feel a sense of appreciation and purpose that’s impossible to put into words.

Working for Oak is the next step in this journey.

The hardest part about this decision was leaving SF: a city I just arrived in 5 weeks ago, and a place where most of my closest friends in the world reside. But alas, after many long-hikes and late-night-hangouts over the past week, I feel — deep in my bones — this is what I’m meant to do.

So Portland, here I come.

Portland, Oregon and Mount Hood from Pittock Mansion

Simple Truths: My Guiding Signals in a World of Noise


One of the best lessons I’ve learned over the past few years is that — when faced with a difficult decision or challenging situation — rarely is more information the answer.

Instead, what I’ve found to be so incredibly helpful is this idea of “simple truths” (what some may call “first principles”): short snippets of wisdom that have been gathered, carefully curated, and repeatedly learned over years of life experience.

I originally learned of the concept from a French philosopher named Alain de Botton, who’s ideas on education reform are fascinating, and is often quoted as saying “we overeducate ourselves out of simple truths”.

As Alain explains, school teaches us that once we know something, that’s it; you know it, and it’s time to move onto the next chapter. But this is dangerous, because it leads us to believe we understand more than we actually do. That is, knowing something in your head is entirely different than feeling it in your bones.

To truly understand requires repetition; repeatedly learning the same lesson over and over again until it translates from conceptual understanding to daily practice. Understanding an idea is on a much lower dimension than acting on it. For example, knowing that daily exercise is good for you is much different than actually going to the gym everyday.

So every time I move into a new place (5 times and counting over the past 3 years), I start a new wall of simple truths. And then, over the course of my time there, I gather these ‘simple truths’ from all sources: conversations with friends & mentors, books I read, podcasts I listen to, or even just experiences I have. Then, every time I’m faced with a difficult decision or challenging situation, I return back to the wall as my source of guiding light.

Friends who know me well (and have received my countless text messages sharing new addition to the wall) often give me shit about my seeming obsession with “post-it note wisdom”. But in a way, this is my religion. The difference is, instead of becoming defensive or dogmatic about it, I start over more than once a year. I’m always beginning again, gathering new lessons and repeatedly learning them until I can’t *not* remember.

Truthfully, up until now I’ve been pretty insecure about sharing these simple truths outside a core group friends… mostly because I realize that 95% of them won’t resonate with others in the same way they make sense to me. But, as I take down my 5th wall in the past 3 years, I feel a need to be more vulnerable than I’m naturally willing, and share.

For the past 11 months, these 25 simple truths (below) have been my guiding signals in a world of noise. There’s nothing complex about them, but they’ve been so very helpful for me in what may have otherwise felt like hopeless situations.

My hope is that, even if 24/25 pass you by, just 1 (5%) sticks with you; enough that you’re able to feel it in your bones, and not just know it in your head whenever you need it most. After all, as Derek Sivers says, “if information was the answer, we’d all be billionaires with perfect abs”.

AJ's Simple Truths 2017-18

Announcing the Data Journeys Podcast


I am thrilled to announce the official launch of my new podcast, Data Journeys:

Data Journeys is a podcast for aspiring Data Scientists where I’ll be interviewing world-class Data Scientists about their learning journeys.

In each episode, the goal is to have them tell their story and equip up-and-comers with the strategies, tactics, and tools that the best in the world have used to get to where they are today.

I’m speaking with guests ranging from the US Military to Silicon Valley, from the top-ranks of academia to down-under in Australia, with a focus on how they’ve bridged the gap between acquiring technical skills and creating real-world impact.

For example, two upcoming guests are Andrew Ng — the co-founder of Coursera — at Stanford University and Fernando Perez — the creator of Jupyter Notebooks — at UC Berkeley.

You can listen or subscribe to the show via links to iTunes, Soundcloud, Google Play Music, and more at:

Enjoy the show!

Deconstructing Data Science: Breaking The Complex Craft Into Its Simplest Parts

This is the SECOND in a series of posts on applying Tim Ferriss’ accelerated learning framework to Data Science. My goal is to become a world-class (top 5%) Data Scientist in < 6 months, while open-sourcing everything I find & learn on the way.

The purpose of this post is to empower others to start accelerating their own learning by:

  1. deconstructing the complex craft of Data Science into its simple micro-skills
  2. identifying the 20% of skills that contribute to 80% of outcomes

And if you stick around until the end, you’re in for a special treat.

Estimated reading time: 15 min ( to save you hours of spinning in circles 😉 )

The Problem


A simple Google search of “how to learn Data Science” returns thousands of learning plans, degree programs, tutorials, and bootcamps. It’s never been more difficult for a beginner to find signal in the noise.

Everyone seems to have a different opinion, and the only common approach appears to be dumping a long list of courses to take and books to read, all the while providing little to no context into how these concepts fit into the bigger picture.

This post is my attempt to convert all the buzzwords & fluffy terminology into explicitly-learnable skills. To do this, I’ll be walking through my application of the first two steps to Tim Ferriss’ accelerated learning framework: Deconstruction & Selection.

Rather than jump right in to a roadmap of my own learning journey (that’ll be next post), I want to empower you to begin your own. And if you haven’t read my first post, I’d highly recommend starting there:

Deconstruction: The Data Science Process

“The whole is greater than the sum of its parts.” – Aristotle

DS Deconstructed
I’ll be walking through this infographic step-by-step below

It’s true: Data Science is not a single discipline, but a craft at the intersection of many. So in order to appreciate how the seemingly disparate puzzle pieces fit together, I present to you a story. It’s called “The Data Science Process”, and it has six parts:

  1. Frame the problem: who are you helping? what do they need?
  2. Collect raw data: what data is available? which parts are useful?
  3. Process the data: what do the variables actually mean? what cleaning is required?
  4. Explore the data: what patterns exist? are they significant?
  5. Perform in-depth analysis: how can the past inform the future? to what degree?
  6. Communicate results: why do the numbers matter? what should be done differently?

But before we begin, a couple quick caveats:

1) In large organizations, “The Data Science Process” is often carried out by an entire team, not a single individual. An individual can specialize in any one of the six steps, but for simplicity, we’ll be assuming a one-person team.

2) The insights that follow are a compilation of various expert interpretations; not my original ideas. I am not (yet) an expert Data Scientist, but over the past 6 weeks I’ve learned from many. Thus, I’m simply serving as the filter between hundreds of hours of research and the actionable insights you’ll find below.

In particular, I’ll be pulling from favorite online articles (linked throughout) and conversations with the following 10 experts:

  1. Chris Brooks — Director of Learning Analytics at the University of Michigan
  2. Andrew Cassidy — Freelance Data Scientist & Online Educator
  3. Jim Guszcza — US Chief Data Scientist at Deloitte Consulting
  4. Kirk Borne — Principal Data Scientist at Booz Allen Hamilton
  5. Michael Moliterno — Data Scientist + Design Lead at IDEO
  6. Chris Teplovs — Research Investigator at the University of Michigan
  7. Jonathan Stroud — Co-Founder of the Michigan Data Science Team (MDST)
  8. Josh Gardner — Data Science Research Associate, Team Leader on MDST
  9. Jared Webb — PhD Candidate in Applied Math, Data Manager at MDST
  10. Alex Chojnacki — Data Application Manager for Flint-Water-Crisis project

And to bring each step of the process to life, I’ll be using my work at, Inc. in San Francisco this summer as a real-world case study.

While there, I leveraged analytics insights from Calm’s database of 11 million users to develop & launch Calm College — the first US platform geared toward using mindfulness to improve college student mental health.

Alright, let’s get started!

Step One: Frame The Problem


The first step of The Data Science process involves asking a lot of questions.

The exact manner in which you do this will depend on the context in which you’re working, but whether you’re in the private sector, public sector, or academia, the key idea is the same: before you can start to solve a problem, you have to deeply understand it.

Your goal here is to get into the clients’ head to understand their view of the problem and desired solution. In the case of a corporation, this will first involve speaking with managers & supervisors to identify the business priorities and strategy decisions that’ll influence your work.

It’s not uncommon for the first request that a Data Scientists’ receives to be entirely ambiguous (i.e. “we want to increase sales”). But it’ll be your job to translate the task into a concrete, well-defined data problem (i.e. “predict conversion rate & return-on-investment across customer segments.”)

This is where domain knowledge and product intuition is crucial. Speaking with subject-matter-experts to cut through confusing acronyms & dense terminology can be incredibly helpful here. And familiarizing yourself with the product/service will be essential to understanding the intuition behind metrics.

For example…

With Calm College, the ambiguous request we started with was to establish partnerships with universities to offer the Calm app as a student wellness resource.

To better understand our specific domain, we started by spending two weeks speaking on the phone with as many college administrators as possible.

We asked questions like:

  • How would you describe the mental health climate on your campus?
  • How high of a priority is improving student mental health?
  • What main resources do you currently offer students?
  • What have been the greatest challenges?
  • Is there precedence for offering 3rd party services?

By the time we got to the final question, nearly every administrator had described their campus’ mental health climate as nothing short of “toxic”, and expressed improving it as their #1 priority.

They explained that the greatest challenge to students seeking help has been overcoming logistical issues (i.e. wait-time, transportation, & money) with the counseling services they currently offer.

Finally, here’s where our ambiguous request became a data problem…

Administrators told us that, before a 3rd party service can be adopted, precedence requires evidence supporting its use. In other words, showing that students on campus are already using the Calm app would be crucial to getting a deal done.

Step Two: Collect Raw Data


The second step of the Data Science Process is typically the most straightforward: collect raw data.

This is where your first technical skill — querying structured databases with SQL — comes into play. But fret not; it’s not as complicated as it may sound.

Here’s an awesome tutorial by Mode Analytics that’ll get you started with SQL in just a couple hours.

More important than the querying itself, however, is your ability to identify all the relevant data sources available to you (e.g. web, internal/external databases) and extract that data into a useable format (e.g. .csv, .json, .xml).

Oftentimes, an analysis requires more than one dataset, so you’ll likely need to speak with backend-engineers in your organization who are more familiar with what data is being collected and where it currently resides. Communication is key.

For example…

With Calm College, this required me sitting down with Calm’s lead engineer and exploring ways to pull usage data for specific college campuses.

Ultimately, I found out that we could simply query user activity by email address and school location. So for the University of Michigan, for example, I simply searched the database for emails ending in “” or locations listed as “Ann Arbor, MI”.

This approach wasn’t full-proof (turns out not all students were using their school email) but it did the job by giving us a representative sample of ~1000 users per college to compare different campuses’ activity head-to-head.

Step Three: Process The Data


The third step of the Data Science Process is the most underrated: process the data.

This is where a scripting language like Python or R comes into play, and a data wrangling tool like Python’s Pandas is absolutely indispensable.

To get started, here’s a breakdown of Python vs. R, intro to Python on Codecademy, 10-minute tutorial to Pandas, and colorful data wrangling cheat-sheet.

Data cleaning is typically the most time-intensive part of data wrangling. In fact, in expert surveys it’s been estimated that up to 80% of a Data Scientists’ time is spent here: cleaning & preparing the data for analysis (more on this below).

The reason this can be so time-consuming is because — before you can analyze data — you have to go column-by-column, developing an understanding for the meaning of every variable and then checking for bad values accordingly.

The tricky part is that a bad value can be defined as many things: input errors, missing values, corrupt records, etc. And once you’ve identified a “bad value”, you have to decide whether it’s most appropriate (given the situation) to throw it away or replace it.

For example…

With Calm College, I faced two significant roadblocks here:

  1. There was little to no company documentation on database variables
  2. I didn’t know Python’s Pandas and felt too intimidated to try and learn

Each of these presented their own challenge:

  1. It took me several days to figure out how to define an “active user” (i.e. should ‘active’ mean opening the app, starting a session, or completing a session?)
  2. I had to use an analytics tool called Amplitude rather than coding in a script file.

After talking with Calm’s Product Manager, I was able to define an active user as someone who “starts a meditation session” and identify the right variables. Then I had to clean the data by filtering out students who hadn’t been active in the last 365 days.

The thought process here was that administrators (i.e. our client) would primarily be interested in student activity from the past academic year, and non-active students (i.e. “null” values) were outliers that, if included, would only skew the results.

Noticing a theme here? It’s about your clients’ interests, not your own.

Step Four: Explore The Data 


The fourth step of the Data Science Process is where you explore the data, and the real adventure begins.

This is where the core competency of scientific computing (i.e. Python’s numpy, matplotlib, scipy, & pandas libraries) comes into play.

To begin, here’s an awesome breakdown of the “SciPy ecosystem” (a collection of libraries in Python), extensive guide to data exploration, and a conceptual handbook of assumptions/principles/techniques.

Using these libraries, you’ll split, segment, & plot the data, in search for patterns. Thus, the key is becoming really comfortable with producing quick & simple bar graphs, box plots, histograms, etc. that’ll let you catch trends early on.

Remember that analysts who produce beautiful externally-facing visualizations often have to iterate through hundreds of internally-facing ones first. So playing around with possibilities in this way is more of a guess-and-check art than a hard-and-fast science.

Finally, once you’ve identified some patterns, you’ll want to test them for statistical significance to determine which are worth including in a model. This is where a strong grounding in inferential statistics (e.g. hypothesis testing, confidence intervals) and experimental design (e.g. A/B tests, controlled trials) is essential.

For example…

With Calm College, I started by exploring factors that would influence a potential partnership: monthly engagement, week-by-week retention, and subscription rate.

My hypothesis going in was that elite schools known for student stress (i.e. Cornell, Harvard, MIT) would have significantly higher numbers across the three statistics. Or, in other words, I suspected that stressed-out kids need more calm.

To test this, I began by segmenting universities into their regional groups and then splitting areas into specific college towns. From there, I was able to compare the statistical significance of schools’ activity across local, regional, and national averages.

After several iterations of my experimental design (and hundreds of internally-facing visualizations), I found what I was looking for: a list of outlier schools that we would ultimately call “Calm’s Most Popular Colleges”.

Step Five: In-Depth Analysis


The fifth step of the Data Science process is where you create a model to explain or predict your findings.

This is where most people lose the forest for the trees, as they enter into the land of shiny algorithms and fancy mathematics. Creating models is by far the most over-glorified part of Data Science, which is why most degree programs solely focus on this single step.

But before jumping in to a particular solution, it’s important to pause and return to the bigger picture by asking yourself: “what am I really trying to do and why does it matter?”.

From here, you’ll:

  1. apply your knowledge of algorithms’ contextual pros/cons to choose one approach best-suited for the situation
  2. carry forward statistically significant variables (from the exploratory phase) using what Data Scientists call “feature engineering”
  3. use a machine learning library like scikit-learn for implementation.

The overall goal is to use training data to build a model that generalizes to new (unseen) test data. So while building, it’s important that you’re keenly aware of (and capable of recognizing) overfitting and underfitting.

Here are some amazing free videos from Andrew Ng’s Machine Learning course and Harvard’s CS109 “Intro to Data Science” class that will teach you how to do this for different algorithm types. A great place to practice is through Kaggle tutorials.

NOTE: I’d recommend starting by watching just one or two videos on a simple model type like logistic regression or decision trees, and then immediately applying what you’ve learned on a dataset you care about.

For example…

With Calm College, the model I was building was more “explanatory” than “predictive”.

That is, I was simply trying to identify the universities most suitable for a partnership and understand what factors about a school were contributing to that.

So what I ultimately built was a simple linear regression model (in Excel, no less) that used features like active user count, student enrollment, & university endowment to explain a university’s user activity over time.

Sure, building a predictive model would’ve been the “cool” thing to do, but the goal wasn’t to predict sales leads for the future; it was to establish partnerships with universities NOW.

Lesson learned: the job of a Data Scientist is NOT to build a fancy model; it’s to do whatever it takes to solve a real-world human problem. 

Step Six: Communicate Results


The sixth step of the Data Science Process is where you bring it all together and communicate results.

This is where you practice the most underrated skill in the Data Science toolbox; the X-factor that separates the good Data Scientists from the great ones: data storytelling.

Speaking with experts, I heard it time and time again: your worth as a Data Scientist will be ultimately determined by your ability to convert insights into a clear and actionable story.

In other words, the ability to create and present simple, effective data visualizations to a non-technical audience is the most sought after skill in business today.

For a perfect example of how to do it right, here’s the most well-put-together data story I’ve ever seen on “Wealth Inequality in America”.

And here’s a lecture by Harvard’s CS109 that’s a brilliant encapsulation of the art of data storytelling. The professor covers everything from understanding your audience to providing memorable examples. If you don’t have time to watch the lecture, you can check out my Evernote notes that sum it all up.

Finally, to create beautiful data visualizations, I’d recommend going beyond Python’s basic matplotlib library and checking out seaborn (statistical) and bokeh (interative).

For example…

With Calm College, we had to weave our findings on student activity into an actionable story for campus administrators.

First, I used our list of “Calm’s Most Popular Colleges” to generate sales leads, by reaching out to 50 schools that the model identified as most suitable for a partnership.

Then, for each of the 50 schools, I crafted a personalized story about their students’ activity on the Calm app.

For example, with Harvard, we reached out to the head of campus wellness to let her know that Harvard’s campus was a top 5 most popular college for the Calm app. Then we included 4 graphs depicting the following insights:

  1. 6% of the Cambridge, Massachusetts population (17,000+ people) are Calm users.
  2. More than 82% of Harvard users are active on a monthly basis, with an average of 15 (fifteen!) sessions/month!
  3. Week-by-week retention amongst Harvard users is 3x that of the average Calm user.
  4. Yet, despite all of this, Harvard student’s subscription rate is still well below average.

The first 3 graphs told a story of extraordinary interest in the Calm app on Harvard’s campus. But what really drove home our program was the last point:

“despite all this amazing interest, it’s clear that your students cannot afford Calm’s $60/year subscription. That’s why you need Calm College: to make the Calm app a FREE wellness resource for your students.”

Rather than sell our product, we were selling their students’ past and present use of our product. And it worked like a charm.

Repeating this approach for other colleges, we were able to successfully get our foot-in-the-door at many of the most elite institutions in the country.

And eventually, thanks to this application of The Data Science Process, we were able to launch the program at 8 schools this Fall:

the 8 schools Calm College launched at this Fall

Selection: The Core 20%

“You are not flailing through a rainforest of information with a machete; you are a sniper with a single bull’s-eye in the cross-hairs.” — Tim Ferriss, The Four Hour Chef

The greatest mistake you can make in accelerated learning is trying to master everything. This is not Pokémon. You are not going to catch ’em all.

Instead, the key is being relentlessly focused with the micro-skills you choose to develop. Through rigorous application of the 80/20 rule, it’s possible to cut down a long list of possibilities to the highest frequency material. Then, once you’ve cleared your plate, it’s depth over breadth all the way.

In his book, the “Four Hour Chef”, Tim Ferriss discusses this selection process by introducing the idea of a “Minimum Effective Dose” (MED). Simply put, an MED is the smallest dose that will produce a desired outcome.

Here, I’ve broken down the MED for all 6 steps of The Data Science Process:

The Core 20
the 20% of Data Science skills that result in 80% of outcomes

In conversations with experts, these 8 skills continuously came up as the most essential.

In particular, Data Wrangling (i.e. Python’s Pandas) was said to be the #1 skill (in terms of time spent doing) by every Data Scientist I spoke with. Data cleaning is not sexy, but it encapsulates up to 80% of the job.

You may be wondering where big data tools like Hadoop & Spark, or modeling techniques like neural networks & deep learning fall into all this. The answer: surely outside the core 20%.

To my surprise, many Data Scientists I spoke with emphasized that only a small percentage of companies have data that even requires something as complex as a neural network!

Instead, an overwhelming majority of employers need more simple services like data cleaning, exploratory analysis, and logistic regression models (as recently reflected in an industry-wide survey by Kaggle).

When choosing what to learn, remember: you can always revisit the heavier topics later, but don’t weigh yourself down at the start. The goal is to accelerate learning. So wait until your house of expertise has a strong foundation before adding the shiny stuff.

If you’re looking to master the fundamentals of Data Science in 6 months or less, you’ll want to simply focus on the core 20%.

Next Steps

“Live as if you were to die tomorrow. Learn as if you were to live forever.” — Mahatma Gandhi

I do not believe knowledge is useful for the sake of knowledge; only if you use what you’ve learned to improve your life, or the lives of others. So I would encourage you to pause, reflect, & ask yourself: “what’s the smallest possible action I can take right now with what I’ve learned?”.

For instance, a great place to start would be picking one of the six steps you’re most interested in and exploring the skills/resources associated with it. Then find a dataset that’s of interest to you and start learning by doing through a mini-side-project.

The key is trusting yourself by following the path that you’re instinctually most drawn to… because that’s where you’re find the most short-term motivation & long-term fulfillment.

Personally, after deconstructing data science and identifying the core 20%, I decided to enroll in Springboard’s Data Science Intensive online bootcamp (recently renamed to “Intermediate Data Science”). I chose this program because it was the only curriculum I could find that covered all 6 steps of the data science process while focusing in on all 8 skills of the core 20%.

For more information on the program, I’d recommend checking out Raj Bandyopadhyay’s brilliant Quora answers (here and here) on the methodology behind Springboard’s approach to Data Science education. And here’s a discount code for $100 off any Springboard course.

Whatever you choose to do with this information, the important thing is that you do something. Getting started is always the hardest part, so I challenge you to turn intention into action.

Final Thoughts

Over the past few weeks, the power of the internet has sure become apparent. In just the first 7 days, my first post — Learning Without Limits — had 3000+ views from 66 countries around the world. Never did I expect it to spread so far and wide, but I guess I have all of you to thank for that.

So as long as you all continue to pay it forward, I’ll continue to be an open book. As promised, I’ve complied and will continue to open-source all my favorite resources, insights, and findings via this new page:

All I ask of you is that you share this with people you think would benefit. That’s my call-to-action. Share. Why? Because we’re all in this together and true happiness comes from other people.


To follow along this journey, feel free to drop your email in the sign-up bar below. By signing up, you’ll receive one (just one) email when I’ve posted a new update.

And don’t hesitate to leave any questions, thoughts, or feedback you have in the comments box below. I’d love to hear from you.