#22: Mike Tamir: Identifying Fake News with the Head of Data Science at Uber ATG

“[FakerFact] is not about telling people this is right and this is wrong or this is true and this is false. That’s not even what the algorithm is detecting. It’s about creating room to help people make their own opinion.” – Mike Tamir

Welcome back to season 2 of the Data Journeys podcast!

Mike Tamir is the Head of Data Science at Uber ATG. He is a leader in data science, specializing in deep learning and distributed scalable machine learning, and he’s also a faculty member at UC Berkeley.

Mike has led several teams of Data Scientists in the San Francisco Bay Area as Chief Data Scientist for InterTrust and Formation, Director of Data Sciences for MetaScale, and Chief Science Officer for Galvanize, where he oversaw all data science product development. He also created an MS degree program in Data Science in partnership with UNH.

Mike began his career in academia serving as a mathematics teaching fellow for Columbia University and graduate student at the University of Pittsburgh. His early research focused on developing the epsilon-anchor methodology for resolving both an inconsistency he highlighted in the dynamics of Einstein’s general relativity theory and the convergence of “large N” Monte Carlo simulations in Statistical Mechanics’ universality models of criticality phenomena.

The focus of today’s conversation was on his fake news detection AI project called Faker Fact.

Listen or Subscribe

You can listen and subscribe to the show at:

button-applepodcasts  googleplay

overcast  listen_on_spotify-black.png 

sticher-sub-button

…or by searching “Data Journeys” on virtually any podcasting platform

Follow us on Social

       

Show Notes:

  • 0:00 First, a life update from AJ. Read about his new opportunity in Portland here on his blog.
  • 5:28 What is the evolutionary explanation for why a human’s capacity for careful, rational thought often takes a back seat to emotion? Explained in a comic on the project website.
  • 6:17 Emotions often win over rational though, but as a result, it can be difficult to think clearly on issues we’re passionate about.
  • 7:05 Why people should be aware of their emotional biases, even though it’s not our fault that we have them.
  • 7:50 Why Facebook deleted over a billion fake accounts recently, and why fake accounts, clickbait, blatantly false content, and other forms of fake news are everywhere on social media.
  • 9:10 What mechanisms can we put in place to counterbalance the parts of our nature that compel us to create and engage with content on an emotional level?
  • 9:51 Since a majority of our information is second-hand, how do we distingush what’s really true?
  • 11:44 How did Mike become motivated to pursue this problem, on top of his full time job at Uber ATG?
  • 12:45 How can we tackle “fake news” without censorship?
  • 16:40 Post-Walter Cronkite era, how do we create a sense of credibility and neutrality in our information?
  • 21:00 Why would it be a mistake if the algorithm learned to only classify right or left wing content as fake news?
  • 22:19 The algorithm only looks at the title and words on a page, not the url.
  • 23:15 How Walt (the FakerFact AI) classifies different types of content. Satire, journalism, etc.
  • 26:46 How do you strike the balance of entertainment and informativeness in content?
  • 31:10 What features and characteristics defines each different category of content that Walt identifies?
  • 36:16 What is Walt’s ideal use case?
  • 36:55 You can use the FakerFact Chrome extension to view the “nutrition facts” of the page you’re reading.
  • 37:42 How does research on run-on sentences and other grammatical choices help Walt understand and score an article?
  • 40:34 What techniques were used to train the Walt AI?
  • 42:41 A discussion on the use of wisdom of the crowds in algorithms.
  • 45:30 What makes it difficult to use the wisdom of the crowds when answers are too closely correlated (because of political affiliations or the news cycle?)
  • 46:47 Visit Humanetech.com for tips on regulating your daily notifications and escaping the “24-hour news cycle” to prevent media from controlling your emotions.
  • 50:15 Rapid fire questions!
  • 52:27 Mike’s advice to his 20 year old self.
  • 52:40 What was his best investment in himself?
  • 53:18 The Deep Learning Book a starting point for basic literacy in data science.
  • 53:20 Mike, like lots of guests on this show, makes a distinction between things he believes but couldn’t prove right now, and believing things for no good reason.

Show Notes: https://ajgoldstein.com/podcast/ep22

AJ’s Twitter: https://twitter.com/ajgoldstein393/

Mike’s LinkedIn: https://www.linkedin.com/in/miketamir/

Mike’s Twitter: https://twitter.com/MikeTamir

Links from the Episode

Support the Show

If you enjoyed this episode of Data Journeys, the best way to support the show is by leaving a review on iTunes and sharing on social media using the hashtag #datajourneys.

Questions or Suggestions?

Got a question, suggestion, or do you just want to say hi? Let AJ know what you think by sending him a note on Twitter at @ajgoldstein393

And don’t forget to join the mailing list (below) to be notified about new episodes, blog posts, giveaways, etc.

Enjoy!