Intro to Python for DSIntro to Python for DS 2 birds. One stone.   WIFI: MakeO ces 5Ghz Password: Internet!23   http://bit.ly/thinkful-dc-python
TJ Stalcup Lead DC Mentor @Thinkful API Evangelist @540 Pokemon Master About UsAbout Us
***see slide*** Speaker notes
What's your name? What do you do? Why are you interested in data science or python? About youAbout you
Online Bootcamp since 2012. We have worked with over 6000 students around the world paired up with over 400 mentors.    We get you ready for a career and guarantee your rst job   92% success rate About ThinkfulAbout Thinkful Local DC Crew
  Learn why DS is a thing   What is Python   How do we use it with a real world project?   How do I learn more? TONIGHT: Learn Python by DoingTONIGHT: Learn Python by Doing
What is a Data Scientist?What is a Data Scientist?
Example: LinkedIn 2006Example: LinkedIn 2006 “[LinkedIn] was like arriving at a conference reception and realizing you don’t know anyone. So you just stand in the corner sipping your drink—and you probably leave early.” -LinkedIn Manager, June 2006
Enter: Jonathan GoldmanEnter: Jonathan Goldman Data Scientist Joined LinkedIn in 2006, only 8M users (450M in 2016) Started experiments to predict people’s networks Engineers were dismissive: “you can already import your address book”
DS ProcessDS Process Frame the question Collect the raw data Process the data Explore the data Communicate results
Frame the QuestionFrame the Question What questions do we want to answer? What connections (type and number) lead to higher user engagement? Which connections do people want to make but are currently limited from making? How might we predict these types of connections with limited data from the user?
Collect the DataCollect the Data What data do we need to answer these questions? Connection data (who is who connected to?) Demographic data (what is the pro le of the connection) Engagement data (how do they use the site)
Process the DataProcess the Data How is the data “dirty” and how can we clean it? • User input • Redundancies • Feature changes • Data model changes
Explore the DataExplore the Data What are the meaningful patterns in the data? • Triangle closing • Time overlaps • Geographic overlaps
Communicate FindingsCommunicate Findings How do we communicate this? To whom? Marketing - this will enable us to sell X more ad space. Results in X more impressions per day Product - this will allow us to build X more features Development - this will allow us to grow our team by X Sales - this will attract X more premium accounts C-Level - this will result in $$$ more revenue 8M - 450M in 10 years
The ResultThe Result     Career Whack-A-Mole
Why DS now?Why DS now? Big Data: datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze
Big DataBig Data Trend "started" in 2005 Web 2.0 - Majority of content is created by users Mobile accelerates this — data/person skyrockets
The Data ProblemThe Data Problem We are generating more data every year than existed before.........
The SolutionThe Solution There goes my hero....   watch 'em as they code....
Just need to do everything....Just need to do everything....
Just need to do everything....Just need to do everything.... Knowledge of statistics, algorithms, & software Comfort with languages & tools (Python, SQL, Tableau) Inquisitiveness and intellectual curiosity Strong communication skills It’s all Teachable!
Coming Soon....Coming Soon.... Intro to SQL Intro to Tableau Intro to Statistics   http://meetup.com/Thinkful-DC
Let's Learn Python TonightLet's Learn Python Tonight Python for Programming Great for Data Science Robotics Web Development (Python/Django) Automation
Let's Learn Python TonightLet's Learn Python Tonight firstName = 'TJ' lastName = "Stalcup" age = 34 // wow, much old print firstName // TJ print firstName + lastName // TJStalcup print firstName + ' ' + lastName // TJ Stalcup print lastName + ', ' + firstName // Stalcup, TJ print age * 2 // 68, hopefully retired def greet(name): print 'Hello', name greet('Jack') // Hello, Jack greet('Jill') // Hello, Jill greet('Bob') // Hello, Bob greet(firstName) // Hello, TJ greet(firstName + ' ' + lastName) // Hello, TJ Stalcup
The ModelThe Model Our model is going to be a Decision Tree.   Decision trees predict the most likely outcome based on input.   You can think of it like a computer building a version of 20 questions.
Decision Trees - Golf?Decision Trees - Golf?
The NotebookThe Notebook We're going to use a Google hosted Python to build this model. This app is called Colaboratory (Collaboration + Laboratory)   http://colab.research.google.com   New Notebook > New Python3 Notebook notebook
ShortcomingsShortcomings Our model has a few weaknesses:   -Limited inputs -Assumptions
Data Science @ ThinkfulData Science @ Thinkful Flexible, project-based curriculum to help you become the data scientist you want to be You don’t just learn skills, you get to make things Mentor support from experts in the industry Also, there's a job guarantee
Link for the third party audit jobs report: https://www.thinkful.com/bootcamp-jobs-stats Thinkful Graduates 92% Job Placement Rate
Learning Mentor Career MentorProgram Manager Local Community You Unprecedented SupportUnprecedented Support
http://bit.ly/dc-ds-trial Initial 2-week trial course Start with Python and Statistics Unlimited Q&A Sessions Option to continue with full bootcamp Financing & scholarships available O er valid for tonight only Aaron Lamphere Trial Program Manager   Thinkful Two Week TrialThinkful Two Week Trial

Intro to Python for Data Science

  • 1.
    Intro to Pythonfor DSIntro to Python for DS 2 birds. One stone.   WIFI: MakeO ces 5Ghz Password: Internet!23   http://bit.ly/thinkful-dc-python
  • 2.
    TJ Stalcup Lead DCMentor @Thinkful API Evangelist @540 Pokemon Master About UsAbout Us
  • 3.
  • 4.
    What's your name? Whatdo you do? Why are you interested in data science or python? About youAbout you
  • 5.
    Online Bootcamp since2012. We have worked with over 6000 students around the world paired up with over 400 mentors.    We get you ready for a career and guarantee your rst job   92% success rate About ThinkfulAbout Thinkful Local DC Crew
  • 6.
      Learn why DSis a thing   What is Python   How do we use it with a real world project?   How do I learn more? TONIGHT: Learn Python by DoingTONIGHT: Learn Python by Doing
  • 7.
    What is aData Scientist?What is a Data Scientist?
  • 8.
    Example: LinkedIn 2006Example:LinkedIn 2006 “[LinkedIn] was like arriving at a conference reception and realizing you don’t know anyone. So you just stand in the corner sipping your drink—and you probably leave early.” -LinkedIn Manager, June 2006
  • 9.
    Enter: Jonathan GoldmanEnter:Jonathan Goldman Data Scientist Joined LinkedIn in 2006, only 8M users (450M in 2016) Started experiments to predict people’s networks Engineers were dismissive: “you can already import your address book”
  • 10.
    DS ProcessDS Process Framethe question Collect the raw data Process the data Explore the data Communicate results
  • 11.
    Frame the QuestionFramethe Question What questions do we want to answer? What connections (type and number) lead to higher user engagement? Which connections do people want to make but are currently limited from making? How might we predict these types of connections with limited data from the user?
  • 12.
    Collect the DataCollectthe Data What data do we need to answer these questions? Connection data (who is who connected to?) Demographic data (what is the pro le of the connection) Engagement data (how do they use the site)
  • 13.
    Process the DataProcessthe Data How is the data “dirty” and how can we clean it? • User input • Redundancies • Feature changes • Data model changes
  • 14.
    Explore the DataExplorethe Data What are the meaningful patterns in the data? • Triangle closing • Time overlaps • Geographic overlaps
  • 15.
    Communicate FindingsCommunicate Findings Howdo we communicate this? To whom? Marketing - this will enable us to sell X more ad space. Results in X more impressions per day Product - this will allow us to build X more features Development - this will allow us to grow our team by X Sales - this will attract X more premium accounts C-Level - this will result in $$$ more revenue 8M - 450M in 10 years
  • 16.
  • 17.
    Why DS now?WhyDS now? Big Data: datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze
  • 18.
    Big DataBig Data Trend"started" in 2005 Web 2.0 - Majority of content is created by users Mobile accelerates this — data/person skyrockets
  • 19.
    The Data ProblemTheData Problem We are generating more data every year than existed before.........
  • 20.
    The SolutionThe Solution Theregoes my hero....   watch 'em as they code....
  • 21.
    Just need todo everything....Just need to do everything....
  • 22.
    Just need todo everything....Just need to do everything.... Knowledge of statistics, algorithms, & software Comfort with languages & tools (Python, SQL, Tableau) Inquisitiveness and intellectual curiosity Strong communication skills It’s all Teachable!
  • 23.
    Coming Soon....Coming Soon.... Introto SQL Intro to Tableau Intro to Statistics   http://meetup.com/Thinkful-DC
  • 24.
    Let's Learn PythonTonightLet's Learn Python Tonight Python for Programming Great for Data Science Robotics Web Development (Python/Django) Automation
  • 25.
    Let's Learn PythonTonightLet's Learn Python Tonight firstName = 'TJ' lastName = "Stalcup" age = 34 // wow, much old print firstName // TJ print firstName + lastName // TJStalcup print firstName + ' ' + lastName // TJ Stalcup print lastName + ', ' + firstName // Stalcup, TJ print age * 2 // 68, hopefully retired def greet(name): print 'Hello', name greet('Jack') // Hello, Jack greet('Jill') // Hello, Jill greet('Bob') // Hello, Bob greet(firstName) // Hello, TJ greet(firstName + ' ' + lastName) // Hello, TJ Stalcup
  • 26.
    The ModelThe Model Ourmodel is going to be a Decision Tree.   Decision trees predict the most likely outcome based on input.   You can think of it like a computer building a version of 20 questions.
  • 27.
    Decision Trees -Golf?Decision Trees - Golf?
  • 28.
    The NotebookThe Notebook We'regoing to use a Google hosted Python to build this model. This app is called Colaboratory (Collaboration + Laboratory)   http://colab.research.google.com   New Notebook > New Python3 Notebook notebook
  • 29.
    ShortcomingsShortcomings Our model hasa few weaknesses:   -Limited inputs -Assumptions
  • 30.
    Data Science @ThinkfulData Science @ Thinkful Flexible, project-based curriculum to help you become the data scientist you want to be You don’t just learn skills, you get to make things Mentor support from experts in the industry Also, there's a job guarantee
  • 31.
    Link for thethird party audit jobs report: https://www.thinkful.com/bootcamp-jobs-stats Thinkful Graduates 92% Job Placement Rate
  • 32.
    Learning Mentor Career MentorProgramManager Local Community You Unprecedented SupportUnprecedented Support
  • 33.
    http://bit.ly/dc-ds-trial Initial 2-week trialcourse Start with Python and Statistics Unlimited Q&A Sessions Option to continue with full bootcamp Financing & scholarships available O er valid for tonight only Aaron Lamphere Trial Program Manager   Thinkful Two Week TrialThinkful Two Week Trial