Welcome to Data Science!


Discovering Data Science
Course Information

Online Section: Discovering Data Science

Instructors: Jonas Reger & Tamun Hanjra
Class Times: 9am - 1pm CDT M-R
Location: No in-person locations. Lectures and Labs will be on Zoom.
Office Hours: Tuesdays 4-6pm CDT & Thursdays 2-4pm CDT

Announcements

July 26
Lecture
  1. Logistic Regression, K-Means Clustering, and Hypothesis Testing

Don't forget:

  1. Make a copy of, download or print the lecture notes!

July 26
Homework
  1. Week 4 & 5 homework set that covers topics from lectures #12-17.

This is optional work but a great way to practice solving data science-related problems. Posted in the "Data Science Homework" Google Drive Folder.

July 22
Lecture
  1. Linear Regression: Scatter Plots, Correlation, and Residuals

Don't forget:

  1. Make a copy of, download or print the lecture notes!

July 21
Lecture
  1. Confidence Intervals, Data Types, and Regression

Don't forget:

  1. Make a copy of, download or print the lecture notes!

July 20
Lecture
  1. Normal Approximation, Central Limit Theorem, and Sampling

Don't forget:

  1. Make a copy of, download or print the lecture notes!

July 19
Lecture
  1. Discrete Random Variables, Distributions and Normal Approximation

Don't forget:

  1. Make a copy of, download or print the lecture notes!

July 15
Lecture
  1. Object-oriented Programming, Random Variables and Distributions

Don't forget:

  1. Make a copy of, download or print the lecture notes!

July 14
Lecture
  1. Simulation Analysis, Errors and Ethics in Data Science

Don't forget:

  1. Make a copy of, download or print the lecture notes!

July 13
Lecture
  1. Functions in Python, Conditional Probability and Bayes Rules

Don't forget:

  1. Make a copy of, download or print the lecture notes!

July 12
Lecture
  1. Probability, Control Flow and Simulations

Don't forget:

  1. Make a copy of, download or print the lecture notes!

July 12
Homework
  1. Week 3 homework set that covers topics from lectures #8-11.

This is optional work but a great way to practice solving data science-related problems. Posted in the "Data Science Homework" Google Drive Folder.

July 8
Lecture
  1. Algorithms and Probability

How would you explain to someone in detail how you put on your shirt? An algorithm is a step-by-step, detailed set of instructions to solve a problem, which can be expressed as English sentences (usually as a numbered list) and is a great way to begin solving complex problems.
Probability is the likelihood or chance of an event occurring. This begins a multi-week journey discovering probability and how to simulate probabilistic events.

Don't forget:

  1. Make a copy of, download or print the lecture notes!

July 7
Lecture
  1. Exploratory Data Analysis: Plots

Large tables of numbers can be difficult to interpret, no matter how organized they are. Sometimes it is much easier to interpret graphs than numbers. Histograms and box plots are used as a way to visually represent numerical data.

Don't forget:

  1. Make a copy of, download or print the lecture notes!

July 6
Homework
  1. Week 2 homework set that covers topics from lectures #5-7.

This is optional work but a great way to practice solving data science-related problems. Posted in the "Data Science Homework" Google Drive Folder.

July 6
Lecture
  1. Exploratory Data Analysis: Summary Statistics and Grouping Data

A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.

Don't forget:

  1. Make copy, download or print the lecture notes!
  2. Here's a cleaned dataset generated from our "Hello" Survey!

July 1
Lecture
  1. Measures of Center and Spread
  2. Boolean Logic and Conditionals

Parameters are numerical facts about the population. In this lecture, we will look at parameters such as the average (µ) and standard deviation (σ) of a list of numbers. Later, we will start talking about statistics. Statistics are estimates of parameters computed from a sample.

Don't forget:

  1. Make copy, download or print the lecture notes!
  2. We will begin working on our capstone projects today! :)
  3. Last chance to complete the Hello Survey!

June 30
Homework
  1. Week 1 homework set that covers topics from lectures #1-4.

This is optional work but a great way to practice solving data science-related problems. Posted in the "Data Science Homework" Google Drive Folder.

June 30
Lecture
  1. Experimental Design, Blocking and pandas
  2. Observational Studies, Confounders, and Stratification

Observational studies have shown that people who carry lighters are more likely to get lung cancer. However, this does not mean that carrying lighters causes you to get cancer. Smoking is an obvious confounder! If we weren’t sure about this, how can we determine whether it’s the lighters or the confounders (or some combination of both) that is causing the lung cancer?

Stratification is often called the "blocking of observational studies" and allows us to further explore observational studies by handling potential confounders.

Don't forget:

  1. Make copy, download or print the lecture notes!
  2. Complete the Hello Survey

June 29
Lecture
  1. Data Structures & Data Science Tools
  2. Experimental Design, Blocking and pandas

Is the death penalty and effective deterrence against crime? Is chocolate healthy? What is the cause of breast cancer? All of these questions attempt to assign a cause to an effect. Careful examination of data can help us answer questions like these.

In studies, random samples and random assignment of participants to control and treatment groups helps average out differences when there are enough subjects. What do you do when you don't have very many participants? Blocking first, then randomizing ensures that the differences are averaged out with regard to the variables blocked on. We can use conditionals in pandas to help us do this!

Don't forget:

  1. Make copy, download or print the lecture notes!
  2. Complete the Hello Survey

June 28
Lecture
  1. Welcome to Digital Scholars Data Science!
  2. Introductions & Syllabus

Data Science is a BIG thing at Illinois and it starts here at Discovering Data Science!

Today we introduced ourselves and covered the syllabus information for this class, which can be found on the syllabus page. We will address additional questions regarding Jupyter Notebooks tomorrow during class.

Don't forget:

  1. Make copy, download or print the lecture notes!
  2. Complete the Hello Survey

June 16
Welcome!
  1. Welcome to Discovering Data Science!

Our first lecture is Monday, June 28th at 12pm CDT after the First Day orientation and activities. We will meet via Zoom at the link above. See you there!:)