Don't forget:
This is optional work but a great way to practice solving data science-related problems. Posted in the "Data Science Homework" Google Drive Folder.
Don't forget:
Don't forget:
Don't forget:
Don't forget:
Don't forget:
Don't forget:
Don't forget:
Don't forget:
This is optional work but a great way to practice solving data science-related problems. Posted in the "Data Science Homework" Google Drive Folder.
How would you explain to someone in detail how you put on your shirt? An algorithm is a step-by-step, detailed set of instructions to solve a problem, which can be expressed as English sentences (usually as a numbered list) and is a great way to begin solving complex problems.
Probability is the likelihood or chance of an event occurring. This begins a multi-week journey discovering probability and how to simulate probabilistic events.
Don't forget:
Large tables of numbers can be difficult to interpret, no matter how organized they are. Sometimes it is much easier to interpret graphs than numbers. Histograms and box plots are used as a way to visually represent numerical data.
Don't forget:
This is optional work but a great way to practice solving data science-related problems. Posted in the "Data Science Homework" Google Drive Folder.
A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.
Don't forget:
Parameters are numerical facts about the population. In this lecture, we will look at parameters such as the average (µ) and standard deviation (σ) of a list of numbers. Later, we will start talking about statistics. Statistics are estimates of parameters computed from a sample.
Don't forget:
This is optional work but a great way to practice solving data science-related problems. Posted in the "Data Science Homework" Google Drive Folder.
Observational studies have shown that people who carry lighters are more likely to get lung cancer. However, this does not mean that carrying lighters causes you to get cancer. Smoking is an obvious confounder! If we weren’t sure about this, how can we determine whether it’s the lighters or the confounders (or some combination of both) that is causing the lung cancer?
Stratification is often called the "blocking of observational studies" and allows us to further explore observational studies by handling potential confounders.
Don't forget:
Is the death penalty and effective deterrence against crime? Is chocolate healthy? What is the cause of breast cancer? All of these questions attempt to assign a cause to an effect. Careful examination of data can help us answer questions like these.
In studies, random samples and random assignment of participants to control and treatment groups helps average out differences when there are enough subjects. What do you do when you don't have very many participants? Blocking first, then randomizing ensures that the differences are averaged out with regard to the variables blocked on. We can use conditionals in pandas to help us do this!
Don't forget:
Data Science is a BIG thing at Illinois and it starts here at Discovering Data Science!
Today we introduced ourselves and covered the syllabus information for this class, which can be found on the syllabus page. We will address additional questions regarding Jupyter Notebooks tomorrow during class.
Don't forget:
Our first lecture is Monday, June 28th at 12pm CDT after the First Day orientation and activities. We will meet via Zoom at the link above. See you there!:)