Stats Learning Week 3

Linxichen
3 min readApr 1, 2021

In week 3, we learned about confounding variables and study design. Confounding variable s a variable that will both affect the explanatory and response variable, which we have to be aware of when carrying out a study design, otherwise, an incorrect conclusion is very likely to be conducted.

For example, in this following case, higher temperature is a confounder, which affects both ice cream sales and shark attacks, leading to a spurious association between ice cream sales and shark attacks.

Also, the study design can be classified into two classes:

  1. observational study, where the assignment mechanism is unknown
  2. experiment, where the assignment mechanism is known

In observational study, we investigate the association between variables and in experiment, we inestigate the causality between variables.

In generally, several types of observational studies and experiments are as the following:

In this week’s reading, The article “Science Isn’t Broken” By Christie Aschwanden https://fivethirtyeight.com/features/science-isnt-broken/ has shown me the power of scientific researches and the difficulties that scientists encountered when carrying out experiments.

From the first to the third year leaning in statistics, the case we analyze becomes more complex and closer to our real world, which means we need to consider more factors and uncertainties that are likely to appear.

One point that really impresses me a lot is “Human fallibilities send the scientific process hurtling in fits, starts and misdirections instead of in a straight line from question to truth.” (By the article “Science Isn’t Broken” By Christie Aschwanden) Strictly speaking, every results we get from an experiment are temporary truth, which are likely to depend on too many factors, such as different experimental methods, personal biases, sample sizes and so on. More rigorous results and better decisions are made from every failure. Also, considering p-hacking, I originally thought it as a kind of misuse of data analysis which is likely to introduce lots of biases; however, in this article, it provides me with another way of understanding p-hacking, which is also a good way to explore the boundaries fo knowledge.

In the example of exploring the association between the soccer referees and red cards, this project reveals that various statistical methods chosen will result in different results, thus replication of experiments is crucially important here to make comparisons and also reduce biases.

Generally, this article shows us there are too much variabilities to consider when carrying out statistical analysis, therefore every failure from each experiment is not meaningless but provides more complete ideas to the next step, which is the power of science.

--

--