1-3


Sampling

Census

Example

1936 presidential election. Alfred Landon(Republican) vs incumbent democrat, Franklin D. Roosevelt.

Literary Digest had correctly predicted the previous 5 elections.

They mailed surveys –10 million of them. 23% response rate; 2.3 million ballots were returned.

(Most modern polls use samples of between a few hundred and a few thousand people.)


Who got the survey?

automobile owners
subscribers
social clubs like the Elks, Moose, etc.
telephone books

(Note: total population of the US was around 130 million at the time, and not all of them can vote.)


George Gallup took a sample of 10,000 people.

Results:

FDR Popular vote percentage  Actual:       60.8%

The Literary Digest prediction:                     43%

Gallup’s prediction:                                      56%

Gallup’s prediction of TLD’s prediction:       44%


quota sample

convenience sample

Simple random sample

Sample size vs. population size

stratified sample

Cluster Sampling

Multistage




1-4

Experimental Design


observational study    VS    experimental study

Observational advantages and disadvantages.

Example:  coffee drinking and health.  https://www.mayoclinic.org/healthy-lifestyle/nutrition-and-healthy-eating/expert-answers/coffee-and-health/faq-20058339


Confounding variables.

Answer:  controlled experiment the independent variable is manipulated, and the dependent variable is observed.  Typically we have treatment and control groups.

All extraneous variables are controlled, properly through randomization.  This is a randomized controlled experiment.


placebo and placebo effect.  Can't let people know if they are in the treatment or control group.  Can't let the scientists know either!

The gold standard:  randomized controlled double-blind experiment

Without it, you'll never be sure about causality. 


For example, to decide if vitamin X really is good at preventing heart attacks, the following (unethical)
example could be employed.

A statistician picks two simple random samples from the population. (Ideally people from around the
world. If it were people only from the United States it could very well be biased.) An “advanced
vitamin X substitute” that is completely indistinguishable from the regular vitamin X is given by
medical doctors to one of the samples, while vitamin X is administered by doctors to the other sample.
The statistician makes sure the medical doctors don’t know whether they are giving the substitute or the
real vitamin X. The doctors then evaluate the people and report their findings to the statistician. The
statistician then performs the exact statistical procedures that were decided upon before the experiment
even began.