Introduction to Statistical Literacy and Data Analysis

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/161

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

162 Terms

1
New cards

What is statistical literacy?

The ability to interpret and critically evaluate statistical information and data-based arguments, and to discuss opinions regarding such information.

2
New cards

What are the three main tasks involved in statistics?

Collect data, describe data (summarise and visualise), and make inferences about a population based on a smaller sample.

3
New cards

What is the significance of data generation in 2024?

As of 2024, there are 149 zettabytes of data generated per year, indicating an overwhelming amount of data.

4
New cards

How are cases and variables organized in a dataset?

Cases generally make up rows, while each variable makes a column, with variables varying between cases.

5
New cards

What is a constant in the context of data?

A constant is something that doesn't vary between cases.

6
New cards

What is the difference between categorical and quantitative variables?

A categorical variable divides cases into groups, while a quantitative variable measures a numerical quantity for each case.

7
New cards

Can numbers be used to code categorical variables?

Yes, but using numbers to code categories does not make the variable quantitative.

8
New cards

What is an example of an ordinal data variable?

Family size or distance from Christchurch, where there is a natural ordering.

9
New cards

Why is it important to verify the source of a dataset?

To ensure the reliability of the data, for example, checking against sources like Stats NZ.

10
New cards

What does the website 'Data Never Sleeps' do?

It determines how much data is produced every minute of every day.

11
New cards

What is the issue with causation in studies?

Causation may be confused with correlation; for example, louder music may be associated with drinking more beer, but not necessarily causing it.

12
New cards

What are the explanatory and response variables?

The explanatory variable helps to understand or predict values of another variable, while the response variable is the outcome being measured.

13
New cards

Where do explanatory and response variables appear on a graph?

The explanatory variable goes on the X-axis (horizontal), and the response variable goes on the Y-axis (vertical).

14
New cards

What is a population in statistical terms?

A population includes all cases, individuals, or objects of interest.

15
New cards

What is a sample?

A sample is a subset of the population from which data has been collected.

16
New cards

Why are samples often preferred over censuses?

Samples are more practical due to barriers such as time, accessibility, or cost, especially in populations that change rapidly.

17
New cards

What is a potential bias in study participant selection?

If participants are selected by choice, they may predominantly be individuals wanting to lose weight, leading to inherent bias.

18
New cards

What is the challenge of using one variable to predict another?

It requires careful consideration of which variable is explanatory and which is the response.

19
New cards

What is the importance of random selection in studies?

Random selection helps to avoid bias and ensures a more representative sample of the population.

20
New cards

What is the role of visualisation in data description?

Visualisation helps to summarise and present data in an understandable format.

21
New cards

How can categorical variables be coded for analysis?

It is usually best to code categorical variables by letters instead of numbers for clarity.

22
New cards

What is the implication of having more data generated than in the past?

We are inundated with data, making statistical literacy increasingly important for interpretation.

23
New cards

What is Statistical Inference?

The process of using data from a sample to gain information about the population.

24
New cards

Why is the sample of STAT101 students not a good representation of UC students?

It is not a representative sample, leading to potential errors in statistical inference.

25
New cards

Can the sample data of 10 followers' tweets be generalized to all Twitter accounts? Why or why not?

Yes, but it's a terrible generalization due to the small and unrepresentative sample.

26
New cards

What went wrong with the telephone poll predicting the 1948 US presidential election?

The sample was biased as it only included wealthy individuals who owned telephones, likely favoring Dewey over Truman.

27
New cards

What is sampling bias?

A situation where the method of selecting a sample causes it to differ from the population in a relevant way.

28
New cards

What should be done to avoid sampling bias?

Take a random sample.

29
New cards

What is a Simple Random Sample (SRS)?

A sampling method where each unit of the population has the same chance of being selected, regardless of other units.

30
New cards

How can a random sample be obtained?

Using formal random sampling methods such as technology or drawing names out of a hat.

31
New cards

What is haphazard sampling?

A non-systematic method of sampling that can lead to biased results, such as airport surveys.

32
New cards

What is convenience sampling?

A method of obtaining a sample based on ease of accessibility, which may not be representative of the population.

33
New cards

What is response bias?

A systemic favoring of certain outcomes that occurs when individuals do not respond truthfully.

34
New cards

What is non-response bias?

A systemic favoring of certain outcomes that occurs when individuals who choose to participate differ from those who do not.

35
New cards

What was the bias in the Federal Office of Road Safety study on alcohol and marijuana?

The study had sampling bias because it favored individuals who listen to rock radio stations and were willing to take drugs.

36
New cards

What does it mean for two variables to be associated?

It means that the values of one variable tend to be related to the values of the other variable.

37
New cards

What is causation in statistics?

Two variables are causally associated if changing the value of one variable influences the value of the other.

38
New cards

Can association imply causation?

No, association does not necessarily imply causation.

39
New cards

What is a confounding variable?

A variable that influences both the independent and dependent variables, potentially leading to a false association.

40
New cards

What example illustrates that association does not imply causation?

Families with many cars tend to own many TVs, but this is due to wealth, which is the confounding variable.

41
New cards

What is the average of the numbers 18, 9, 3, 15, and 1?

The average is 9.2.

42
New cards

What did Abraham Wold argue regarding the military's bullet hole data?

He argued that more armor should be added to the center of the plane, not the wings and tail, based on the data.

43
New cards

Why is it important to use random sampling methods?

To ensure that the sample is representative of the population and to avoid biases in the results.

44
New cards

What is the difference between random and haphazard sampling?

Random sampling is systematic and ensures equal chance of selection, while haphazard sampling is non-systematic and can lead to bias.

45
New cards

What is snowball sampling?

A method where existing study subjects recruit future subjects from among their acquaintances, often leading to similar experiences.

46
New cards

How can a confounding variable affect the interpretation of data?

It can offer a plausible explanation for an association between the explanatory and response variable.

47
New cards

What is the difference between causation and association?

Causation implies a direct effect, while association indicates a correlation that may not be due to direct influence.

48
New cards

What is an observational study?

A study in which the researcher doesn't actively control the value of any variable, but simply measures and records the values as they naturally exist.

49
New cards

What is an experiment in the context of research?

A study in which the researcher actively controls the level of one or more of the explanatory variables.

50
New cards

What can be concluded from the study where men rated women on different background colors?

Causation can probably be concluded as it's an experiment, and possible confounding variables have been reduced.

51
New cards

Why can observational studies rarely establish causation?

Because there are almost always confounding variables present.

52
New cards

What is a randomized experiment?

An experiment where the value of the explanatory variable for each unit is determined randomly before measuring the response variable.

53
New cards

What is the purpose of a control group in an experiment?

To provide a comparison group to determine whether a treatment is effective.

54
New cards

What is a placebo?

A fake treatment that resembles the active treatment as much as possible.

55
New cards

Why is blinding important in experiments?

To prevent participants and researchers from knowing which treatment is being administered, reducing bias.

56
New cards

What is double-blinding?

An experimental design where neither the participants nor the researchers know which treatment the patients are receiving.

57
New cards

What confounding variable might explain the link between ice cream sales and drowning deaths?

Temperature.

58
New cards

What confounding variable could explain the relationship between beef and pork consumption?

Increase in worldwide wealth/GDP.

59
New cards

What confounding variable is associated with yacht owners buying sports cars?

Wealth.

60
New cards

What confounding variable might explain higher air pollution in paved areas?

More cars.

61
New cards

What confounding variable is linked to cancer rates near high-voltage power lines?

Lower income houses under power lines, which are less likely to receive cancer treatment.

62
New cards

What is randomization in the context of experiments?

The process of randomly assigning values of the explanatory variable to avoid confounding variables.

63
New cards

How does random assignment help in experiments?

It ensures that treatment groups look similar, minimizing the association with other variables.

64
New cards

What is the significance of well-designed experiments regarding confounding variables?

Confounding variables are eliminated if the experiment is well designed.

65
New cards

What is the role of randomization in treatment assignment?

To ensure that treatment groups are comparable and that the explanatory variable is not associated with other variables.

66
New cards

What is the importance of having two different treatments in a randomized experiment?

To compare the effects of different treatments and establish effectiveness.

67
New cards

What was the focus of the study conducted on 60 men with PIN lesions?

The study aimed to investigate the relationship between green tea and prostate cancer.

68
New cards

What was the daily dosage of green tea extract given to half of the participants in the study?

600 mg

69
New cards

What type of study design was used in the green tea cancer study?

Double-blind study

70
New cards

What was the outcome of the green tea extract study after one year?

Only 1 person taking green tea developed cancer, compared to 9 in the placebo group.

71
New cards

What must approve an experiment for it to take place?

Human ethics committee

72
New cards

What are the two types of randomised experiments mentioned?

Randomised Comparative Experiment and Matched Pairs Experiment.

73
New cards

How does a randomised comparative experiment work?

Cases are randomly assigned to different treatment groups and results are compared on the response variable(s).

74
New cards

What is the key feature of a matched pairs experiment?

Each case receives both treatments in random order or cases are paired in some obvious way.

75
New cards

What is the ideal sampling method for a randomised experiment?

A random sample.

76
New cards

What is the relationship between association and causation in observational studies?

Association does not imply causation; confounding variables often exist.

77
New cards

What is a necessary component of randomised experiments to infer causality?

A control or comparison group.

78
New cards

What effect must be considered in randomised experiments?

The placebo effect.

79
New cards

What is the purpose of descriptive statistics?

To summarise and visualise data.

80
New cards

What does a frequency table show?

The number of cases that fall in each category.

81
New cards

What does a relative frequency table show?

The proportion of cases that fall in each category, summing to 1.

82
New cards

What is the main feature that distinguishes a bar chart from a histogram?

A bar chart has gaps between the bars, while a histogram does not.

83
New cards

Why are pie charts considered misleading?

They force viewers to interpret area in relation to other categories, which can distort perception.

84
New cards

What is the formula to find the difference in proportions of students in a relationship who are female and single students who are female?

pR - pS.

85
New cards

What does a segmented bar chart display?

It shows the proportions of different categories stacked on top of each other.

86
New cards

What is the characteristic of a 100% stacked bar chart?

The bars are forced to total 100%.

87
New cards

What are the two forms of vitamin D injections used for kidney dialysis patients?

Calcitriol and paricalcitol.

88
New cards

How many dialysis patients were examined in the study regarding vitamin D injections?

67,000 patients.

89
New cards

What is the significance of randomisation in experiments?

It helps to prevent confounding variables, allowing for causal inferences.

90
New cards

What is the importance of blinding in experiments?

It reduces bias by ensuring that participants do not know which treatment they are receiving.

91
New cards

What percentage of patients survived after three years when treated with paricalcitol?

58.7%

92
New cards

What percentage of patients survived after three years when treated with calcitriol?

51.5%

93
New cards

What is the total number of patients in the study?

67,000

94
New cards

What is the total number of patients who survived?

36,918

95
New cards

What is the total number of patients who died?

30,082

96
New cards

What is a two-way table used for?

To display the relationship between two categorical variables.

97
New cards

What type of chart is used to visualize one categorical variable?

Bar chart or pie chart.

98
New cards

What are the three aspects of the distribution of a quantitative variable?

Shape, center, and variability.

99
New cards

What is a dotplot?

A visualization where each dot represents one or more cases of a quantitative variable.

100
New cards

How does a histogram differ from a bar chart?

A histogram is for quantitative data with a numeric x-axis, while a bar chart is for categorical data.