1/161
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is statistical literacy?
The ability to interpret and critically evaluate statistical information and data-based arguments, and to discuss opinions regarding such information.
What are the three main tasks involved in statistics?
Collect data, describe data (summarise and visualise), and make inferences about a population based on a smaller sample.
What is the significance of data generation in 2024?
As of 2024, there are 149 zettabytes of data generated per year, indicating an overwhelming amount of data.
How are cases and variables organized in a dataset?
Cases generally make up rows, while each variable makes a column, with variables varying between cases.
What is a constant in the context of data?
A constant is something that doesn't vary between cases.
What is the difference between categorical and quantitative variables?
A categorical variable divides cases into groups, while a quantitative variable measures a numerical quantity for each case.
Can numbers be used to code categorical variables?
Yes, but using numbers to code categories does not make the variable quantitative.
What is an example of an ordinal data variable?
Family size or distance from Christchurch, where there is a natural ordering.
Why is it important to verify the source of a dataset?
To ensure the reliability of the data, for example, checking against sources like Stats NZ.
What does the website 'Data Never Sleeps' do?
It determines how much data is produced every minute of every day.
What is the issue with causation in studies?
Causation may be confused with correlation; for example, louder music may be associated with drinking more beer, but not necessarily causing it.
What are the explanatory and response variables?
The explanatory variable helps to understand or predict values of another variable, while the response variable is the outcome being measured.
Where do explanatory and response variables appear on a graph?
The explanatory variable goes on the X-axis (horizontal), and the response variable goes on the Y-axis (vertical).
What is a population in statistical terms?
A population includes all cases, individuals, or objects of interest.
What is a sample?
A sample is a subset of the population from which data has been collected.
Why are samples often preferred over censuses?
Samples are more practical due to barriers such as time, accessibility, or cost, especially in populations that change rapidly.
What is a potential bias in study participant selection?
If participants are selected by choice, they may predominantly be individuals wanting to lose weight, leading to inherent bias.
What is the challenge of using one variable to predict another?
It requires careful consideration of which variable is explanatory and which is the response.
What is the importance of random selection in studies?
Random selection helps to avoid bias and ensures a more representative sample of the population.
What is the role of visualisation in data description?
Visualisation helps to summarise and present data in an understandable format.
How can categorical variables be coded for analysis?
It is usually best to code categorical variables by letters instead of numbers for clarity.
What is the implication of having more data generated than in the past?
We are inundated with data, making statistical literacy increasingly important for interpretation.
What is Statistical Inference?
The process of using data from a sample to gain information about the population.
Why is the sample of STAT101 students not a good representation of UC students?
It is not a representative sample, leading to potential errors in statistical inference.
Can the sample data of 10 followers' tweets be generalized to all Twitter accounts? Why or why not?
Yes, but it's a terrible generalization due to the small and unrepresentative sample.
What went wrong with the telephone poll predicting the 1948 US presidential election?
The sample was biased as it only included wealthy individuals who owned telephones, likely favoring Dewey over Truman.
What is sampling bias?
A situation where the method of selecting a sample causes it to differ from the population in a relevant way.
What should be done to avoid sampling bias?
Take a random sample.
What is a Simple Random Sample (SRS)?
A sampling method where each unit of the population has the same chance of being selected, regardless of other units.
How can a random sample be obtained?
Using formal random sampling methods such as technology or drawing names out of a hat.
What is haphazard sampling?
A non-systematic method of sampling that can lead to biased results, such as airport surveys.
What is convenience sampling?
A method of obtaining a sample based on ease of accessibility, which may not be representative of the population.
What is response bias?
A systemic favoring of certain outcomes that occurs when individuals do not respond truthfully.
What is non-response bias?
A systemic favoring of certain outcomes that occurs when individuals who choose to participate differ from those who do not.
What was the bias in the Federal Office of Road Safety study on alcohol and marijuana?
The study had sampling bias because it favored individuals who listen to rock radio stations and were willing to take drugs.
What does it mean for two variables to be associated?
It means that the values of one variable tend to be related to the values of the other variable.
What is causation in statistics?
Two variables are causally associated if changing the value of one variable influences the value of the other.
Can association imply causation?
No, association does not necessarily imply causation.
What is a confounding variable?
A variable that influences both the independent and dependent variables, potentially leading to a false association.
What example illustrates that association does not imply causation?
Families with many cars tend to own many TVs, but this is due to wealth, which is the confounding variable.
What is the average of the numbers 18, 9, 3, 15, and 1?
The average is 9.2.
What did Abraham Wold argue regarding the military's bullet hole data?
He argued that more armor should be added to the center of the plane, not the wings and tail, based on the data.
Why is it important to use random sampling methods?
To ensure that the sample is representative of the population and to avoid biases in the results.
What is the difference between random and haphazard sampling?
Random sampling is systematic and ensures equal chance of selection, while haphazard sampling is non-systematic and can lead to bias.
What is snowball sampling?
A method where existing study subjects recruit future subjects from among their acquaintances, often leading to similar experiences.
How can a confounding variable affect the interpretation of data?
It can offer a plausible explanation for an association between the explanatory and response variable.
What is the difference between causation and association?
Causation implies a direct effect, while association indicates a correlation that may not be due to direct influence.
What is an observational study?
A study in which the researcher doesn't actively control the value of any variable, but simply measures and records the values as they naturally exist.
What is an experiment in the context of research?
A study in which the researcher actively controls the level of one or more of the explanatory variables.
What can be concluded from the study where men rated women on different background colors?
Causation can probably be concluded as it's an experiment, and possible confounding variables have been reduced.
Why can observational studies rarely establish causation?
Because there are almost always confounding variables present.
What is a randomized experiment?
An experiment where the value of the explanatory variable for each unit is determined randomly before measuring the response variable.
What is the purpose of a control group in an experiment?
To provide a comparison group to determine whether a treatment is effective.
What is a placebo?
A fake treatment that resembles the active treatment as much as possible.
Why is blinding important in experiments?
To prevent participants and researchers from knowing which treatment is being administered, reducing bias.
What is double-blinding?
An experimental design where neither the participants nor the researchers know which treatment the patients are receiving.
What confounding variable might explain the link between ice cream sales and drowning deaths?
Temperature.
What confounding variable could explain the relationship between beef and pork consumption?
Increase in worldwide wealth/GDP.
What confounding variable is associated with yacht owners buying sports cars?
Wealth.
What confounding variable might explain higher air pollution in paved areas?
More cars.
What confounding variable is linked to cancer rates near high-voltage power lines?
Lower income houses under power lines, which are less likely to receive cancer treatment.
What is randomization in the context of experiments?
The process of randomly assigning values of the explanatory variable to avoid confounding variables.
How does random assignment help in experiments?
It ensures that treatment groups look similar, minimizing the association with other variables.
What is the significance of well-designed experiments regarding confounding variables?
Confounding variables are eliminated if the experiment is well designed.
What is the role of randomization in treatment assignment?
To ensure that treatment groups are comparable and that the explanatory variable is not associated with other variables.
What is the importance of having two different treatments in a randomized experiment?
To compare the effects of different treatments and establish effectiveness.
What was the focus of the study conducted on 60 men with PIN lesions?
The study aimed to investigate the relationship between green tea and prostate cancer.
What was the daily dosage of green tea extract given to half of the participants in the study?
600 mg
What type of study design was used in the green tea cancer study?
Double-blind study
What was the outcome of the green tea extract study after one year?
Only 1 person taking green tea developed cancer, compared to 9 in the placebo group.
What must approve an experiment for it to take place?
Human ethics committee
What are the two types of randomised experiments mentioned?
Randomised Comparative Experiment and Matched Pairs Experiment.
How does a randomised comparative experiment work?
Cases are randomly assigned to different treatment groups and results are compared on the response variable(s).
What is the key feature of a matched pairs experiment?
Each case receives both treatments in random order or cases are paired in some obvious way.
What is the ideal sampling method for a randomised experiment?
A random sample.
What is the relationship between association and causation in observational studies?
Association does not imply causation; confounding variables often exist.
What is a necessary component of randomised experiments to infer causality?
A control or comparison group.
What effect must be considered in randomised experiments?
The placebo effect.
What is the purpose of descriptive statistics?
To summarise and visualise data.
What does a frequency table show?
The number of cases that fall in each category.
What does a relative frequency table show?
The proportion of cases that fall in each category, summing to 1.
What is the main feature that distinguishes a bar chart from a histogram?
A bar chart has gaps between the bars, while a histogram does not.
Why are pie charts considered misleading?
They force viewers to interpret area in relation to other categories, which can distort perception.
What is the formula to find the difference in proportions of students in a relationship who are female and single students who are female?
pR - pS.
What does a segmented bar chart display?
It shows the proportions of different categories stacked on top of each other.
What is the characteristic of a 100% stacked bar chart?
The bars are forced to total 100%.
What are the two forms of vitamin D injections used for kidney dialysis patients?
Calcitriol and paricalcitol.
How many dialysis patients were examined in the study regarding vitamin D injections?
67,000 patients.
What is the significance of randomisation in experiments?
It helps to prevent confounding variables, allowing for causal inferences.
What is the importance of blinding in experiments?
It reduces bias by ensuring that participants do not know which treatment they are receiving.
What percentage of patients survived after three years when treated with paricalcitol?
58.7%
What percentage of patients survived after three years when treated with calcitriol?
51.5%
What is the total number of patients in the study?
67,000
What is the total number of patients who survived?
36,918
What is the total number of patients who died?
30,082
What is a two-way table used for?
To display the relationship between two categorical variables.
What type of chart is used to visualize one categorical variable?
Bar chart or pie chart.
What are the three aspects of the distribution of a quantitative variable?
Shape, center, and variability.
What is a dotplot?
A visualization where each dot represents one or more cases of a quantitative variable.
How does a histogram differ from a bar chart?
A histogram is for quantitative data with a numeric x-axis, while a bar chart is for categorical data.