knowt ap exam guide logo

Ch. 10-12 Terms

  • Observational study

    • A study based on data where there is no manipulating factors

  • Retrospective study

    • Observational study where subjects are selected and their previous conditions/behaviors are determined

    • Not based on random samples and usually focus on estimating differences between groups/associations between variables

  • Prospective study

    • Observational study where subjects are followed to observe future outcomes

    • No treatments are deliberately applied, so not an experiment

    • Focus on estimating differences among groups that might appear as the groups are followed during the course of the study

  • Experiment

    • Manipulates factor levels to create treatments, RANDOMLY ASSIGNS subjects to these treatment levels, then compares responses of subject groups across treatment levels

  • Random assignment

    • To be valid, an experiment must assign experimental units to treatment groups at random

  • Factor

    • A variable whose vales are manipulated by the experimenter

    • Experiments attempts to discover the effects of different factor levels on experimental units

  • Response variable

    • A variable whose values are compared across different treatments

    • In a randomized experiment, large response differences can be attributed to the effect of differences in treatment level

  • Experimental units

    • Individuals on whom an experiment is performed

    • Usually SUBJECTS or PARTICIPANTS

  • Level

    • The specific values the experimenter chooses for factors

  • Treatment

    • The process/intervention/other controlled circumstance applied to randomly assigned experimental units

    • Treatments are different levels of a single factor or combinations of levels of two or more factors

  • Principles of experimental design

    • CONTROL aspects of the experiment that we know may have an effect on the response, but that are not the factors being studied

    • RANDOMIZE subjects to treatments to even out effects we cannot control

    • REPLICATE over as many subjects as possible. Try to replicate with different parts of a population

    • BLOCK to reduce the effects of identifiable attributes of the subjects that may affect their responses but cannot be controlled

  • Completely randomized design

    • All experimental units should have an equal chance of receiving any treatment

  • Statistically significant

    • When an observed difference is too large for us to think it could be caused by chance, it is statistically significant

  • Control group

    • The experimental unit assigned to a baseline treatment level (either default or nothing)

    • The response of this will provide a basis Ifor comparison

  • Blinding

    • Any individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups

  • Single blind, double blind

    • Two types of individuals who affect outcome of experiment. Those who can influence results (subjects, technicians), those who evaluate the results (judges, physicians)

    • When everyone in both classes are blinded, it's double blind

    • When someone in only one class is blind. It's single blind

  • Placebo

    • Treatment known to have no effect, administered to one group so all groups experience same conditions

    • Only by comparing with a placebo can we be sure the observed effect of a treatment is not due simply to placebo effect

  • Blocking

    • When subgroups of experimental units differ in ways that may affect their responses to treatments, we isolate them by blocking

    • We can isolate the variability attributable to differences between blocks so we can see differences caused by treatment more clearly

  • Randomized block design

    • Subjects are randomly assigned to treatments only within blocks

  • Matching

    • In retrospective or prospective study, subjects who are similar in ways not under study may be matched and compared on variables of interest

    • Matching (like blocking) reduces unwanted variation

  • Confounding

    • When the levels of one factor are associated with the levels of another factor in such a way that their effects cannot be separated

  • Population

    • Entire group of individuals about whom we wish to learn

  • Sample

    • Representative subset of population, examined in hopes of learning from population

  • Sample survey

    • Study that asks questions of a sample drawn from some population in hopes of learning something about whole population

    • Polls taken to assess voter preference are common sample surveys

  • Bias

    • Any systematic failure of a sampling method to represent its population

    • Biased sampling methods over/underestimate parameters

    • Near impossible to recover from bias

    • Common examples

      • Relying on voluntary response

      • Undercoverage of population

      • Nonresponse bias

      • Response bias

  • Randomization

    • Each individual is given a fair/random chance of selection

    • Best defense against bias

  • Sample size

    • Number of individuals in a sample

    • Determines how well sample represents the population, not fraction of population

  • Census

    • Sample that is the whole population

  • Population parameter

    • Numerically valued attribute of a model for a population

    • Never really know true value of it, but we can estimate

  • Statistic/sample statistic

    • Statistics are values calculated for sampled data

  • Representative

    • Sample is representative if the statistics computed from it accurately reflect the corresponding population parameters

  • Simple random sample (SRS)

    • Simple random sample of sample size n is a sample in which each set of n elements in the population has an equal chance of being selected

  • Sampling frame

    • A list of individuals from whom the sample is drawn

  • Sampling variability

    • The natural tendency of randomly drawn samples to differ from one another

    • Sometimes called sampling error, but not really an error

  • Stratified random sample

    • Sampling design where population is divided into subpopulations (strata). Individuals are drawn from each stratum (usually in representative proportion) to reduce variability

  • Cluster sample

    • Groups (clusters) are chosen at random to be sampled

    • Done as a matter of convenience, practicality, or cost

  • Multistage sample

    • Sampling designs that combine several sampling methods

  • Systematic sample

    • Sample drawn by selecting individuals systematically from a sampling frame

    • If no relationship between order of sampling frame and variables of interest, can be representative

  • Pilot survey

    • Small trial run of survey to check if questions are good

    • Reduces error caused by ambiguous questions

  • Voluntary response bias

    • Bias introduced to sample when individuals can choose on their own whether to participate in the sample

    • Always invalid

  • Convenience sample

    • Sample that consists of individuals who are conveniently available

    • Not representative of population

  • Undercoverage

    • Sampling design that biases sample because it gives some part of the population less representation than it actually has in the population

  • Nonresponse bias

    • Bias introduced when large fraction of those sampled don't respond

    • Those who do respond then are not likely to represent full population

    • Voluntary response bias is a form of this

  • Response bias

    • Anything in a survey design that influences responses

    • Typically from wording of questions

  • Random

    • Outcome is random if we know the possible values it can have, but not which particular value it takes. A random outcome is FREE of human influence

  • Generating random numbers

    • Random numbers are hard to generate, but several internet sites offer an unlimited supply of equally likely random values

  • Simulation

    • A simulation models a real-world situation by using random-digit outcomes to mimic the uncertainty of a response variable on interest

  • Trial

    • The sequence of several components representing events that we are pretending will take place

  • Simulation component

    • A component uses equally likely random digits to model simple random occurrences whose outcomes may not be equally likely

  • Response variable

    • Values of the response variable record the results of each trial with respect to what we were interested in

Ch. 10-12 Terms

  • Observational study

    • A study based on data where there is no manipulating factors

  • Retrospective study

    • Observational study where subjects are selected and their previous conditions/behaviors are determined

    • Not based on random samples and usually focus on estimating differences between groups/associations between variables

  • Prospective study

    • Observational study where subjects are followed to observe future outcomes

    • No treatments are deliberately applied, so not an experiment

    • Focus on estimating differences among groups that might appear as the groups are followed during the course of the study

  • Experiment

    • Manipulates factor levels to create treatments, RANDOMLY ASSIGNS subjects to these treatment levels, then compares responses of subject groups across treatment levels

  • Random assignment

    • To be valid, an experiment must assign experimental units to treatment groups at random

  • Factor

    • A variable whose vales are manipulated by the experimenter

    • Experiments attempts to discover the effects of different factor levels on experimental units

  • Response variable

    • A variable whose values are compared across different treatments

    • In a randomized experiment, large response differences can be attributed to the effect of differences in treatment level

  • Experimental units

    • Individuals on whom an experiment is performed

    • Usually SUBJECTS or PARTICIPANTS

  • Level

    • The specific values the experimenter chooses for factors

  • Treatment

    • The process/intervention/other controlled circumstance applied to randomly assigned experimental units

    • Treatments are different levels of a single factor or combinations of levels of two or more factors

  • Principles of experimental design

    • CONTROL aspects of the experiment that we know may have an effect on the response, but that are not the factors being studied

    • RANDOMIZE subjects to treatments to even out effects we cannot control

    • REPLICATE over as many subjects as possible. Try to replicate with different parts of a population

    • BLOCK to reduce the effects of identifiable attributes of the subjects that may affect their responses but cannot be controlled

  • Completely randomized design

    • All experimental units should have an equal chance of receiving any treatment

  • Statistically significant

    • When an observed difference is too large for us to think it could be caused by chance, it is statistically significant

  • Control group

    • The experimental unit assigned to a baseline treatment level (either default or nothing)

    • The response of this will provide a basis Ifor comparison

  • Blinding

    • Any individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups

  • Single blind, double blind

    • Two types of individuals who affect outcome of experiment. Those who can influence results (subjects, technicians), those who evaluate the results (judges, physicians)

    • When everyone in both classes are blinded, it's double blind

    • When someone in only one class is blind. It's single blind

  • Placebo

    • Treatment known to have no effect, administered to one group so all groups experience same conditions

    • Only by comparing with a placebo can we be sure the observed effect of a treatment is not due simply to placebo effect

  • Blocking

    • When subgroups of experimental units differ in ways that may affect their responses to treatments, we isolate them by blocking

    • We can isolate the variability attributable to differences between blocks so we can see differences caused by treatment more clearly

  • Randomized block design

    • Subjects are randomly assigned to treatments only within blocks

  • Matching

    • In retrospective or prospective study, subjects who are similar in ways not under study may be matched and compared on variables of interest

    • Matching (like blocking) reduces unwanted variation

  • Confounding

    • When the levels of one factor are associated with the levels of another factor in such a way that their effects cannot be separated

  • Population

    • Entire group of individuals about whom we wish to learn

  • Sample

    • Representative subset of population, examined in hopes of learning from population

  • Sample survey

    • Study that asks questions of a sample drawn from some population in hopes of learning something about whole population

    • Polls taken to assess voter preference are common sample surveys

  • Bias

    • Any systematic failure of a sampling method to represent its population

    • Biased sampling methods over/underestimate parameters

    • Near impossible to recover from bias

    • Common examples

      • Relying on voluntary response

      • Undercoverage of population

      • Nonresponse bias

      • Response bias

  • Randomization

    • Each individual is given a fair/random chance of selection

    • Best defense against bias

  • Sample size

    • Number of individuals in a sample

    • Determines how well sample represents the population, not fraction of population

  • Census

    • Sample that is the whole population

  • Population parameter

    • Numerically valued attribute of a model for a population

    • Never really know true value of it, but we can estimate

  • Statistic/sample statistic

    • Statistics are values calculated for sampled data

  • Representative

    • Sample is representative if the statistics computed from it accurately reflect the corresponding population parameters

  • Simple random sample (SRS)

    • Simple random sample of sample size n is a sample in which each set of n elements in the population has an equal chance of being selected

  • Sampling frame

    • A list of individuals from whom the sample is drawn

  • Sampling variability

    • The natural tendency of randomly drawn samples to differ from one another

    • Sometimes called sampling error, but not really an error

  • Stratified random sample

    • Sampling design where population is divided into subpopulations (strata). Individuals are drawn from each stratum (usually in representative proportion) to reduce variability

  • Cluster sample

    • Groups (clusters) are chosen at random to be sampled

    • Done as a matter of convenience, practicality, or cost

  • Multistage sample

    • Sampling designs that combine several sampling methods

  • Systematic sample

    • Sample drawn by selecting individuals systematically from a sampling frame

    • If no relationship between order of sampling frame and variables of interest, can be representative

  • Pilot survey

    • Small trial run of survey to check if questions are good

    • Reduces error caused by ambiguous questions

  • Voluntary response bias

    • Bias introduced to sample when individuals can choose on their own whether to participate in the sample

    • Always invalid

  • Convenience sample

    • Sample that consists of individuals who are conveniently available

    • Not representative of population

  • Undercoverage

    • Sampling design that biases sample because it gives some part of the population less representation than it actually has in the population

  • Nonresponse bias

    • Bias introduced when large fraction of those sampled don't respond

    • Those who do respond then are not likely to represent full population

    • Voluntary response bias is a form of this

  • Response bias

    • Anything in a survey design that influences responses

    • Typically from wording of questions

  • Random

    • Outcome is random if we know the possible values it can have, but not which particular value it takes. A random outcome is FREE of human influence

  • Generating random numbers

    • Random numbers are hard to generate, but several internet sites offer an unlimited supply of equally likely random values

  • Simulation

    • A simulation models a real-world situation by using random-digit outcomes to mimic the uncertainty of a response variable on interest

  • Trial

    • The sequence of several components representing events that we are pretending will take place

  • Simulation component

    • A component uses equally likely random digits to model simple random occurrences whose outcomes may not be equally likely

  • Response variable

    • Values of the response variable record the results of each trial with respect to what we were interested in

robot