knowt logo

unit 2 exam

Contingency Table
Table which shows how the individuals are distributed along each variable.

Marginal Distribution
Row total or column total in contingency tables.

Conditional Distribution
Show distribution of one variable for just those cases that satisfy a condition on another variable. Example: Event B given Event A occurs first.

Chapter 8 - Producing Data: Sampling Definitions to know

Population
The entire group of individuals or instances about whom we hope to learn.

Sample
A (representative) subset of a population, examined in the hope of learning about the population.

Sample Survey
A study that asks questions of a sample drawn from some population in the hope of learning something about the entire population.

Randomization
The best defense against bias is randomization, in which each individual is given a fair, random chance of selection.

Census
A sample that consists of the entire population.

Population Parameter
A numerically valued attribute of a model for a population. Example: mean income of all employed people in the USA

1This version: October 21, 2023, by Anita Kursell. May not include all things that could possibly be tested on. To be used as an additional reference to studying all Chapters 6, 8, 9, 12 - 14

1

7.

8.

9.

10.

11. 12. 13. 14.

15.

16. 17.

3

1.

Sample statistic
Statistics or sample statistics are values that are calculated for sample data. Example: mean income of employed people in a representative sample

Sampling Frame
A list of individuals from whom the sample is drawn. Individuals who may be in the population of interest, but who are not in the sampling frame cannot be included in any sample.

Simple Random Sample (SRS)
A SRS of sample size n is a sample in which each set of n elements in the population has an equal chance of selection.

Stratified Random Sampling
A sampling design in which the population is divided into several subpopulations (strata) and random samples are then drawn from each stratum. Try to make strata as homogeneous as possible.

Cluster Sampling
Entire groups, or clusters, are chosen at random. Clusters are heterogeneous.

Multistage Sampling
Sampling schemes that combine several sampling methods.

Systematic sample
A sample drawn by selecting individuals systematically from a sampling frame.

Voluntary response bias
Bias introduced to a sample when individuals can choose on their own whether to participate in the sample.

Undercoverage bias
Biases the sample in a way that gives a part of the population less representation in the sample than it has in the population.

Nonresponse bias
Bias introduced when a large fraction of those sampled fails to respond.

Response bias
Anything in a survey design that influences responses.

Chapter 9 - Producing Data: Experiments

Studies

(a) Observational Study
Study based on data in which no manipulation of factors has been employed.

2

  1. (b)  Retrospective Study
    Observational study in which subjects are selected and then their previous con- ditions or behaviors are determined. Based on historical data and memories.

  2. (c)  Prospective Study
    Observational study in which subjects are followed to observe future outcomes. Because no treatments are deliberately applied, it is not an experiment.

  1. Matching in Studies
    In a retrospective of prospective study, participants who are similar in ways not under study may be matched and then compared with each other on the variables of interest.

  2. Experiments

    1. (a)  Factor
      Variable whose levels are manipulated by the experimenter.

    2. (b)  Response Variable
      Variable whose values are compared across different treatments.

    3. (c)  Experiment
      Manipulates factor levels to create treatments, randomly assigns subjects to these treatment levels, and then compares the responses of the subject groups across treatment levels. Tries to assess effects of treatments.

    4. (d)  Levels
      Specific values that the experimenter chooses for a factor.

    5. (e)  Treatment
      Process, intervention, or other controlled circumstance applied to randomly as- signed experimental units.

    6. (f)  Block
      When groups of experimental units are similar in a way that is not a factor under study, it is often a good idea to gather them together into blocks and then randomize the assignment of treatments within each block.

  3. Randomization through Random Assignment
    An experiment must assign experimental units (individuals) to treatment groups using some form of randomization.

  4. Principles of Experimental Design

    1. (a)  Control
      Control aspects of the experiment that we know may have an effect on the re- sponse, but that are not the factors being studied.

    2. (b)  Randomize
      Randomize subjects to treatments to even out effects that we cannot control.

    3. (c)  Replicate
      Replicate over as many subjects as possible.

3

(d) Block
Reduce the effects of identifiable attributes of the subjects that cannot be con- trolled.

  1. Statistically Significant
    When an observed difference is too large for us to believe that it is likely to have occurred naturally, we consider the difference to be statistically significant.

  2. Types of Experiments

    1. (a)  Completely randomized design (CRD)
      All experimental units have an equal chance of receiving any treatment.

    2. (b)  Randomized Block Design (RBD)
      Participants are randomly assigned to treatments within each block.

    3. (c)  Matched Pair Designg
      Participants are paired with similar subjects (often the same subject), one of the pair is given the treatment, and the difference in the response variables are compared.

  3. Control Treatment Baseline treatment.

  4. Control Group
    Experimental units assigned to a baseline treatment level typically either the default treatment or a placebo treatment. Responses provide a basis for comparison.

  5. Blinding
    Any individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups.

  6. Single/Double Blind
    ˆ Those who could influence the results.

    ˆ Those who evaluate the results.
    Single Blind: when either of the two above statements is blinded. Double Blind: when

    both of the two above statements is blinded.

  7. Placebo
    A treatment known to have no effect.

  8. Placebo Effect
    The tendency of human subjects to show a response even when administered a placebo.

  9. Potential Problems

4

15.

4 4.1

1.

2. 3. 4.

5.

6.

7.

8.

  1. (a)  Confounding
    When the levels of one factor are associated with the levels of another factor in such a way that their effects cannot be separated, we say that these two factors are confounded.

  2. (b)  Lurking Variable
    A variable associated with both y and x that makes it appear that x may be causing y.

In summary, the best experiments are usually 1) Randomized, 2) Comparative, 3) Double-blind, and 4) Placebo-controlled.

Chapter 12 - Introducing Probability and 13 - Gen- eral Rules of Probability

Definitions to know

Random Phenomenon
A phenomenon is random if we know what outcomes could happen, but not which particular values will happen.

Trial
A single attempt or realization of a random phenomenon.

Outcome
The value measured, observed, or reported for an individual instance of a trial.

Event
A collection of outcomes. Usually, we identify events so that we can attach probabilities to them. Denote events with bold capital letters like
A, B, etc.

Sample Space
The collection of all possible outcome values. The collection of values in the sample space has a probability of 1. Denote by S or Ω.

Law of Large Numbers (LLN)
This law states that the long-run relative frequency of an event’s occurrence gets closer and closer to the true relative frequency as the number of trials increases.

Independence (informal definition)
2 events are independent if learning that one event occurs does not change the proba- bility that the other event occurs.

Probability
A number between 0 and 1 that reports the likelihood of that event’s occurrence. Write P(
A) for the probability of event A.

5

  1. Empirical Probability
    When the probability comes from the long-run relative frequency of the event’s occur- rence.

  2. Theoretical Probability
    When the probability comes from a model (such as equally likely outcomes).
    P (A) = # outcomes in A divided by # all possible outcomes

  3. Personal (or subjective) Probability
    When the probability is subjective and represents your personal degree of belief.

  4. Legitimate Assignment of Probabilities
    An assignment of probabilities to outcomes is legitimate if

    ˆ each probability is greater than or equal to 0 and less than or equal to 1 ˆ the sum of the probabilities = 1

4.2 Chapter 13 - Rules on Probability

  1. ForalleventsA,0P(A)1.

  2. Probability Assignment Rule ˆ P(S)=1

    ˆ The set of all possible outcomes of a trial must have probability = 1.

  3. Complement Rule

    ˆ Set of outcomes that are not in the event A is the complement AC
    ˆ P(AC) = 1 P(A) Where AC is the complement of A,
    ˆ The probability of an event not occurring is 1 minus the probability that it occurs

  4. Addition Rule

    • ˆ  For 2 disjoint events A and B, the probability that one or the other occurs is the sum of the probability of the two events.

    • ˆ  P(AorB)=P(A)+P(B)whereAandBaredisjoint

    • ˆ  disjoint means mutually exclusive; there are no outcomes in common

  5. Multiplication Rule

    • ˆ  For two independent events A and B, the probability that both A and B occur is the product of the probabilities of the two events.

    • ˆ  P(A and B) = P(A)P(B) where A and B are independent

6

  1. General Addition Rule
    For any two events
    A and B, the probability of A or B is

    P (A or B) = P (A) + P (B) P (A and B). This rule does NOT require disjoint events.

  2. Conditional Probability
    The conditional probability of the event
    B given the event A has occurred is

    P(B|A)=P(A andB). P (A)

  3. General Multiplication Rule
    For any two events
    A and B, the probability of A and B is

    P(A and B) = P(A)P(B | A).
    This rule does NOT require independence.

  4. Independent
    Events
    A and B are independent when P(B | A) = P(B). Note: independent is not the same as disjoint.

  5. Tree Diagram
    A display of conditional events or probabilities that is helpful in thinking through conditioning.

  6. Bayes Rule
    P(B | A) = P(A|B)P(B) .

    P (A|B)P (B)+P (A|BC )P (BC )

    Since P(A | B)P(B) + P A | BCP BC = P(A) so this may be simplified to read P(B | A)P(A) = P(A | B)P(B)

7

4.3 Tree Diagram Example and Interpretations of Every Node Example Probabilities are Given

= =

Calculate things like P(A | B) using Bayes Rule: P(A | B) = P(B|A)P(A)

P(B|A)P(A)+P(B|Ac)P(Ac) P(B|A)P(A)

P(B|A)P(A)+P(B|NotA)P(NotA) (0.8)(0.6)

P (A and B) = (0.6)(0.8) = 0.48

P (A and Not B) = (0.6)(0.2) = 0.12 P (Not A and B) = (0.4)(0.2) = 0.08

P (Not A and Not B) = (0.4)(0.8) = 0.32 Here are the mathematical interpretations of the numbers in the tree diagram:

P(A) = 0.6 P(Not A) = 0.4

P(B | A) = 0.8 P(Not B|A) = 0.2

P(A and B) = 0.48 P(A and Not B) = 0.12

P(Not A and B) = 0.08 P(Not A and Not B) = 0.32

P(B |Not A) = 0.2 P(Not B|Not A) = 0.8

(0.8)(0.6)+(0.2)(0.4) = 0.48

0.56
= 0.8571.

Calculate things like P(B) using the Multiplication Rule but rearranging it. P(B and A) = P(B)P(A | B) P(BandA) = P(B).

P(A|B) P(B) = P(BandA) = 0.32 = 0.3734.

5 Chapter 14 - Binomial Distribution 5.1 Definitions to know

1. Binomial Distribution
A sequence of trials has a binomial distribution if

ˆ There are exactly 2 possible outcomes (success and failure) 8

Now,

P(A|B) 0.8571

A 0.6

B 0.8

B 0.2

Not A
0.4

Not B
0.8

Not B
0.2

ˆ Probability of success, p, is constant ˆ Trials are independent
ˆ There are a fixed number of trials, n.

2. Success/Failure Condition
A Binomial Model is approximately Normal if we expect at least 10 successes and 10 failures, i.e.
np 10 and n(1 p) 10.

5.2 Binomial Model:

  • ˆ  P(X=k)=npk(1p)nk k

  • ˆ  μ = np

  • ˆ  σ=pnp(1p)

    where

    n n!
    ˆ k = k!(nk)!

    ˆ n!=n(n1)(n2)···(1)

unit 2 exam

Contingency Table
Table which shows how the individuals are distributed along each variable.

Marginal Distribution
Row total or column total in contingency tables.

Conditional Distribution
Show distribution of one variable for just those cases that satisfy a condition on another variable. Example: Event B given Event A occurs first.

Chapter 8 - Producing Data: Sampling Definitions to know

Population
The entire group of individuals or instances about whom we hope to learn.

Sample
A (representative) subset of a population, examined in the hope of learning about the population.

Sample Survey
A study that asks questions of a sample drawn from some population in the hope of learning something about the entire population.

Randomization
The best defense against bias is randomization, in which each individual is given a fair, random chance of selection.

Census
A sample that consists of the entire population.

Population Parameter
A numerically valued attribute of a model for a population. Example: mean income of all employed people in the USA

1This version: October 21, 2023, by Anita Kursell. May not include all things that could possibly be tested on. To be used as an additional reference to studying all Chapters 6, 8, 9, 12 - 14

1

7.

8.

9.

10.

11. 12. 13. 14.

15.

16. 17.

3

1.

Sample statistic
Statistics or sample statistics are values that are calculated for sample data. Example: mean income of employed people in a representative sample

Sampling Frame
A list of individuals from whom the sample is drawn. Individuals who may be in the population of interest, but who are not in the sampling frame cannot be included in any sample.

Simple Random Sample (SRS)
A SRS of sample size n is a sample in which each set of n elements in the population has an equal chance of selection.

Stratified Random Sampling
A sampling design in which the population is divided into several subpopulations (strata) and random samples are then drawn from each stratum. Try to make strata as homogeneous as possible.

Cluster Sampling
Entire groups, or clusters, are chosen at random. Clusters are heterogeneous.

Multistage Sampling
Sampling schemes that combine several sampling methods.

Systematic sample
A sample drawn by selecting individuals systematically from a sampling frame.

Voluntary response bias
Bias introduced to a sample when individuals can choose on their own whether to participate in the sample.

Undercoverage bias
Biases the sample in a way that gives a part of the population less representation in the sample than it has in the population.

Nonresponse bias
Bias introduced when a large fraction of those sampled fails to respond.

Response bias
Anything in a survey design that influences responses.

Chapter 9 - Producing Data: Experiments

Studies

(a) Observational Study
Study based on data in which no manipulation of factors has been employed.

2

  1. (b)  Retrospective Study
    Observational study in which subjects are selected and then their previous con- ditions or behaviors are determined. Based on historical data and memories.

  2. (c)  Prospective Study
    Observational study in which subjects are followed to observe future outcomes. Because no treatments are deliberately applied, it is not an experiment.

  1. Matching in Studies
    In a retrospective of prospective study, participants who are similar in ways not under study may be matched and then compared with each other on the variables of interest.

  2. Experiments

    1. (a)  Factor
      Variable whose levels are manipulated by the experimenter.

    2. (b)  Response Variable
      Variable whose values are compared across different treatments.

    3. (c)  Experiment
      Manipulates factor levels to create treatments, randomly assigns subjects to these treatment levels, and then compares the responses of the subject groups across treatment levels. Tries to assess effects of treatments.

    4. (d)  Levels
      Specific values that the experimenter chooses for a factor.

    5. (e)  Treatment
      Process, intervention, or other controlled circumstance applied to randomly as- signed experimental units.

    6. (f)  Block
      When groups of experimental units are similar in a way that is not a factor under study, it is often a good idea to gather them together into blocks and then randomize the assignment of treatments within each block.

  3. Randomization through Random Assignment
    An experiment must assign experimental units (individuals) to treatment groups using some form of randomization.

  4. Principles of Experimental Design

    1. (a)  Control
      Control aspects of the experiment that we know may have an effect on the re- sponse, but that are not the factors being studied.

    2. (b)  Randomize
      Randomize subjects to treatments to even out effects that we cannot control.

    3. (c)  Replicate
      Replicate over as many subjects as possible.

3

(d) Block
Reduce the effects of identifiable attributes of the subjects that cannot be con- trolled.

  1. Statistically Significant
    When an observed difference is too large for us to believe that it is likely to have occurred naturally, we consider the difference to be statistically significant.

  2. Types of Experiments

    1. (a)  Completely randomized design (CRD)
      All experimental units have an equal chance of receiving any treatment.

    2. (b)  Randomized Block Design (RBD)
      Participants are randomly assigned to treatments within each block.

    3. (c)  Matched Pair Designg
      Participants are paired with similar subjects (often the same subject), one of the pair is given the treatment, and the difference in the response variables are compared.

  3. Control Treatment Baseline treatment.

  4. Control Group
    Experimental units assigned to a baseline treatment level typically either the default treatment or a placebo treatment. Responses provide a basis for comparison.

  5. Blinding
    Any individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups.

  6. Single/Double Blind
    ˆ Those who could influence the results.

    ˆ Those who evaluate the results.
    Single Blind: when either of the two above statements is blinded. Double Blind: when

    both of the two above statements is blinded.

  7. Placebo
    A treatment known to have no effect.

  8. Placebo Effect
    The tendency of human subjects to show a response even when administered a placebo.

  9. Potential Problems

4

15.

4 4.1

1.

2. 3. 4.

5.

6.

7.

8.

  1. (a)  Confounding
    When the levels of one factor are associated with the levels of another factor in such a way that their effects cannot be separated, we say that these two factors are confounded.

  2. (b)  Lurking Variable
    A variable associated with both y and x that makes it appear that x may be causing y.

In summary, the best experiments are usually 1) Randomized, 2) Comparative, 3) Double-blind, and 4) Placebo-controlled.

Chapter 12 - Introducing Probability and 13 - Gen- eral Rules of Probability

Definitions to know

Random Phenomenon
A phenomenon is random if we know what outcomes could happen, but not which particular values will happen.

Trial
A single attempt or realization of a random phenomenon.

Outcome
The value measured, observed, or reported for an individual instance of a trial.

Event
A collection of outcomes. Usually, we identify events so that we can attach probabilities to them. Denote events with bold capital letters like
A, B, etc.

Sample Space
The collection of all possible outcome values. The collection of values in the sample space has a probability of 1. Denote by S or Ω.

Law of Large Numbers (LLN)
This law states that the long-run relative frequency of an event’s occurrence gets closer and closer to the true relative frequency as the number of trials increases.

Independence (informal definition)
2 events are independent if learning that one event occurs does not change the proba- bility that the other event occurs.

Probability
A number between 0 and 1 that reports the likelihood of that event’s occurrence. Write P(
A) for the probability of event A.

5

  1. Empirical Probability
    When the probability comes from the long-run relative frequency of the event’s occur- rence.

  2. Theoretical Probability
    When the probability comes from a model (such as equally likely outcomes).
    P (A) = # outcomes in A divided by # all possible outcomes

  3. Personal (or subjective) Probability
    When the probability is subjective and represents your personal degree of belief.

  4. Legitimate Assignment of Probabilities
    An assignment of probabilities to outcomes is legitimate if

    ˆ each probability is greater than or equal to 0 and less than or equal to 1 ˆ the sum of the probabilities = 1

4.2 Chapter 13 - Rules on Probability

  1. ForalleventsA,0P(A)1.

  2. Probability Assignment Rule ˆ P(S)=1

    ˆ The set of all possible outcomes of a trial must have probability = 1.

  3. Complement Rule

    ˆ Set of outcomes that are not in the event A is the complement AC
    ˆ P(AC) = 1 P(A) Where AC is the complement of A,
    ˆ The probability of an event not occurring is 1 minus the probability that it occurs

  4. Addition Rule

    • ˆ  For 2 disjoint events A and B, the probability that one or the other occurs is the sum of the probability of the two events.

    • ˆ  P(AorB)=P(A)+P(B)whereAandBaredisjoint

    • ˆ  disjoint means mutually exclusive; there are no outcomes in common

  5. Multiplication Rule

    • ˆ  For two independent events A and B, the probability that both A and B occur is the product of the probabilities of the two events.

    • ˆ  P(A and B) = P(A)P(B) where A and B are independent

6

  1. General Addition Rule
    For any two events
    A and B, the probability of A or B is

    P (A or B) = P (A) + P (B) P (A and B). This rule does NOT require disjoint events.

  2. Conditional Probability
    The conditional probability of the event
    B given the event A has occurred is

    P(B|A)=P(A andB). P (A)

  3. General Multiplication Rule
    For any two events
    A and B, the probability of A and B is

    P(A and B) = P(A)P(B | A).
    This rule does NOT require independence.

  4. Independent
    Events
    A and B are independent when P(B | A) = P(B). Note: independent is not the same as disjoint.

  5. Tree Diagram
    A display of conditional events or probabilities that is helpful in thinking through conditioning.

  6. Bayes Rule
    P(B | A) = P(A|B)P(B) .

    P (A|B)P (B)+P (A|BC )P (BC )

    Since P(A | B)P(B) + P A | BCP BC = P(A) so this may be simplified to read P(B | A)P(A) = P(A | B)P(B)

7

4.3 Tree Diagram Example and Interpretations of Every Node Example Probabilities are Given

= =

Calculate things like P(A | B) using Bayes Rule: P(A | B) = P(B|A)P(A)

P(B|A)P(A)+P(B|Ac)P(Ac) P(B|A)P(A)

P(B|A)P(A)+P(B|NotA)P(NotA) (0.8)(0.6)

P (A and B) = (0.6)(0.8) = 0.48

P (A and Not B) = (0.6)(0.2) = 0.12 P (Not A and B) = (0.4)(0.2) = 0.08

P (Not A and Not B) = (0.4)(0.8) = 0.32 Here are the mathematical interpretations of the numbers in the tree diagram:

P(A) = 0.6 P(Not A) = 0.4

P(B | A) = 0.8 P(Not B|A) = 0.2

P(A and B) = 0.48 P(A and Not B) = 0.12

P(Not A and B) = 0.08 P(Not A and Not B) = 0.32

P(B |Not A) = 0.2 P(Not B|Not A) = 0.8

(0.8)(0.6)+(0.2)(0.4) = 0.48

0.56
= 0.8571.

Calculate things like P(B) using the Multiplication Rule but rearranging it. P(B and A) = P(B)P(A | B) P(BandA) = P(B).

P(A|B) P(B) = P(BandA) = 0.32 = 0.3734.

5 Chapter 14 - Binomial Distribution 5.1 Definitions to know

1. Binomial Distribution
A sequence of trials has a binomial distribution if

ˆ There are exactly 2 possible outcomes (success and failure) 8

Now,

P(A|B) 0.8571

A 0.6

B 0.8

B 0.2

Not A
0.4

Not B
0.8

Not B
0.2

ˆ Probability of success, p, is constant ˆ Trials are independent
ˆ There are a fixed number of trials, n.

2. Success/Failure Condition
A Binomial Model is approximately Normal if we expect at least 10 successes and 10 failures, i.e.
np 10 and n(1 p) 10.

5.2 Binomial Model:

  • ˆ  P(X=k)=npk(1p)nk k

  • ˆ  μ = np

  • ˆ  σ=pnp(1p)

    where

    n n!
    ˆ k = k!(nk)!

    ˆ n!=n(n1)(n2)···(1)

robot