PPT 1: Descriptive Statistics

PPT 1: Descriptive Statistics



Intro to Biostats

  • Why do we use statistics in biology?

    • Most things are probabilistic (data based on probabilities) rather than deterministic (data based on know facts)


Definitions

  • What is the difference between observational and experimental populations?

    • Observational- A finite population, but is difficult to count

    • Experimental- An infinite amount in the population

  • What is a sample survey (or an observational study)

    • A study of the individuals actually present in a population (under what the investigator can control)

  • What is the difference between an experiment and an observation?

    • Observations are descriptions of patterns

    • Experiments are designed to collect observations according to a plan

  • What is a sample?

    • The amount of a population actually measured

  • What is a sample unit?

    • An individual thing drawn from the population (e.g. an organism or a measurement)

  • What is inference?

    • A generalization of an observation (as samples do not always contain all of the population)

  • What is random sampling?

    • Truly random, all members of a population have a chance to be chosen

  • What is simple random sampling?

    • A random sample within the whole population

  • What is a stratified random sample?

    • A random sample within a group (males v females, age 1 age 2, etc)


Biological Variables

  • What is continuous measurement of variables?

    • Any value between two extremes can be selected

  • What are discrete measurements of variables?

    • Fixed values are chosen between extremes (whole numbers)

  • What are rank variables?

    • Indicate more or less of variables based on their rank (e.g. smallest to greatest)

  • What are qualitative variables

    • Categorical variables (e.g. male/female, living/dead)

  • What is a rate?

    • The quantity per unit (eg. time, mass, births per year)

  • What are indices (or index)?

    • Complex derived variables (e.g. condition index: condition of body)

  • What is the difference between accuracy an precision?

    • Accuracy- How close a value is to the true value

    • Precision- How close repeated variables are

  • What is bias?

    • Departure from the true value


Frequency Distribution

  • What can frequency distribution show?

    • Location, dispersion, and symmetry

  • What are examples of frequency distributions?

    • Symmetrical unimodal:

  • Asymmetrical bimodal:

  • Symmetrical uniform:

  • Asymmetrical (skewed) unimodal:

  • Symmetrical bimodal:

  • Extremely skewed:

  • What is the difference between absolute and relative frequencies?

    • Absolute- The vertical axis represents the real number of observations

    • Relative- The axis represents a percentage of observations

  • What does ∑ represent?

    • The sum of all of the variables

  • What is the statistics of location (or mean)?

    • The position of a sample oolong a given dimension representing a variable

  • What is the difference between an arithmetic mean and a weighted mean?

    • Arithmetic- The balance point of a distribution. All numbers are treated equally and have equal weight. Most commonly used

    • Weighted- The averages of the values are taken to find the mean (may be based on prevalence or the overall percentage of some variables)

  • Why do we transform our results?

    • We want to be able to shift the data to get a normal distribution

  • What is the geometric mean (GM)?

    • The back-transformed mean of a log transformed variable (Y becomes logY, and then is changed back to Y)

  • What is a harmonic mean?

    • 1/Y

    • When we take a (highly) skewed distribution and make it normal.

  • What is the median?

    • M or Y

    • It is the middle value of the distribution

  • What is mode?

    • The most frequent value. Can be bimodal or multimodal.

  • Where are the mode, median, and means in an asymmetric distribution?

    • Mode- Farthest from the tail

    • Median- in between

    • Mean- Closest to the tail

  • What is the mean deviation?

    • A measure of the average deviation from the mean

  • What is standard deviation?

    • A measure of the amount of variation from the variables around the mean

    • The square root of the variance

    • σ

  • What is variance?

    • The overall deviation of the observations from their mean

  • What is a parameter?

    • The true number value of a population (this number is the goal of an estimate)

  • What is a sample statistic?

    • The estimate of a parameter based on a sample

  • What is μ?

    • The population mean or expected value

  • What is ȳ?

    • The unbiased estimate of μ (the mean)

  • What is σ?

    • The standard deviation

  • If the means and standard deviations are the same, how do we measure interdependence (how variables are related)?

    • We use covariance. It describes the extent and direction of two numerical variables with numbers. You take the product of two deviations.

  • What happens if there is a positive covariance?

    • Large Y1, and large Y2

    • Small Y1, and small Y2

    • (left in figure)

  • What happens if there is a negative covariance?

    • Large Y1, small Y2

    • Small Y1, large Y2

    • (right in figure)

  • What happens if the covariance is 0?

    • (middle of figure)

    • This means that knowing one variable tells you nothing about the other

  • What is standardized covariance?

    • The correlation coefficient

    • It scales the variance between -1 and 1


PPT 2: 2 prob and distributions



  • What is n?

    • Equally likely outcomes

    • The number of possible events

  • What is s?

    • Successful outcomes

  • What is the probability of success?

    • s/n

  • What is a simple event?

    • Any 1 element in a sample space

    • Only one can occur in a trial

  • What is an event?

    • When more successes happen (depending on what is the desired outcome)

  • What is an intersection?

    • An “and” relationship among common elements (eg. (A,B) = A⋂B)

  • What is a union?

    • An “either/or” relationship among all elements in both sets (e.g. A⋃B)

  • What is independence?

    • Among two events, when probability of one occurring does not affect the other

  • What is replacement? 

    • P(D)2

    • When you remove one individual and there is a chance of getting another (the sample space is unchanged)

  • What happens to the chance of an event (2 draws) happening as the population size increases?

    • Your chance of an event occurring increases

    • A smaller population results in less chances of an event

  • Can we multiply two probabilities of the two events we want to get the probability of both events?

    • No

  • What is π?

    • The probability of a success

  • What is 1-π?

    • The probability of a failure

  • What is n and r in C(n,r)?

    • The number of combinations of n things taken r at a time (r successes)

  • What is the equation for the mean of the distribution?

    • wi= frequencies (graph)

    • ri= # of successes

  • What is the equation for the expected mean (or the absolute frequency of success)?

    • μ (mean) = n (number of equally likely outcomes) π (probability of success)

  • What is the equation for the s (standard deviation)  of the distribution?

  • What happens to σ with changes in n?

    • σ increases, with an increase of n

    • σ decreases, with a decrease of n

  • What is the distribution expected if π (probability of success) is 0.5?

  • What is a repulsed distribution?

    • s (success) < σ (SD)

    • When there is excess in the center and too few at the tails

  • What is a clumped distribution?

    • s (success) > σ (SD)

    • When there is an excess in tails

  • What is skewness?

    • When data has a tail to one side (left or right)

    • Can be measured with g1

  • What is kurtosis?

    • The shape and height of a distribution

    • E.g. leptokurtic (high peaked)

    • E.g. platykurtic (rounded)

  • What happens to the standard error of the mean with changes of n?

    • As n increases, the SD decreases

  • What is a confidence interval?

    • The percent of the results that are expected to include the μ (mean)

  • What is the central limit theory?

    • As the sample size n increases, the distribution of the sample mean ( ȳ) approaches a normal distribution, even if the original population is not normally distributed

  • What is Pr?

    • 95% of the data is between the two extremes

    • The confidence interval


  • 95% probability the true mean is between sample mean +- std error

  • 95% of intervals calculated from samples will contain the sample mean

  • Changing in the 1.96→ changes the % confidence interval 

    • 1.96 is the Z value

  • There is no 100% confidence interval

robot