Course Title: MVLS: Life Sciences Level 2
Topic: Contemporary Issues in Biology 2 Block 2
Lecturers: Dr. Stevie A Bain (she/her), Dr. Paul Capewell (he/him)
Difference Between Samples and Populations: Understanding the distinction between individual samples and entire populations.
Sampling Error: Definition and explanation of how sampling error can occur.
Data Classification: Identify and present data as continuous or categorical appropriately.
Statistical Hypotheses: Formulate null hypotheses (H0) and alternative hypotheses (H1) for various experiments.
Chi-squared Test: Overview of the chi-squared test, including calculations (observed counts, expected counts, degrees of freedom, p-values). Note: R code not required for memorization.
Visualization: Histogram for MMI1$Height with various heights marked.
Hist() Command Modifiers:
Number of bins: breaks
X-axis label: xlab
Y-axis label: ylab
Bar color: col
Main title: main
X-axis limit: xlim
Y-axis limit: ylim
Definition of Sample: A sample is a very small subset of a total population.
Population: All members constituting a defined group.
Representativeness: Questions of how well a sample represents the population and potential sampling biases.
Random Subsets: Conduct random sampling to perform statistical analyses (mean, median, range).
Impact on Results: Explore whether random sampling influences outcomes.
Definition: Random variation in data arising from sampling only part of the population.
Various influences include:
Genetic factors (inheritance, parental height)
Environmental conditions (nutrition, lifestyle)
Societal factors (socioeconomic status, beauty standards)
Biological factors (hormone levels, medical conditions)
Geographical location and climate influences
Importance of collecting data on biological sex alongside height data.
Participation in data collection is voluntary and anonymous.
Clean data using Excel.
Use R Studio for visualization and statistical analysis.
Acknowledge differences between continuous and categorical data in presentation and analysis.
Discuss inferences regarding expected sex ratios based on genetic determination.
Comparison of expected versus observed graphs will be assessed.
Mere observation of data is insufficient; statistical testing validates findings.
Consider if deviations from expectations result from sampling error or other factors.
Statistical Hypothesis: An assumption about population characteristics or variable relationships subject to testing.
Null Hypothesis (H0):
Default assumption that there’s no relationship or effect.
Categorical outcomes assumed equally likely.
Alternative Hypothesis (H1):
Suggests a difference exists; categorical outcomes are not equally likely.
Design hypotheses to evaluate sex ratios using:
H0: Male to female ratio = 1:1
H1: Male to female ratio ≠ 1:1
Further variations can be hypothesized (more females or males).
Application: Used for categorical data to compare observed versus expected frequencies under H0.
Assumptions:
Categories are mutually exclusive.
Observations are independent.
Expected values mostly > 5.
Formula:
( \chi^2 = \sum \frac{(d^2)}{e} ) where ( d ) = observed - expected.
Construct a contingency table to display observed counts.
Calculate degrees of freedom using: (rows - 1) x (columns - 1).
Example p-value chart provided for significance levels based on the chi-squared statistic and degrees of freedom.
Utilization of R for expedited statistical analysis and p-value computations.
Tomorrow's Lecture: Focus on interpreting p-values and assess the role of genetic sex as a modifier of human height.