MMI-L2-Moodle
Lecture Overview
Course Title: MVLS: Life Sciences Level 2
Topic: Contemporary Issues in Biology 2 Block 2
Lecturers: Dr. Stevie A Bain (she/her), Dr. Paul Capewell (he/him)
Intended Learning Outcomes
Difference Between Samples and Populations: Understanding the distinction between individual samples and entire populations.
Sampling Error: Definition and explanation of how sampling error can occur.
Data Classification: Identify and present data as continuous or categorical appropriately.
Statistical Hypotheses: Formulate null hypotheses (H0) and alternative hypotheses (H1) for various experiments.
Chi-squared Test: Overview of the chi-squared test, including calculations (observed counts, expected counts, degrees of freedom, p-values). Note: R code not required for memorization.
Data Presentation
Class Height Data
Visualization: Histogram for MMI1$Height with various heights marked.
Histograms
Hist() Command Modifiers:
Number of bins:
breaks
X-axis label:
xlab
Y-axis label:
ylab
Bar color:
col
Main title:
main
X-axis limit:
xlim
Y-axis limit:
ylim
Samples and Populations
Definition of Sample: A sample is a very small subset of a total population.
Population: All members constituting a defined group.
Representativeness: Questions of how well a sample represents the population and potential sampling biases.
Investigation Approach
Random Subsets: Conduct random sampling to perform statistical analyses (mean, median, range).
Impact on Results: Explore whether random sampling influences outcomes.
Sampling Error
Definition: Random variation in data arising from sampling only part of the population.
Factors Influencing Human Height
Various influences include:
Genetic factors (inheritance, parental height)
Environmental conditions (nutrition, lifestyle)
Societal factors (socioeconomic status, beauty standards)
Biological factors (hormone levels, medical conditions)
Geographical location and climate influences
Biological Sex as a Modifier of Height
Importance of collecting data on biological sex alongside height data.
Participation in data collection is voluntary and anonymous.
Data Analysis Steps
Clean data using Excel.
Use R Studio for visualization and statistical analysis.
Acknowledge differences between continuous and categorical data in presentation and analysis.
Expected Outcomes
Discuss inferences regarding expected sex ratios based on genetic determination.
Comparison of expected versus observed graphs will be assessed.
Importance of Statistical Testing
Mere observation of data is insufficient; statistical testing validates findings.
Consider if deviations from expectations result from sampling error or other factors.
Hypotheses in Statistics
Statistical Hypothesis: An assumption about population characteristics or variable relationships subject to testing.
Null Hypothesis (H0):
Default assumption that there’s no relationship or effect.
Categorical outcomes assumed equally likely.
Alternative Hypothesis (H1):
Suggests a difference exists; categorical outcomes are not equally likely.
Sex Ratio Contextual Hypotheses
Design hypotheses to evaluate sex ratios using:
H0: Male to female ratio = 1:1
H1: Male to female ratio ≠ 1:1
Further variations can be hypothesized (more females or males).
Chi-squared Test
Application: Used for categorical data to compare observed versus expected frequencies under H0.
Assumptions:
Categories are mutually exclusive.
Observations are independent.
Expected values mostly > 5.
Formula:
( \chi^2 = \sum \frac{(d^2)}{e} ) where ( d ) = observed - expected.
Contingency Table and Degrees of Freedom
Construct a contingency table to display observed counts.
Calculate degrees of freedom using: (rows - 1) x (columns - 1).
Calculating p-values
Example p-value chart provided for significance levels based on the chi-squared statistic and degrees of freedom.
Using R for Analysis
Utilization of R for expedited statistical analysis and p-value computations.
Upcoming Topics
Tomorrow's Lecture: Focus on interpreting p-values and assess the role of genetic sex as a modifier of human height.