Define hypothesis testing
A process to test an assumption (hypothesis) about a population using data.
Understand the types of hypotheses
Null (H₀): No effect/difference.
Alternative (H₁): Shows effect/difference.
Outline the steps in hypothesis testing
State H₀ and H₁
Choose significance level (α)
Select test statistic
Determine critical region
Collect data and compute statistic
Compare and conclude (reject or fail to reject H₀)
Formulate a null hypothesis
A default assumption (e.g., "The medicine has no effect").
Formulate an alternative hypothesis
A statement that contradicts H₀ (e.g., "The medicine improves health").
Select a significance level
Common α levels: 0.05, 0.01
Represents the probability of Type I error (false positive)
Choose an appropriate test statistic
Depends on data type (e.g., t-test for means, chi-square for categories)
Define the problem
State clearly what is being tested or investigated.
Formulate a hypothesis
Create testable H₀ and H₁ based on the problem.
Design an experiment
Plan method: variables, controls, and procedure.
Conduct the experiment
Follow the procedure; control bias and maintain consistency.
Analyze the data
Use statistical tools to interpret results.
Draw conclusions
Decide whether to reject or fail to reject H₀ based on data.
Define data analysis and its importance
Organizing and interpreting data to extract meaning and support conclusions.
Identify data sources and types
Sources: surveys, sensors, experiments
Types: quantitative (numbers), qualitative (descriptions)
Learn data collection methods
Surveys, observations, experiments
Important: reliability and validity
Apply data preprocessing techniques
Cleaning, normalization, handling missing data
Perform exploratory data analysis (EDA)
Use graphs, summaries to understand patterns/trends
Implement statistical analysis
Inferential methods: t-tests, correlation, regression
Use visualization tools for data interpretation
Graphs, charts (histograms, scatterplots, boxplots)
Develop data-driven decision-making strategies
Use insights to support logical and strategic decisions
Address data quality issues
Check for bias, outliers, duplicates, and missing entries
Understand data security and privacy
Protect sensitive information; follow legal/ethical guidelines