2024-general-maths-summary-notes

Page 1: Chapter One – Investigating Data Distributions

Key Definitions

Mode/Modal: The most frequently occurring value or category in a dataset.
Mean: The average of all data values, represented as 𝑥̄.
Median: The middle value of a dataset, calculated using 𝑛+1/2 = median number.
Range: The difference between the maximum and minimum data values, calculated as Largest Data Value – Smallest Data Value.
IQR (Interquartile Range): The range of the middle 50% of data values, calculated as IQR = Q3 – Q1.

Univariate Data Distributions of Categorical Data

Frequency Table: An example can be the classification of climate types in 23 countries as ‘cold’, ‘mild’, or ‘hot’. It can be reported as:
- 60.9% mild
- 26.1% hot
- 13.0% cold

Numerical Data Distributions

Grouped Frequency Table: Use 3 intervals to summarize the data distribution.
Report: Summarize context and describe a histogram in terms of shape, center, spread, and outliers.

Displaying Numerical Data

Dot Plot: Represents frequency of data points.
Stem Plot: A graphical representation of numerical data, showing frequency and intervals.
Histogram: Displays data in bars without gaps.
Bar Chart: Can be segmented to show distributions of categorical data with percentages.

Significant Figures Rules

First non-zero digit is significant.
All non-zero digits are significant.
Zeroes between significant digits are significant.
Zeroes after a decimal to the right of non-zero digits are significant.

Histograms

Use CAS for plotting histograms through Lists and Spreadsheets.
- Adding data and setting up variables.

Page 2: Key Features of Data Distribution

Shape, Center, Spread, and Outliers

Shape: Distribution shape influences analysis and interpretation.
Center: Represents the middle value of the data (mean, median, mode).
Spread: Indicates how much variation exists within the data.
Outliers: Data values that are significantly different from others.

Logarithmic Scale

Base 10 Logarithms: Useful for large quantities and can simplify multiplication into addition.
- Logarithmic values calculated for various numbers (Log10 of values like 0.01 to 1,000,000).
Properties of Logarithms:
- Log of a number >1 is positive, <1 is negative, and log of 0 is undefined.

Measures of Center and Spread

IQR: More reliable measure as it is not impacted by outliers.
Mean: Best used with symmetric data without outliers, median preferred for skewed data.
Standard Deviation: Measures the amount of variation or dispersion in a set of values.

Distribution Types

Bimodal Distribution: Two peaks indicate potential data from two populations.
Skewed Distributions: Positively skewed (tail on the right) and negatively skewed (tail on the left).

Page 3: Five Number Summary and Box Plots

Five-number summary includes:

Minimum: Smallest value in data.
Quartile 1 (Q1): Value below which 25% of data fall.
Median (M): Middle value of sorted data.
Quartile 3 (Q3): Value below which 75% of data fall.
Maximum: Largest value in data.

Standard (z) Score

A z score helps determine how far a value is from the mean.
- Positive: Above the mean.
- Zero: Equal to the mean.
- Negative: Below the mean.
Upper and Lower Fences: Used to identify outliers.

Box Plots

Box Plot Construction: Depicts the five-number summary, whiskers indicate min/max, outliers identified.
Use CAS to create box plots and analyze data.

Page 4: Investigating Associations Between Two Variables

Variable Types

Explanatory Variable (EV): Variable believed to predict or explain the response variable.
Response Variable (RV): Variable that responds to changes in the explanatory variable.

Associations Between Categorical Variables

Two-way Frequency Table: Display associations.
Examples showing percentage differences can indicate associations (e.g., gender and intention to attend university).

Associations Between Numerical and Categorical Variables

Parallel Box Plots & Dot Plots: Used for comparison of groups.
Report by comparing medians, IQRs, and identifying outliers.

Associations Between Numerical Variables

Scatterplots: Visual representation showing relationships.
Analyze the direction and form of the relationship.
- Assess strength and non-linearity.

Page 5: Correlation Coefficient

Pearson’s Correlation Coefficient (r)

Measures strength and direction of linear relationships between two continuous variables.
Valid under conditions: both variables are numerical, association is linear, no outliers present.

Coefficient of Determination (r²)

Indicates the proportion of the variance in the response variable that can be explained by the explanatory variable.
Calculation steps involve squaring the correlation coefficient.

Interpreting Correlation Coefficient

r = 0: No association, r = +1: Perfectly positive, r = -1: Perfectly negative association.

Page 6: Fitting a Least Squares Regression Line

Regression Analysis Process

Construct a scatterplot to visualize data.
Calculate the correlation coefficient to determine strength of association.
Determine the regression line using the formula y = a + bx where a is the y-intercept and b is the slope.
Interpret the regression line.
Use the coefficient of determination to assess prediction power.
Make predictions based on the regression line.

Residuals and Linearity

Residuals: Differences between observed values and fitted values from the model; check for constant variance.
Conduct residual analysis to verify that the assumptions of linearity are valid.

Page 7: Transformations

Transformation Types

Squared Transformation: y = a + bx²; useful for quadratic relationships.
Log Transformation: y = a + b log10(x) and log10(y) = a + bx; helps normalizing right-skewed data.
Reciprocal Transformation: y = a + (b/x); used for hyperbolic relationships.

Implementing Transformations in CAS

Naming new variables for transformations.
Using the relevant formulas to create new datasets for analysis.

Page 8: Determining the Best Transformation

Assessing Transformations

Evaluate which transformation yields the best linear model by checking residual plots for linearity.
Coefficient of determination (r²) indicates how well the model fits the data.

Page 9: Time Series Data

Trend Analysis

Trend: General movement in data over time—can be increasing, decreasing, or constant.
Cyclic Variation: Fluctuations occurring at regular intervals longer than a year.
Seasonality: Patterns related to calendar periods, identifiable in a year’s cycle.
Structural Change: Sudden shifts in the time series trend, indicating a period change.
Irregular Fluctuations: Random variations arising that don’t fit systematic trends.

Smoothing Techniques

Calculate smoothed values using mean or averaging methods for forecasting.

Page 10: Seasonal Indices and Deseasonalising Data

Seasonal Indices Calculation

Seasonal indices are averages normalized to 1 or 100%, reflecting performance relative to average.
Deseasonalised Data: Actual figure adjusted to remove seasonal effects for analysis.

Fitting Trend Lines

Use least squares regression on deseasonalised data to identify trends, adjusting forecasts accordingly.

Page 11: Chapters on Finance

Key Financial Concepts

Basic Terminology: Principal (V₀), Future value (Vn), interest rates (r), and payments (D).
Simple interest calculations and methods for linear growth and decay, including flat rate depreciation.

Compound Interest and Amortization

Compound interest for geometric growth, understanding effective rates, and repayment plans for loans.

Page 12: Amortization Tables

Understanding Loan Repayment

Regular payments lead to a decrease in interest vs. an increase in principal reduction for loans.
Create amortization tables that track interest, principal reduction, and outstanding balance.

Calculating Interest and Principal

Use monthly rates to determine amounts owed, decreasing principal over time with regular payments.

Page 13: Finance Solver in CAS

Using Finance Solver

Inputting financial variables to calculate present and future values based on investment parameters.
Analyzing outcomes for common loan structures like interest-only loans and annuities.

Page 14: Matrices Summary

Types of Matrices

Definitions: Simple, degenerate, connected, complete, subgraph, etc.
Operations involving matrix addition, subtraction, and scalar multiplication.

Matrix Properties

Explore identity and diagonal matrices, along with concepts of equivalent and symmetric matrices.

Page 15: Matrix Multiplication and Special Rules

Matrix Operations

Only matrices of compatible dimensions can be multiplied.
The resultant matrix dimensions will match specified rules concerning the order of matrices involved.

Page 16: Dominance and Transition Matrices

Transition Matrices

Record transitions between conditions in a network; use for behavior model examples.
Compute dominance scores using one-step and two-step assessments.

Steady State Solutions

Systems eventually stabilize over time; ensure all columns sum to 1 for regular matrices.

Page 17: Leslie Matrices

Application of Leslie Matrices

Models variations in population structures using age-group specific parameters.
Capture birth and survival rates to forecast changes in demographic sectors over time.

Page 18: Graphs and Networks

Graph Terminology

Different types of graphs including simple, connected, and complete graphs.
Key components such as edges, vertices, and graph properties such as planar forms.

Page 19: Eulerian Trails and Circuits

Key Paths in Graph Theory

Understanding conditions of Eulerian trails based on vertex degree.
Employ Dijkstra’s algorithm for shortest path calculations on weighted graphs.

Page 20: Maximum Flow and Cuts

Flow Network Concepts

Identifying maximum flow based on the capacity of the weakest link and calculating cut capacities.
Definitions of bipartite graphs, capacities, and methodologies used in analysis.

Page 21: Precedence Tables and Scheduling

Project Management Tools

Use precedence tables for scheduling tasks; dummy variables may be introduced.
Apply critical path methods to minimize completion times.

Page 22: Total Project Duration

Calculating Minimum Completion Time

Assignment and analysis of float times per activity using forward and backward scheduling techniques.

Page 23: Critical Path Analysis

Project Optimization Techniques

Critical path identification helps streamline project management, providing cost-effective strategies for reducing completion times.