Basics of Statistics and Mathematics Unit 1/4

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/134

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

135 Terms

New cards

Descriptive Statistics

A branch of statistics that summarises and describes the main features of a dataset using measures such as central tendency, dispersion and skewness.

New cards

Central Tendency

A numerical value around which most data points cluster; represented by measures like mean, median and mode.

New cards

Variation / Dispersion

The extent to which data values scatter around the central value; described by range, variance and standard deviation.

New cards

Skewness

A statistic that measures the asymmetry of a distribution, indicating whether data are stretched to the left (negative) or right (positive).

New cards

Population

The complete set of individuals, items or measurements under investigation.

New cards

Sample

A subset of the population selected for analysis, ideally mirroring the population’s characteristics.

New cards

Population Parameter (μ)

A descriptive measure calculated from an entire population, such as the population mean denoted by the Greek letter mu (μ).

New cards

Sample Statistic (x̄)

A descriptive measure calculated from sample data, such as the sample mean denoted by x-bar (x̄).

New cards

Ungrouped Data

Raw, unorganised observations presented individually, suitable for small datasets.

New cards

Grouped Data

Data organised into class intervals with corresponding frequencies to simplify analysis of large datasets.

New cards

Class Interval

A continuous range of values in grouped data, often written as a closed interval [a, b].

New cards

Class Width (h)

The difference between the upper and lower boundaries of a class interval.

New cards

Mid-Value (mi)

The midpoint of a class interval, calculated as (lower limit + upper limit) ÷ 2.

New cards

Arithmetic Mean (AM)

The sum of all data values divided by the number of observations; the most common measure of central tendency.

New cards

Population Mean Formula

μ = (Σ Xi) / N, where Σ Xi is the sum of all population values and N is the population size.

New cards

Sample Mean Formula

x̄ = (Σ Xi) / n, where Σ Xi is the sum of sample values and n is the sample size.

New cards

Direct Method (Mean)

Computation of the mean by summing all observed values (or fi xi for grouped data) and dividing by the total number of observations.

New cards

Indirect / Shortcut Method (Mean)

Mean calculation using an assumed mean A and deviations di: x̄ = A + (Σ fi di) / n.

New cards

Step Deviation Method

A shortcut mean formula for grouped data: x̄ = A + [(Σ fi di) / n] × h, where h is class width.

New cards

Weighted Arithmetic Mean

Mean that multiplies each value by a weight reflecting its importance: x̄w = Σ wi xi / Σ wi.

New cards

Median

The middle value that divides an ordered dataset into two equal halves; the 50th percentile (P50) and second quartile (Q2).

New cards

Median Class

In grouped data, the class interval containing the (n / 2)th observation after cumulative frequencies are calculated.

New cards

Median Formula (Grouped Data)

Median = l + [(n / 2 − cf) / f] × h, where l is lower class limit, cf cumulative frequency before the class, f class frequency and h class width.

New cards

Mode

The value or class interval with the highest frequency in a dataset.

New cards

Modal Class

For grouped data, the class interval with the greatest frequency.

New cards

Mode Formula (Grouped Data)

Mode = l + [(fm − fm-1) / (2 fm − fm-1 − fm+1)] × h, where fm is modal class frequency.

New cards

Unimodal Distribution

A frequency distribution with one mode (single peak).

New cards

Bimodal Distribution

A distribution possessing two values of equal highest frequency, resulting in two peaks.

New cards

Partition Values

Statistical measures (quartiles, deciles, percentiles) that divide ordered data into equal-sized parts.

New cards

Quartiles (Q1, Q2, Q3)

Values that split an ordered dataset into four equal parts, marking 25 %, 50 % and 75 % positions.

New cards

Quartile Formula (Grouped Data)

Qi = l + [(i × n / 4 − cf) / f] × h, where i = 1, 2, 3.

New cards

Deciles (D1–D9)

Nine values dividing ordered data into ten equal parts, each representing 10 % of the observations.

New cards

Decile Formula (Grouped Data)

Di = l + [(i × n / 10 − cf) / f] × h, where i = 1 … 9.

New cards

Percentiles (P1–P99)

Ninety-nine values dividing ordered data into one hundred equal parts; Pk marks the kth percent.

New cards

Percentile Formula (Grouped Data)

Pi = l + [(i × n / 100 − cf) / f] × h, where i = 1 … 99.

New cards

Cumulative Frequency

The running total of frequencies up to and including a given class boundary.

New cards

Ogive

A cumulative frequency curve used to estimate median, quartiles, deciles and percentiles graphically.

New cards

Symmetrical Distribution

A dataset where mean = median = mode because values are evenly distributed around the centre.

New cards

Positively Skewed Distribution

A distribution with a long right tail where Mean > Median > Mode.

New cards

Negatively Skewed Distribution

A distribution with a long left tail where Mean < Median < Mode.

New cards

Outlier

An observation markedly distant from other values in the dataset, potentially distorting the mean.

New cards

Ideal Measure of Central Tendency

A measure that is rigidly defined, easy to compute, based on all observations, algebraically tractable, minimally variable across samples and resistant to extreme values.

New cards

Statistics

Branch of mathematics concerned with collecting, analysing, interpreting, and presenting data.

New cards

Descriptive Statistics

Methods that summarise and organise data using measures such as mean, median, mode, and graphs.

New cards

Inferential Statistics

Techniques that draw conclusions about a population based on data from a sample, e.g., hypothesis testing.

New cards

Functions of Statistics

Sequential activities of data collection, organisation, analysis, and interpretation to support decision-making.

New cards

Data Collection

Process of gathering relevant information to meet a study’s objectives.

New cards

Direct Data Collection

Primary data gathered firsthand via surveys, interviews, observation, or experiments.

New cards

Indirect Data Collection

Secondary data obtained from existing sources such as reports, databases, or historical records.

New cards

Tabulation

Systematic arrangement of data in rows and columns for easy comparison and analysis.

New cards

Class Interval

Numerical range that groups data values, defined by upper and lower limits.

New cards

Frequency (Absolute Frequency)

Number of times a particular observation occurs in a data set; denoted by f.

New cards

Cumulative Frequency

Running total of frequencies for all classes up to a specified point in an ordered data set.

New cards

Frequency Distribution

Table that shows the number of observations falling into each class interval.

New cards

Contingency Table

Cross-tabulation displaying the frequency distribution of two or more categorical variables.

New cards

Exploratory Data Analysis (EDA)

Initial investigation of data to uncover patterns, spot anomalies, and test assumptions through visualisation.

New cards

Measures of Central Tendency

Single values (mean, median, mode) that describe the centre of a data set.

New cards

Scatter Plot

Graph plotting paired numerical data to reveal relationships or correlations between two variables.

New cards

Bar Chart

Graphical display of categorical data where bar heights represent frequencies or proportions.

New cards

Histogram

Two-dimensional graph of continuous data showing frequencies within adjoining class intervals.

New cards

Pie Chart

Circular graph divided into sectors representing proportional parts of a whole.

New cards

Ogive (Cumulative Frequency Curve)

Graph plotting cumulative frequency against upper class limits to show data accumulation.

New cards

Box Plot

Box-and-whisker diagram summarising median, quartiles, and outliers of numerical data.

New cards

Sampling

Technique of selecting a subset of a population to estimate characteristics of the whole.

New cards

Population

Entire set of individuals or items about which information is sought.

New cards

Sample

Subset of a population selected for study, ideally reflecting population characteristics.

New cards

Sampling Frame

Complete list or set of criteria that defines all elements eligible for sampling.

New cards

Probability Sampling

Sampling approach where every population member has a known, non-zero chance of selection.

New cards

Non-Probability Sampling

Sampling where some population members may have unknown or zero chance of selection.

New cards

Simple Random Sampling

Method giving each population element an equal chance of selection, minimising bias.

New cards

Systematic Sampling

Selecting every kᵗʰ element from an ordered population after a random start.

New cards

Stratified Sampling

Dividing population into homogeneous strata and randomly sampling each stratum proportionally.

New cards

Cluster Sampling

Dividing population into clusters, randomly selecting clusters, then sampling all or some units within them.

New cards

Sample Size

Number of observational units included in a sample.

New cards

Sampling Bias

Systematic error caused by non-representative sampling, leading to incorrect conclusions.

New cards

Numerical Data

Data expressed as numbers, suitable for arithmetic operations; includes discrete and continuous types.

New cards

Categorical Data

Data consisting of labels or categories, analysed by counting frequency of occurrence.

New cards

Qualitative Data

Non-numeric data describing qualities, attributes, or opinions.

New cards

Quantitative Data

Numeric data representing measured quantities, enabling statistical calculations.

New cards

Discrete Data

Numerical data that take only specific, separate values (e.g., number of students).

New cards

Continuous Data

Numerical data that can take any value within a range (e.g., height, weight).

New cards

Primary Data

Information collected firsthand specifically for the current research purpose.

New cards

Secondary Data

Information previously collected for another purpose and reused in new research.

New cards

Structured Data

Organised data in predefined formats such as tables or spreadsheets.

New cards

Unstructured Data

Data lacking a predefined format, e.g., text, images, videos.

New cards

Static Data

Data that remain unchanged over time, typically historical or reference data.

New cards

Dynamic Data

Data that change frequently and may be updated in real time.

New cards

Sensitive Data

Information requiring special protection due to confidentiality, e.g., medical or financial records.

New cards

Non-Sensitive Data

Information that can be shared freely without compromising privacy or security.

New cards

Theoretical Distribution

A probability-based mathematical model that predicts how values are expected to behave under ideal conditions (e.g., normal, binomial, Poisson, exponential).

New cards

Empirical Distribution

A distribution derived from observed data rather than theoretical probability rules.

New cards

Random Experiment

A process of measurement or observation with uncertain outcome but well-defined possible results.

New cards

Outcome

A single possible result of a random experiment or trial.

New cards

Sample Space (S)

The set of all possible outcomes of a random experiment.