1/115
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Continuous Data
Quantitative measurements on a continuous scale
Regression Equation
The line of best fit is described by a regression equation in the form of y = mx + c
WE use this to predict values of Y for a particular value of X (but only within the range of X values of our data set, as we cannot be sure the relationship is the same beyond the limits of our data set)
Nominal Data
Data in the form of categories with names. Data is non-quantative (its often counted to produce a discrete value)
Ordinal data
Data that is ranked / on a rating scale. Data is not quantitive because we do not know the size of the difference between categories
Descriptive Characteristics
Measures calculated from a data set which summarise some characteristic of the data (Quantify patterns from findings)
Median
The middle number in a sample when they are placed in order. If there are two numbers in the middle, the median is an average of the two
Measures of central Tendency
The mean, median and mode
Poisson distribution
Common distribution for discrete data; the shape is dependent on the mean.
- Mean is near to zero, heavily skewed normal distribution
- Mean is big, looks normal distribution
One tailed test
Specific; states "no positive / negative relationship"
- We are interested in only positive or only negative deviations of the test statistics
P value is HALVED
Pseudo-replication
Use of non-independent data points as if they were actually independent
Trend
A relationship between two variables, positive or negative
Correlation
A trend / relationship of two variables where changes coincide, yet casualty is not important / related
Covary variables
Spearman's rank correlation coefficient (rho)
Non-parametric statistic used to test the significance of correlations between variables
Can be used when normality / linearity violated
MS among
The average size of the difference between group means and grand means
Covariate
The word used to describe a continuous independent variable in situations where there is a mixture of continuous and independent variables, such as during ANCOVA
Induction
The derivation of general ideas from specific observations.
Hypothetico-deductive reasoning
An alternative to inductive reasoning; it argues that there is no way of proving a hypothesis to be true.
Hypothesis
An idea that is tentatively put forward to explain an observation. It may be generated by or contribute to a more general theory.
Theory
A set of general ideas or rules which are used to explain a group of observations.
Paradigm
A whole way of thinking / viewing the world
Paradigm Shift
A dramatic change in the way in which we think about a subject on science, when the evidence has accumulated in favour of rejecting a previous set of hypotheses or theories
Null Hypothesis
The form of hypothesis that we test formally, that predicts that nothing will happen / no effect will be observed / there is no difference or relationship between the two variables.
Statisitcs
The branch of mathematics that scientists use to provide a more objective assessment of patterns in data collected from experiments or observations
Sample Size
Number of individuals sampled (n)
Frequency
The number of times something occurs, or a count of the number of items in a particular category
Mean
The average of a sample of numbers - x̅
Mode
Most common number in a sample
Frequency Histogram
A graph showing the frequency of quantitative observations in each of a series of ordered numerical categories.
Discrete - Categories represent each possible total count made
Continuous - Categories are arbitrary
Distribution
The shape of a data set as seen on a frequency histogram.
Hypothetical distributions with mathematical equations includes normal, poisson and binomial
Deviate
The distance between a particular data point / observation and the mean ( also known as a residual in some context )
Sum of Squares
Total of all the squared deviates for a particular data set. It quantifies the magnitude of the total variability in a data set, but ignores the direction of that variability
(SS)
Variance
The average size of the squared devotees in a sample; a measure of variability in a data set.
Sample variance (s2) is an estimation of the population variance (σ2)
Standard Deviation
The average size of the deviates in a data set (s)
By squaring rooting the variance, the get a measure of the variation that is not affected by sample size and is in units we understand.
Population
All the individuals in a particular group
Sample
A subset of the population, chosen to represent the population
Normal Distribution
(Bell curve / Gaussian distribution)
A population of continuous data can have a "normal distribution' which attains a certain mathematical characteristic.
- Its bell shaped / symmetrical
- 68.5% of all points = within one S.D from the mean
Standard Error of the Mean
(SEM)
A measure of the confidence we have in our sample mean as an estimate of the real population mean (μ)
It is defined as standard deviation of a population of sample means
(SEM = S / root N)
Skew
The skew of a distribution of a sample
Skew to the right, long tail to the distribution on the right
Skew to the left, long tail to the distribution on the left
Not symmetrical (so not normal)
Statistical (parametric) tests
Tests which make several key assumptions about the distribution of the data from which they are calculated.
Non-Parametric Tests
Tests which make fewer assumptions about data (such as normal distribution)
Often deal with ranked data
Binomial Distribution
Good description of discrete data but only in situations where maximum possible count is close to the mean
Bar chart
Graph used for visualising differences between samples
Scatter graph
Type of graph normally used for visualising trends between variances
Experiment
Manipulation of a variable of interest in order to observe the effects on other variables
Control
A default where the manipulation of the variable being tested is not performed, used for comparison against the results of the experiment
Observational Experiment
A scientific study where data are collected but no manipulation is performed
Measurement Precision
A measurement is not precise if there is unbiased measurement error - they key is that imprecision is random ( just as likely to overestimate as you are to underestimate
Measurement Accuracy
A measurement is accurate if it is free from bias - which occurs when there is systematic error in your measurements resulting in a consistent over / underestimation
Confounding Variables
A variable that influences your results in a way that may be confused with the variable in which you are actually interested in. They are caused by a lack of independence in data points, avoided by measuring such variables to account for them or using appropriate control measurements
Can be confused with a real effect
Caused by systematic, non-random variation
Noise
Cause via random variation
Can make it tricky to spot a real variation
Order effect
The order of presenting the treatments affects the dependent variable
Replication
Repetition of an experimental manipulation or observation in identical circumstances. It allows you to gauge how much background or environmental variability there is in your data, regardless of the variable you are interested in. It increases the statistical power
Effect size
A large difference between two means / a steep slope of a trend
Statistical power
The degree of ability to detect the signal of an effect that you're interested in. More replication, large effect and low background variability results in a higher statistical power
Floor and ceiling effects
When a variable produces an effect below a certain (ceiling) or above (floor) a certain threshold. Above / below these thresholds, the signals cannot become any greater / lesser
Cause and effect
A manipulative experiment that is conducted to show that changes in A CAUSE changes in B
We otherwise do no know which way around in a significant relationship, cause and effect are
Reverse Causation
When causation is in the opposite direction to the hypothesis
A/D Observation
-easier
-Cheaper / Quicker
-Realistic
Tells us less about the cause and effect
-more confounding variables
-Possible Reverse Causation
A/D Experiment
-Difficult / Time consuming
-Expensive
-Artificial (Floor/Ceiling)
Tells us more about cause and effect
-Less confounding variables
-No reverse causation
Statistical Test
A test perfumed on your data to assess the validity of your Null hypothesis
P value
The probability that differences / trends could have arisen by chance, if the null hypothesis was true
Test statistic
Summarises the difference between samples
Treatment
Manipulation performed in an experiment
-Manipulated and control
Statistical Significance
When we conduct a statistical test, we compare our obtained probability (P value) and compare it to our arbiter threshold value.
If the probability is lower than this value, we say the effect is statistically significant and we can reject our Null Hypothesis.
Threshold values (significance level)
Threshold value is dependent on the particular scientific situation, and is set before data is collected so the decision is not influenced by subjective impressions of the data
Independent samples t-test
A parametric statistical test used to test for a difference between the means of two independent samples of continuous data - are the samples from the same population with a single mean
Independent samples t test - t test statistic
The test statistic 't' tells us about the size of the difference between the two samples.
t is big when variance is small and difference between means is big
Degrees of Freedom
A modified form of the sample size n, it represents the power of the statistical test
- df = n1 + n2 - ( ? )
This ? relates to the number of parameters being estimated in the test
Two tailed test
general; "there is / is not a relationship"
- We are interested in both positive and negative devotions of the test statistic
Type I error
Rejection of the Null hypothesis when it is in fact true.
At a p value lower than 0.05, there is a 5% chance we will make a Type I error
Type II error
The failure to reject the Null Hypothesis when it is in fact false.
The chance is influenced by experimental design, sample size, the chosen test and our threshold value
Independence
Data points are independent if they have nothing special in common except for the treatment or variable of interest.
Non-independence
Arises from repeated measures or non-random sampling
Causes confounded results; we cannot tell if observed differences are result of treatments or other confounding variables
Repeated Measures
Repeated observations made on the same subjects in an experiment.
Paired Design
An experimental design for collection of non-independent samples.
Before and after experiment
A paired design
Data are collected from the same group of individuals, before and after an experimental treatment. In this situation, the two data points are non-independent of each other, and the animals themselves act as the control.
Examines average change in variable
Time could still be confounding
Welch two-sample t-test
Used when variances of sample are significantly different, but data are still normal.
There is a small tweak to the degrees of freedom
Paired samples Wilcoxon test
Non-parametric equivalent of the paired t-test.
It assumes the samples are paired, rather than independent
Homogeneity of variance
When the variances in each sample in a statistical test are assumed to be the same (homogenous)
Transformation
We often assume data is normally distributed - if not we can try to transform in order to maintain a normal distribution
- Square root / logarithm of data
Arcsine transformation
Proportion data is rarely normally distributed; taking the arcsine f the square root of the proportions will allow us to transform the distribution of the collected data
Levenes test
A test for the homogeneity of variance of samples
H0 is that the variances ARE THE SAME
Shapiro-wilk test
A test for normality of sample distribution
H0 is that data ARE NORMALLY DISTRIBUTED
Two-sample wilcoxon test
Non-parametric equivalente of independent samples t test
Examines the difference between two samples of ranked data
H0 is that two samples come from a single population with a single mean rank
Chi-Sqaured test (X2)
A test used to examine differences between observed and expected counts / frequencies - we are asking if the frequencies of individual observations made in two or more categories are significantly different from the frequencies we would expect to find if H0 was true
df = (a-1)
contingency table
A table of observed counts or frequencies in a number of categories
Causal relationship
A trend / relationship between two variables where one variable causes changes in the other variable
Pearsons Correlation Coefficient
Parametric statistic used to test the significance of correlations between two variables
Both variables must be normally distributed and have a linear relationship
Data dredging
Use of certain statistics to test large numbers of possible relationships between variables in the absence of specific hypotheses formulated in advance
- useful for spotting patterns and generating new hypotheses
ANOVA (analysis of variance)
A Parametric statistical test for differences between any number of groups or samples, and can analyse differences in samples causes by more than one variable.
Factor (anova)
An independent variable affecting a sample in analysis of variance
Level (anova)
Each different value that each factor of anova could take
Multi-way ANOVA
ANOVA that tests more than one null hypothesis simultaneously
F ratio
Statistic used to test the null hypothesis in ANOVA, from the between / within SS
It allows us to compare the relative amounts of variation among and within groups
Large F shows a large variation between groups compared to within groups
Grand Mean
X bar bar
Mean of all the data points in all the groups / samples in ANOVA
Group Mean
X bar
Mean of the data points in an individual group / sample in ANOVA
SS among
The total amount of variation among (between) groups - adding up the squared differences between each group mean and the grand mean
SS within
Total amount of variation within groups - adding up squared differences between each data point and the relative group net.
MS within
The average size of the difference between the data points and the relative group mean
ANOVA table
Results of ANOVA presented in a table, showing among / within SS, MS, df, F and P