1/37
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What are large datasets?
Large datasets in science are expansive collections of information that exceed the capacity of traditional tools for storage, management, and analysis
What are common features of large datasets?
They contain many data points, multiple variables, patterns, trends, and often require digital tools for analysis.
How are large datasets collected?
Large datasets can be collected through surveys, experiments, sensors, satellites, databases, or online tracking systems.
What are large datasets used for?
They are used to identify patterns, make predictions, support research, and solve real-world problems.
Give an example of a large dataset application
Weather forecasting uses large datasets to predict climate and weather patterns.
How do you develop a question using a dataset?
Create a question that can be answered by analysing variables within the dataset.
Example: “Is there a relationship between study time and test scores?”
What is descriptive statistics?
Descriptive statistics are methods used to summarise and describe data.
What is the mean?
The mean is the average value found by adding all values and dividing by the total number of values.
What is the median?
The median is the middle value when data is arranged in order.
What is the mode?
The mode is the value that appears most frequently in a dataset.
What is range?
Range is the difference between the highest and lowest values in a dataset.
What is the IQR?
The IQR measures the spread of the middle 50% of data.
What is standard deviation?
Standard deviation measures how spread out data values are from the mean.
What are the benefits of descriptive statistics?
They simplify large datasets, identify patterns, and make data easier to communicate.
What are weaknesses of descriptive statistics?
They may hide important details, ignore causes, and sometimes oversimplify data.
What is univariate analysis?
Univariate analysis examines one variable at a time.
What is included in a univariate analysis?
Measures of centre, spread, graphs, and identifying patterns or anomalies.
What is a histogram?
A histogram displays the frequency distribution of continuous numerical data.
What is a box plot?
A box plot shows the median, quartiles, spread, and possible outliers in data.
What is bivariate analysis?
Bivariate analysis examines the relationship between two variables.
What is a scatter plot?
A scatter plot graphs pairs of data values to show relationships between variables.
What is correlation?
Correlation describes the strength and direction of a relationship between two variables.
What does a positive correlation mean?
As one variable increases, the other also increases.
What does a negative correlation mean?
As one variable increases, the other decreases.
What does zero correlation mean?
There is no relationship between the variables.
What is a correlation coefficient?
A correlation coefficient is a number between -1 and +1 that measures the strength of correlation.
How do you interpret a correlation coefficient?
Close to +1 = strong positive correlation
Close to -1 = strong negative correlation
Close to 0 = weak or no correlation
Does correlation prove causation?
No. Correlation only shows association, not cause and effect.
What conditions are needed to establish causation?
There must be a clear relationship, controlled variables, repeated evidence, and logical scientific explanation.
Why are large datasets important in science?
They improve accuracy, reliability, and allow scientists to identify trends and patterns.
What is an outlier/anomaly in a dataset?
An anomaly is a data point that does not fit the overall pattern.
Why is identifying outliers important?
Outliers may indicate errors, unusual events, or important discoveries.
How should findings from data analysis be communicated?
Using clear scientific language, graphs, tables, statistics, and conclusions.
Why are graphs useful in data analysis?
Graphs help visualise patterns, trends, and relationships clearly.
What are evidence based decisions?
Decisions made using reliable data and scientific evidence
Why should implications of decisions be assessed?
To understand possible effects, risks, benefits, and consequences of decisions.
Give an example of using data for decision making
Using pollution data to decide whether stricter environmental laws are needed.
Why is statistical analysis important in science?
Statistical analysis helps scientists interpret data accurately and determine whether results are meaningful.