In-Depth Notes on Scientific Experiment and Data Analysis
Objective of the Session:
Discussion on scientific experiment relevant to assessment item two.
No use of R Commander during this session; focus is on the experimental process.
Review of central tendency, graphical summary, and explanation of data variability.
Key Concepts in the Discussion:
GPS and Variability Map:
GPS technology aids in determining location variability.
RTK GPS provides centimeter-level accuracy.
Variability map generated indicates low yield (red) and high yield (green) regions in paddocks.
Scientific Method:
Systematic investigation of phenomena to generate new knowledge.
Six steps: Ask a question, conduct background research, form a hypothesis, conduct experiments, analyze data, and draw conclusions.
Hypothesis Formation:
Initial hypothesis formed that soil acidity contributes to yield variability in paddocks of the same soil type.
Limited data led to developing hypotheses regarding potential causes for variability (soil acidity, nitrogen deficiency, diseases, etc.).
Validation of hypothesis through experimental methods, particularly in a controlled glasshouse.
Experimental Design and Procedures:
Three soil types used for treatments:
Good soil, poor soil, and poor soil with lime.
Lime applied to increase soil pH (neutralize acidity).
Students divided into groups, each responsible for different treatments.
Ensure randomization to minimize systemic error during soil treatment placement.
Data Collection and Analysis:
Measurement of plant height and biomass after a six-week growth period.
Use of paper bags for biomass measurement; plants dried in an oven to obtain dry biomass data.
Data collection in preparation for assessment item two, which includes analysis of plant height and dry biomass.
Introduction of ANOVA:
ANOVA (Analysis of Variance) to be used for data analysis.
Discussion of how to generate graphs and analyze datasets in R Commander will follow in subsequent sessions
Understanding Central Tendency:
Definitions:
Mean: Average calculated by summing data points and dividing by the number of observations.
Mode: Most frequently occurring data value.
Median: Middle value when data is arranged in ascending order.
Considerations for choosing mean vs. median based on data distribution shape and presence of outliers.
Variation in Data:
Importance of understanding variation, both within individual pots (experimentally) and between pots in a dataset.
Frequency histograms as a method to visualize how often each value occurs within a dataset.
Grouping of data points (binning) for larger datasets to simplify representation and analysis.
Effect of Outliers:
Discussion on skewness in data distribution affecting mean and median calculations.
Left-skewed data indicates most outliers on the left, affecting central tendency values.
Right-skewed data shows outliers on the right, again influencing mean and median values in opposite directions.
Final Notes and Recommendations:
Importance of summarizing data accurately and applying statistical analysis techniques effectively.
Instructions provided on what sections of the report can be written now based on completed processes during the course.
Reminder for students to prepare for quizzes based on the content covered.
Emphasis on comprehension of experimental design and application of statistical tools as critical for future assessment tasks.