During this session, we focused on understanding and analyzing police crime data through Excel. The analysis aims to evaluate trends, attendance variations, and offers insights into crime statistics across different regions and time periods.
Purpose of Districts: There are 12 police districts which facilitate effective law enforcement by ensuring that each area has designated officers responsible for policing.
List of Districts: The districts include Northland, Auckland City, Wairarapa, Counties Manukau, Waikato, Bay of Plenty, Eastern, Central, Wellington, Tasman, Canterbury, and Southland. Understanding these boundaries is critical for analyzing geographic crime patterns.
NSOC Division: The Australia and New Zealand Standard Offense Classification (NSOC) was designed to categorize offenses systematically to simplify data analysis. Each offense is represented by a unique four-digit code, where:
The first two digits indicate the main category (e.g., homicide, acts intending to cause injury, etc.).
The last two digits specify the type of offense further.
This standardization aids in understanding various criminal activities systematically.
Components of the Dataset: The dataset consists of multiple columns, each serving as a variable that holds specific data.
Variables include: police district, year of offense, category of offense, type of legal action, and count of offenses.
For example, the police district variable encapsulates all 12 districts, indicating the geographical context of each offense.
Importance of Variables: Understanding what each variable represents is crucial before delving into data analysis, as it ensures accurate interpretation of the information contained within.
Students are encouraged to familiarize themselves with the dataset before processing the information. This includes:
Checking variable names and understanding their meanings.
Knowing the overall structure of the dataset to ensure data integrity during analysis.
It was advised to always verify the context of the dataset, including the time period covered and the methods used to collect the data, before starting analysis.
Students will explore the data, analyze each variable, and answer specific questions related to categorical and numerical variables. The focus is on the following:
Identify all police districts and ANSOC divisions present in the dataset.
Calculate the mean, median, and standard deviation for different offense categories based on filters applied to the dataset.
We highlighted the importance of saving work regularly, identifying relevant buttons (such as 'Save' or 'Save As'), and ensuring the Excel workbook maintains its original formatting.
Filtering is a vital tool in Excel for isolating relevant data based on defined criteria. Steps included:
Applying filters to reveal specific records (e.g., focusing on Auckland City in 2020 for illicit drug offenses).
Copying filtered data as needed while ensuring no hidden cells are included, promoting accurate analysis outcomes.
Finding Statistical Measures: The following functions were emphasized:
Mean: To calculate the mean of a filtered dataset, use the SUBTOTAL
function to ignore hidden rows: =SUBTOTAL(1, range)
.
Standard Deviation: Similarly, to obtain standard deviation, the command =SUBTOTAL(7, range)
was advised.
Median: Utilize the AGGREGATE
function, specifying to ignore hidden rows which helps ensure accurate calculations: =AGGREGATE(12, 5, range)
.
Importance of Values Only
pasting was discussed to avoid replicating formulas unintentionally while pasting calculated results.
Students were encouraged to engage actively with the data set and utilize the provided tools to complete their workshop tasks. They were reminded to submit their work before the deadline, emphasizing the value of clarifying uncertainties during the session for a better understanding of data analytics.