Focus mostly on Week 7 & 8
Week 7:
Data is stored in spreadsheets.
SQL ( Structered Query Language) - language used to interact with the database.
Data integrity is ensured, can be accessed quickly, and easily shared.
SQL Script: a SQL script is a set of SQL commands saved as a file in SQL Scripts.
CREATE TABLE: Creates a new table in a database
DROP TABLE: Removes table in a database
SELECT: Allows you to read data and display it. this is called a query
LIMIT: Always the last part of a query
ORDER BY: Allows us to sort our results using the data. (Comes after SELECT and FROM statements and before LIMIT).
DESC: Comes after ORDER BY statement by largest
WHERE: Use symbols. Goes after FROM but before ORDER BY
LIKE: pulls rows with ‘me’ in text.
NOT: used with LIKE
Week 9:
Visualization: lets you see through the complexity of the data
Data interpreter: Data files are not always ready and might require extensive prep work.
NULLS: NULL mean that some empty cells in your data (select Null field and Ctrl-click, select exclude)
Week 10:
Geocoding:
Process of converting textual addresses into geographic coordinates (latitude and longitude).
Location Intelligence, Personalization, Efficiency, and Research & Policy is important of Geocoding.
Method & Tools of Geocoding:
Geocoding services are specialized platforms that take textual addresses and convert them into geographic coordinates (essential for mapping and spati na access and geocoding requests.
Geographic hierarchies: are a structured system of levels or layers
Week 11
The fundamental role of temporal aspects find datasets is to capture and represent the timing or sequencing of events, observations, or measurements over time.
Conditional Formatting for Time Analysis
Conditional formatting enhances data visualization by applying formatting rules based on data values.
DATEDIF function finds the amount of time between two dates in excel.
Week 12:
Survival Analysis:
Can be used to model the expected time-to-event.
Is a powerful set of analytic techniques for understanding customers.
Estimates how long it takes for a particular event to happen.
RFM: Recency, frequency, monetary is a powerful method to segment customers based on their transaction behaviour.
Understanding Customer Behaviour
Behavioural Segmentation
Sentiment Analysis
Channel Preference
20% will be from Week 1 to 6
Week 1:
Big Data: refers to large datasets
7 V’s of Big Data
- Volume, variety, veracity. velocity, variability, visualization and value.
Variety: increasing diversity of data generation
Variability: rapid change
Velocity: high speed
Veracity: data reliability and accuracy
Visualization: visual of data
Value: data analysis
Data science: promotes taking information and knowledge from data
Business Intelligence: systems and applications for collecting and analyzing data
Business Analytics: techniques, technologies that are used to analyze business data
Data analysis: process of respecting data
Big data analysis: advanced analytic techniques
Categories of Data Analyics
Descriptive, Diagonostic, Predictive, Prescriptive
Week 2:
Pattern recognition:
A pattern is a design or a model
Patterns can be
Temporal
Spatial
Funtional
Data Processing Chain
Data, Data Base, Data Warehouse, Data Mining, Data Visualization
Data Mining Techniques
Supervised Learning: works with labelled data
Unsupervised Learning: works with unlabelled data
What is a query
A question, especially one addressed to an official or organization.
API: Applixation programming interface