Comprehensive Notes – Intro to Data Analytics & Excel Workshop
Predictability & Accountability of Data-Driven Decisions
- Data analytics eliminates “guesswork/bluffs” by grounding every forecast in historical data.
- Weather, stock prices, demand forecasting given as examples.
- Predictability increases organisational accountability:
- Decisions can be defended because they are traceable to data.
- Models are continuously improved by feeding new data, so recommendations evolve.
Four Types of Analytics
- Descriptive – “What happened?” (keyword: summarise / describe)
- Totals, averages, counts.
- Tools: Excel, Tableau, basic SQL.
- Diagnostic – “Why/how did it happen?” (keyword: why/how)
- Root-cause, drill-down, correlation analyses.
- Tools: SQL, Python, R.
- Predictive – “What might happen?” (keyword: forecast)
- Regression, time-series, ML models.
- Tools: SAS, Python, R.
- Prescriptive – “What should we do?” (keyword: recommend/ prescribe)
- Optimisation, simulation, decision trees.
- Tools overlap with predictive plus optimisation packages.
Interactive Classification Exercise (Quiz Highlights)
- Recommending a price based on demand → Prescriptive (giving a recommendation).
- Forecasting weather next week → Predictive (future-oriented).
- Identifying why sales fell in a month → Diagnostic (answers why).
- Summarising total store sales → Descriptive (reporting what happened).
- Estimating tomorrow’s website traffic → Predictive (future estimate).
- Emphasised usage of keywords to classify analytics quickly.
Data Visualization Principles
- Visuals (charts/graphs/maps/dashboards) turn raw data into patterns & stories.
- Pre-attentive attributes (colour, size, orientation, flicker) processed in ≈250ms enabling instant comprehension.
- Chart selection guidance:
- Categorical → bar/pie.
- Continuous trend → line/area.
- Distribution → histogram/box plot.
- Message clarity > “fanciness”. Wrong chart defeats purpose.
- Video (Kurzgesagt/ColumnFive) slogan: “Your message is only as good as your ability to share it.”
Microsoft Excel Primer (UI Tour)
- Menu bar → Ribbon (dynamic) → Formula bar → Worksheet grid (rows numbers, columns letters).
- Workbook = collection of worksheets.
- Cell reference = column + row (e.g., C6).
- Resize columns: double-click border for auto-fit; prevents ##### display for numbers.
- Dates stored as serial numbers (days since 1-Jan-1900).
- Format via Format Cells → Category: Date (e.g., dd mm yyyy).
- Currency formatting (₹ chosen for Indian data).
- Keep raw numeric format during analysis; apply currency/percent at presentation stage to avoid clutter.
Removing Duplicates (Data → Remove Duplicates)
- Decide key fields (e.g., Name+Department+Region) before deletion.
- Using only Name risks false deletion of distinct people.
- Steps:
- Select range.
- Data ▶ Remove Duplicates.
- Tick columns that form unique-key.
- Review count of removed vs remaining records.
Essential Shortcuts
- Select to edge: Ctrl + Shift + Arrow
- Select full table: Ctrl + A inside block.
- Auto-fill: drag tiny square or double-click to fill downward.
Sales & Statistical Analyses in Excel
- Total sales per employee:
Total=Q1+Q2+Q3+Q4 (or SUM(G2:J2)). - Convert formulas to constants: Copy ▶ Paste Special → Values.
- Average sales per quarter:
xˉQ1=n∑Q1 via AVERAGE(). - Highest/Lowest totals: MAX() and MIN() functions.
Lookup & Reference Functions
- VLOOKUP(lookupvalue, tablearray, col_index, FALSE)
- Example to fetch Performance Rating:
VLOOKUP(A2,$C$1:$T$1012,12,0)
- Absolute reference (\$ signs) locks table range during auto-fill.
- Flash Fill recognises patterns: type EMP0001, EMP0002… auto-completes IDs.
Pivot Tables & Sorting
- Insert ▶ PivotTable (new or existing sheet).
- Drag Department → Rows, Total Sales → Values gives departmental totals.
- Sort within pivot: Right-click field → More Sort Options → Descending by Sum of Sales.
- To find top performer per department: add Employee ID under Rows, then sort descending inside each group.
Charting & Visualization in Excel
- Create charts from pivot outputs or any 2-column table:
- Bar/Column for loan amount by department.
- Line for quarterly sales trend.
- Pie/Donut for sales proportion.
- Customise via Chart Elements (➕), Styles (🎨), and Format pane.
Conditional Logic with IF
- New column “Feedback”:
IF(Rating≥4,"Excellent","NeedsImprovement") - Populates row-wise, then paste as values.
Bonus & Salary Adjustments
- Updated bonus % rule:
New Bonus%=5Old Bonus%×Rating - Bonus Amount:
Bonus=100Salary×Bonus% - Updated Salary:
Salary<em>new=Salary</em>old+Bonus
Descriptive Statistics in Excel
- Mean (average): AVERAGE(range)
- Mode (most frequent): MODE.SNGL(range)
- Median (middle value): MEDIAN(range)
- Standard Deviation: STDEV.S(range)
- Low σ ⇒ tightly clustered, few outliers; High σ ⇒ high variability.
Ethical, Philosophical & Practical Implications
- Using data responsibly builds trust; poor visual choice or careless duplicate removal can mislead stakeholders.
- Continuous data refresh emphasises an iterative mindset: analytics is not one-off.
- Medical analogy (diagnostic) reminds that data analysts, like doctors, must trace symptoms (metrics) back to causes before prescribing action.
Real-World Connections & Next Steps
- Skills covered translate directly to internships/entry-level analyst roles: reporting, dashboarding, KPI tracking.
- Assignment portal demo: upload deliverables under “Introduction to Data” before specified deadline.
- Attendance tracking via shared Google Sheet underscores accountability even in virtual learning.