Comprehensive Notes – Intro to Data Analytics & Excel Workshop

Predictability & Accountability of Data-Driven Decisions

  • Data analytics eliminates “guesswork/bluffs” by grounding every forecast in historical data.
    • Weather, stock prices, demand forecasting given as examples.
  • Predictability increases organisational accountability:
    • Decisions can be defended because they are traceable to data.
    • Models are continuously improved by feeding new data, so recommendations evolve.

Four Types of Analytics

  • Descriptive – “What happened?” (keyword: summarise / describe)
    • Totals, averages, counts.
    • Tools: Excel, Tableau, basic SQL.
  • Diagnostic – “Why/how did it happen?” (keyword: why/how)
    • Root-cause, drill-down, correlation analyses.
    • Tools: SQL, Python, R.
  • Predictive – “What might happen?” (keyword: forecast)
    • Regression, time-series, ML models.
    • Tools: SAS, Python, R.
  • Prescriptive – “What should we do?” (keyword: recommend/ prescribe)
    • Optimisation, simulation, decision trees.
    • Tools overlap with predictive plus optimisation packages.

Interactive Classification Exercise (Quiz Highlights)

  • Recommending a price based on demand → Prescriptive (giving a recommendation).
  • Forecasting weather next week → Predictive (future-oriented).
  • Identifying why sales fell in a month → Diagnostic (answers why).
  • Summarising total store sales → Descriptive (reporting what happened).
  • Estimating tomorrow’s website traffic → Predictive (future estimate).
    • Emphasised usage of keywords to classify analytics quickly.

Data Visualization Principles

  • Visuals (charts/graphs/maps/dashboards) turn raw data into patterns & stories.
  • Pre-attentive attributes (colour, size, orientation, flicker) processed in 250ms\approx 250\,\text{ms} enabling instant comprehension.
  • Chart selection guidance:
    • Categorical → bar/pie.
    • Continuous trend → line/area.
    • Distribution → histogram/box plot.
  • Message clarity > “fanciness”. Wrong chart defeats purpose.
  • Video (Kurzgesagt/ColumnFive) slogan: “Your message is only as good as your ability to share it.”

Microsoft Excel Primer (UI Tour)

  • Menu bar → Ribbon (dynamic) → Formula bar → Worksheet grid (rows numbers, columns letters).
  • Workbook = collection of worksheets.
  • Cell reference = column + row (e.g., C6C6).

Data Cleaning & Formatting in Excel

  • Resize columns: double-click border for auto-fit; prevents ##### display for numbers.
  • Dates stored as serial numbers (days since 1-Jan-1900).
    • Format via Format Cells → Category: Date (e.g., dd mm yyyydd\ mm\ yyyy).
  • Currency formatting (₹ chosen for Indian data).
  • Keep raw numeric format during analysis; apply currency/percent at presentation stage to avoid clutter.

Removing Duplicates (Data → Remove Duplicates)

  • Decide key fields (e.g., Name+Department+Region) before deletion.
    • Using only Name risks false deletion of distinct people.
  • Steps:
    1. Select range.
    2. Data ▶ Remove Duplicates.
    3. Tick columns that form unique-key.
    4. Review count of removed vs remaining records.

Essential Shortcuts

  • Select to edge: Ctrl + Shift + Arrow\text{Ctrl + Shift + Arrow}
  • Select full table: Ctrl + A\text{Ctrl + A} inside block.
  • Auto-fill: drag tiny square or double-click to fill downward.

Sales & Statistical Analyses in Excel

  • Total sales per employee:
    Total=Q1+Q2+Q3+Q4\text{Total} = Q1 + Q2 + Q3 + Q4 (or SUM(G2:J2)\text{SUM}(G2:J2)).
  • Convert formulas to constants: Copy ▶ Paste Special → Values.
  • Average sales per quarter:
    xˉQ1=Q1n\bar{x}_{Q1}=\dfrac{\sum\text{Q1}}{n} via AVERAGE()\text{AVERAGE}().
  • Highest/Lowest totals: MAX()\text{MAX}() and MIN()\text{MIN}() functions.

Lookup & Reference Functions

  • VLOOKUP(lookupvalue, tablearray, col_index, FALSE)
    • Example to fetch Performance Rating:
      VLOOKUP(A2,$C$1:$T$1012,12,0)\text{VLOOKUP}(A2,\$C\$1{:}\$T\$1012,12,0)
  • Absolute reference (\$ signs) locks table range during auto-fill.
  • Flash Fill recognises patterns: type EMP0001, EMP0002… auto-completes IDs.

Pivot Tables & Sorting

  • Insert ▶ PivotTable (new or existing sheet).
  • Drag Department → Rows, Total Sales → Values gives departmental totals.
  • Sort within pivot: Right-click field → More Sort Options → Descending by Sum of Sales.
  • To find top performer per department: add Employee ID under Rows, then sort descending inside each group.

Charting & Visualization in Excel

  • Create charts from pivot outputs or any 2-column table:
    • Bar/Column for loan amount by department.
    • Line for quarterly sales trend.
    • Pie/Donut for sales proportion.
  • Customise via Chart Elements (➕), Styles (🎨), and Format pane.

Conditional Logic with IF

  • New column “Feedback”:
    IF(Rating4,"Excellent","NeedsImprovement")\text{IF}(\text{Rating}\ge 4,\,"Excellent",\,"Needs Improvement")
  • Populates row-wise, then paste as values.

Bonus & Salary Adjustments

  • Updated bonus % rule:
    New Bonus%=Old Bonus%×Rating5\text{New Bonus\%}=\frac{\text{Old Bonus\%}\times\text{Rating}}{5}
  • Bonus Amount:
    Bonus=Salary×Bonus%100\text{Bonus}=\frac{\text{Salary}\times \text{Bonus\%}}{100}
  • Updated Salary:
    Salary<em>new=Salary</em>old+Bonus\text{Salary}<em>{\text{new}} = \text{Salary}</em>{\text{old}} + \text{Bonus}

Descriptive Statistics in Excel

  • Mean (average): AVERAGE(range)\text{AVERAGE}(\text{range})
  • Mode (most frequent): MODE.SNGL(range)\text{MODE.SNGL}(\text{range})
  • Median (middle value): MEDIAN(range)\text{MEDIAN}(\text{range})
  • Standard Deviation: STDEV.S(range)\text{STDEV.S}(\text{range})
    • Low σ\sigma ⇒ tightly clustered, few outliers; High σ\sigma ⇒ high variability.

Ethical, Philosophical & Practical Implications

  • Using data responsibly builds trust; poor visual choice or careless duplicate removal can mislead stakeholders.
  • Continuous data refresh emphasises an iterative mindset: analytics is not one-off.
  • Medical analogy (diagnostic) reminds that data analysts, like doctors, must trace symptoms (metrics) back to causes before prescribing action.

Real-World Connections & Next Steps

  • Skills covered translate directly to internships/entry-level analyst roles: reporting, dashboarding, KPI tracking.
  • Assignment portal demo: upload deliverables under “Introduction to Data” before specified deadline.
  • Attendance tracking via shared Google Sheet underscores accountability even in virtual learning.