Data Analytics and Visualization - Lecture Notes

0.0(0)
studied byStudied by 12 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/66

flashcard set

Earn XP

Description and Tags

A set of vocabulary-style flashcards covering key concepts from the lecture notes on data, analytics, quality, databases, time series, visualization, and SQL.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

67 Terms

1
New cards

Data

Raw facts and figures used by computers to support decision making (input before processing).

2
New cards

Information

Processed data that informs decisions and actions.

3
New cards

Big Data

Extremely large and diverse data sets that require special methods to analyze.

4
New cards

6 V of Big Data

The six characteristics: Volume, Variety, Velocity, Veracity, Value, and Variability.

5
New cards

Volume

The amount of data available.

6
New cards

Variety

Different data types, including structured, semi-structured and unstructured data.

7
New cards

Velocity

The speed at which data is generated and processed.

8
New cards

Veracity

The degree of trustworthiness or reliability of data.

9
New cards

Value

Business value derived from data.

10
New cards

Variability

The ways data can be used or vary in interpretation.

11
New cards

Data analytics

The science of analyzing raw data and turning it into useful information.

12
New cards

Data analysis

A subset of data analytics focused on evaluating data to gain insights.

13
New cards

Descriptive analytics

Analyses that describe what happened in the data.

14
New cards

Diagnostic analytics

Analyses that explain why something happened.

15
New cards

Predictive analytics

Analyses that forecast what will happen in the future.

16
New cards

Prescriptive analytics

Analyses that suggest actions to make something happen.

17
New cards

Data culture

An organizational approach where decisions are data-informed, data-driven, and data-inspired.

18
New cards

Data-informed

Decisions guided by data insights.

19
New cards

Data-driven

Decisions primarily guided by data evidence.

20
New cards

Data-inspired

Decisions influenced by data but not solely determined by it.

21
New cards

Cognitive bias

Systematic errors in thinking that data helps reduce or counteract.

22
New cards

Competitive advantage

Gaining an edge over competitors through effective use of data.

23
New cards

Data literacy gap

96% of executives discount data they don't understand.

24
New cards

Problem Statement

Steps to frame business problems for data goals: understand situation, define goals, translate to data goals, frame problem, build metrics.

25
New cards

Data Wrangling

Cleaning, structuring, and enriching data for analysis.

26
New cards

Data quality

How good data is for use, including dimensions like accuracy, completeness, validity, consistency, integrity, and uniqueness.

27
New cards

Uniqueness

No duplicates; each value is distinct where required.

28
New cards

Accuracy

Data correctly reflects the real world or source.

29
New cards

Completeness

All required data is present.

30
New cards

Validity

Data conforms to defined formats and business rules.

31
New cards

Consistency

Data remains uniform across datasets and systems.

32
New cards

Integrity

Data relationships and structure are maintained accurately.

33
New cards

Data discovery

Understanding what the data says and implies.

34
New cards

Data structuring

Transforming data into a standard tabular format with variables (columns) and observations (rows).

35
New cards

Dataset

A collection of values organized for analysis.

36
New cards

Variable

A column representing a measured attribute.

37
New cards

Observation

A row representing measurements on a unit across attributes.

38
New cards

Relational database

A database organized into related tables with predefined relationships.

39
New cards

Database schema

Blueprint defining how data is organized and related in a relational database.

40
New cards

Time series data

Data points indexed in time order (time-stamped data).

41
New cards

Trend

General direction in data over time.

42
New cards

Cyclic variations

Regular cycles in data beyond seasonal patterns.

43
New cards

Seasonal variations

Regular patterns repeating within a fixed period (e.g., yearly).

44
New cards

Random movements

Unpredictable, irregular fluctuations in data.

45
New cards

Excel data types

Common data types in Excel including text, numbers, dates, times, boolean, and errors.

46
New cards

Text

Alphanumeric strings stored in cells.

47
New cards

Number

Numeric values (integers or decimals).

48
New cards

Date

Calendar date values.

49
New cards

Time

Time values (clock time).

50
New cards

Boolean

TRUE (1) or FALSE (0) values.

51
New cards

Error

Spreadsheet error values like #DIV/0, #N/A, etc.

52
New cards

SQL data types

Common SQL data types such as TEXT, CHAR, VARCHAR, NUMBER, FLOAT, INTEGER, BOOLEAN, DATE, TIME, DATETIME, TIMESTAMP, INTERVAL, NULL.

53
New cards

TEXT

Text data type in SQL for large strings.

54
New cards

CHAR

Fixed-length character data type in SQL.

55
New cards

VARCHAR

Variable-length character data type in SQL.

56
New cards

NUMBER

Numeric data type in some SQL dialects.

57
New cards

FLOAT

Floating-point numeric data type.

58
New cards

INTEGER

Whole-number numeric data type.

59
New cards

BOOLEAN

TRUE (1) or FALSE (0) data type.

60
New cards

DATE

Date value without time.

61
New cards

TIME

Time value without date.

62
New cards

DATETIME

Date and time value.

63
New cards

TIMESTAMP

Date and time value with possible timezone awareness.

64
New cards

INTERVAL

A duration of time in SQL.

65
New cards

NULL

A special value representing missing or unknown data.

66
New cards

Data Visualization

The representation of data through graphics like charts and infographics to communicate insights.

67
New cards

McCandless method

A framework by David McCandless for good data visualization focusing on data, story, goal, and visual form.