Unit 07: Analyzing Data and Visualizations

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/26

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

27 Terms

1
New cards

Data Analysis Process

A systematic approach to analyzing data, consisting of steps to collect, clean, visualize, and generate information.

2
New cards

Step 1: Collect or Choose Data

Gather the data needed for analysis.

3
New cards

Step 2: Clean/Filter

Remove errors or inconsistencies from the data. Focus on relevant data for analysis.

4
New cards

Step 3: Visualize and Find Patterns

Use graphs/charts to observe data for patterns.

5
New cards

Step 4: Generate New Information

Produce results based on observations.

6
New cards

Data

Information collected for analysis.

7
New cards

Metadata

Data about data including time of data collection, type of data, location of data collection, method of collection, and collector of the data.

8
New cards

Bar Charts

Visualizations that can be vertical or horizontal, showing frequency analysis where taller/longer bars indicate more frequent values.

9
New cards

Insights from Bar Charts

Identify most and least common values, range, and presence of values.

10
New cards

Pie Charts

Visualizations that represent percentages of unique values in a dataset.

11
New cards

Insights from Pie Charts

Identify highest/lowest percentages and compare values.

12
New cards

Histograms

Displays frequency of values within ranges and reads similarly to bar charts.

13
New cards

Insights from Histograms

Identify most and least common ranges.

14
New cards

Scatterplots

Compares two data columns to find relationships, which can be direct, inverse, or none.

15
New cards

Insights from Scatterplots

Identify relationships and trends; make predictions.

16
New cards

Correlation

Indicates similarities and apparent patterns between data sets.

17
New cards

Causation

Implies one event causes another.

18
New cards

Important Note on Correlation and Causation

Correlation does not equal causation.

19
New cards

Big Data

Data gathered through data mining and web scraping, solving problems like efficiency in business and disease identification.

20
New cards

Open Data

Freely available data with minimal restrictions, sourced from open data repositories.

21
New cards

Crowdsourced Data

Data collected by ordinary people for decision-making.

22
New cards

Machine Learning

Involves algorithms that analyze data and adapt, used in daily tasks and AI.

23
New cards

Limitations and Bias in Machine Learning

Algorithms may reflect human biases if the input data is not diverse.

24
New cards

Example of Bias

Twitter's cropping algorithm favored certain demographics due to biased training data.

25
New cards

Ways to Mitigate Bias

Diversify training data by including underrepresented groups.

26
New cards

Simulation

A model of real-world situations/events useful for hypothesis testing when real experimentation is impractical or risky.

27
New cards

Usage of Simulations

Help abstract complex processes and provide insights that cannot be easily realized in real life.