Data Science

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/21

flashcard set

Earn XP

Description and Tags

Last updated 4:00 PM on 4/11/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

22 Terms

1
New cards

What are the steps of data analysis?

Problem formulation -> Getting the data -> Exploring the data -> Analysing the data -> Communicating the results

2
New cards

What is the first step in data analysis?

Problem formulation, where questions and patterns to investigate are identified.

3
New cards

What common file format is used for data sets in Python?

.csv (comma separated values)

4
New cards

What library is commonly used in Python to handle data frames?

Panda data frames

5
New cards

What issues might you find in a data set?

Missing data values or outliers.

6
New cards

What is the purpose of pre-processing data?

To find and fix issues such as incorrect or missing data.

7
New cards

How can you explore data in Python?

Using the describe() function, printing values or calculations from data, and graphing.

8
New cards

What is the benefit of changing data types in a data set?

It helps visualization libraries like Seaborn interpret fields correctly.

9
New cards

What are the two main plot families in Seaborn for 1-dimensional charts?

Catplot (comparing categories) and Displot (comparing distributions). Both can be grouped by additional categories.

10
New cards

What does catplot create?

Charts of a single variable, such as box plots, strip plots, and violin plots.

11
New cards

What does displot create?

Charts to show the distribution of a single variable, such as histograms (with equal interval widths and frequency on y axis).

12
New cards

How can comparisons in datasets be facilitated?

Superimposing datasets over each other.

13
New cards

What does the term 'association' mean in data analysis?

It means the values of one variable are linked to the values of another.

14
New cards

Does association imply causation?

No, association does not imply causation.

15
New cards

How can you discover answers to questions about data?

By combining describe() with scatter diagrams.

16
New cards

What function can be used to compare different features in a data set?

groupby

17
New cards

What function can be used to create box plots and compare charts for different features?

catplot

18
New cards

What function can be used to create and compare scatterplots based on two features to identify associations?

relplot

19
New cards

What can facilitate data visualization?

Using colours to enhance visual clarity and comparing smaller slices of data

20
New cards

What is essential after investigating data?

Communicating the results clearly to the target audience.

21
New cards

What is the role of charts and calculated values in data analysis?

They help answer the initial problems or questions posed.

22
New cards

What can proper use of the data cycle allow?

The yield of patterns or insights