data sceince cycle

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/10

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

11 Terms

1
New cards

problem formulation

looking at data set and find questions or patterns to investigate

2
New cards

getting the data

using comma seperated values files in python can be imported in to pandas dataframes. cleaning data finds and fixes missing or outlier data

3
New cards

exploring data

the use of describe() to explore data and exploration through prints and graphing such as box plots, through grouping data to calculate averages, changing data types to helps interpret fields e.g. from numerical to categorical

4
New cards

analysing the data

assocatiation means the values of one variable are linked in some way to the values of another. combining describe with the creation of scatter diagrams enables the discovery of answers to questions

5
New cards

catplot

compares categories

6
New cards

displot

compares distribution

7
New cards

csv file

plain text files, the values in each row are separated by commas, and each row is ended by a line break .comma seperated values

8
New cards

relplot

creates and compares scatterplots based on two features to identify accosiation

9
New cards

groupby

compares different features

10
New cards

catplot charts

box plot, strip plot, violin plot

11
New cards

displot

histograms