1/10
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
problem formulation
looking at data set and find questions or patterns to investigate
getting the data
using comma seperated values files in python can be imported in to pandas dataframes. cleaning data finds and fixes missing or outlier data
exploring data
the use of describe() to explore data and exploration through prints and graphing such as box plots, through grouping data to calculate averages, changing data types to helps interpret fields e.g. from numerical to categorical
analysing the data
assocatiation means the values of one variable are linked in some way to the values of another. combining describe with the creation of scatter diagrams enables the discovery of answers to questions
catplot
compares categories
displot
compares distribution
csv file
plain text files, the values in each row are separated by commas, and each row is ended by a line break .comma seperated values
relplot
creates and compares scatterplots based on two features to identify accosiation
groupby
compares different features
catplot charts
box plot, strip plot, violin plot
displot
histograms