Exploring data cards

studied byStudied by 0 people
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions

1 / 3

4 Terms

1

Name 4 ways to do EDA (Exploratory Data Analysis)

  • compare columns against each other (feature variables)

  • compare variables against your target (what we want to predict)

  • Understand the data dictionary

  • talk with SME's so that, you become an SME on this data

New cards
2

EDA - Exploratory data analysis: 4 thing s u need to know about the data

  • what kind of data is this (catagorical, numeric etc.)? What tool and which models make the most sense to use?

  • Are their outliers?

  • How will missing data be dealt with?

  • Which features (variables) should be used (kept or discarded)?

New cards
3

How to use code to EDA the target variable

dataframe['target'].value_counts()

# of the 303 patients (each is a row of data)

# 165 have heart disease vs. 138 do not have heart disease

# Roughly 50/50 so our data has an equal split roughly 150 in each target group - balance data

New cards
4

Use matplotlib.pyplot as plt to visualize the data given the previous variables. dataframe = pd.read_csv("data/heart-disease.csv") and %matplotlib inline

dataframe['target'].value_counts()

# of the 303 patients (each is a row of data)

# 165 have heart disease vs. 138 do not have heart disease

# Roughly 50/50 so our data has an equal split roughly 150 in each target group - balance data

New cards
robot