7 - Data Cleaning and Preparation

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/19

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

20 Terms

New cards

dropna

Filters out missing data from a Series or DataFrame, removing rows or columns with NA values based on specified thresholds.

New cards

fillna

Fills missing values with a specified value or using interpolation methods like 'ffill' (forward fill) or 'bfill' (backward fill).

New cards

isnull

Returns a boolean array indicating which values are missing/NA in a Series or DataFrame.

New cards

notnull

Returns the negation of `isnull`, indicating which values are not missing/NA.

New cards

drop_duplicates

Use the `drop_duplicates` method, which returns a DataFrame with duplicate rows removed.

New cards

duplicated

A boolean Series indicating whether each row is a duplicate of a previous row.

New cards

replace

Use the `replace` method, which substitutes occurrences of one value or pattern with another.

New cards

rename

Renames axis labels (index or columns) in a DataFrame, either in-place or returning a new DataFrame.

New cards

pd.cut

Bins continuous data into intervals based on specified bin edges or quantiles.

New cards

pd.qcut

Bins data into equal-sized buckets based on sample quantiles.

New cards

detect outliers

Use boolean indexing with conditions (e.g., `np.abs(data) > 3`) or statistical methods like standard deviation.

New cards

get_dummies

Converts categorical variables into dummy/indicator variables (one-hot encoding).

New cards

split

Use the `split` method, often combined with `strip` to trim whitespace.

New cards

str.contains

Checks if each string in a Series contains a specified pattern or substring, returning a boolean Series.

New cards

str.extract

Use the `str.extract` or `str.findall` methods with a regex pattern containing groups.

New cards

str.replace

Replaces occurrences of a pattern or substring in each string of a Series.

New cards

str.cat

Use the `str.cat` method with an optional delimiter.

New cards

str.upper

Converts all characters in each string of a Series to uppercase.

New cards

str.startswith

Use the `str.startswith` method.

New cards

str.len

The length of each string in a Series.

Explore top notes

Mathematics: Review of PSAT Mathematics

Updated 914d ago

Note

Size of Continents and Seafloor Spreading

Updated 934d ago

Note

Chp 10: Communication in Intimate Relationships

Updated 899d ago

Note

Science 8: Light and Optics LO1, LO2 & LO3

Updated 599d ago

Note

Humanitarian Law

Updated 929d ago

Note

Chapter 7 - Conceptual Development

Note

Note

Note

Explore top flashcards

wars of roses knowledge bank

Updated 753d ago

Flashcards (240)

no more repeats of agricultural

Flashcards (190)

Flashcards (54)

Flashcards (164)

English 10 - Vocab. 11-20

Flashcards (20)

Flashcards (28)

Flashcards (53)

Flashcards (50)