Distributions and Histograms!

0.0(0)
Studied by 2 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/23

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 8:29 PM on 1/24/24
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

24 Terms

1
New cards

Scatter plot

numerical vs numerical

2
New cards

line plot

sequential numerical (time) vs numerical

3
New cards

bar chart

cateogrical vs numerical

4
New cards

histogram

distribution of numerical

5
New cards

What is the distribution of a variable?

“How often does a variable take on a certain value?”

Both categorical and numerical variables have this.

6
New cards

Categorical variables

Bar charts show you distribution of categorical variable.
and you can do .plot(kind=”barh”, y=”Distance”) ← y declares numerical value

can include .plot(kind=,y=,legend=False,xlabel=”Count”, title=”Distribution of Exoplanet Types”)

When you don’t put an x, you use the index
legend=False puts a legend on the top right that may or may not be accurate
xlabel is the label of the x-axis (the categorical part)
title= the title

figsize=(3, 10) takes in sequence of two values, first one is how wide you want it to be, second one is the how high you want it

Keep in mind, this is a bar in horizontal.. so if you have ascending=False, it will actually output an ascending order!

7
New cards

What does .describe() do? Series method.

Output series, and will give you count(), mean(), etc.

8
New cards

can you represent radius of exoplanets in bar chart?

NO, horizontal axis should be numerical not categorical. There should be more space between certain bars than others.

ex. you might think that one planet which is 80% larger than another be the same height

Instead, use density histograms

9
New cards

What is a density histogram? for a radius?

Looks like a bar chart, but x-axis is like a numberline (rather than a category!)

also, y-axis says frequency. This does not mean there are 2.8 planets within that range if y-axis is at 2.8; It’s not telling us how many COUNT() is in there

10
New cards

What is binning?

Groups nearby values into one bin. like [a, b) will include a, b is not

this is the convention of binning, greater than or equal to the left endpoint and less than the right endpoint

Doesn’t distinguish between each number, just puts together

11
New cards

Plotting density histograms

df.plot(

kind=”hist”

y=column_name

density=True
)

ec=”w” puts a little white EDGE COLORS to the bars

Requires ONLY ONE value

default chooses 10 bins of equal space, some which are empty
can also specify bins to be different by included argument “bins = #”

12
New cards

What does bins=20 argument do?

creates 20 bins of equal width for your histogram

13
New cards

you can specify specific starting and ending points. How?

set bin= to a sequence such as a list of all the endpoints you want to use.

bin=[]

14
New cards

what is the y-axis values? for histograms?

Proportion of the values of that bar’s WIDTH

15
New cards

Normally histogram bins include [a, b) but what about the last section?

[a, b]

16
New cards

Do bins cut off value?

Yes, if you don’t include all values in your range then it will get cut off

17
New cards

What does bins=np.arange(4)

This works! But it creates bins [0,1), [1, 2), [2,3]
CUTS OFF 4!

18
New cards

Also, histograms total area is what

Total area of all the redness is 1, explains the weird y-axis. This is the DENSITY histogram, so it makes sense!

19
New cards

Proportion vs percentage

proportion = 0-1, percentage 0%-100%

20
New cards

How do you find area of bar?

Calculate the width, then multiply it by the height;

not an exact match b/c you’re estimating visually

21
New cards

y-axis always says “FREQUENCY” but it’s wrong. how to fix?

Can use ylabel to fix it, but you usually don’t and just see that it’s density histogram, and know how to interpret it.

22
New cards

How to make multiple plots on the same axis

you can .get([]) multiple columns, and display them at the same time!

Alternatively, if you omit the y, you get ALL the columns displayed at the same time (if they can be. i.e. they are numerical!)

23
New cards
24
New cards

Explore top notes

note
Chapter 31
Updated 377d ago
0.0(0)
note
Chapter 29.1
Updated 1403d ago
0.0(0)
note
Redox chemistry
Updated 769d ago
0.0(0)
note
Escape and Avoidance Learning
Updated 1290d ago
0.0(0)
note
6 IGOs
Updated 1167d ago
0.0(0)
note
Civil Rights Movement
Updated 321d ago
0.0(0)
note
Chapter 31
Updated 377d ago
0.0(0)
note
Chapter 29.1
Updated 1403d ago
0.0(0)
note
Redox chemistry
Updated 769d ago
0.0(0)
note
Escape and Avoidance Learning
Updated 1290d ago
0.0(0)
note
6 IGOs
Updated 1167d ago
0.0(0)
note
Civil Rights Movement
Updated 321d ago
0.0(0)

Explore top flashcards

flashcards
The Rise & Spread of Islam
35
Updated 1138d ago
0.0(0)
flashcards
Phrasal verb C1C2
100
Updated 564d ago
0.0(0)
flashcards
FOOD TECH FOOD QUALITY
64
Updated 589d ago
0.0(0)
flashcards
Gran Hotel 8-11
37
Updated 1125d ago
0.0(0)
flashcards
spanish imperfect verbs
75
Updated 851d ago
0.0(0)
flashcards
Mythology Vocabulary 2
25
Updated 1146d ago
0.0(0)
flashcards
The Rise & Spread of Islam
35
Updated 1138d ago
0.0(0)
flashcards
Phrasal verb C1C2
100
Updated 564d ago
0.0(0)
flashcards
FOOD TECH FOOD QUALITY
64
Updated 589d ago
0.0(0)
flashcards
Gran Hotel 8-11
37
Updated 1125d ago
0.0(0)
flashcards
spanish imperfect verbs
75
Updated 851d ago
0.0(0)
flashcards
Mythology Vocabulary 2
25
Updated 1146d ago
0.0(0)