1/5
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
How to load csv data in NumPy?
import numpy as np
np.genfromtxt(filename, delimiter=None)
E.g.: data = np.genfromtxt('data.csv', delimiter=',')
What is NaN in numpy?
NaN
stands for Not a Number and indicates that the underlying value cannot be represented as a number. It's similar to Python's None
constant and it is often used to represent missing values in datasets.
In our case, the NaN
values appear because the first row of our CSV file contains column names, which NumPy can't convert to float64
values. The solution is simple enough: we need to remove that problematic row!
To remove the header row from our ndarray, we can use a slice, just like with a list of lists:
taxi = taxi[1:]
What is the alternate way to remove headers while loading data from CSV?
Alternatively, we can avoid getting NaN
values by skipping the header row(s) when loading the data. We do this by passing an additional argument, skip_header=1
, to our call to the numpy.genfromtxt()
function. The skip_header
argument accepts an integer — the number of rows from the start of the file to skip. Remember that this integer measures the total number of rows to skip and doesn't use index values. To skip the first row, use a value of 1
, not 0
.
What is boolean array?
A Boolean array is an array filled with Boolean values. We might also call them Boolean vectors or Boolean masks.
How to use boolean index on 1-D array? (input on the left image)
Answer in the left image
How to use boolean index on 2-D array? (input on the left image)
Answer in the left image