1/19
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
what is vectorized operations?
Vectorized operations in Python are a way to perform computations on multiple elements of an array, vector, or data frame at once, instead of using loops. This can make code faster, more readable, and use less memory.
series_a + series_b
- Addition
series_a - series_b
- Subtraction
series_a * series_b
- Multiplication (this is unrelated to the multiplications used in linear algebra).
series_a / series_b
- Division
How to find maximum number in a pandas series?
Series.max()
How to find minimum number in a pandas series?
Series.min()
How to find the average of pandas series?
Series.mean()
How to find the median of pandas series?
Series.median()
The median is the middle value when all values are sorted in order.
How to find the mode of pandas series?
Series.mode()
The mode of a set of values is the value that appears most often.
How to get sum of all the values in a pandas series?
Series.sum()
What is the output of series.describe()
Count, mean,std, min,max, 25%, 50%, 75%, Name (i.e., column name) and dtype.
What is method chaining?
The method that a way to combine multiple methods together in a single line.
E.g.: countries_counts = f500["country"].value_counts()
How to extract total number of china records in country series?
print(f500["country"].value_counts().loc["China"])
Which parameter the dataframe require to perform calculation?
Axis
What is the example of perform calculation on particular column?
f500[["revenues", "profits"]].median(axis=0)
The above example provide average value for revenues and profits columns of f500 dataframe.
For rows, the axis should be either 0 or “index” whereas for columns, the axis should be either 1 or “columns”.
What is the syntax to get maximum number from all the numerical columns in a dataframe?
print(df.max(numeric_only=True))
For what type of columns the describe method give the statistics?
numeric columns.
To make describe method to return statistics for non-numeric columns what is the syntax?
df.describe(include = [’0’])
How to assign a value to all the rows in a column of dataframe?
df[column name] = value
top5_rank_revenue["revenues"] = 0
How to assign a value to a specific row in a column of dataframe?
df.loc[row label, column label] = value
What is boolean indexing? with example.
Boolean indexing is used to filter data by selecting subsets of the data from a given Pandas DataFrame.
In the below example, the motor_bool returns a series of true and false. True for value of records in industry as "Motor Vehicles and Parts" else false. In motor_countries variable only the countries which industry as "Motor Vehicles and Parts" is available.
motor_bool = f500.loc[:,"industry"] == "Motor Vehicles and Parts"
motor_countries = f500.loc[motor_bool,"country"]
How to add new column in a dataframe?
df[new column name] = value
top5_rank_revenue["year_founded"] = 0
How to find top two countries?
top_2_countries = f500["country"].value_counts().head(2)