1/36
A comprehensive set of vocabulary flashcards covering essential concepts related to Python programming, machine learning, and data analysis.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
Data Types
Categories of data in Python, including Integer, Float, String, and Boolean.
Floor Division
An operation that divides two numbers and rounds down to the nearest whole number, represented by '//'.
Modulo Operator
An operator that returns the remainder of a division operation, represented by '%'.
IDE
Integrated Development Environment, a software application that provides comprehensive facilities to computer programmers.
Case Sensitivity
In Python, identifiers are case-sensitive; 'Variable' and 'variable' refer to different entities.
Not Equal Operator
An operator denoted by '!=' used to compare two values, returning true if they are not equal.
F-string
A string formatting method in Python that allows for the inclusion of variables and expressions within string literals.
Variable Naming Rules
Guidelines that dictate how variables should be named in Python, including rules about immutability.
List Properties
Characteristics of lists in Python, including their ordered nature and ability to use negative indices for indexing.
Dictionary Identification
The ability to distinguish dictionary data structures in Python, which store key-value pairs.
Function Parameters vs Arguments
Parameters are variables listed in a function's definition, while arguments are the actual values passed to the function.
Area Function Example
A function that calculates the area using parameters and demonstrates the difference from arguments.
DataFrame Creation
The process of constructing DataFrames in Pandas, a powerful data manipulation library.
df.head()
A method in Pandas that returns the first five rows of a DataFrame.
Column Averages
The process of calculating the average of values in a column of a DataFrame.
Sorting Values
Rearranging data in a DataFrame based on specified criteria.
Linear Regression
A statistical method used to model and analyze the relationship between a dependent and independent variable.
R² (R-squared)
A statistical measure that represents the proportion of variance for a dependent variable that's explained by an independent variable.
RMSE (Root Mean Square Error)
A metric used to measure the differences between values predicted by a model and the values actually observed.
Mean Error
The average of the differences between predicted values and actual values.
Train/Test Split Best Practices
Strategies for dividing a dataset into subsets for training and testing machine learning models.
Binary Classification
A type of classification where there are only two possible classes or outcomes.
Precision
A metric that measures the accuracy of positive predictions in a classification model.
Recall
A metric that measures the ability of a model to identify all relevant instances in a dataset.
F1-score
A harmonic mean of precision and recall, used as a single metric to evaluate model performance.
Confusion Matrix
A table used to evaluate the performance of a classification model by presenting true vs predicted values.
K-Means Purpose
A clustering algorithm aimed at dividing a dataset into K distinct clusters.
Elbow Method
A technique used to determine the optimal number of clusters by plotting the explained variance against the number of clusters.
Centroids
The center points of clusters in K-means clustering, representing the mean of all points in each cluster.
Euclidean Distance
A method of calculating the straight-line distance between two points in Euclidean space.
Time Series Analysis
A statistical technique used to analyze time-ordered data points for forecasting.
Additive Model
A time series model where the components are added together.
Multiplicative Model
A model where the components are multiplied together to account for varying trends over time.
Autocorrelation
The correlation of a signal with a delayed copy of itself, used in time series data.
Four V's of Big Data
The key characteristics of big data: Volume, Velocity, Variety, and Veracity.
Moore's Law
The observation that the number of transistors on a microchip doubles approximately every two years.
General AI vs Narrow AI
General AI refers to machines with the ability to perform any cognitive task like a human, while Narrow AI refers to machines designed for specific tasks.