1/214
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
As engineers we design, build, and operate
Machines
Steps For a Machine
Take in input, Use a model, Produce Action
What is a model
A perceived state of the world that is improved overtime.
What can we learn from the data
Structures, Parameters, Associations, Similarity
Explain Physics/Mathematical model
Mathematical/physics-based models are rigorously developed based on empirical evidence and understanding of causal relations.
Machine Learning
Derive the model from the DATA!
Unsupervised Learning
Machine learns data parameters, structure, relationships for model directly from the data without training data
Supervised Learning
Machine is given training sets which are: "labeled" data with expected outputs to train the model.
Examples of Supervised Learning
Email Spam Detection, House Price Prediction, Handwritten Digit Recognition
Examples of Unsupervised Learning
Grouping Music by Genre, Organizing Photos on Your Phone, News Article Grouping
Numeric Data
Quantitative, measurable; values are numbers. Eg. 0, 42, 3.1415, 1.602x10^-19
Categorical
Qualitative, recognizable; values arerestricted to the possible values in a category and canbe represented by a text value or a number
Types Of Numeric Data
Discrete (1,2,3,4,5,6,7,8,9,10) .
Continuous
(1,1.1,1.12234,2.23434...9.5,9.6,10)
Types Of Categorical Data
Ordinal (Monday, Friday)
Nominal(Fiat 500, Victor)
First rule of Machine learning
do not alter the original data(not in the df)
Dataframe structure
Instance or observation, index attribute, column attribute, datum, feature (LEARN HOW TO PLACE THEM).
Feature
column attribute + column data
Feature set or Dataset
set of features covering all attributes
Dimensionality
number of attributes
Missing data options
Remove entire feature, Remove an instance/observation
Fill missing datum with some value
Types of Filling technique for missing data
previous reading, zero, min(), max(), mean(), median(). MORE(regression, KNN)
The work of filling missing data
Imputation
What is an outlier
An observation that "lies an abnormal distance from other values in a random sample from a population'
What to do with outliers
Outlier is not part of the population : Remove.
Outlier is part of the population: Keep .
For categorical attributes, to operate on the values in our machine learning models.
We convert them to numeric values.
Descriptive statistics
Understanding the characteristics of our data for greater insights.
What does out model predict based on
Statistical characteristics of the existing data
Inferential statistics
Making predictions based on statistical data
Population
Set of all data in an area of interest.
Sample
Subset of a population.
Measures of frequency
Count- Number of data entries.
Proportions-ratio of a number of observations or event.
Occurence percent-Proportions *100
Measures of Central Tendency
Arithmetic Mean for Sample/ Population, Geometric Mean , Harmonic Mean, Median , Node.(KNOW ALL FORMULAS)
When to use Arithmetic Mean
An average of individual data
When to use Geometric Mean
Averaging data from exponential processes: population growth, disease infection.
When to use harmonic Mean
Averaging of flows : pipeline, volumetric flow, average resistance
Measured of Dispersion
Range: max/min difference, Variance: spread of the data, standard deviation: sqrt(variance).(KNOW ALL FORMULAS MIGHT BE CALCULATIONS)
Measured of positions
kth-percentile Rank, Quartile Rank
Data Scaling
Normalization And Standardization. Should be done in a new df such as dfscaled.
What is normalization in data processing?
Values are shifted and rescaled so that they end up ranging between 0 and 1.(KNOW THE FORMULA)
What is another name for normalization?
Min-Max scaling.
What is Standardization in data scaling ?
The values are centered around the mean with a unit standard deviation.(KNOW THE FORMULA)
What happens to the mean during Standardization ?
It becomes zero !
When is Normalization useful ?
Normalization can be useful where distribution of the data is unknown and in algorithms that do not make assumptions of distribution of the data
When is Standardization useful ?
Standardization is well suited to data that is characterized by a Normal(aka Gaussian) distribution.
What is Univariate statistics ?
One variable: mean, median, mode, variance, std deviation. ex: Variance
What is Multivariate statistics?
More than 1 variable. Focuses on the relationships between variables ex: Covariance
What is Covariance (bivariate) ?
Measure of the relation between the variation of two variables(KNOW THE FORMULA)
How do we interpret Covariance ?
cov(X,Y) > 0 positively correlated
cov(X,Y) < 0 inversely correlated
cov(X,Y) = 0 X and Y independent (goes from -infinity to +infinity)
What is Pearson Correlation (bivariate) ?
Measures both the strength and direction of a linear relationship (stays between -1 and 1 )
Is Variance Covariate or Univariate ?
Univariate
What's a good tool to visualize correlation ?
Seaborn pair-plots
What type of learning is Clustering ?
Unsupervised Learning
What is k-Means Clustering ?
Go watch a youtube video
In K-Means Clustering what do we alternate between ?
Assign data instances to closest mean and Reassign each mean to the average of its newly assigned points
When does K- Means clustering stop ?
When no points' assignments change.
What is K in k means clustering ?
Estimated number of clusters represent a point assigned at an estimated cluster center.
What are other clustering algorithms ?
(KNOW THE TABLE )
What is the elbow method ?
The Elbow Method is a technique used in unsupervised learning, especially in K-Means clustering, to determine the optimal number of clusters (K).
PROBABILITY
DID NOT DO CARDS NEED DEEP UNDERSTANDING
What is the main goal of linear regression?
Minimize estimation error . To predict new values that follow the previously found trend.
what is Extrapolation/ Interpolation ?
Interpolation:
Estimating values within the range of known data points.
Extrapolation:
Estimating values outside the range of known data points.
What is Dependent variable ?
The variable you measure or try to predict.
What is Independent Variable ?
The variable you control, manipulate, or use to make predictions ?
What are the causes of error in regression ?
Hidden features: y does not just depend on x and the y-intercept but our model does not include these features, Observational error, Statistical variation, physical noise
What is parametric Machine Learning ?
Using training data to learn the parameters such as linear regression.
How to Linear regression on paper ?
(LEARN AN EXAMPLE)
What do we change when we fit for minimal error in linear regression?
We find the coefficients that minimize the sum the square of the residuals.
What is the linear regression is not ideal for our data ?
Use a non linear model or a higher degree linear model.
Use a piecewise-linear fit(make bins and apply regression on them).
What leads to measurement or observational error ?
Limited accuracy in instruments,
Faulty sensors,
Recording errors,
Noise & stochastic variation
What is a higher degree linear model ?
A polynomial with higher degree.
What is Mean squared Error ?
A metric for regression error. (KNOW THE FORMULA)
Definition of Residual ?
Deviation of the observed value from the predicted value of the measured quantity.
What kind of Model Errors can we get ?
Regression errors: residuals on the training data, Prediction errors: residuals on test data.
What is Generalization ?
How well a model predicts on data it has not been trained on.
Can you explain Bias Variance Tradeoff ?
Increasing model complexity (makes it more sensitive to small changes in the dataset which leads to great changes in the parameters ) increase of variance and decrease of bias.
Define Underfitting?
Bias is too high , the model does not correctly approximate the data , we need to increase the complexity and variance.
Define Overfitting ?
Variance is too high, the model is very sensitive to any small changes in the dataset and which causes major error in the fit and major error in predictions.
Describe the bias variance tradeoff graph ?
(PUT THIS IN CHEAT SHEET)
What is regularization for ?
Prevent overfitting.
What is regularization using Ridge regression ?
Penalizes large weights(of features) by adding to the cost function (MSE()) a fraction of square of each weight. (Impacts all weights)
What is regularization using Lasso regression ?
Drives least important weights(of features) to zero. (can make them straight up 0)
When to use Ridge regression ?
When all features are expected to matter.
When to use Lasso regression ?
When only some features matter.
Wheat hyper parameter can we change for Lasso and Ridge ?
Alpha hyperparameter controls how aggressively the cost function is modified by the regularization penalty
What is classification ?
Given features X, predict label (class) y
Explain K-nearest neighbors classification ?
Assign class based on the majority vote of the k-closest neighbors.(WATCH YOUTUBE VIDEO)
KNN is a non-parametric classifier ?
True
What does KNN classify from ?
Classification from similarity in features geometry.
What are the tradeoff when choosing K ?
Small k gives relevant neighbors, Large k gives smoother functions.
When to use KNN ?
Not too many dimensions , lots of training data.
What are the perks and tweaks of KNN ?
Advantages:
Very fast at training
Learn complex functions
Disadvantages:
Slow at new data
Irrelevant features can confuse the classifier
What's the problems with Accuracy for KNN.
Not well suited to imbalanced classes .If we have more reds that green and predict all red we'll will get high accuracy whatsoever.
Understand True/False/Positive/Negative ?
-True Positive
actual class = Positive;
predicted class = Positive
-True Negative
actual class = Negative
predicted class = Negative
-False Positive
actual class = Negative
predicted class = Positive
-False Negative
actual class = Positive
predicted class = Negative.
What is Precision ?
TP / (TP + FP) (Know by heart)
What is Recall ?
TP/ (TP + FN) (Know by heart)
Situations when we want high recall ?
Cancer Detection, Credit Card Fraud Detection
-Better off making false negatives
Situations when we want high Precision ?
Fake News Detection, Spam Detection
-You would rather miss some positive than flag some for no reason.
Know the Stucture of Confusion Matrix ?
(MAYBE GO INTO SHEET)
What is feature selection ?
Decide which features to use in training.
What can provide feedback on feature importance ?
Random Forest