Pre-processing
Handeling data, Meaning coercing vars to be correct, filtering noise, feature selection, normilization.
Cross-Validation
ensures the model generalizes well
input X, input Y
f(x) → y that generalizes to new data
Classification
Categorizing objects or ideas,
A new observation is assigned a category using patterns learning from labeled data
Binary classification
Two possible categories (eg spam vs not spam)
Multi-Class classifiication
More than two categories, but one label per instance
Multi-Label Classification
One sample can belong to multiple classes (Movie genres)
Image Segmentation
Pixel-based classification that assigns a label to every pixel in an image, allowing for the identification of objects and boundaries.
Sequential data Classification
Includes saptial /temporal data (speach recognition)
Linera Classification
Linear decision boundaries
Non-linear Classification
Require complex decision bounaries.
Kernal trick
Maps non-linearly seperable data into a higher dimension.
Over fitting VS Under fitting
Over fitting: Model memorizes training data but fails to generalize to new data
underfitting: Model is too simple and fails to capture patterns in data.
Clustering
The process of grouping similar data point together into custers, based on their characteristics.
What is a Tensor
Array with more then two axes, Three indices to identify an element
Features
an individual measurable property of a phenomenon being observed
Nomianal vs ordinal
Nominal = Have two or more categories but which do not have an intrinsic order
Ordinal = Have two or more categories, which can be ordered or ranked
Lazy learning
Doesn’t learn until the test example is given
Voroni Diagram
Describes the areas that are nearest to any given point, given a set of data
Standardization
Rescaling the data so the mean is zero and the standard deviation from the mean.
Upper limit varies
Proximity refers to a similarity or dissimilarity.
MIn-Max scaling
Between 0,1 a fixed range scale the data to a fixed range.
Confusion matrix
shows performance of an algorithm
Bootstrapping
Amplifying the minor class samples so that the class are equally distributed
Correlation
The linear association between two variables
What is the Happiness formula
level of gratitude + Definetion of Happiness + Contribute + yout personal sucess. /6
Ridge Regression
Shrinks coefficients to give less sensitivity
LaGrange Multiplier
A strategy for finding the local maxima or minima of a function
Global optimization
Refers to finding the optimal value of a given function among all possible solution
Local optimization
Finds the optimal value within the neighboring set of Candidate solution
Gradient
The derivative slope of the tangent line at