1/13
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Generalization
the model’s ability to adapt to previously unseen data
label
-target variable; attribute you are trying to represent
-represented by y
CODE: {y = df[‘colName’] }
feature
-input for that prediction
-often grouped together as a vector; represented by X
CODE: { X = df[feature_list] } { X = df.drop(columns=’y_colName’ }
supervised learning
-learnt by using prior knowledge
-attempts to discover the relationship between features and an associated label for the purpose of future prediction.
unsupervised learning
discovering patterns in data without the use of training data containing labeled examples; aka on its own!
first step in ML pipeline
regression
the label is any real valued number
classification
the label is a categorical variable
clustering
unsupervised learning technique
group subsets of data that are collectively similar to each other based on the similarity of their feature value
The ML Process
Business Understanding
defining the business objectives
what does the business need?
example: ____
data understanding and preparation
transforming raw data into a form of suitable for modeling
what data do we have? is the data clean?
how do we prepare the data for our model?
example: ______
modeling
what techniques should we apply to the model?
evaluation
training a model and confirmed its performance on unseen data, perform additional analysis
is our model best suited for our problem?
example:
deployment
preparing to move a model into production after evaluating it
how do we make our model available to stakeholders and other users?
example: