L20_ ML Operations

0.0(0)

Studied by 0 people

View linked note

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/8

Earn XP

Description and Tags

Flashcards for reviewing ML operations concepts from lecture 20.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

9 Terms

New cards

What is Concept Drift?

When evolutions in data cause your model to perform worse over time. Analogous to a ship's course drifting off due to unseen currents; the model, once accurate, degrades as the data it relies on changes. This drift can be sudden or gradual, requiring continuous monitoring and adaptation.

New cards

What are the two approaches to model training?

The two primary approaches to model training are Static (offline) training and Dynamic (online) training. Static training involves training a model once on a fixed dataset, deploying it, and using it without further updates for a certain period. This approach is suitable when the data distribution is stable and concept drift is minimal. Dynamic (online) training involves continuously retraining models as new data becomes available, and serving the most recent model. This approach is beneficial when the data distribution is dynamic and concept drift is significant.

New cards

Describe Static (offline) training.

Train a single model once, deploy it, and keep using it for a while. Like launching a satellite with predefined programming, it operates until its data becomes obsolete. Models can be thoroughly tested and Deployment needs only be done once, but it doesn't adapt to new information.

New cards

Describe Dynamic (online) training.

(Re)-train models continuously when new data comes in, and serve the most recent. Similar to a self-adjusting thermostat, it constantly calibrates to the current conditions. Model always up-to-date with shift/fluctuations in data, ensuring relevance but demanding more resources.

New cards

What is MLOps?

Machine Learning Operations, adapting Continuous integration (CI) and Continuous delivery (CD) from DevOps, to include data processing and add Continuous training (CT). Like bridging construction and city planning, it integrates coding, testing, and deployment for streamlined ML workflows.

New cards

What is level 0 of MLOps?

All processes are done manually: Data preparation, model training, validation etc. performed by hand. Imagine crafting a sculpture by hand, each step is deliberate but time-consuming and not easily repeatable.

New cards

What is level 1 of MLOps?

Pipeline automation: Set up automated processes for data preparation and model training. Automatically retrain models when new data is available and run standardized performance tests. Think of an automated assembly line only for managing and creating models.

New cards

What is level 2 of MLOps?

Add CI/CD pipeline automation: Allows new model development to be integrated quickly. Enhances level 1 through automatic model delivery and version control. Requires more support structure: Integration testing, Additional model metrics, Model metadata store and Advanced pipeline orchestration.

New cards

What are some performance measures/Metrics to evaluate in ML engineering?

When evaluating machine learning engineering performance, several key metrics should be considered:

Quality: This refers to the accuracy and reliability of the model's predictions. Metrics include precision, recall, F1-score, AUC-ROC, and error rates, which measure how well the model performs on different types of data and tasks.
Latency and Throughput: Latency measures the time it takes for the model to generate a prediction, while throughput measures the number of predictions the model can make in a given time period. Low latency and high throughput are essential for real-time applications and high-volume data processing.
Development and Maintenance Time: This encompasses the time and resources required to develop, deploy, and maintain the model. Efficient development processes, code quality, and documentation can reduce development and maintenance time.
Usage Cost: This includes the costs associated with infrastructure, compute resources, data storage, and model deployment. Optimizing resource utilization and leveraging cost-effective cloud services can help minimize usage costs.
Compliance: Ensure the model and its development process adhere to relevant regulations, industry standards, and ethical guidelines. This includes data privacy, security, and fairness considerations.