Overview of the ML Lifecycle and Deployment
0. Learning Objectives:
Identify the key components of the ML Lifecycle.
Define “concept drift” as it relates to ML projects.
Differentiate between shadow, canary, and blue-green deployment scenarios in the context of varying degrees of automation.
Compare and contrast the ML modeling iterative cycle with the cycle for deployment of ML products.
List the typical metrics you might track to monitor concept drift.
1. ML project lifecycle

1.1 Case Study: Speech recognition
Scoping:
Decide to work on speech recognition for voice search
Decide on key metrics
Accuracy, latency, throughput
Estimate resources and timeline
Data
Is the data labeled consistently?
How much silence before/after each clip?
How to perform volume normalization?
Modeling
code (algorithm/model)
Hyperparameters
Data -> Good data is low hanging fruit to improve the model performance
Deployment
Deploy in production
Monitor
Challenge: Concept/Data drift
2. Deployment
2.1 Key challenges:
Concept drift and Data drift -> Monitoring
Concept drift: x->y change. It means the correlation between X and y is changed
Data drift: x distribution change. It means the distribution of X is different from the training data
Software engineering issues -> Deploy in production
Checklist of questions for engineering
Realtime or batch
Cloud vs Edge/Browser
Compute resources (CPU/GPU/memory)
Latency, throughput (QPS)
Logging
Security and privacy
2.2 Deployment patterns
Common deployments cases
New product/capability
Automate/assist with manual task
Replace previous ML system
Key ideas:
Gradual ramp up with monitoring
Rollback
2.2.1 Deployment modes
Shadow mode
ML system shadows the human and runs in parallel
ML system's output not used for any decision during this phase
Canary deployment
Roll out to small fraction (5%) of traffic initially
Monitor system and ramp up traffic gradually
Blue green deployment
The blue version is the old version and the green version is the new version.
This step is change the connection of system from blue to green version
Easy way to enable rollback
Degrees of automation
Human only -> Shadow model -> AI assistance -> Partial automation -> Full automation
The AI assistance and Partial automation are Human-in-the-loop deployment
2.3 Monitoring
Monitoring dashboard
Brainstorm the things that could go wrong
Brainstorm a few statistics/metrics that will detect the problem
OK to bring many metrics initially and gradually remove the ones you find not useful
Set thresholds for alarms
Adapt metrics and thresholds over time
Examples of metrics to track
Software metrics:
Memory,
compute,
latency,
throughput,
server load
Input metrics:
Avg input length,
Avg input volume,
Num missing values,
Ave image brightness
Output metrics:
# times return " " (null),
# times user redos search,
# times user switches to typing,
CTR
2.3.1 Deployment is iterative process
Iterative process to choose the right set of metrics to monitor

2.4 Model maintenance
Manual retraining
Automatic retraining
2.5 Metrics to monitor
Monitor
Software metrics
Input metrics
Output metrics
How quickly do they change?
User data generally has slower drift
Enterprise data (B2B applications) can shift fast
Last updated
Was this helpful?