Overview of the ML Lifecycle and Deployment

0. Learning Objectives:

  • Identify the key components of the ML Lifecycle.

  • Define “concept drift” as it relates to ML projects.

  • Differentiate between shadow, canary, and blue-green deployment scenarios in the context of varying degrees of automation.

  • Compare and contrast the ML modeling iterative cycle with the cycle for deployment of ML products.

  • List the typical metrics you might track to monitor concept drift.

1. ML project lifecycle

1.1 Case Study: Speech recognition

  • Scoping:

    • Decide to work on speech recognition for voice search

    • Decide on key metrics

      • Accuracy, latency, throughput

    • Estimate resources and timeline

  • Data

    • Is the data labeled consistently?

    • How much silence before/after each clip?

    • How to perform volume normalization?

  • Modeling

    • code (algorithm/model)

    • Hyperparameters

    • Data -> Good data is low hanging fruit to improve the model performance

  • Deployment

    • Deploy in production

    • Monitor

      • Challenge: Concept/Data drift

2. Deployment

2.1 Key challenges:

  • Concept drift and Data drift -> Monitoring

    • Concept drift: x->y change. It means the correlation between X and y is changed

    • Data drift: x distribution change. It means the distribution of X is different from the training data

  • Software engineering issues -> Deploy in production

    • Checklist of questions for engineering

      • Realtime or batch

      • Cloud vs Edge/Browser

      • Compute resources (CPU/GPU/memory)

      • Latency, throughput (QPS)

      • Logging

      • Security and privacy

2.2 Deployment patterns

  • Common deployments cases

    • New product/capability

    • Automate/assist with manual task

    • Replace previous ML system

  • Key ideas:

    • Gradual ramp up with monitoring

    • Rollback

2.2.1 Deployment modes

  • Shadow mode

    • ML system shadows the human and runs in parallel

    • ML system's output not used for any decision during this phase

  • Canary deployment

    • Roll out to small fraction (5%) of traffic initially

    • Monitor system and ramp up traffic gradually

  • Blue green deployment

    • The blue version is the old version and the green version is the new version.

    • This step is change the connection of system from blue to green version

    • Easy way to enable rollback

  • Degrees of automation

    • Human only -> Shadow model -> AI assistance -> Partial automation -> Full automation

    • The AI assistance and Partial automation are Human-in-the-loop deployment

2.3 Monitoring

  • Monitoring dashboard

    • Brainstorm the things that could go wrong

    • Brainstorm a few statistics/metrics that will detect the problem

    • OK to bring many metrics initially and gradually remove the ones you find not useful

    • Set thresholds for alarms

    • Adapt metrics and thresholds over time

    • Examples of metrics to track

      • Software metrics:

        • Memory,

        • compute,

        • latency,

        • throughput,

        • server load

      • Input metrics:

        • Avg input length,

        • Avg input volume,

        • Num missing values,

        • Ave image brightness

      • Output metrics:

        • # times return " " (null),

        • # times user redos search,

        • # times user switches to typing,

        • CTR

2.3.1 Deployment is iterative process

Iterative process to choose the right set of metrics to monitor

2.4 Model maintenance

  • Manual retraining

  • Automatic retraining

2.5 Metrics to monitor

  • Monitor

    • Software metrics

    • Input metrics

    • Output metrics

  • How quickly do they change?

    • User data generally has slower drift

    • Enterprise data (B2B applications) can shift fast

Last updated

Was this helpful?