Preface Part I. Find the Correct ML Approach 1. From Product Goal to ML Framing Estimate What Is sible Models Data Framing the ML Editor Trying to Do It All with ML: An End-to-End Framework The Simplest Approach: Being the Algorithm Middle Ground: Learning from Our Experience Monica Rogati: How to Choose and Prioritize ML Projects Conclusion 2. Createa Plan Measuring Success Business Performance Model Performance Freshness and Distribution Shift Speed Estimate Scope and Challenges Leverage Domain Expertise Stand on the Shoulders of Giants ML Editor Planning Initial Plan for an Editor Always Start with a Simple Model To Make Regular Progress: Start Simple Start with a Simple Pipeline Pipeline for the ML Editor Conclusion Part II. Build a Working Pipeline 3. Build Your First End-to-End Pipeline The Simplest Scaffolding Prototype of an ML Editor Parse and Clean Data Tokenizing Text Generating Features Test Your Workflow User Experience Modeling Results ML Editor Prototype Evaluation Model User Experience Conclusion 4. Acquire an Initial Dataset Iterate on Datasets Do Data Science Explore Your First Dataset Be Efficient, Start Small Insights Versus Products A Data Quality Rubric Label to Find Data Trends Summary Statistics Explore and Label Efficiently Be the Algorithm Data Trends Let Data Inform Features and Models Build Features Out of Patterns ML Editor Features Robert nro: How Do You Find, Label, and Leverage Data? Conclusion Part III. Iterate on Models 5. Train and Evaluate Your Model The Simplest Appropriate Model Simple Models From Patterns to Models Split Your Dataset ML Editor Data Split Judge Performance Evaluate Your Model: Look Beyond Accuracy Contrast Data and Predictions Confusion Matrix ROC Curve Calibration Curve Dimensionality Reduction for Errors The Top-k Method Other Models Evaluate Feature Importancek Directly from a Classifier Black-Box Explainers Conclusion 6. Debug Your ML Problems Software Best Practices ML-Specific Best Practices Debug Wiring: Visualizing and Testing Start with One Example Test Your ML Code Debug Training: Make Your Model Learn Task Difficulty Optimization Problems Debug Generalization: Make Your Model Useful Data Leakage Overfitting Consider the Task at Hand Conclusion 7. Using Classifiers for Writing Recommendations Extracting Recommendations from Models What Can We Achieve Without a Model? Extracting Global Feature Importance Using a Model's Score Extracting Local Feature Importance Comparing Models Version 1: The Report Card Version 2: More Powerful, More Unclear Version 3: Understandable Recommendations Generating Editing Recommendations Conclusion Part IV. Deploy and Monitor 8. Considerations When Deploying Models Data Concerns Data Ownership Data Bias Systemic Bias Modeling Concerns Feedback Loops Inclusive Model Performance Considering Context Adversaries Abuse Concerns and Dual-Use Chris Harland: Shipping Experiments Conclusion 9. Choose Your Deployment Option Server-Side Deployment Streaming Application or API Batch Predictions Client-Side Deployment On Device Browser Side Federated Learning: A Hybrid Approach Conclusion 10. Build Safeguards for Models Engineer Around Failures Input and Output Checks Model Failure Fal