Best Practices

Reproducible Experiments

Seeds

Git Env

Docker

Pinned Requirements

Logging

Experiment Tracking

Baseline

Evaluation First

Change one thing at a time

Avoiding Drift Between Development and Production

Immutable Data

Testing

Human Baseline / Data Annotation

Build a hand annotated held out set by annotating the data. Measure your performance against the test set. Evaluate your model against your set.

Error Analysis

Overfitting a Batch

Starting with a smaller dataset

Starting with the simplest model

Hyper Parameter Tunning

Iteration Speed

Usable by Others

Documentation

Website

Common Commands

Fine Tunning

Caching Model Outputs

Training in GPU

Data Management

Versioning Datasets

Tracking Changes

Production

Monitoring

Shadow Deployments

Feedback

Load Testing

Batching

Continuous Integration

Automate Training

Other Resources