Key steps in ML setup
Key steps in ML setup
-
Problem setup
- what is the existing solution
- user interaction and batch or online mode
- input output and historical data
-
Mode - Scale and latency requirements
- online then latency matters
- offline then freshness matters
-
Metrics - Telemetric
- offline metrics
- AUG, log loss, precision, recall, f1, NDCG etc.
- online metrics
- end-to-end metrics and component metrics
- user behavior indicators
- offline metrics
-
Architecture
- Components
-
Training data
- Corpus centric - manual label
- Closed loop - historical user interaction
- Maybe from heuristic based first version - first step - no ML
-
Feature Engineering
- Make the crucial signal in the data pop
-
Model Training
- Select appropriate model structure per problem
- Tuning hyper-parameters according to offline metric
-
Piloting
- Direct small % user to new model, collect telemetric and decide launch or not?
-
Iterative Model Improvement