Viewing a single comment thread. View all comments

cthorrez t1_jc5s8ag wrote

Basically I would just make sure the metrics being compared are computed the same way. Same numerator and denominator like summing vs averaging, over the batch vs epoch. If the datasets are the same and the type of metric you are computing is the same it's comparable.

The implementation details just become part of the comparison.

9