37 - On Statistical Significance, Training Variance, and Why Reporting Score Distributions Matters
In this episode we talk about a couple of recent …
13 Minuten
Podcast
Podcaster
Beschreibung
vor 8 Jahren
In this episode we talk about a couple of recent papers that get at
the issue of training variance, and why we should not just take the
max from a training distribution when reporting results. Sadly, our
current focus on performance in leaderboards only exacerbates these
issues, and (in my opinion) encourages bad science. Papers:
https://www.semanticscholar.org/paper/Reporting-Score-Distributions-Makes-a-Difference-P-Reimers-Gurevych/0eae432f7edacb262f3434ecdb2af707b5b06481
https://www.semanticscholar.org/paper/Deep-Reinforcement-Learning-that-Matters-Henderson-Islam/90dad036ab47d683080c6be63b00415492b48506
the issue of training variance, and why we should not just take the
max from a training distribution when reporting results. Sadly, our
current focus on performance in leaderboards only exacerbates these
issues, and (in my opinion) encourages bad science. Papers:
https://www.semanticscholar.org/paper/Reporting-Score-Distributions-Makes-a-Difference-P-Reimers-Gurevych/0eae432f7edacb262f3434ecdb2af707b5b06481
https://www.semanticscholar.org/paper/Deep-Reinforcement-Learning-that-Matters-Henderson-Islam/90dad036ab47d683080c6be63b00415492b48506
Weitere Episoden
30 Minuten
vor 2 Jahren
51 Minuten
vor 2 Jahren
45 Minuten
vor 2 Jahren
48 Minuten
vor 2 Jahren
36 Minuten
vor 2 Jahren
In Podcasts werben
Kommentare (0)