114 - Behavioral Testing of NLP Models, with Marco Tulio Ribeiro
We invited Marco Tulio Ribeiro, a Senior Research…
44 Minuten
Podcast
Podcaster
Beschreibung
vor 5 Jahren
We invited Marco Tulio Ribeiro, a Senior Researcher at Microsoft,
to talk about evaluating NLP models using behavioral testing, a
framework borrowed from Software Engineering. Marco describes three
kinds of black-box tests the check whether NLP models satisfy
certain necessary conditions. While breaking the standard IID
assumption, this framework presents a way to evaluate whether NLP
systems are ready for real-world use. We also discuss what
capabilities can be tested using this framework, how one can come
up with good tests, and the need for an evolving set of behavioral
tests for NLP systems. Marco’s homepage:
https://homes.cs.washington.edu/~marcotcr/
to talk about evaluating NLP models using behavioral testing, a
framework borrowed from Software Engineering. Marco describes three
kinds of black-box tests the check whether NLP models satisfy
certain necessary conditions. While breaking the standard IID
assumption, this framework presents a way to evaluate whether NLP
systems are ready for real-world use. We also discuss what
capabilities can be tested using this framework, how one can come
up with good tests, and the need for an evolving set of behavioral
tests for NLP systems. Marco’s homepage:
https://homes.cs.washington.edu/~marcotcr/
Weitere Episoden
30 Minuten
vor 2 Jahren
51 Minuten
vor 2 Jahren
45 Minuten
vor 2 Jahren
48 Minuten
vor 2 Jahren
36 Minuten
vor 2 Jahren
In Podcasts werben
Kommentare (0)