111 - Typologically diverse, multi-lingual, information-seeking questions, with Jon Clark
We invited Jon Clark from Google to talk about Ty…
38 Minuten
Podcast
Podcaster
Beschreibung
vor 5 Jahren
We invited Jon Clark from Google to talk about TyDi QA, a new
question answering dataset, for this episode. The dataset contains
information seeking questions in 11 languages that are
typologically diverse, i.e., they differ from each other in terms
of key structural and functional features. The questions in TyDiQA
are information-seeking, like those in Natural Questions, which we
discussed in the previous episode. In addition, TyDiQA also has
questions collected in multiple languages using independent
crowdsourcing pipelines, as opposed to some other multilingual QA
datasets like XQuAD and MLQA where English data is translated into
other languages. The dataset and the leaderboard can be accessed at
https://ai.google.com/research/tydiqa.
question answering dataset, for this episode. The dataset contains
information seeking questions in 11 languages that are
typologically diverse, i.e., they differ from each other in terms
of key structural and functional features. The questions in TyDiQA
are information-seeking, like those in Natural Questions, which we
discussed in the previous episode. In addition, TyDiQA also has
questions collected in multiple languages using independent
crowdsourcing pipelines, as opposed to some other multilingual QA
datasets like XQuAD and MLQA where English data is translated into
other languages. The dataset and the leaderboard can be accessed at
https://ai.google.com/research/tydiqa.
Weitere Episoden
30 Minuten
vor 2 Jahren
51 Minuten
vor 2 Jahren
45 Minuten
vor 2 Jahren
48 Minuten
vor 2 Jahren
36 Minuten
vor 2 Jahren
In Podcasts werben
Kommentare (0)