Podcaster
Episoden
29.06.2023
30 Minuten
In this special episode of NLP Highlights, we discussed building
and open sourcing language models. What is the usual recipe for
building large language models? What does it mean to open source
them? What new research questions can we answer by open sourcing
them? We particularly focused on the ongoing Open Language Model
(OLMo) project at AI2, and invited Iz Beltagy and Dirk Groeneveld,
the research and engineering leads of the OLMo project to chat.
Blog post announcing OLMo:
https://blog.allenai.org/announcing-ai2-olmo-an-open-language-model-made-by-scientists-for-scientists-ab761e4e9b76
Organizations interested in partnership can express their interest
here: https://share.hsforms.com/1blFWEWJ2SsysSXFUEJsxuA3ioxm You
can find Iz at twitter.com/i_beltagy and Dirk at
twitter.com/mechanicaldirk
Mehr
06.06.2023
51 Minuten
In this special episode, we chatted with Chris Callison-Burch about
his testimony in the recent U.S. Congress Hearing on the
Interoperability of AI and Copyright Law. We started by asking
Chris about the purpose and the structure of this hearing. Then we
talked about the ongoing discussion on how the copyright law is
applicable to content generated by AI systems, the potential risks
generative AI poses to artists, and Chris’ take on all of this. We
end the episode with a recording of Chris’ opening statement at the
hearing.
Mehr
24.03.2023
45 Minuten
How can we generate coherent long stories from language models?
Ensuring that the generated story has long range consistency and
that it conforms to a high level plan is typically challenging. In
this episode, Kevin Yang describes their system that prompts
language models to first generate an outline, and iteratively
generate the story while following the outline and reranking and
editing the outputs for coherence. We also discussed the challenges
involved in evaluating long generated texts. Kevin Yang is a PhD
student at UC Berkeley. Kevin's webpage:
https://people.eecs.berkeley.edu/~yangk/ Papers discussed in this
episode: 1. Re3: Generating Longer Stories With Recursive
Reprompting and Revision
(https://www.semanticscholar.org/paper/Re3%3A-Generating-Longer-Stories-With-Recursive-and-Yang-Peng/2aab6ca1a8dae3f3db6d248231ac3fa4e222b30a)
2. DOC: Improving Long Story Coherence With Detailed Outline
Control
(https://www.semanticscholar.org/paper/DOC%3A-Improving-Long-Story-Coherence-With-Detailed-Yang-Klein/ef6c768f23f86c4aa59f7e859ca6ffc1392966ca)
Mehr
20.01.2023
48 Minuten
Compositional generalization refers to the capability of models to
generalize to out-of-distribution instances by composing
information obtained from the training data. In this episode we
chatted with Najoung Kim, on how to explicitly evaluate specific
kinds of compositional generalization in neural network models of
language. Najoung described COGS, a dataset she built for this,
some recent results in the space, and why we should be careful
about interpreting the results given the current practice of
pretraining models of lots of unlabeled text. Najoung's webpage:
https://najoungkim.github.io/ Papers we discussed: 1. COGS: A
Compositional Generalization Challenge Based on Semantic
Interpretation (Kim et al., 2020):
https://www.semanticscholar.org/paper/b20ddcbd239f3fa9acc603736ac2e4416302d074
2. Compositional Generalization Requires Compositional Parsers
(Weissenhorn et al., 2022):
https://www.semanticscholar.org/paper/557ebd17b7c7ac4e09bd167d7b8909b8d74d1153
3. Uncontrolled Lexical Exposure Leads to Overestimation of
Compositional Generalization in Pretrained Models (Kim et al.,
2022):
https://www.semanticscholar.org/paper/8969ea3d254e149aebcfd1ffc8f46910d7cb160e
Note that we referred to the final paper by an earlier name in the
discussion.
Mehr
13.01.2023
36 Minuten
We invited Urvashi Khandelwal, a research scientist at Google Brain
to talk about nearest neighbor language and machine translation
models. These models interpolate parametric (conditional) language
models with non-parametric distributions over the closest values in
some data stores built from relevant data. Not only are these
models shown to outperform the usual parametric language models,
they also have important implications on memorization and
generalization in language models. Urvashi's webpage:
https://urvashik.github.io Papers discussed: 1) Generalization
through memorization: Nearest Neighbor Language Models
(https://www.semanticscholar.org/paper/7be8c119dbe065c52125ee7716601751f3116844)
2)Nearest Neighbor Machine Translation
(https://www.semanticscholar.org/paper/20d51f8e449b59c7e140f7a7eec9ab4d4d6f80ea)
Mehr
Über diesen Podcast
**The podcast is currently on hiatus. For more active NLP content,
check out the Holistic Intelligence Podcast linked below.** Welcome
to the NLP highlights podcast, where we invite researchers to talk
about their work in various areas in natural language processing.
All views expressed belong to the hosts/guests, and do not
represent their employers.
Kommentare (0)