NLP Highlights

NLP Highlights

**The podcast is currently on hiatus. For more ac…

Episoden

141 - Building an open source LM, with Iz Beltagy and Dirk Groeneveld
29.06.2023
30 Minuten
In this special episode of NLP Highlights, we discussed building and open sourcing language models. What is the usual recipe for building large language models? What does it mean to open source them? What new research questions can we answer by open sourcing them? We particularly focused on the ongoing Open Language Model (OLMo) project at AI2, and invited Iz Beltagy and Dirk Groeneveld, the research and engineering leads of the OLMo project to chat. Blog post announcing OLMo: https://blog.allenai.org/announcing-ai2-olmo-an-open-language-model-made-by-scientists-for-scientists-ab761e4e9b76 Organizations interested in partnership can express their interest here: https://share.hsforms.com/1blFWEWJ2SsysSXFUEJsxuA3ioxm You can find Iz at twitter.com/i_beltagy and Dirk at twitter.com/mechanicaldirk
Mehr
140 - Generative AI and Copyright, with Chris Callison-Burch
06.06.2023
51 Minuten
In this special episode, we chatted with Chris Callison-Burch about his testimony in the recent U.S. Congress Hearing on the Interoperability of AI and Copyright Law. We started by asking Chris about the purpose and the structure of this hearing. Then we talked about the ongoing discussion on how the copyright law is applicable to content generated by AI systems, the potential risks generative AI poses to artists, and Chris’ take on all of this. We end the episode with a recording of Chris’ opening statement at the hearing.
Mehr
139 - Coherent Long Story Generation, with Kevin Yang
24.03.2023
45 Minuten
How can we generate coherent long stories from language models? Ensuring that the generated story has long range consistency and that it conforms to a high level plan is typically challenging. In this episode, Kevin Yang describes their system that prompts language models to first generate an outline, and iteratively generate the story while following the outline and reranking and editing the outputs for coherence. We also discussed the challenges involved in evaluating long generated texts. Kevin Yang is a PhD student at UC Berkeley. Kevin's webpage: https://people.eecs.berkeley.edu/~yangk/ Papers discussed in this episode: 1. Re3: Generating Longer Stories With Recursive Reprompting and Revision (https://www.semanticscholar.org/paper/Re3%3A-Generating-Longer-Stories-With-Recursive-and-Yang-Peng/2aab6ca1a8dae3f3db6d248231ac3fa4e222b30a) 2. DOC: Improving Long Story Coherence With Detailed Outline Control (https://www.semanticscholar.org/paper/DOC%3A-Improving-Long-Story-Coherence-With-Detailed-Yang-Klein/ef6c768f23f86c4aa59f7e859ca6ffc1392966ca)
Mehr
138 - Compositional Generalization in Neural Networks, with Najoung Kim
20.01.2023
48 Minuten
Compositional generalization refers to the capability of models to generalize to out-of-distribution instances by composing information obtained from the training data. In this episode we chatted with Najoung Kim, on how to explicitly evaluate specific kinds of compositional generalization in neural network models of language. Najoung described COGS, a dataset she built for this, some recent results in the space, and why we should be careful about interpreting the results given the current practice of pretraining models of lots of unlabeled text. Najoung's webpage: https://najoungkim.github.io/ Papers we discussed: 1. COGS: A Compositional Generalization Challenge Based on Semantic Interpretation (Kim et al., 2020): https://www.semanticscholar.org/paper/b20ddcbd239f3fa9acc603736ac2e4416302d074 2. Compositional Generalization Requires Compositional Parsers (Weissenhorn et al., 2022): https://www.semanticscholar.org/paper/557ebd17b7c7ac4e09bd167d7b8909b8d74d1153 3. Uncontrolled Lexical Exposure Leads to Overestimation of Compositional Generalization in Pretrained Models (Kim et al., 2022): https://www.semanticscholar.org/paper/8969ea3d254e149aebcfd1ffc8f46910d7cb160e Note that we referred to the final paper by an earlier name in the discussion.
Mehr
137 - Nearest Neighbor Language Modeling and Machine Translation, with Urvashi Khandelwal
13.01.2023
36 Minuten
We invited Urvashi Khandelwal, a research scientist at Google Brain to talk about nearest neighbor language and machine translation models. These models interpolate parametric (conditional) language models with non-parametric distributions over the closest values in some data stores built from relevant data. Not only are these models shown to outperform the usual parametric language models, they also have important implications on memorization and generalization in language models. Urvashi's webpage: https://urvashik.github.io Papers discussed: 1) Generalization through memorization: Nearest Neighbor Language Models (https://www.semanticscholar.org/paper/7be8c119dbe065c52125ee7716601751f3116844) 2)Nearest Neighbor Machine Translation (https://www.semanticscholar.org/paper/20d51f8e449b59c7e140f7a7eec9ab4d4d6f80ea)
Mehr

Über diesen Podcast

**The podcast is currently on hiatus. For more active NLP content, check out the Holistic Intelligence Podcast linked below.** Welcome to the NLP highlights podcast, where we invite researchers to talk about their work in various areas in natural language processing. All views expressed belong to the hosts/guests, and do not represent their employers.

Kommentare (0)

Lade Inhalte...

Abonnenten

15
15