76 - Increasing In-Class Similarity by Retrofitting Embeddings with Demographics, with Dirk Hovy
EMNLP 2018 paper by Dirk Hovy and Tommaso Fornaci…
30 Minuten
Podcast
Podcaster
Beschreibung
vor 7 Jahren
EMNLP 2018 paper by Dirk Hovy and Tommaso Fornaciari.
https://www.semanticscholar.org/paper/Improving-Author-Attribute-Prediction-by-Linguistic-Hovy-Fornaciari/71aad8919c864f73108aafd8e926d44e9df51615
In this episode, Dirk Hovy talks about natural language as social
phenomenon which can provide insights about those who generate it.
For example, this paper uses retrofitted embeddings to improve on
two tasks: predicting the gender and age group of a person based on
their online reviews. In this approach, authors embeddings are
first generated using Doc2Vec, then retrofitted such that authors
with similar attributes are closer in the vector space. In order to
estimate the retrofitted vectors for authors with unknown
attributes, a linear transformation is learned which maps Doc2Vec
vectors to the retrofitted vectors. Dirk also used a similar
approach to encode geographic information to model regional
linguistic variations, in another EMNLP 2018 paper with Christoph
Purschke titled “Capturing Regional Variation with Distributed
Place Representations and Geographic Retrofitting” [link:
https://www.semanticscholar.org/paper/Capturing-Regional-Variation-with-Distributed-Place-Hovy-Purschke/6d9babd835d0cdaaf175f098bb4fd61fd75b1be0].
https://www.semanticscholar.org/paper/Improving-Author-Attribute-Prediction-by-Linguistic-Hovy-Fornaciari/71aad8919c864f73108aafd8e926d44e9df51615
In this episode, Dirk Hovy talks about natural language as social
phenomenon which can provide insights about those who generate it.
For example, this paper uses retrofitted embeddings to improve on
two tasks: predicting the gender and age group of a person based on
their online reviews. In this approach, authors embeddings are
first generated using Doc2Vec, then retrofitted such that authors
with similar attributes are closer in the vector space. In order to
estimate the retrofitted vectors for authors with unknown
attributes, a linear transformation is learned which maps Doc2Vec
vectors to the retrofitted vectors. Dirk also used a similar
approach to encode geographic information to model regional
linguistic variations, in another EMNLP 2018 paper with Christoph
Purschke titled “Capturing Regional Variation with Distributed
Place Representations and Geographic Retrofitting” [link:
https://www.semanticscholar.org/paper/Capturing-Regional-Variation-with-Distributed-Place-Hovy-Purschke/6d9babd835d0cdaaf175f098bb4fd61fd75b1be0].
Weitere Episoden
30 Minuten
vor 2 Jahren
51 Minuten
vor 2 Jahren
45 Minuten
vor 2 Jahren
48 Minuten
vor 2 Jahren
36 Minuten
vor 2 Jahren
In Podcasts werben
Kommentare (0)