A curated list of awesome embedding models tutorials, projects and communities. Please feel free to pull requests to add links.
Word2vec, GloVe, FastText
- Efficient Estimation of Word Representations in Vector Space (2013), T. Mikolov et al. [pdf]
- Distributed Representations of Words and Phrases and their Compositionality (2013), T. Mikolov et al. [pdf]
- word2vec Parameter Learning Explained (2014), Xin Rong [pdf]
- word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method (2014), Yoav Goldberg, Omer Levy [pdf]
- GloVe: Global Vectors for Word Representation (2014), J. Pennington et al. [pdf]
- Improving Word Representations via Global Context and Multiple Word Prototypes (2012), EH Huang et al. [pdf]
- Enriching Word Vectors with Subword Information (2016), P. Bojanowski et al. [pdf]
- Bag of Tricks for Efficient Text Classification (2016), A. Joulin et al. [pdf]
Language Model
- Semi-supervised sequence tagging with bidirectional language models (2017), Peters, Matthew E., et al. [pdf]
- Deep contextualized word representations (2018), Peters, Matthew E., et al. [pdf]
- Contextual String Embeddings for Sequence Labeling (2018), Akbik, Alan, Duncan Blythe, and Roland Vollgraf. [pdf]
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (2018), [pdf]
Embedding Enhancement
- Sentence Embedding:Learning Semantic Sentence Embeddings using Pair-wise Discriminator(2018),Patro et al.[Project Page] [Paper]
- Retrofitting Word Vectors to Semantic Lexicons (2014), M. Faruqui et al. [pdf]
- Better Word Representations with Recursive Neural Networks for Morphology (2013), T.Luong et al. [pdf]
- Dependency-Based Word Embeddings (2014), Omer Levy, Yoav Goldberg [pdf]
- Not All Neural Embeddings are Born Equal (2014), F. Hill et al. [pdf]
- Two/Too Simple Adaptations of Word2Vec for Syntax Problems (2015), W. Ling[pdf]
Comparing count-based vs predict-based method
- Linguistic Regularities in Sparse and Explicit Word Representations (2014), Omer Levy, Yoav Goldberg[pdf]
- Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors (2014), M. Baroni [pdf]
- Improving Distributional Similarity with Lessons Learned from Word Embeddings (2015), Omer Levy [pdf]
Evaluation, Analysis
- Evaluation methods for unsupervised word embeddings (2015), T. Schnabel [pdf]
- Intrinsic Evaluation of Word Vectors Fails to Predict Extrinsic Performance (2016), B. Chiu [pdf]
- Problems With Evaluation of Word Embeddings Using Word Similarity Tasks (2016), M. Faruqui [pdf]
- Improving Reliability of Word Similarity Evaluation by Redesigning Annotation Task and Performance Measure (2016), Oded Avraham, Yoav Goldberg [pdf]
- Evaluating Word Embeddings Using a Representative Suite of Practical Tasks (2016), N. Nayak [pdf]
Sentence
- Skip-Thought Vectors
- A Simple but Tough-to-Beat Baseline for Sentence Embeddings
- An efficient framework for learning sentence representations
- Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning
- Universal Sentence Encoder
Document
- SENSEMBED: Learning Sense Embeddings for Word and Relational Similarity
- Multi-Prototype Vector-Space Models of Word Meaning
- Recurrent neural network based language model
- A Neural Probabilistic Language Model
- Linguistic Regularities in Continuous Space Word Representations
- SemEval-2012 Task 2
- WordSimilarity-353
- Stanford's Contextual Word Similarities (SCWS)
- Stanford Rare Word (RW) Similarity Dataset
Below is pre-trained ELMo models. Adding ELMo to existing NLP systems significantly improves the state-of-the-art for every considered task.
Below is pre-trained sent2vec models.
Convenient downloader for pre-trained word vectors:
Links for pre-trained word vectors:
- Word2vec pretrained vector(English Only)
- Word2vec pretrained vectors for 30+ languages
- FastText pretrained vectors for 157 languages
- FastText pretrained vector for Japanese with NEologd
- word vectors trained by GloVe
- Dependency-Based Word Embeddings
- Meta-Embeddings
- Lex-Vec
- Huang et al. (2012)'s embeddings (HSMN+csmRNN)
- Collobert et al. (2011)'s embeddings (CW+csmRNN)
- BPEmb: subword embeddings for 275 languages
- Wikipedia2Vec: pretrained word and entity embeddings for 12 languages
- word2vec-slim
- BioWordVec: fastText pretrained vector for biomedical text