The task in Named Entity Recognition (NER) is to find the entity-type of words. Entities can be locations, times or names.
Intros
- Introduction for named entities using Python at depends-on-the-definition.com
- A Review of Named Entity Recognition (NER) in towardsdatascience
- Named entity extraction Github - sample code for 3 different methods
- Papers with Code on Named Entity Recognition
Methods (from simple to advanced)
- Conditional random field (CRF)
- LSTM
- Bidirectional LSTM-CRF Models for Sequence Tagging paper on arxiv and article here
- LSTMs With Character Embeddings
- Residual LSTM network together with ELMo embeddings (paper here)
- ELMo word representations are functions of the entire input sentence
- NER with BERT - highest score
Service offerings
- Azure Entity identification and entity linking
- Google Cloud Api Demo which also figures out phone # etc. You can also train custom models.
Client libraries
Good introduction O’Reilly Deep Learning for Natural Language Processing
- Libraries:
- Open source - NLTK. Introductory exercise here in O’Reilly book here
- spaCy Industrial Strength NLP
- Article that shows custom NER on top of spaCy
- CoreNLP / Stanford Python library
- Gensim
- Textblob
- Flair A very simple framework for state-of-the-art NLP
Data sets
- Training data set entity annotated corpus
- Conll 2003
My own prototype
- Implementing spacy together with Flask / python, and deploying to Azure. Live website