NLTK is a leading platform for building Python programs to work with human language data.It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries, and an active discussion forum.
- Includes basic introduction to NLP with NLTK(Natural Language Toolkit) including documentation and python code in jupyter notebook.
- NLP with NLTK includes following content:
- Installing NLTK
- Concordance with nltk
- Similar with nltk
- common_contexts with nltk
- Dispersion plots for words in text
- Counting Vocabulary
- Find length of text
- Find distinct words in text
- Calculate a meausre of the lexical richness of the text
- word count in text
- Text as list of words
- Lists and basic operation with list
- Indexing lists
- slicing lists
- String and Basic Operations
- Multiplication with strings
- Addition with strings
- Join list to string
- Split string to list
- Frequency Distribution of From Text
- Hapaxes
- Fine-Grained Selection of Words
- Collocations and Bigrams
- Making Decision and Control
- conditionals
- Numerical comparison operators
- word comparison operators
- conditionals
- Text Preprocessing using NLTK
- Tokenization
- Word Tokenization
- Sentence Tokenization
- Lower Casing
- Stop words removal
- Stemming
- Errors in stemming i.e Over Stemming and Under Stemming
- Lemmatization
- Difference between stemming and lemmatization
- Removal of symbols and numbers
- Named Entity Recognition
- Tokenization
- clone project
- Install NLTK
- pip install nltk
- Install nltk book module
- import nltk
- nltk.downlad()
- Browse and select nltk book module
- Click Download