Text Analytics is Microsoft AI library with the below tasks
- Tokenization: Breaking text into individual words or phrases (tokens).
- Stop Word Removal: Removing common words (e.g., "the," "a," "is") that don't carry much meaning.
- Stemming/Lemmatization: Reducing words to their root form (e.g., "running" to "run"). Stemming is a simpler, rule-based approach, while lemmatization uses dictionaries and is more accurate.
- Part-of-Speech Tagging: Identifying the grammatical role of each word (e.g., noun, verb, adjective).
- Named Entity Recognition (NER): Identifying named entities like people, organizations, and locations.
- Sentiment Analysis: Determining the emotional tone of text (positive, negative, neutral).
- Topic Modeling: Discovering underlying topics in a collection of documents.
- Text Classification: Assigning categories or labels to text.