Monday, February 17, 2025

Text Analytics

Text Analytics is Microsoft AI library with the below tasks

  • Tokenization: Breaking text into individual words or phrases (tokens).
  • Stop Word Removal: Removing common words (e.g., "the," "a," "is") that don't carry much meaning.
  • Stemming/Lemmatization: Reducing words to their root form (e.g., "running" to "run"). Stemming is a simpler, rule-based approach, while lemmatization uses dictionaries and is more accurate.
  • Part-of-Speech Tagging: Identifying the grammatical role of each word (e.g., noun, verb, adjective).
  • Named Entity Recognition (NER): Identifying named entities like people, organizations, and locations.
  • Sentiment Analysis: Determining the emotional tone of text (positive, negative, neutral).
  • Topic Modeling: Discovering underlying topics in a collection of documents.
  • Text Classification: Assigning categories or labels to text.


No comments:

Post a Comment