Author: Dr. Gohar F. Khan
Chapter: 7
Upon concluding this chapter, readers will have gained the knowledge and skills to:
Comprehend basic social text analytics concepts and tools.
Understand uses of text analytics by different sectors:
Business
Government
Academia
Financial institutes
Recognize objectives of social media text analytics for business intelligence, including:
Sentiment analysis
Concept mining
Trends mining
Topic mining
Comprehend the text analytics cycle and steps required to extract business insights from text.
Understand various text analytics terms, methods, and algorithms.
Extract and analyze social media text.
List common limitations and issues in text analytics.
Fact: 80% of data is unstructured.
Examples include:
Database notes
Call center transcripts
Emails
Open-ended survey responses
Web pages
News groups
Reviews, tweets, comments
Multimedia (photos, videos, infographics)
Consequence: Decision-makers rely on only 20% of data.
Text Definition: Fundamental element of social media platforms includes comments, tweets, blog posts, product reviews, and status updates.
Social Media Text Analytics (Text Mining): Technique to extract, analyze, and interpret hidden business insights from textual data.
Categories of Uses:
By Business
By Government
By Academia
By Financial Institutes
Definition: Real-time, user-generated content articulating opinions on various topics.
Characteristics:
Shorter in length (e.g., a couple of sentences)
Frequently updated or deleted.
Examples:
Tweets
Comments
Discussions
Reviews
Definition: Text that is revised infrequently.
Characteristics: Longer format (e.g., several paragraphs).
Examples:
Wiki content
Blogs
Word documents
Corporate reports
News transcripts
IBM: Leading provider with enterprise-level text analytics solutions via NLP tools.
Google: Offers Cloud Natural Language API for understanding customer conversations and documents.
Microsoft: Azure Text Analytics provides services for sentiment analysis, key phrase extraction, etc.
SAS: Advanced analytics platform for discovering patterns in text data.
Smaller Specialists: Aylien, TextRazor focusing on tailored NLP services.
On-premise Model: Expensive but offers greater security and control.
Cloud-based Model: Cost-effective, scalable, and lower risks; attractive for small/medium businesses.
Common Applications:
Document management
Corporate history
Scientific publications
Thematic understanding of websites
Survey data
Email comprehension
Call center data
Definition: Assess social media text (mostly dynamic) as positive, negative, or neutral.
Purpose: Discovers user intentions (e.g., buying, selling, recommending) from media text.
Definition: Predictive analytics to foresee future events using vast amounts of social media data.
Purpose: Extracts ideas/concepts from static social media text for classification and clustering.
Identification and Searching: Locate the text source for analysis, acknowledging diversity and noise in social media content.
Text Parsing and Filtering: Clean, filter, and prepare text using NLP techniques to remove irrelevant elements.
Text Transformation: Convert cleaned text into a computer-readable format (binary code).
Text Mining: Apply various algorithms (clustering, association, classification, prediction) to extract business insights.
Natural Language Processing (NLP)
Information Retrieval (IR)
Named Entity Recognition (NER)
Corpus
Bag of Words (BoW)
Latent Semantic Analysis (LSA)
Latent Dirichlet Allocation (LDA)
Lack of a solid business case
Resource intensity
Complexity of sentiments
Contextual nature of data
Issues with multilingual text
Cultural/regional differences
NLTK: Python library for NLP tasks.
spaCy: Production-ready Python NLP library with pre-trained models.
Gensim: Focused on topic modeling and document similarity.
TextBlob: Simple library for various NLP tasks including sentiment analysis.
Lexalytics: Tool for semantic analysis and sentiment extraction.
Discovertext: Platform for collecting and analyzing text streams.
Tweet Archivist: Tool for archiving and analyzing tweets.
Twitonomy: Detailed analytics of Twitter engagements.
Discuss the usefulness of text analytics and provide a differentiation between static and dynamic text.
Explain the four primary purposes of social media text analytics.
Distinguish between supervised and unsupervised machine learning techniques.
Describe the social media text value creation cycle.
Identify issues related to text analytics.