hi
Overview of Tweet Analysis
A random sample of tweets from the @Auckland Uni Twitter account was analyzed to explore various characteristics of tweets and their potential influence on engagement.
Data Variables in the Dataset
tweet_id: Unique identifier for each tweet.
year_tweeted: Year when the tweet was published (2020, 2021).
number_hashtags: The count of hashtags used in the tweet.
first_words: Starting words of the tweet.
tweet_length: The length of the tweet in characters.
True/False Statements with Variables
number_hashtags: Numeric variable identified by iNZight Lite. TRUE.
first_words: Categorical variable identified by iNZight Lite. FALSE.
tweet_length: Categorical variable identified by iNZight Lite. FALSE.
Dataset Shape:
The dataset is rectangular. TRUE.
Engagement Analysis
Percentage of Tweets with Engagement: 29% of tweets received at least one retweet. FALSE.
Most tweets contained no links. TRUE.
Among tweets with links, 70% received at least one retweet. TRUE.
Tweets with links significantly predict engagement (retweeting). TRUE.
Summary Statistics Questions
Proportion of Tweets Using Hashtags: Calculate using iNZight Lite.
Tweets Posted on Sunday: Calculate using iNZight Lite.
Hashtag Use on Monday: Identify the proportion using iNZight Lite.
Hashtag-Free Tweets on Thursday: Determine proportion using iNZight Lite.
Classification of Bots in Tweets
Confusion Matrix Findings
Analyzed a model that classifies tweets as written by a bot.
Predicted / Actual Classification: Actual Twitter bot: 3 predicted as bots, 4 as not bots (total 7).
Actual Not bot: 2 are predicted as bots (total 11).
Percentage Calculations
Overall Accuracy of the Model: Calculate based on correct predictions over total tweets.
Percentage of Actual Bots: Identify the total percentage of tweets written by bots.
Predicted Bots Accuracy: Percentage of predicted bots actually being bots.
Non-bot Predictions: Percentage of tweets not predicted as bots that were actually bots.
Visual Representation of Model Results
A visual of the tweet classification model utilizing 34 tweets. Color coding indicates actual authorship and predictive capabilities of the model. A complete confusion matrix needs to be filled based on visual data.