hi

Overview of Tweet Analysis

A random sample of tweets from the @Auckland Uni Twitter account was analyzed to explore various characteristics of tweets and their potential influence on engagement.

Data Variables in the Dataset

  • tweet_id: Unique identifier for each tweet.

  • year_tweeted: Year when the tweet was published (2020, 2021).

  • number_hashtags: The count of hashtags used in the tweet.

  • first_words: Starting words of the tweet.

  • tweet_length: The length of the tweet in characters.

True/False Statements with Variables

  • number_hashtags: Numeric variable identified by iNZight Lite. TRUE.

  • first_words: Categorical variable identified by iNZight Lite. FALSE.

  • tweet_length: Categorical variable identified by iNZight Lite. FALSE.

Dataset Shape:

The dataset is rectangular. TRUE.

Engagement Analysis

  • Percentage of Tweets with Engagement: 29% of tweets received at least one retweet. FALSE.

  • Most tweets contained no links. TRUE.

  • Among tweets with links, 70% received at least one retweet. TRUE.

  • Tweets with links significantly predict engagement (retweeting). TRUE.

Summary Statistics Questions

  • Proportion of Tweets Using Hashtags: Calculate using iNZight Lite.

  • Tweets Posted on Sunday: Calculate using iNZight Lite.

  • Hashtag Use on Monday: Identify the proportion using iNZight Lite.

  • Hashtag-Free Tweets on Thursday: Determine proportion using iNZight Lite.

Classification of Bots in Tweets

Confusion Matrix Findings

Analyzed a model that classifies tweets as written by a bot.

  • Predicted / Actual Classification: Actual Twitter bot: 3 predicted as bots, 4 as not bots (total 7).

  • Actual Not bot: 2 are predicted as bots (total 11).

Percentage Calculations

  • Overall Accuracy of the Model: Calculate based on correct predictions over total tweets.

  • Percentage of Actual Bots: Identify the total percentage of tweets written by bots.

  • Predicted Bots Accuracy: Percentage of predicted bots actually being bots.

  • Non-bot Predictions: Percentage of tweets not predicted as bots that were actually bots.

Visual Representation of Model Results

A visual of the tweet classification model utilizing 34 tweets. Color coding indicates actual authorship and predictive capabilities of the model. A complete confusion matrix needs to be filled based on visual data.