Big Data and Advanced Analytics in Modern Marketing

The Relationship Between Big Data and Secondary Data

  • Conceptual Link: Big data is fundamentally a form of secondary data because it consists of information already existing within an organization. It is “lying all around” rather than being gathered for a specific, immediate research purpose.
  • Distinction from Traditional Secondary Data: While it falls under the same category, big data is characterized as being significantly more powerful and complex than the traditional secondary data sources previously discussed.
  • Complexity Level: Big data is not a singular set of data but refers to a whole series of diverse data sets that are extremely large and complex.

Defining Characteristics and Technical Requirements

  • Data Forms: Big data encompasses various formats including:     * Structured data (organized).     * Unstructured data (loose or disorganized).     * Textual data (treated as data by computers despite humans reading it as articles).
  • Analysis Limitations: Due to its “enormity,” big data cannot be analyzed using standard technical techniques. It requires:     * Transformation: Converting data into a state where statistical techniques can be applied.     * Machine Learning: Utilizing highly sophisticated computer-driven analysis, including supervised and unsupervised learning, to gain deep insights.     * Pattern Recognition: The focus is primarily on identifying patterns rather than traditional measures of statistical significance or differences used in smaller sample sizes.

Physical Infrastructure and Financial Scale

  • Storage Reality: Big data cannot be stored on laptops, standard servers, or typical cloud storage. It requires “hyperscale” data centers.
  • Scale Comparisons: The data is stored in massive buildings that can reach the size of a couple of rugby grounds.
  • Investment Case Studies:     * Microsoft (2022): Invested approximately 300,000,000,000300,000,000,000 to build data centers globally to house interactions from individuals, companies, and governments.     * Datagrid (Invercargill, New Zealand): A company recently approved to build a data center in Invercargill valued at 3,500,000,0003,500,000,000.     * Regional Scope: The Datagrid center will house data for the Southern Hemisphere, including Australia and Pacific countries.     * Physical Footprint: The space allocated for this project is roughly 48 hectares48\text{ hectares}.     * Growth Trends: The 3.5 billion3.5\text{ billion} investment for a regional center is nearly ten times the scale of major investments from just two years prior, highlighting the explosive growth of the sector.

The 5 Vs of Big Data

Automatic/Computer-Generated Qualities
  1. Volume: Massive, mega-scale quantity of data that exceeds human comprehension.
  2. Velocity: The extreme speed of data generation. Data is generated in real-time as people talk, move with mobile phones, or record digital interactions.
  3. Variety: A vast range of data types including expressions, movies, videos, conversations, and audio, all stored as computer bytes.
Human-Assigned/Marketer Qualities
  1. Value: Data has no inherent meaning until researchers and marketers assign value to it. This is done by applying assumptions, theories, and domains (health, medicine, etc.) to identify trends and prediction models.
  2. Veracity: Refers to the confidence and trust in the data. This addresses issues of “fake news” or artificially generated content.     * Policies: Veracity is managed through ethical frameworks and government regulations.     * Capital Analogy: Big data should be viewed as a general-purpose utility or “capital,” similar to electricity and water supply, requiring safeguarding and rules for consumption.

Strategic Importance in Business and Marketing

  • Efficiency: Enables better and faster decisions, driving productivity and growth.
  • Real-Time Analysis: Traditional survey or secondary research takes 2 months to a year to produce a report. Big data allows for instantaneous analysis.
  • Precise Targeting: Gadgets (fridges, washing machines, laptops) retrieve real-time data about user habits, allowing for highly sharpened segmentation and targeting.
  • New Product Development (NPD):     * Traditionally, more products fail than succeed on the shelf, leading to bankruptcy or resource burnout.     * Big data reduces this risk by capturing consumption behavior directly from the household.     * Example: Food in the Bag tracks tastes and preferences over 6 months to push specific products to individual customers.
  • Cost and Error Reduction: While human error still exists in data judgment, big data allows companies to “rebound” and correct errors within 24 hours24\text{ hours}.

Data Categorization types

  • Structured Data (20%): Neatly organized in Excel, databases, or Oracle systems. This is primarily used by accountants and financial professionals for sales and expenditure analysis.
  • Unstructured Data (80%): Includes the vast majority of global data. It currently remains mostly untouched because many professionals do not know how to analyze it. This presents a massive opportunity for marketers to take control of organizational decision-making.

Analytical Methods and Technical Skills

  • Data Mining: Digging into “data mountains” to find patterns or “nuggets” of information that change consumption behavior.
  • Coding Skills: Marketers must move beyond traditional skills and learn to code in programs such as R, Python, and SAS.
  • Neural Networks: A form of AI that mimics the human brain using "nodes" and relationships. It relies on a foundation of knowledge to “hook” new information onto.
  • Sentiment Analysis: Using Natural Language Processing (NLP) to understand the “mood” of customers. This is more accurate than traditional surveys, which are often tainted by “social desirability bias.”
  • Market Basket Analysis: Organizing customer purchases (similar to a checkout clerk organizing groceries into bags) to understand buy patterns.

AI Training, Prediction, and Risks

  • Training Process: AI is trained by feeding it vast quantities of data to recognize concepts.     * Concept Example: To teach "Big," a computer is shown elephants, trucks, and whales. For "Small," it is shown chihuahuas, beetles, and monkeys.
  • Predictive Nature: Large Language Models (LLMs) like OpenAI, ChatGPT, and Microsoft Copilot do not have brains; they are predictive models that decide the next most likely word in a sentence based on patterns.
  • Hallucination: A known issue where AI generates incorrect information.
  • Google Bombing and Fake News:     * Historical Example: A prank where users loaded Google with images of a distorted monkey and labeled them "Michelle Obama," causing the search engine to return those images for her name.     * Implication: This demonstrates how the public can train a computer to believe and spread incorrect information, forming the basis of “fake news.”

Educational Outlook and the Skills Gap

  • AI Illiteracy: There is a noted lack of digital readiness among students and businesses in New Zealand, often characterized by a lack of trust in digital agents.
  • Skill Obsolescence: Traditional survey research skills are becoming outdated. Students must aggressively learn analytical skills over the next 2-3 years to remain competitive.
  • Future Employment: While AI may consume routine marketing assistant jobs, those who can handle unstructured data and sophisticated analytics will see increased opportunities.