Marketing Research Chapter 1 (Marketing, Marketing Strategy, and Marketing Research)
Marketing is defined as the activity, set of institutions, and processes for creating, communicating, delivering, and exchanging offerings that have value for customers.
The marketing process includes:
Situation Analysis
Strategy
Tactics
Objectives
Situation Analysis is comprised of the 3 C’s: Company, Customers, and Competitors, and sometimes Collaborator and Context.
Tactics are comprised of the 4 P’s: Product, Price, Place, and Promotion.
Objectives are comprised of Margins, ROI, CLV
Marketing Strategy
Marketing Strategy is defined as a roadmap for achieving marketing goals by:
Identifying the target audience
Differentiating the brand through a unique value proposition
Utilizing the marketing mix (4 P’s: product, price, place, and promotion) to effectively position offerings in the market.
Strategy: STP
Segmentation
Targeting
Positioning
Marketing Research
Marketing Research is the process of designing, gathering, analyzing, and reporting information that may be used to solve a specific marketing problem.
Learning Marketing Research (Individual-level) for:
Career Growth Opportunities
Competitive Edge
Effective Communication
Learning Marketing Research (Firm-level) for:
Improved Decision-Making
Enhanced Customer Understanding
Optimized Marketing Campaigns
Increased ROI
Competitive Advantage
Uses of Marketing Research are:
Identify marketing opportunities and problems
Generate, refine, and evaluate potential marketing actions
Monitor marketing performance
Chapter 2 (Marketing Research Process)
There are 3 preliminary steps during the marketing research process.
Research Purpose
Research objective
Estimating Value of research information
There is a five-step marketing research process.
Define Research Purpose and Objectives
Research Design
Data Collection
Data Analysis and develop insights
Communicate the insight
Defining the research purpose is the first critical step during the marketing research process. It is crucial that the purpose or problem is correctly defined; if it is not, all the steps afterwards will be seen as a complete waste of time and resources. After the problem has been identified, the next step is to formulate the research objectives for that problem. A research objective specifies the information needed in order to solve the problem at hand.
A research design is the framework or approach used to meet research objectives, the research design will determine how the data is being collected, analyzed, and interpreted. There are three main types of research design these are
Exploratory Design
Descriptive Design
Causal Design
Data Collection is comprised of two types of information, Primary Information and Secondary Information. Primary Information is information that is collected specifically for the problem at hand examples are surveys, focus groups, and observations. Secondary Information is information that is already collected and available for external use for example, government reports, and industry publications.
Communicate the Insight which is the final step of the marketing research process, this includes presenting your findings to your stakeholders in a clear and concise manner.
A problem is the gap between what was supposed to happen and what actually happened. For managers, to solve a problem they have to get to the root of the problem and find a strategic solution in order to solve it.
A marketing opportunity is a favorable situation where a company can achieve growth or competitive advantage. An example would be to discover new uses for a product (baking soda).
Discovering the problem at hand and accumulating a marketing opportunity is the foundation for effective marketing research.
Research Purpose, Research Objectives, and Hypothesis
A research purpose in marketing research is the studys’ main goal, guiding its objectives and focus. A research objective is a clear statement or question that specifies the information needed to solve a problem or seize an opportunity. Hypothesis is a testable statement that predicts the relationship between variables or outcomes based on assumptions or observations. Hypothesis testing is used in marketing research to primarily focus on research efforts. Provide clarity and supports decision-making. Methods for hypothesis testing would be Experiments or focus groups.
A construct is an abstract concept that is not directly measurable but can be represented through measurable variables or attributes. An example of this would be Brand Loyalty, Customer Satisfaction, and Intention to purchase.
Key characteristics of Constructs
Abstract in Nature
Multi-Dimensional
Validity and Reliability
Operationalization
Marketing Research is NOT the best solution when: The information is already available, the timing is wrong, and Cost outweighs the value. Value of Research, if the cost of the research outweighs the value of information, then marketing research is not needed.
Chapter 3 (Research Design)
Research design is defined as specifying the methods to collect and analyze information for a research project. Research design addresses ethical issues early in the process, it provides a structured framework to address problems systematically which saves time and costs through preplanning.
Research Design is classified into three categories, Exploratory, Descriptive, and Causal. The choice of design depends largely on the objectives of the research.
For Exploratory Design, the purpose of this design is to gain background information and develop a hypothesis. When to use the exploratory design is based on these three characteristics:
When little is known about the problem or topic
To explore ideas or generate insights for further research
Informal
Flexible
Unstructured
Examples of exploratory design are Interviews, case studies, secondary data, and focus groups
Uses of Exploratory Research:
Gain Background Information
Define Terms (Satisfaction with service quality)
Clarify problems and hypothesis
Establish research Priorities
Descriptive Research Design, the purpose of this design is to describe the state of a phenomenon or variable. When to use Descriptive Research Design is based on these characteristics:
When you need to quantify or describe specific characteristics of a population.
Structured
Pre-planned
Quantitative
Examples of descriptive research design are surveys, observations, and panel studies.
Uses of Descriptive Research:
To describe the characteristics of certain groups, for example how do the characteristics of online buyers differ from offline buyers
To estimate the proportion of people who behave a certain way. For example, what is the percentage of customers who are responsive to marketing promotion.
Causal Research design, the purpose of this design is to test cause-and-effect relationships between variables. When to use Causal Research design is based on these characteristics:
When you want to determine how one variable influence another
Highly controlled
Experimental in nature
Experimental Design is a procedure for creating an experimental setting to attribute changes in a dependent variable.
Example of Experimental Design/causal research design relationship are:
If advertising increases (x), then sales will also increase (y).
If price increases, (x), then customer purchases will decrease (y).
Types of Experimental Designs:
Before-After Testing
A/B Testing
A/B testing simultaneously tests two or more independent variables. An example would be comparing two advertisements to see which one generates more clicks/engagement.
An experiment is valid if it is based on internal validity and external Validity
Internal Validity observes the changes in the dependent variable and the independent variable. Without internal validity, the results cannot be trusted
External Validity observes real world problems.
Test Marketing reduced risk by assessing product performance before full-scale launch. Detective Funnel is the combination of all three research techniques.
Chapter 4 (Secondary Sources of Marketing Data)
There are two types of data, primary and secondary data.
Primary data is the data that is directly collected by the researcher for a specific project, it is used to address the research at hand. Examples would be experimental research, surveys, focus groups.
Secondary data is the data that is collected by someone other than the user. It is used to gather for a purpose other than solving the problem. Examples would be Census, transaction records, and government reports.
There are two types of Primary data: Observational data and Questionnaire data.
Types of secondary data:
Internal Secondary data
External Secondary data
Internal secondary data is the data that is collected within an organization. Examples are Sales records, invoices, and customer complaints
External secondary data is the data that is collected outside of an organization. Examples of this include government reports, and market research studies.
Key sources of External Secondary data are:
Published sources
Official Statistics
Data Aggregators
Advantages of Secondary Data
Cost-Effective (Cheaper than Primary Data)
Timesaving (Requires less time compared to Primary Data)
Wide Availability
Disadvantages of Secondary Data
Incompatible Reporting Units (Data may not match)
Mismatched Measurement Units (Differences in how data is measured)
Outdated Data
Lack of Credibility
Information Packaged Data is a type of External Secondary data where the collection process and/or the data itself are prepackaged for all users. Packaged data has two broad classes: Syndicated data and Packaged Services.
Syndicated data is a data set that is shared with multiple marketing companies
Packaged services which are predefined research process designed specifically for individual clients.
Advantages of Syndicated data:
High quality data
Cost-Effective
Quick Access
Disadvantages of Syndicated Data:
Lack of Customization (data is standardized)
Long-term contracts (minimum 3 years contract)
Easy Access to Information
Digital Tracking Data is also another form of External Secondary Data, this type of data is generated from users’ online activities such as browsing websites, using apps, or interacting with ads.
Chapter 5 (Qualitative Research Techniques)
There are two types of research methods:
Qualitative Research – Explores emotions, opinions, and motivations through a non-numerical data
Quantitative Research – focuses on numerical data and statistical analysis.
Mixed methods Research combines both Qualitative and Quantitative research.
Characteristics of Quantitative Research:
Uses surveys, polls, or experiments
Data is measurable and testable
Characteristics of Qualitative Research:
Open-ended questions and observations
Responses are unique and can be categorized as being negative, positive, or neutral
Qualitative Research: Observational Techniques
Observational Techniques is one form of qualitative Research, it involves watching and recording people, objects, or activities in an organized way. There are 3 types of Observation:
Direct vs. Indirect
Covert vs. Overt
Structured vs. Unstructured
Direct vs. Indirect Observation
Direct Observation observes behavior as it happens in real time. An example would be watching how shoppers select fresh produce.
Indirect Observation observes the effects or outcomes of past behaviors. An example of this would be Waste Management analyzing trash study to see the recycling habits that could impact the environment.
Covert vs. Overt Observation
Covert Observation, during this observation customers are unaware that they are being observed. An example would be hidden cameras recording shoppers’ behavior. An advantage of this would be that it captures authentic, unaltered behavior.
Overt Observation, during this observation customers are aware that they are being observed. An example of this would be Tracking TV viewership.
Structured vs. Unstructured Observations
Structured Observation identifies predefined behaviors and recording them by using forms or checklists. An example of this would be monitoring how customers interact with products on display. An advantage of this is that it is efficient and focuses on specific reports.
Unstructured Observations has no restrictions, all behaviors within the environment is recorded. An example of this would be watching children play Legos. An advantage of this is that it is more flexible.
Advantages of Observational Techniques:
Natural Behavior (Subjects are unaware that they are being watched)
No Recall Errors (Real-Time observations)
Cost – effective
Unique Insights
Complementary Methods
Disadvantages of Observational Techniques
Small Sample Size
Subjective Interpretation (Observer Bias can affect conclusions)
Lacks Depth (Cannot uncover motives, attitudes, and intentions)
Restricted Scope (Only suitable for the public)
Observer Effect (Awareness of observation may alter behavior in overt studies)
Focus Groups
Key Components of Focus Groups are
Moderator skills (Excellent observation, and communication skills)
Focus Group Reports (Summarize participants insights into categories and themes)
Use of results (Exploring needs, attitudes, and preferences)
Advantages of focus groups
Encourages open discussions
Provides rich qualitative data that reveals customers’ needs
Disadvantages of focus groups
Small sample sizes
Discussions can be influenced by dominant participants
Requires skilled moderators
Focus Groups should be used when the objective is to explore or describe but not to predict. It should be used also to gain consumer insights based on how the consumer reacts to a specific product. It should be used also to generate ideas.
Focus groups SHOULD NOT be used to predict the outcome due to small samples and should also not be used for Quantitative Analysis because focus groups do not provide standardized, measurable data.
Ethnographic Research
Ethnographic Research is a qualitative research which provides a descriptive study of a group’s behavior and characteristics. The purpose of this to gain a deep understanding of consumer behavior in a natural setting over a period of time.
Methods of ethnographic research:
Immersion (Researchers embed themselves in participants’ environments)
Participant Observation (Observing and interacting with participants in a natural setting)
In-Depth Interviews
Shopalongs (Accompanying shoppers to observe their behavior during shopping trips)
Advantages of Ethnographic Research
Provides rich contextual insights into consumer behavior
Captures real life interactions with products and services
Help to identify unmet needs
Disadvantages of ethnographic Research
Time intensive and requires prolonged observation
Relies on researcher’s interpretation, which can be biased
Mobile ethnography may miss unconscious behaviors
Marketing Research Online Communities (MROC)
Marketing Research Online Communities is a group of respondents that come together online to interact, provide opinions, and complete tasks for research purposes. Participants can interact by sharing photos, posts, and videos.
Key Benefits of MROC:
Cost effective and Flexible
Rich Data Collection (Captures multimedia responses like videos and posts)
Convenience for Participants
Targeted Insights
Global Reach
Challenges of MROC:
Engagement drop off (Participants may lose interest over time)
Representation Issues
Data Overload
Limited Moderator Control
There are additional qualitative techniques which are:
Neuromarketing
Projective Techniques/indirect approach
In-Depth Interviews (IDI)
In-Depth Interviews (IDI) is one on one interview that explores the thoughts and behaviors of another. An advantage of this would be that it generates rich, detailed responses. And the challenge would be that it requires skilled interviewers.
Projective Techniques/Indirect Approach is used when an individual prefers not to be open to any formal discussion. Examples of this would be to use Third person Techniques, Word Association, and Completion Test.
Neuromarketing is the study of how the brain and body react to marketing, like ads or products. Key neuromarketing Techniques:
Neuroimaging like MRI
Eye tracking
Facial Coding
Chapter 6 (Survey Data Collection Methods)
A survey is a method used to collect information from people by asking them structured verbal or written questions. Surveys normally has fixed and standardized questions meaning everyone answers the same set of questions. Surveys are mainly descriptive, so it is used primarily for descriptive research.
Advantages of Surveys
It is Standardized
Efficiency (Quickly gather large data sets from broad samples)
Easy Administration (It is simple for interviewers and respondents)
Motivation Insights
Quick Analysis
Subgroup Insights
Disadvantages of surveys
Validity loss (Fixed-response formats may fail to capture true beliefs)
Limited Depth (May not capture complex opinions or feelings)
Response Bias (Users may provide social responses rather than honest responses)
Sampling Issues
Question Misinterpretation
Sources of Error in Survey Methods: There are four types of errors when conducting a survey, these are:
Sampling Errors
Measurement Errors
Measurement Instrument Error
Processing Error
Sampling Errors happened when the sample used does not fully represent the population we want to study. There are three types of sampling error, Frame Error, Population Specification Error, and Selection Error.
Frame Error occurs when the sampling frame is incomplete or inaccurate.
Population Specification Error which happens when the wrong group is chosen for a survey. An example of this would be to include high school students in a survey about a liquor store
Selection Error which happens when the sampling process isn’t done correctly.
Measurement Error this error occurs when the data that was collected doesn’t match what we need. There are two types of measurement Error, Surrogate Information Error, and Interviewer Error
Surrogate Information Error occurs when the researchers collect information that doesn’t solve the problem
Interviewer Error where the interviewer influences the respondents’ answers consciously or unconsciously. This could happen because of age, gender, or facial expressions.
Measurement Instrument Error which happens from a poorly design questionnaire. An example would be questions that are unclear or easy to misinterpret
Processing Error happens when survey data is transferred incorrectly to a computer. An example would be scanning a document incorrectly.
Data Collection Methods: Interviewer and Computer Technology
When there is no computer but there is an interviewer, this method is called a person-administered survey. When there is a computer and an interviewer the method is called a Computer-Assisted (Person- Administered Survey).
When there is no computer and no interviewer the method is called Self-Administered Survey. When there is an computer and no interviewer, this method is called Computer-Administered Survey
A Person-Administered Survey takes place in person or over the phone with no computer present. This method can take place at home (In-Home Interviews), at the mall (Mall-Intercept Interviews), in the office (In-Office interviews), and over the phone (Telephone Interview). Advantages of Person-Administered Survey are: it provides Feedback, Quality control and adaptability. Disadvantages of Person-Administered Survey are: Human Error, Slow Speed, High Cost.
Computer Assisted surveys has two types of surveys: CATI (computer assisted telephone interview) and CAPI (computer assisted personal interview). Advantages of Computer-Assisted Surveys
Speed
Error-Free Interviews (Computers ensure that there is no errors in question sequencing)
Use of Image and Audiovisuals (Showing a video of a new product during a survey for feedback).
Disadvantages of Computer-Assisted Surveys
High Setup Costs
Technical Skill Requirement (interviewers need training to operate the systems effectively).
Self-Administered Survey is a type of survey that respondents can complete on their own without the aid of another person or from a computer system. Self-Administered Surveys can be conducted in different ways such as Group Self-Administered Surveys and Drop-Off Surveys, and Mail Surveys
Group Self-Administered surveys are conducted in a group setting.
Drop-Off surveys are delivered to the respondent for later completion and return.
Mail Surveys are sent via postal mail for respondents to complete and return. Advantages of Self-Administered Surveys
Reduced Cost (No need for interview or computer systems) an example would be paper surveys
Respondent Control (Respondents can complete the survey at their own pace).
Reduced Interview Evaluation (This is ideal for sensitive topics) for example an anonymous health behavior survey. Disadvantages of Self-Administered Survey
Respondent Control (Risk of incomplete responses, errors, or delayed return)
Lack of monitoring (No researcher available to clarify certain questions)
High Questionnaire Requirements (Requires clear instructions about the survey) An example of this is a poorly designed questionnaire/survey may frustrate the respondent and reduce completion task.
Computer-Administered Survey is a type of survey that a computer ask questions and records the respondents answers. This type of survey does not require an interviewer Ways of Conducting a Computer-Administered Survey:
Online Surveys
Interactive Voice Response (IVR). An example of this would be Post-Call Customer satisfaction surveys, press 5 for very satisfied etc. Advantages of Computer-administered survey
User Friendly, which means that they are easy design and use
High Efficiency
No Interviewer Disadvantages of Computer-Administered Surveys
Requires Computer-Literate Responses
Respondent Misrepresentation
Chapter 7 (Survey Design/Attitude Measurement)
Measurement is the process of quantifying as in assigning numbers to properties of objects like consumers, brands, stores, or advertisement. Key Characteristics of Standardized Measurements are Consistency and Uniformity.
Objects are the entities that is being studied during the research process.
Properties are the specific features of the object, examples could be Brand Loyalty, Customer Satisfaction.
Objective vs. Subjective Properties in Management Objective Properties are observable properties that are tangible and physically verifiable, examples are Age, Income, Number of bottles purchased
Subjective Properties are Unobservable mental constructs such as attitudes or intentions, examples are Customer Satisfaction, Brand Loyalty or Purchase Intent. The measurement scale for this would rating a product from 1 to 5.
Types of Measurement Scale There are four types of Measurement Scale:
Nominal
Ordinal
Interval
Ratio
Nominal Scales is a type of scale that is used label or classify data into categories. Examples include:
Demographic (Gender, Race, Religion)
Behavioral Data (Brand last Purchased)
Other categories like Occupation, buyer/nonbuyer This type of data is only used for descriptive research
Ordinal Scales is a type of scale that allows users to rank order on a variety of products or brands. For example, ranking different shoe brands from 1 – 4.
Interval Scale is a subjective type of scale that allows a respondent to rate a product’s features on a scale. An example of this would be rating Starbuck’s coffee taste on a scale of 1 to 5. NOTE: There is no 0 in interval scale
Ratio is a type of scale that includes all the properties of nominal, ordinal, and interval scales with the addition of a true 0 point. An example of this would be “How many pairs of shoes do you have?”, the answer choices would be from 0 to 2 etc.
Interval Scale is one of the four scales that is commonly used in marketing research, it measures constructs like customer satisfaction, brand loyalty, and purchase intentions.
A Likert scale is a type of scale that measures a respondents level of agreement or disagreement with statements, it is a mixture of ordinal and interval scale.
A Semantic Differential Scale is a type of scale that measures subjective perceptions like attitudes and emotions towards an object, concept or experience. Key features of Semantic Differential Scale are:
Bipolar Adjectives
Continuum of Intensity
Quantifiable results
Random flipping of objects (High price vs low-price, low-price vs high price)
A Stapel Scale is unipolar rating scale that measures respondents’ attitudes towards an object by using a single adjective and numerical scale. An example of this is: “How would you rate the helpfulness of Chase customer Service?”
+5 (+4) (+3) (+2) (+1) 0 (-1) (-2) (-3) (-4) (-5)
The 0 represents Neutrality if included Slider Scales is an interactive graphical scale where respondents indicate their answers by dragging a slider to a specific value on a continuum. Advantages of Slider scales are:
Engaging
Efficient
Mobile-Friendly Disadvantages of Slider Scales:
Learning Curve (May confuse some respondents
Bias Risk
Uncertain Quality
Issues with Interval scales:
Neutral option response
Symmetric vs. Nonsymmetric Symmetric Scales are balanced with equal positive and negative points. Nonsymmetric Scales focuses only on positive responses and omit negative responses.
Chapter 8 (Designing a Questionnaire)
A Questionnaire is a tool that is used to collect data by presenting standardized questions to respondents. There is a six-step process for designing a questionnaire:
Define Research Objectives
Develop Questions
Determine Question Flow (starts with a flow of questions from general to specific and ends with sensitive topics
Pretest the Questionnaire
Client Review and Approval
Launch Survey
The four Do’s of Question Wording:
Focused: Address a single issue or topic
Brief: Avoid unnecessary wording
Grammatically Simple: Use short sentences with one subject
Make your question Crystal Clear: Make sure your questions are understandable to the respondents.
The four Do Not’s of Question Wording
Leading Questions (Don’t you worry when using your credit card online?)
Loaded Questions (Should people be allowed to protect themselves from harm by using a taser in self-defense)
Double-Barreled Questions (Were you satisfied with the restaurant’s food and service?).
Overstated Questions (How much would you pay sunglasses that protect against UV rays known to cause blindness?)
Question Flow for a Questionnaire
The introduction is a critical part of the questionnaire design because it sets the stage for the survey. Five key functions of an Introduction
Who is doing the survey?
What is the survey about?
How did you select me?
Motivate me to participate?
Am I qualified to take part?
Who is doing the survey – the introduction should clearly state who is conducting the survey. This includes introducing the interviewer or any sponsor team. An undisguised survey, the sponsor is undisclosed and for an disguised survey, the sponsor is withheld to avoid influencing responses
What is the survey about? – The introduction should clearly state the general purpose of the survey in a simple way. Avoids using lengthy wordings.
How did you select me? – The introduction should explain why this respondent was chosen for example (you were selected at random).
Motivate me to Participate? – The introduction should politely ask for participation, for example (“Would you mind answering a few questions for me?”). Offer incentive if possible such as: monetary rewards, product samples or discounts. Address privacy concerns such as this survey will guarantee anonymity and confidentiality.
Am I qualified to take part? – The introduction should contain screening questions in order to see if the respondent qualifies for the survey.
Chapter 10 (R Studio)
R Studio – Is an Integrated Development Environment (IDE) for R, providing user-friendly interface.
R Studio has four main windows, each serving a unique purpose:
Script Files
Environment
Console
Misc
Script Files – The Script files saves your script, allows code and comments, and can have multiple files open at a time.
Environment – This feature holds your objects and can review history.
Console/Command Line – This feature can be used as a calculator, it does not save codes, and this is where your output is displayed.
Misc – This feature displays files in working directory, plots data when produced, and helps with searching of files.
Script Editor: This feature writes and saves R scripts. Great for longer code blocks. Workspace Environment: View saved data objects and command history. Console: Run R commands directly here. Results appear instantly. Files/Plots/Packages/Help: Access files, view plots, manage packages, and read documentation. Key Features of the Script Editor
The Script Editor (top-left window in R Studio) is where you write and manage code you want to keep and refine.
Code Completion: Speeds up coding by suggesting code options.
Multiple-File Editing: Switch between open scripts effortlessly.
Find/Replace: Quickly search and replace text in scripts.
Workspace Environments
Workspace Environment (top-right window) displays your current R working environment, including any user-defined objects. Codes Used When Managing Objects
ls() – List all objects that the workspace environment has
rm(x) – Removes certain elements from the workspace environment
rm(list = ls()) – Removes all objects from the workspace environment
User-Defined Objects: This includes:
Vectors, Matrices, Data Frames, Lists, Functions Miscellaneous Displays
The bottom-right window in R Studio has multiple useful tabs:
Files: Shows available files in your working directory.
Plots: Displays any plots or graphics generated by your code.
Packages: Lists all downloaded packages, including those currently loaded.
Help: Search for help topics or view help documentation for commands.
Chapter 11 (R Basics 2)
tidyverse – Is a collection of R packages designed for data science. It allows users to write sample, readable, and efficient codes. It is essential for data-wrangling tasks throughout this course.
Core Tidyverse Packages o Library (tidyverse): Loads tidyverse, which includes:
dplyr: Data Manipulation
ggplot2: Visualization
tidyr: Data tidying
readr: Data Import
tibble: Enhanced data frames
forcats: Categorial variable handling
stringr: String manipulation
purr: Functional Programming
lubricate: Data/Time Management
head(mpg)- is a type of function that will show the first six lines of a dataset.
Pipe (%>%) Operator This function is provided by tidyverse, and it’s called a “pipe” operator. This operator will forward a value, or the result of an expression, into the next expression. The pipe operator (%>%) is read as “and then”
Why Use %>%?
It improves readability: Code flows from top to bottom, like natural reading. easier to debug and modify.
Reduce Complexity: Avoids deeply nested function calls.
Increases Maintainability: Each step of the operation is clear and self- contained.
Transforming Data with dplyr
dplyr is part of the tidyverse package, it is designed for task like manipulating, sorting, summarizing, and joining data frames. It also uses clear and easy to read syntax which makes data transformation faster and less error prone.
select() – The select function in the dplyr packages is used to reduce dataframe size to only desired variables for current task.
mutate() – The mutate function creates new variables or new columns to existing data.
Why Data Visualization?
It helps to understand patterns and trends
Detecting Outliers
Simplifying Complexity
ggplot2 – Is a package used to construct charts and makes plots of the data.
Chapter 12&13 (Descriptive Analysis)
Descriptive Analysis – Provides a summary of data to create an overall picture (e.g., average customer ratings).
Coding Survey Responses
Before starting any statistical analyses, we must code survey responses to numeric numbers, this process is called “Coding”. There are two types of questions that must be analyzed, these are : Closed-Ended Questions and Open-Ended Questions.
Closed-Ended Questions are easier to code, responses are predefined, making is straightforward to enter and analyze data.
Assign Numeric Values: Each response option is given a specific numeric code (e.g., 1= yes, 2 = no).
Open-Ended Questions are more complex to code, responses vary widely, creating a lengthy list of possible answers.
Qualitative Analysis: Coding requires categorizing or grouping similar responses, which is time-consuming and can introduce subjective interpretation.
Purpose of Descriptive Analysis
Provides an overview of the data, helping to summarize large datasets.
Sets the foundation for deeper analysis and insight generation.
Two Key Types of Measures describe the Information obtained in a Sample.
Measures of Central Tendency – these describe the “typical” respondent or response (e.g., mean, median, mode).
Measures of Variability – These describe how similar or different respondents or responses are to the “typical” ones (e.g., range, variance, standard deviation, Frequency/Percentage Distribution).
A frequency distribution is a table that shows how often each unique value appears within a data set.
A percentage distribution is derived by dividing each frequency by the total number of observations and then converting it to a percentage. This helps to express the relative proportion of each value in the data set.
Range – Identifies the distance between the lowest value (minimum) and the highest value (maximum) in an ordered set of values.
Range = Maximum – Minimum
A Standard Deviation measures how much values vary around the mean.
A low standard deviation means values are close to the mean, while a high standard deviation shows a greater spread.