Analyzing Qualitative Data
Qualitative Data
Definition and Characteristics
Qualitative data includes transcripts, text responses, videos, emails, etc.
It is considered more subjective than quantitative data.
There are challenges introduced by ambiguity and lack of context.
Comparison with Quantitative Data
Quantitative data refers to numbers, measurements, rankings, etc.
Qualitative data encompasses descriptive elements that cannot easily be quantified, such as interview transcripts and survey text responses.
Qualitative analysis leads researchers away from the clearly defined world of statistics into a realm filled with subjectivity.
Challenges of Qualitative Research
The subjective nature of qualitative data can create unease, particularly for those with a background in computer science or quantitative analysis.
Questions of “truth” and “accuracy” become complex in qualitative research as there are no straightforward answers.
Issues faced include the size of the dataset and the necessity of understanding that data without its original context.
Example Scenario:
A researcher analyzing an interview transcript from a previous conversation could misunderstand context if they encounter sentences that are ambiguous, like a joke or references to prior discussions.
Acceptance of Subjectivity
Qualitative researchers must accept the inherent murkiness and subjectivity involved in their analyses.
Types of Qualitative Analysis
Overview
There are various methods for analyzing qualitative data, including:
Thematic Analysis
Content Analysis
Grounded Theory
Etc...
Each method overlaps significantly and requires researchers to spend considerable time familiarizing themselves with the data.
Familiarization Process
Familiarization involves multiple rounds of reading, annotating, and re-annotating data.
It is an iterative process with no predetermined number of iterations deemed correct, however, the number of iterations should be sufficient to ensure a deep engagement with the data.
Handling an overwhelming dataset poses challenges, emphasizing the importance of making informed choices regarding data collection strategies.
Coding in Qualitative Research
The term “coding” relates to the creation of categories and descriptions assigned to sections of qualitative data.
Coding methods may vary, but all aim to organize and interpret data effectively.
Thematic Analysis
Definition
Thematic Analysis involves iterative familiarization with the data, and the application of codes.
It is widely applied in Human-Computer Interaction (HCI) for qualitative data.
This method entails identifying themes, which are patterns of shared meaning within the data.
Process of Thematic Analysis
Researcher engages with data repeatedly, modifying and grouping codes into larger categories, ultimately creating themes.
The number of identified themes or coded pieces within a theme is variable and contingent upon the narrative the researcher wishes to convey.
Reflexive Thematic Analysis (RTA)
RTA was popularized by Braun and Clarke, who outlined several distinctions from conventional thematic analysis.
Emphasizes the researcher’s impact on data, prompting researchers to acknowledge their biases through reflexivity or positionality statements.
Terminology is adjusted from “extracting” or “uncovering” themes to “generating” themes, highlighting the role of the researcher in shaping analysis.
Themes are patterns of shared meaning that tell stories of the data, NOT mere summaries of the data.
Positionality/Reflexivity Statements
Purpose
Positionality statements offer a means to reflect on potential biases and viewpoints of the researcher.
Factors influencing perspectives may include:
Upbringing
Socioeconomic class
Race
Gender
Education
Occupation
Strong beliefs that we hold
Environmental factors (location, community).
Language
The communities we belong to
Importance in RTA
Acknowledging one’s positionality is vital to recognizing how personal experiences shape research outcomes.
Discussing findings with others helps mitigate bias by exposing the researcher to various perspectives.
Example Positionality Statement:
3.4 Reflexivity
When conducting reflexive thematic analysis it is crucial to acknowledge one's positionality and how it shapes the outcome of research. I have volunteered as a librarian for the Guelph Tool Library for three years, and it is a cause about which I deeply care. In addition, Guelph, the city in which I have lived, worked, and studied for the past seven years is politically progressive, especially in sustainability and environmental issues, having elected a Green party candidate in the last three provincial elections. Lastly, I take a more critical lens to technology as a result of education I received in graduate studies on the topic of data and AI ethics. It is possible that my experience as a volunteer, a Guelph resident, and my critical opinions of technology affected my analysis. I mitigated the effects of my biases by discussing my findings with my advisor and peers so that I could hear different perspectives. When interviewing participants, I made a conscious effort to ask participants to explicitly explain their reasons for making certain choices so I did not assume their reasoning.
Content Analysis
Definition
Content analysis refers to “developing a representative description of text or other unstructured input”.
Both qualitative and quantitative techniques may be utilized, including counting phrases.
Can be used in multimedia data but mainly used for textual data.
Categories of Content
Data can be sorted into two main categories: media content and audience content.
Media Content Examples:
Books, television, magazines, commercials, music lyrics, etc.
Audience Content Examples:
Interview transcripts, diary study texts, etc.
Grounded Theory
Usage Context
Grounded theory is used in areas we don’t know much about, and may not have pre-existing literature and theories on.
This method gathers data and aims to develop a theory explaining the observations we collect.
Process
Similar to other qualitative methods, grounded theory involves iterative familiarization and qualitative coding.
However, it explicitly aims to yield a theory as an outcome of the research process.
Measuring Reliability and Validity
Overview of Reliability
It is not possible to achieve the exactness of quantitative data in qualitative studies – we are just doing different things.
However, over the years, there have been methods developed to communicate the “validity” or “reliability” of qualitative analyses.
Various measuring methods, such as inter-rater reliability (IRR) and Cohen’s Kappa, are used to assess reliability in qualitative studies.
IRR and Cohen’s Kappa measure the agreement in coding among multiple researchers; a Cohen’s Kappa score above 0.6 is generally considered acceptable.
Cohen’s Kappa score is derived by , where () represents the observed agreement among raters, and () signifies the expected agreement by chance. This formula allows researchers to quantify the level of agreement between raters beyond what would be expected randomly.
Validity in Qualitative Research
Unlike quantitative data, it is impossible to achieve exactitude in qualitative studies; they serve different research purposes.
Calculating validity relies on the mindset that an objective truth exists, and rigorous exploring can approximate this.
Some researchers question the necessity of achieving consensus in coding, viewing disagreement as reflective of differing perspectives rather than failures.
Choices for Analysis
Key Considerations
Should you use a codebook (predefined list of codes) or create codes during analysis?
Using a codebook is where you have a predefined list of codes you will apply to your text. These could be informed from prior research and what you are looking for in the text.
Using no codebook could be a good approach if your study is more exploratory, and you don’t have as clear of an idea of what to expect or what you are looking for.
The number of researchers. Participation of multiple researchers can reduce individual bias but requires resources for consolidation of findings.
When collaborating, decisions need to be made on the handling of coding disagreements.
Will you strive for inter-rater reliability, or is it okay if you and another researcher disagree on coding?
Will you address your own positionality, and how?