Analyzing Qualitative Data: Thematic Coding and Categorizing

Codes and Coding

Coding is the process of defining the essence of the data being analyzed. It involves pinpointing and documenting segments of text or data items (like image components) that represent a shared theoretical or descriptive concept. Typically, multiple segments are identified and linked to a code name, thus categorizing all data pertaining to the same subject under a unified label. This method allows for indexing and categorizing text to form a thematic framework.

Two primary forms of analysis are enabled through coding:

  1. Text Retrieval: Gathering all text segments under the same code to consolidate examples of a phenomenon, idea, or activity.
  2. Analytic Questioning: Using the code list, particularly when organized hierarchically, to explore relationships between codes and make case-by-case comparisons.

Terms like 'indices', 'themes', and 'categories' are often used interchangeably with 'codes', each emphasizing different aspects of the coding process. A structured code list with definitions is known as a coding frame, thematic framework, template, or simply a codebook. A comprehensive codebook should include the code list, definitions, and analytic notes.

Coding is most efficient when using a transcript. While direct coding from audio, video, or field notes is possible, it can be challenging without specialized CAQDAS software.

Code Definitions

Codes serve as a focal point for interpreting text. It is crucial to document each code's development, including its nature, underlying rationale, application guidelines, and relevant text or images. This documentation ensures consistent application and facilitates code sharing within a team. Code memos should be stored in an easily editable format and include:

  • Code label or name
  • Coder's name
  • Coding date
  • Code definition
  • Analytic notes

Mechanics of Coding

Identifying text chunks and assigning theoretical and analytic codes can be challenging initially. It requires intensive reading to discern the essence of the text. Key questions to guide this process include:

  • What is going on?
  • What are people doing?
  • What is the person saying?
  • What do these actions and statements take for granted?
  • How do structure and context serve to support, maintain, impede, or change these actions and statements?
Example Illustration

Consider an interview excerpt with Barry, a caregiver for his wife with Alzheimer's. When asked about activities he had to give up, Barry mentions dancing and indoor bowling. He also notes continuing activities like dances at the works club and car rides. Initial coding might label these activities directly.

  • Descriptive codes: ‘Dancing’, ‘Indoor bowling’, ‘Dances at works club’, ‘Drive together’.

However, a more analytical approach would categorize: 'Joint activities ceased' and 'Joint activities continuing'. Further analysis could identify 'Loss of physical co-ordination,' 'Togetherness,' and 'Doing for' as potential codes, reflecting Barry's perception and the underlying dynamics.

  • Categories: ‘Joint activities ceased’, ‘Joint activities continuing’.
  • Analytic codes: ‘Loss of physical co-ordination’, ‘Togetherness’, ‘Doing for’, ‘Resignation’, ‘Core activity’.

Marking the coding on paper involve jotting code names in the margin or using colors to indicate codes.

Data-Driven vs. Concept-Driven Coding

Codebook construction is an analytical process that builds a conceptual schema. Codes can be derived from data (data-driven) or from existing literature and research (concept-driven).

  • Concept-driven coding: Codes come from existing research, interview schedules, or researcher's insights. Framework analysis and template analysis employ this approach.
  • Data-driven coding: Begins without predefined codes, allowing themes to emerge from the text. Grounded theory and phenomenology use open coding, setting aside preconceptions to interpret the data organically.

Most researchers integrate both approaches, using existing knowledge to guide initial coding while remaining open to new themes emerging from the data.

What to Code

The content to code depends on the analysis type. Disciplines like phenomenology and discourse analysis focus on specific phenomena. Common elements include:

ElementExamples
ActivitiesAttending meetings, traveling, shopping, going to work
EventsJoining a sports club, being made redundant, winning a prize
StrategiesMemorizing course notes, planning a holiday, conducting a SWOT analysis
RelationshipsParent and child, employer and employee, friends
ParticipationTaking part in a demonstration, voting in an election, attending a local meeting
SettingsBeing at home, in a train, at work
ThinkingConsidering the consequences of an action, deciding whether to take redundancy
EmotionsFeeling angry, feeling sad, being happy
ValuesBelief in free speech, belief in justice, belief in equality
MeaningsWhat someones means by an action, interpretation of an event, definition of a situation
ConsequencesWhat happened as a result of someones action, or as a result of some thing
ContextsWhat was going on at the time
AccountsHow some one explains their action, how an organisation explains its action

Rather than merely describing events, aim to code for broader analytic categories like 'Activity to make friends' or 'Commitment to keeping fit'.

Retrieving Text from Codes

Coding enables methodical retrieval of thematically related text sections for:

  • Identifying the core of a code
  • Examining thematic changes within a case
  • Exploring thematic variations across cases

Practical retrieval involves:

  • Consolidating text coded similarly
  • Tagging each extract with document information
Paper Methods

Photocopy transcripts, cut them up, and store extracts in labeled folders. A tag consisting of a string of letters or numbers that indicates not only the identity of the respondent but also some basic biographical information (like age group, gender and status) will help identify where the original text came from. You might use something like ‘BBm68R’ to indicate the interview with Barry Bentlow who is male, aged 68 and retired. Put this tag at the top of each extract or slip.

Digital Methods

Use copy-and-paste into separate files for each code, adding a reference to the original line numbers along with the source tag.

Keep retrieved text with code memos to ensure consistent definitions and refine analytic ideas.

Grounded Theory

Grounded theory focuses on inductively generating novel theories from data, which are then related to existing theories. Coding is divided into three stages:

  1. Open coding: Identify relevant categories.
  2. Axial coding: Refine and relate categories.
  3. Selective coding: Identify a core category that integrates all other categories.
Open Coding

Examine text reflectively, avoiding mere descriptions, and formulate theoretical codes by asking questions: who, when, where, what, how, how much, why, and so on. Employ constant comparison to highlight distinctive aspects of the text.

Techniques for constant comparison:

  • Analysis of word, phrase or sentence
  • Flip-flop technique
  • Systematic comparison
  • Far-out comparisons
  • Waving the red flag
Line-by-Line Coding

This approach involves coding each line of text to force analytic thinking, minimizing personal biases. It ensures codes reflect the respondent's experience while allowing for theoretical interpretations. Line-by-line coding helps in paying close attention to what the respondent is actually saying and to construct codes that reflect their experience of the world, not yours or that of any theoretical presupposition you might have.

Example: Coding an interview extract with Sam, a homeless man, reveals themes like relationship endings, emotional distress, and self-perception.

Refining Codes

Revisit the text to identify alternative coding options, incorporate longer passages, and identify new instances that require coding. Additionally, refine codes to transition them from descriptive to analytical.

Analytical Memos

Analytical memos are texts written by the researcher during the coding process to provide an preliminary analysis. They facilitate the coding process, theory building, and data analysis, and can change drastically as new data emerges.

Key Points

  • Coding is a fundamental analytic process for qualitative research.
  • Codes should be analytic and theoretical.
  • Grounded theory offers valuable coding techniques.
  • Constant comparison enhances coding.
  • Line-by-line coding aids in creating new codes.