1/46
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
What is the key difference between experiments/surveys and content analysis?
Experiments and surveys elicit new data, whereas content analysis studies existing communicative behavior.
What implications does content analysis have for sampling?
You sample content, not people.
Why does content analysis often have high ecological validity?
Because it studies behavior that occurs naturally in the real world.
Why is reliability crucial in content analysis?
Because content must be interpreted as objectively and consistently as possible.
What is content analysis (Berelson, 1952)?
A quantitative, systematic, and objective technique for describing the manifest content of communication.
What does “quantitative” mean in content analysis?
Counting how often something occurs.
What does “systematic” mean in content analysis?
Using clear, predefined rules for sampling and analysis.
What does “objective” mean in content analysis?
Coding rules are unambiguous and not dependent on personal interpretation.
What is “manifest content”?
Content that is directly observable (e.g., words, images), usually with the goal of uncovering some latent concept
What is “latent content”?
Underlying meanings or constructs inferred from manifest content (e.g., framing, values).
What are the 8 steps in content analysis?
Develop a hypothesis.
Define the content to be analyzed (e.g., text only, no photos). → set boundries
Sample the content (e.g., 300 dating profiles – how much data to analyze). → Pick a sample
Select units for coding (e.g., individual words or sentences). → Select what piece
Develop a coding scheme (the rules in a "codebook"). → select which piece of the text you’re going to count, and create clear rules to categorize data.
Code the units: Apply the coding rules systematically
Count occurrences: Analyse how often specific elements appear.
Report the results: Share findings, using visuals and explanations.
What is a theory-driven content analysis approach?
Coding categories are based on existing theory. You select specific (categories of) words and count how often they were used based on these models.
Give examples of theory-driven word categories.
Body/sexuality words, status words, emotion words, pronouns (I, you, we).
What is a data-driven content analysis approach?
Using algorithms or machine learning to identify patterns without predefined categories.
What is a strength of the data-driven approach?
It can reveal unexpected patterns.
What is LIWC?
A dictionary-based tool that automatically counts words in predefined categories.
What is a key advantage of dictionary-based approaches?
High reliability and consistency.
What is a key disadvantage of dictionary-based approaches?
What is a key disadvantage of dictionary-based approaches?
What does it mean that coding categories must be exhaustive?
Every coding unit must fit into a category.
What does it mean that coding categories must be exclusive?
Each unit can belong to only one category.
What is intercoder reliability?
The extent to which different coders agree on coding decisions.
What does validity mean in content analysis coding?
Categories accurately represent the theoretical construct of interest.
What are the “three V’s” of big data? And what is big data?
Volume, Variety, Velocity.
Big data involves huge digital archives of "traces" left by spontaneous human behavior, often online.
What does “Volume” mean in big data?
Extremely large datasets covering many observations.
What does “Variety” mean in big data?
Different data types (text, images, audio, video).
What does “Velocity” mean in big data?
Data are generated and available rapidly, often in real time.
What is the fourth V of big data, and what does it mean?
Veracity.
Data quality: the data is accurate and truthful.
Interpretability: looking at everyday behaviour and not unrealistic settings.
Are big data studies usually exploratory or confirmatory?
Exploratory.
Does big data research focus more on induction or deduction?
Induction.
Does big data research usually test causality?
No, it focuses mainly on correlations.
What are Opportunities in Big Data Research?
Big data may include rare phenomena and hard-to-reach populations.
Reduces the risk of error and bias associated with small samples, because you typically have large samples. (However, representative samples are not guaranteed)
It can lead to the discovery of correlations that no current theory would predict.
(But at the risk of these correlations being spurious)
Provides the opportunity to construct more sophisticated statistical models (But at the risk of these models being overfit).
Why can big data capture rare phenomena?
Because datasets are very large and comprehensive.
Why is big data still not necessarily representative?
Because “big” does not equal “all”.
What are disadvantages of Big Data?
Spurious correlations
Overfitting
Representation
What are spurious correlations in big data?
Statistically significant relationships that occur by chance.
Why are spurious correlations common in big data?
With many variables, some correlations will appear randomly.
What is overfitting?
A model fits existing data well but performs poorly on new data. The model may be too complicated, and may try to explain new details and exceptions in the old data.
Models that match existing data too tightly may fail to predict new data.
What does representation as a disadvantage mean in Big Data?
Big data often ignores groups with limited internet access.
Why can simpler models sometimes be better?
They generalize better to new data.
What are advantages of Big Data?
High ecological validity (natural behavior)
includes rare events (e.g., plane crashes)
Reaches hard-to-reach population
What are ethical considerations in big data?
De-anonymization: Combining a few "anonymous" data points (age + city + hobby) can identify specific individuals.
Privacy: Accessible data is not always public (e.g., closed groups). Consent is difficult to obtain.
Bots/AI: Researchers must verify if content is from real humans or automated bots.
What is de-anonymization?
Identifying individuals from supposedly anonymous data (combining a few “anonymous” data points (age, city, hobby) that can lead to identifying specific individuals
What is the issue that arises with privacy in big data?
Accessible data is not always public (e.g., closed groups). Consent is difficult to obtain.
What is the issue that arises with bots/AI in big data?
Researchers must verify if content is from real humans or automated bots.
What is the main strength of content analysis?
Systematic, quantitative study of real-world communication.
What is the main risk of big data research?
Misinterpreting correlations as meaningful or causal.
Why is content analysis and big data analysis sometimes better than experiments, surveys, and interviews?
Sometimes cheaper and more straightforward than creating a new survey/experiment.
There is no substitute for the richness, creativity, and humor in the communication on social media platforms, in written news media.
There is no subtitle for the hate, bias, and conformity there.
You can see the whole spectrum of human behavior by looking at content.