Case-based research: explore, describe, and explain a phenomenon and provide detailed, context-rich insights; useful for theory-building.
Variable-based research (Large-n): describe and explain relationships between variables (IVs and DVs) by identifying trends; useful for theory-testing.
Units of analysis: social elements that are emphasized in the RQ (e.g. families).
Units of observation: key elements while collecting data (e.g. family members).
Types of units: actors (individual or group), actions (single actors but also their interaction), opinions (statements), and events (whether they affect the actors).
Levels of analysis: micro, meso, macro.
Multi-level analysis: units at different levels in the same study.
Wrong-level fallacy/ ecological fallacy: to draw false conclusions about members of an organization after studying the organization as a unit
Types of information about the relationship between units
Formal properties: strong/weak ties, (in) formal, etc.
Substantial aspects: what does the relation concern, how are the units connected
Chapter 22 (in gronmo 2019) Temporal studies
Temporal studies: how large/small social processes progress
Longitudinal studies: analyses of modes of development
Biographical studies: recall data
Qualitative content analysis of documents with no comparable content at different times
Time series data: repeated questions at regular times, used to combine data to express a trend.
Panel data: at random times, which does have the problem of drop-outs.
Panel data sees respondents as individuals; changes in their answers are considered gross changes, which allows for more detailed analysis.
Time series data sees them as a whole; changes are net changes.
Cohort analysis: analyses done on basis of people’s age. A cohort has experienced a significant event at the same time.
Difference between cohort is called cohort effect/ generation effect.
Within a cohort => age effect/ life-phase effect
Temporal fallacies: drawing conclusions about development while studying one point in time.
Types of Data
1 point in time/ case study: synchronous data
Multiple point in time / longitudinal studies: diachronic data
Special studies: similarities or differences between places, both geographically and about conditions in context, societies.
Comparative studies: comparing different societies or conditions in different societies; at least 2 units are systematically compared to find a causal relation.
Units as different as possible: find 1 commonality
Units as similar as possible: find 1 difference
Equivalence: in order to compare particular phenomena, we must have equivalent data about these phenomena.
Linguistic: have the same expressions and meaning across the compared societies.
Contextual: different contexts, does the phenomenon have the same relevance
Conceptual: do concepts have the same meaning (culture bound)
Methodological: do the same methods create the same kind of data.
Variables in multi-level analysis
Global variables: they only refer to one level of the analysis
Aggregated variables: variables at one level are used in the analysis as expressions of units at a higher level
Contextual variables: based on one level and used in the analysis of units at a lower level
Cross-level fallacies: conclusions about conditions at one level based on data from another level.
Aggregative fallacy: faulty conclusions based on data about a higher level
Atomistic fallacy: faulty conclusions based on data about a lower level
Historical-comparative studies: combines analyses of stability, change and different levels in different societies.
Key Definitions
Context: the broader context/ field/ topic of the study
Unit of analysis: social units or elements which are the focus of the study (what is the study about?)
Unit of observation: unit actually being observed in a study
Concept: an abstract/ category that enables researchers clarify, categorize, and understand a phenomenon in the social world
Variable: a characteristic that can be measured, is numeric (measurable), and they can vary between observations; created through the operationalization of concepts.
Case Study Research
Case study research: engages in an empirical inquiry that investigates a particular phenomenon in real-life within a specific bounded system; treats cases as holistic and complex units.
Defined as: the intensive analysis of a single unit, where the researcher’s goal is to understand a larger class or similar units.
Types of cases
Typical case: being representative of a larger population (what is average)
Extreme case: extreme version of the larger pattern/ outlier. Choose a case that’s as far away form average as possible; aim is to explain why the case is extreme.
Deviant case: a case that does not fit the larger pattern. Aim is to explain why something is (not) happening; possibility for a new theory.
Comparative case studies: compare either same point in time and place but different case, or different points in time, or same case but across time. Important is comparability: unit homogeneity/ equivalence
Week 3 Types of comparative case studies
Diverse cases: two or more cases representing variation on a relevant condition. Cases are selected to represent the full range of values on a relevant condition/ relationship.
Most similar systems design: the cases are similar on nearly all areas but differ in one, why?
Most different systems design: the cases differ in basically everything, but have a similarity, why?
Large-n studies and Hypotheses
Planning survey questions: research question => theoretical concepts of interest => operationalization => variables => data collection and analysis.
Operationalization: criteria for how concepts are going to be measured by empirical data.
Independent variable: a variable in the analysis of the relationship that assumes to influence another variable.
Dependent variable: a variable in the analysis of the relationship which is assumed to be influenced by one or more variables.
Hypotheses: statement about social phenomenon that can be tested empirically.
Null hypothesis: there is no significant relationship => reject.
Alternative hypotheses: there is a significant relationship between the variables => accept.
Week 4 Chapter 6.1-6.5: Source Types
3 main types of sources:
Actors: observed during action
Respondents: give answers to researchers’ questions, or informants => usually questioned about other actors
Documents: written, oral, or visual presentations, which are studied in content analyses.
Data: information that has been processed, systematized, and recorded in a specific form and for the purpose of a specific analysis
Major types of research design
Ethnographic research/ participant observation: researcher is a participant in the process which is studies
Structures observation: no participation, observations are registered on a prepared schedule
Unstructured interviews: conversations, not pre-determined questions; includes semi-structured interviews
Questionnaires / surveys: fixed question, fixed options, with possible use of experiment group and control group (Survey experiment).
Quantitative content analysis: structured coding system with categories (e.g. tweet study)
Elements of Qualitative and Quantitative Research
Element
Quan
Qual
Types of RQ
Statistic generalizations
Analytical descriptions
Methodology
Structuring
Flexibility
Relation to sources
Distance and selectivity
Proximity and sensitivity
Interpretation
Precision
Relevance
Chapter 7.5-7.7 Procedure for selecting info
Specify and define each of the concepts of the study
Decompose concepts: specify dimensions for each term
Define a set of categories for each dimension
Clarify operational definitions
Different levels of measurement for variables
Nominal level: inequality between values (e.g. gender)
Ordinal level: rank order between values (e.g. education)
Interval level: distance between values (e.g. temperature/ degrees)
Ratio level: proportion between values, with a meaningful or natural zero value (e.g. age)
Big data sources
Volunteered information: social media, transactions, sousveillance, crowdsourcing, etc.
Automated information: various systems like surveillance devices, scan data, interaction data
Directed information: CCTV, drones, individual identification
Data scraping:
using data programs to extract relevant info
Types are web scraping, report mining, screens scraping
Data mining also identifies patterns in the extracted data.
Chapter 8 (in gronmo 2019) Sample studies
Sample studies: part of the population is chosen to form a sample; only part is studies, but findings are used to generalize.
Statistical generalization: based on numerical data, extensive research
Theoretical generalization: based on conceptual relevance, intensive research
Types of Samples
Population sample: all units in the study’s universe
Pragmatic sample: not meant to generalize, more exploratory/ pilot studies
Probability sample: all units have a known probability of being included in the sample
Confidence interval /statistical margin of error
Significance level => P< 0.05
Strategic sample: theoretical understanding of the social conditions being studied is required to develop theories (analytical induction) or to make a holistic generalization
Case studies: restricted to one unit
Sampling methods
Simple random sampling: random drawing from a list of all units in the study’s universe
Systematic sampling: sampling of every Nth unit on a list of all units in the universe (e.g. every 10th unit is used)
Stratified sampling: units are divided into categories according to their properties. Random drawing of units from each category.
Proportional sample: all units have the same probability of being included in the sample
Different per category: disproportional
Weighing: under-represented units receive more weight
Cluster sampling: units divided into clusters according to location. Random drawing of entire clusters.
Multi-stage probability sampling: different sampling methods are used in turn
Methods for strategic sampling
Quota sampling: units are divided into specific categories from which a specific number (quota) is selected
Haphazard sampling: sampling of units that happen to be located in a particular place at a particular time
Self-selection: consists of actors who volunteer to participate (often surveys)
Snowball sampling: first actor suggests more actors to participate for the sample
Week 4 Quantitative research process
Surveys
Surveys experiment: a researcher randomly assigns participant to at least two experimental conditions (vignettes) and aims to identify is there is a relationship between the manipulated variables.
Documents and records / archival data (large-n data sets)
Participant observation / ethnographic research: researchers directly observe actors in their natural environment and use extensive fieldnotes and either overt or covert participation observation
Documents and records/ archival data
Sampling Details
Sampling: deciding your units of observation
Population Theoretical: Who/what do you want to study?
Study population: who/what do you have access to?
Sampling strategy: how do you access the participants /data?
Types include cross-sectional, panel, cohort, and trend.
Probability sampling (quan) and Sampling Error
Probability sampling (quan): goal is being able to generalize, intending to ensure that each unit of observation in the population has an equal chance of being included in the study.
Simple random sample
Systematic sample: every nth person is chosen
Stratified sample: population is divided into sub-groups and a random sample is selected from each sub-group
Clustered sample: divide population into clusters (groups) based on physical/ geographical proximity, randomly choose one cluster to study
Sampling error: error that arises in a data collection process as a result of taking a sample from a population rather than using the whole population.
Confidence interval: probability that sample accurately reflects the population (standard is 95%)
Margin of error: range by which the population parameter may deviate from the sample parameter
Non-sampling Error & Non-probability Sampling
Non-sampling error
Population specification error: does not understand who to collect data on
Sample frame error (selection bias): wrong sub-group is used for representation
Self-selection error (volunteer basis): only those who are interested respond
Non-response error: inability to contact potential respondents or their refusal
Non-probability sampling (quan): no aim of equal change of being included; continue to sample until you don’t get new information
Convenience sample: volunteers who are available
Quota sample: research determines what kind of characteristics are wanted in the sample, with a minimum of two groups for comparison.
Purposive sample: researcher decides who is included; specific individuals are targeted.
Snowball sample: when reaching difficult populations, respondents help find new respondents.
Week 5 Chapter 18 (in gronmo 2019) Number of variables
Bivariate analysis: study with 2 variables
Multivariate analysis: study with 3 or more variables
Table analysis: no fixed limit of variables in a table, but focus lies on 2 or 3.
Correlation analysis: primarily bivariate. Correlation is clarified between each pair of variables and expressed in correlation coefficient that are presented in a correlation matrix (built on separate bivariate analyses)
Regression analysis: multivariate, suitable for analyzing the relationship between many variables; variables at interval or ratio level, but at the nominal level only is they are dichotomies with values 0 and 1 (dummy variables).
Dependency Relation Between Variables
Dependency relation between variables: independent variable exerts an effect while the dependent variable is affected.
Symmetrical relation: in the analysis, the variables are treated as equivalent.
How to determine these relations:
Chronology: can we assume that each unit’s value for one variable is determined before they got their value for the other variable (e.g. gender first, income later).
Causal relation: independent variable is considered the cause, the dependent variable the effect.
Spurious: bivariate relationship is due to a statistical relationship between one of the two variables and a third variable, which will disappear when analyzing the 3 variables together (e.g. income – home size – age).
Chapter 16 (in gronmo 2019) Impressionist approach to data analysis
Impressionist approach to data analysis: conclusions drawn from researcher’s experience and impressions.
Coding: finding keywords (codes) that describe a larger section of the text. In quan, codes receive a number or value before the analysis
Descriptive: characteristics of actual or explicit text
Interpretive: researcher’s interpretations
Explanatory: researcher’s explanation of explicit elements of the text
Open coding: initial characterization of the key elements of the data, mostly descriptive codes.
Categories/ type: collection of specific common characteristics; is more systematic than open coding
Concept: theoretical construct/ general notion for a particular type of phenomena
Coding Methods and Terminology
Constant comparative methods: repeated systematic comparisons of the various elements in the data; useful for theory development (grounded theory).
Typology: multiple categories are arranged in relation to each other in a particular system.
Ideal type: representation of particular phenomenon, where the most important features of the phenomenon are isolated and described in an idealized or pure form (roughly a model, but not ideal in the normative sense).
Matrix: chart for systematizing and arranging quotes from qualitative data; can be person-centered or theme-centered, filled with text rather than numbers (figures to illustrate structural patterns).
Sociograms: actors are presented as points, with relationship between the actors illustrated by lines or arrows between the points.
Hierarchies: concern hierarchal structures, where the relationships represent relations between superior and subordinate categories or units (e.g. organizational chart).
Condensation: content of the data is presented in an abbreviated or condensed form, which is a method to obtain a comprehensive understanding.
Narrative analysis: text is organized with reference to typical elements in a story or narrative.
Discourse analysis: used to establish a comprehensive understanding of expressions of opinion and communication processes.
Discourse: system of ideas, perceptions, and concepts about conditions in society.
Grounded theory/ analytical induction / theory-generating studies: hypotheses and theories based on empirical evidence where empirical patterns are not only described but also interpreted.
Search for deviating cases: search for elements in data non-conform the hypotheses to formulate a new, complete theory (only in qual. In quan, the hypotheses are rejected)
CAQDAS: computer-assisted qualitative analysis software; often depends on coding, effective in the constant comparative method, but is not effective to find a holistic understanding / finding overarching patterns / interpretation of data (Examples are NVivo, ATLAS.ti, N6, MaxQDA)
Chapter 17 (in gronmo 2019) Univariate distribution
Univariate distribution: distribution on a single variable or in a single index
Frequency: number of units registered with a particular value
Absolute frequency: actual counts
Relative frequency: percentage of total counts, total always 100%
A relative distribution/ percentage distribution is a better basis for comparisons.
Cumulative frequencies: share of a particular value, plus all the lower values, with the highest always at 100% (either absolute or relative)
Inverse: starting from high to low
Graphical: not as a table but a graph or chart (examples are bar chart, pie chart, and line chart where the line is called the curve)
Distributions & Statistical Measures
Symmetrical distribution is called the normal distribution or bell curve.
Statistical measures for Central tendency:
Nominal: modes (variable with the highest frequency)
Ordinal: median (divided the units in 2 equal parts when arranges in ascending order)
Interval/ ratio: either modes or median, with possible use of mean (values of units / number of units)
Statistical measures for Dispersion:
Standardized modal percentage: 1 (max dispersion) to 100 (no dispersion)
Quartile deviation: units divided into 4 parts, and the difference between Q1 and Q3. Quartile deviation is (Q3-Q1)/2
Variance or standard deviation: based on distance from a unit to the mean, usually at the interval and ratio level.
Week 5 Quantitative analysis
Types of Quantitative analysis:
Descriptive statistics: explaining the basic features of the data
Inferential statistics: aims to draw conclusions (inferences) from the data that can be generalized to a larger population
measurement Levels and Descriptive Statistics
Measurement levels:
Nominal: data with no inherent order or ranking (e.g. nationality, gender, religion)
Ordinal: data with an inherent order or rank (e.g. age group, level of education)
Scale/ continuous: ordered data with a meaningful metric (e.g. GDP)
Descriptive statistics: allow us to summarize and display information about single variable, also called univariate analysis
Frequency /percentage distribution: describes how the sample is distributed
Measures of central tendency: mean (average), median (middle point), mode (most common/frequent)
Measures of variation: range (highest value -lowest value), interquartile range (Q3- Q1), standard deviation (average spread of the data around the mean where the larger the SD, the more spread out the data is).
Outliers and Bivariate Analysis
Outliers: a data point that differs significantly from other observations, which can skew the results and SD; needs to be 1.5 * IQR above Q3 or below Q1
Bivariate analysis: two variables are analyzed to determine their relationship
Correlation coefficient: measured the degree of linearity in the relationship between the variables which will be between -1 and +1, and can be weak (-0.2 to 0.2), medium (-0.2 to -0.4 or 0.2 to 0.4) or strong (
Statistical significance: tells us if a relationship exist that is not based on chance and relies on a p-value. Is between 0 and 1, threshold p< 0.05 is used; If p
Multivariate Analysis
Multivariate analysis: allows the simultaneous investigation of the relationship between more than two variables. By adding more variables, we are able to control for differences in the sample.
Types of variables:
Independent
Dependent
Control: anything that is held constant or limited in a research study; it is not of direct interest to the study’s objectives but may have some impact.
Confounding variable: a variable that both influences the IV and DV, causing a spurious association.
Spurious correlation: two events are found to be correlated despite having no logical connection.
Inferential statistics: types of analysis used to determine something about a population based on a sample, generally used for hypotheses testing.
Qualitative Analysis types
Content analysis: subjective interpretation of text data through systematic coding, condensing the raw data into categories by developing a codebook in which each code is clearly defined with in-/ exclusion criteria.
(reflexive) thematic analysis: identifying, analyzing, and reporting patterns (themes) within data
(critical) discourse analysis: examine patterns of language across texts and consider the relationship between language and the social and cultural contexts in which it is used; language is not neutral.
Narrative analysis: discourses with a clear sequential order that connects events in a meaningful way to understand how do people make sense of what happened?
Week 6 Chapter 15 (in gronmo 2019) Data quality
Data quality is based on scientific principles of truth and logical discussion, which is why the selection of units and data collection must be carried out in a systematic manner, in accordance with the used research design.
2 criteria that must be maintained are reliability and validity
Reliability & Validity
Reliability: accuracy and trustworthiness, which is often tested by repeating research (quan)
Validity: adequacy and relevance, which needs consistency in theoretical and operational definitions; reliability is needed to have validity.
Reliability types:
Stability: consistency during repetition tested by the test-retest method, which entails a repetition with random sample of units.
Equivalence: comparison of data between research with the same design with different people doing the study, tested by the inter-subjectivity method which entails comparisons of different observers/ interviewers (deviations in finding are due to the observer’s reliability), and equivalence between different indicators in the same index tested with the split-half method / internal consistency test, where data in a single study is compared on different parts of the research design
Reliability is measured on a scale from 0 (no consistency) to 1 (full consistency), and does not take random consistency into account. This can be expressed by Scott’s pi
Pi = (% \text{ actual consistency} - % \text{ random consistency}) / (100% - % \text{ random consistency})
Qualitative Relaibility
In qual, reliability is also interpreted as credibility where findings are based on data about actual, objective, conditions involving internal consistency between different elements of the data within the study and relating to the bigger picture, and external consistency between data in the study and other available information about the conditions studied, and also about the bigger picture
Validity types
Face validity: data collection is appropriate for the intention of the study
Quan:
Definitional: theoretical (intent on what to study) and operational (what actually is being studied)
Content validity: operational definition is broad
Criterion validity: consistency between data of the same concepts, different definitions involving concurrenct/ predictive tests, of which one test, which is already established, is executed at the same time as a newer one to test its validity.
Construct validity: data on the relationship between concepts aligns with known relationship between then, and may by either convergent (there is a relation) and discriminant (lack of a relation)
Internal validity: experiment is satisfactory
External validity: results are realistic / generalizable
Qual
Competence validity: researcher’s competence to collect the data is deemed sufficient, due to experience.
Communicative validity:
Discussion between researcher and others about the data directly with the sources (actor validation) and/or with colleagues: collegial validation, offers different academic perspectives.
Pragmatic validity: good basis for action, prescriptions and suggestions made by the study
Week 6: Evaluating quantitative research: rigor
Measurement error: the difference between observed value and the actual value
Random error: create imprecision in data and impact how reproducible the result would be. Reduce by repeating, increasing sample size, and controlling variables
Systematic error: creates bias in data and mean that measurements of the same thing will vary in predictable ways. Reduce by triangulation and calibration
Reliability & Validity Evaluation
Reliability: accuracy and consistency of data collection
Stability and test-retest
Equivalence: internal consistency assesses whether you get the same results when testing a specific concept via multiple items (like survey questions)
Equivalence: inter-subjective: refers to the degree in which multiple observers agree on a measurement; is called inter-rater reliability
Validity: adequacy and relevance of data collection
Measurement validity
Content: does the measure adequately capture the concept’s full meaning?
Criterion (concurrent): measures the consistency between different measures for the same concept
Construct: shows the relationship between measures for different concepts
Strong correlation between concepts is convergent validity.
Weak correlation is discriminant validity.
Design validity
Internal: does the IV cause the DV? Only tested in (survey) experiments
External: population and ecological can it be generalized to the population and across settings?
Qualitative trustworthiness criteria
Credibility: establishing that the results are believable
Transferability: applicability of findings to other settings
Dependability: accounting for the changing context in which the research is conducted
Confirmability: can it be confirmed by other researchers
Reflectivity: open about their positionality (how the researcher's background affects data collection and interpretations)
Week 7 Chapter 3 Basic ethical norms
Communalism: open available research
Universalism: pure academic criteria, objectivity
Disinterestedness: motives are purely scientific
Organized skepticism: results are discussed
Originality: no plagiarism
Humility: open about limitations
Honesty
Unethical behaviour Rules for participants
Plagiarism: copy other authors’ work and publish as your own
Fabrication of data
Inappropriate assigning of authorship
Rules for participants: consent, safety, anonymity, information is clear, requiring informed consent.
Online research: netiquette
Ethical resources: ethical guidelines and ethical committees
The Purpose of Social Science
Analytical: understanding phenomena
Critical: finding point of critique
Constructive: improve society by changing phenomena
Shenton, A. K.(2004). Strategies for ensuring trustworthiness