Data Analysis - informatics

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/48

flashcard set

Earn XP

Description and Tags

Data Analysis - semester 1

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

49 Terms

1
New cards

Cross-sectional data

Data collected a single point in time from a sample.

Provides a snapshot of a social phenomenon.

2
New cards

Name an example of a cross-sectional data

World Values Survey

3
New cards

Representative sample

A smaller group chosen from a larger group population that reflects key characteristics

4
New cards

World Values Survey

a repeated cross-sectional survey
examines the evolution of social values

each country is represented by a representative sample of 1200 individuals

5
New cards

Observations

Numeric values, categorical codes, or scale scores representing variables of interest for a set of cases in a cross-sectional dataset.

6
New cards

metadata

adds meaning to data by describing it

provides descriptors such as variable names, labels, coding schemes, and measurement labels so it can be properly interpreted

7
New cards

microdata

data gathered from individuals

8
New cards

macrodata

data about social units

(countries, counties, organisations, etc)

9
New cards

Variable

A characteristic that varies between cases.

Can refer to facts, attitudes or social values.

10
New cards

Values

Numbers that represent a response category or response

11
New cards

Variable Name

should not be more than 8 characters.

should not have special characters.

12
New cards

String variable

Just text or text with numbers and signs

13
New cards

Numeric variable

A variable just with numbers

14
New cards

Numbers vs classifications

Number - number and values are the same

Classifications - the category receives a value

15
New cards

Technical use vs substantial use of frequency tables

Technical use - see which values and value labels the variable has

Substantial use - see how people distribute relative to variables’ values

16
New cards

Elements of a frequency table

knowt flashcard image
17
New cards

Valid cases

respondents giving a response to the substantial question

18
New cards

missing cases

respondents refusing to answer

19
New cards

frequency

number of cases for each value

20
New cards

percent

percent of cases for each value

21
New cards

Valid percent

percentage for each valid value.

When the number of missing values is large, differences between percent and valid percent are high.

22
New cards

Cumulative percent

each percentage is added.

useful when values are ordered.

23
New cards

Technical use vs substantial use of a contingency table

Technical - see if filters are in place

Substantial -See how one variable is associated with another, where one is considered dependent (outcome) and the other independent (predictor).

24
New cards

Contingency Table

knowt flashcard image
25
New cards

independent variable

the variable you change to see if it has an effect on the dependent variable

26
New cards

Concepts

Concepts are abstract terms used to describe

characteristics of social units, such as gender roles, work ethic, religiosity, progressive values, and more.

27
New cards

Single-item scale

closed questions.

respondents are asked to select one or more choices appropriate to their situation to a SINGLE variable

<p>closed questions.</p><p>respondents are asked to select one or more choices appropriate to their situation to a SINGLE variable</p>
28
New cards

Multi-item scales

closed questions.

asked to select one choice appropriate for their situation to at least two variables.

<p>closed questions.</p><p>asked to select one choice appropriate for their situation to at least two variables. </p><p></p>
29
New cards

Recoding the variable

creating a new variable by modifying an EXISTING one (e.g collapsing categories, reversing the scale, changing values)

30
New cards

Computing a new variable

summarising a multi-item scale by creating a NEW single variable.

31
New cards

Listwise

if a case has missing values, it is ignored when computing the sum of variables or the mean of variables.

less valid cases.

more missing cases.

better than pairwise.

<p>if a case has missing values, it is ignored when computing the sum of variables or the mean of variables.</p><p>less valid cases.</p><p>more missing cases.</p><p>better than pairwise.</p>
32
New cards

Pairwise

if a case has missing values, we will not ignore it.

more valid cases.

less missing cases.

better when the number of cases available for analysis is very low.

<p>if a case has missing values, we will not ignore it. </p><p>more valid cases.</p><p>less missing cases.</p><p>better when the number of cases available for analysis is very low.</p>
33
New cards

Bar Chart purpose

used to display a distribution of a categorical variable or to represent a metric variable with multiple distinct values.

34
New cards

Bar Chart x-axis and y-axis

x-axis - categories or groups being described

y-axis - numerical values or frequencies

<p>x-axis - categories or groups being described</p><p>y-axis - numerical values or frequencies </p>
35
New cards

What does a bar chart represent from a frequency table?

Valid Percent

<p>Valid Percent </p>
36
New cards

Histograms vs bar charts

Histograms visualize quantitative data or numerical data, whereas bar charts display categorical variables

bar chart - COMPARING and displaying data across different categories, categorical data, bars do not touch

histogram - good for continuous data, numerical data, bars touch

37
New cards

Clustered bar chart purpose

to visualise a contingency table with percentages within one of the variables.

shows subcategories.

<p>to visualise a contingency table with percentages within one of the variables.</p><p>shows subcategories. </p>
38
New cards

Stacked bar chart

the same as a clustered bar chart but the bars are stacked on top of each other

<p>the same as a clustered bar chart but the bars are stacked on top of each other </p>
39
New cards

nominal data

categories with no order
e.g. gender, ethnicity, religion

use: frequencies, percentages, bar charts

40
New cards

ordinal data

logical order

e.g. educational level, social class

use: frequencies, percentages, median

41
New cards

scale data (interval/ratio)

numeric

e.g. age, income, hours worked

use: mean, median, standard deviation, histograms

42
New cards

frequency

raw count (how many)

43
New cards

percentages

frequencies converted out of 100

44
New cards

cumulative percent

adds percentages progressively (should end at 100%)

doesn’t work for nominal data

45
New cards

bar chart

nominal or ordinal

separate bars

compares things at one point in time

46
New cards

histogram

scale data

bars touch

47
New cards

percent used for missing data

valid percent

48
New cards

what does valid percent exclude?

missing values

49
New cards

frequencies table

how often each response occurs