Chapter 14- - Quantitative Data Analysis

studied byStudied by 0 people
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions

1 / 56

encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

57 Terms

1

Database

A repository of data (what researchers access to retrieve data).

New cards
2

Dataset

A set of variables and cases that researchers use for analysis.
Often derived from a larger database.
Could be the same as a "database" if an entire database is included in the dataset.
Can also have a dataset that you create/develop with no database (like most examples we use in class).

<p>A set of variables and cases that researchers use for analysis.<br>Often derived from a larger database.<br>Could be the same as a "database" if an entire database is included in the dataset.<br>Can also have a dataset that you create/develop with no database (like most examples we use in class).</p>
New cards
3

Database ex

Income information for all people in NC who received Medicaid services from 19995-2017 (housed on a government server in Raleigh).

New cards
4

Dataset ex

income information for 100 people in Pitt County who received Medicaid services in 2017 (in an Excel spreadsheet on a researchers computer at ECU).

New cards
5

Sample Size (Sample)

A subset of a population that is used to study the population as a whole.

<p>A subset of a population that is used to study the population as a whole.</p>
New cards
6

Sampling Frame

A list of all elements or other units containing the elements in a population.
A listing of the accessible population from which you'll draw your sample.

<p>A list of all elements or other units containing the elements in a population.<br>A listing of the accessible population from which you'll draw your sample.</p>
New cards
7

Rows

Cases or observations (indicate the Units of Analysis)

<p>Cases or observations (indicate the Units of Analysis)</p>
New cards
8

Columns

Variables (quantitative or qualitative)

<p>Variables (quantitative or qualitative)</p>
New cards
9

Units of Analysis

What is being studied.
Is often people (children, adults, etc.)

<p>What is being studied.<br>Is often people (children, adults, etc.)</p>
New cards
10

Identification (ID) Variable

Typically, the first column in a dataset.
A unique value for each unit of analysis (can not have the same value for different cases).
Examples: case numbers, name (if all unique), Medicaid ID, etc.

<p>Typically, the first column in a dataset.<br>A unique value for each unit of analysis (can not have the same value for different cases).<br>Examples: case numbers, name (if all unique), Medicaid ID, etc.</p>
New cards
11

Descriptive Statistics

Statistics used to describe the distribution of and relationship among variables.
The analysis of data that helps describe, show, or summarize data.
Two main ways: central tendency and standard deviation

<p>Statistics used to describe the distribution of and relationship among variables.<br>The analysis of data that helps describe, show, or summarize data.<br>Two main ways: central tendency and standard deviation</p>
New cards
12

Central Tendency

The most common value (for variables measured at the nominal level) or the value around which cases tend to center (for a qualitative variable).
The typical or central value for a variable (mean, median, and mode - the most common measure)

<p>The most common value (for variables measured at the nominal level) or the value around which cases tend to center (for a qualitative variable).<br>The typical or central value for a variable (mean, median, and mode - the most common measure)</p>
New cards
13

Mean

The arithmetic or weighted average of a set of numbers is computed by adding up the values of all the cases and dividing by the number of cases.

<p>The arithmetic or weighted average of a set of numbers is computed by adding up the values of all the cases and dividing by the number of cases.</p>
New cards
14

Advantages to using the Mean

Everyone understands and is commonly used (most reliable)
A good indication of the central tendency of a distribution
Takes into account all scores
Good for interval & ration data

<p>Everyone understands and is commonly used (most reliable)<br>A good indication of the central tendency of a distribution<br>Takes into account all scores<br>Good for interval &amp; ration data</p>
New cards
15

Disadvantages to using the Mean

Can be skewed by extreme scores (best when there are no extremes).

<p>Can be skewed by extreme scores (best when there are no extremes).</p>
New cards
16

Median

The point that divides a distribution in half; the middle value in a sequence.
Organize the set of numbers from lowest to highest -> locate the middle number.

<p>The point that divides a distribution in half; the middle value in a sequence.<br>Organize the set of numbers from lowest to highest -&gt; locate the middle number.</p>
New cards
17

Advantages to using the Median

Easy to calculate
Is not skewed by extreme scores
Useful when needing a "middle value" (e.g., housing values)
Works for interval and ratio data

<p>Easy to calculate<br>Is not skewed by extreme scores<br>Useful when needing a "middle value" (e.g., housing values)<br>Works for interval and ratio data</p>
New cards
18

Disadvantages to using the Median

No information provided by more extreme scores

<p>No information provided by more extreme scores</p>
New cards
19

Mode

The most frequently occurring value in the distribution.
Organize the set of numbers from smallest to largest
Locate the number that appears in the list most often.

<p>The most frequently occurring value in the distribution.<br>Organize the set of numbers from smallest to largest<br>Locate the number that appears in the list most often.</p>
New cards
20

Advantages to using the Mode

East to calculate
Is not skewed by extreme scores
Useful to determine what occurred the "most" (e.g., the # of children in the home)/
Can use with counts of nominal data (e.g., most common major at ECU)

<p>East to calculate<br>Is not skewed by extreme scores<br>Useful to determine what occurred the "most" (e.g., the # of children in the home)/<br>Can use with counts of nominal data (e.g., most common major at ECU)</p>
New cards
21

Disadvantages to using the Mode

Information is obtained about only one response value.
Cannot be used of all the scores are different & might be several modes.
The most frequent score may not actually be "in the middle" of the distribution.

<p>Information is obtained about only one response value.<br>Cannot be used of all the scores are different &amp; might be several modes.<br>The most frequent score may not actually be "in the middle" of the distribution.</p>
New cards
22

Variability (or Spread)

The extent to which cases are spread out through the distribution or clustered in just one location.
The dispersion of scores for a variable (range and SD)
The greater the difference between scores, the more spread out the distribution is.
The more tightly the scores group together, the less variability there is in the distribution.

<p>The extent to which cases are spread out through the distribution or clustered in just one location.<br>The dispersion of scores for a variable (range and SD)<br>The greater the difference between scores, the more spread out the distribution is.<br>The more tightly the scores group together, the less variability there is in the distribution.</p>
New cards
23

Range

The true upper limit in a distribution minus the true lower limit (or the highest rounded value minus the lowest rounded value).
The difference between the highest & and lowest scores -> tells us something about the variation of the scores.
(The simplest way)
Can be drastically altered by just on exceptionally high or low value.

<p>The true upper limit in a distribution minus the true lower limit (or the highest rounded value minus the lowest rounded value).<br>The difference between the highest &amp; and lowest scores -&gt; tells us something about the variation of the scores.<br>(The simplest way)<br>Can be drastically altered by just on exceptionally high or low value.</p>
New cards
24

Standard Deviation

The square root of the average squared deviation of each case from the mean.
A statistic that indicate how far scores in a sample (or population) are spread out from the mean.
Provides a measure of the overall variation In a dataset.

<p>The square root of the average squared deviation of each case from the mean.<br>A statistic that indicate how far scores in a sample (or population) are spread out from the mean.<br>Provides a measure of the overall variation In a dataset.</p>
New cards
25

Standard Deviation Example

Small SD indicates a small amount of variability for a given data set; there will be a lot of values closer to the mean (less spread out).

<p>Small SD indicates a small amount of variability for a given data set; there will be a lot of values closer to the mean (less spread out).</p>
New cards
26

Univariate Analysis: Qualitative Variable (nominal; ordinal)

Mode

New cards
27

Univariate Analysis: Quantitative Variable (interval; ratio)

Mean
Median
Mode
Range
Standard Deviation

New cards
28

What are Univariate Tables

Often used with qualitative (nominal/ordinal) values.
If quantitative (interval/ratio_ variables, need to group them into categories first.
Shows frequencies, percentages, or proportions of values for one variable.
Very commonly used in research.

<p>Often used with qualitative (nominal/ordinal) values.<br>If quantitative (interval/ratio_ variables, need to group them into categories first.<br>Shows frequencies, percentages, or proportions of values for one variable.<br>Very commonly used in research.</p>
New cards
29

Bar Charts

Used with qualitative (nominal/ordinal) data
Shows the frequencies, percentage, or proportions of values for one variable
The variable distribution is displayed with solid bars separated by spaces.

<p>Used with qualitative (nominal/ordinal) data<br>Shows the frequencies, percentage, or proportions of values for one variable<br>The variable distribution is displayed with solid bars separated by spaces.</p>
New cards
30

Histograms

Used with quantitative (interval/ratio) data.
Shows frequencies, percentages, or proportions of values for one variable.
Good at showing how data is achieved.
The variable distribution is displayed with adjacent bars NOT separated to indicate that the variable is continuous.

<p>Used with quantitative (interval/ratio) data.<br>Shows frequencies, percentages, or proportions of values for one variable.<br>Good at showing how data is achieved.<br>The variable distribution is displayed with adjacent bars NOT separated to indicate that the variable is continuous.</p>
New cards
31

Bar Chart Qualities

Mutually exclusive categories
Exhaustive categories
Count, percent, or proportion on y-axis
Qualitative variables (nominal/ordinal)
Space between bars

New cards
32

Histogram Qualities

Mutually exclusive categories
Exhaustive categories
Count, percent, or proportion on y-axis
Quantitative variables (interval/ratio)
No space between bars

New cards
33

Boxplots

For quantitative (interval/ratio) data
Shows/summarizes the distribution of a variable (i.e., the relative number, percent, or proportion of times that each possible score or value occurred).
Useful for comparing distributions between groups.

<p>For quantitative (interval/ratio) data<br>Shows/summarizes the distribution of a variable (i.e., the relative number, percent, or proportion of times that each possible score or value occurred).<br>Useful for comparing distributions between groups.</p>
New cards
34

Interquartile Range

The difference between the upper and lower quartiles.

<p>The difference between the upper and lower quartiles.</p>
New cards
35

Outliers

A data point located outside the whiskers of a boxplot.
A single data point that goes far outside the average value of a group of statistics. May be exceptions.

<p>A data point located outside the whiskers of a boxplot.<br>A single data point that goes far outside the average value of a group of statistics. May be exceptions.</p>
New cards
36

Pie Charts

Best when you'd like to emphasize one especially big value or one especially small value.

<p>Best when you'd like to emphasize one especially big value or one especially small value.</p>
New cards
37

Why are pie charts generally not recommended for research?

Not always right for making comparisons:
- if the size of the pieces is similar and difficult to judge the relative size of the pieces.
- Problems especially if many levels of the variables and/or levels of the variable are roughly the same.

New cards
38

Univariate Graph: Qualitative Variables

Table
Bar Chart

New cards
39

Univariate Graph: Quantitative Variables

Histogram
Box Plot

New cards
40

Descriptive Statistics uses numbers, variables, proportions, etc., to

characterize (i.e., describe) or summarize information about a particular group.
Fairly intuitive.
Most statistics we have talked about in this course (e.g., mean & SD)

New cards
41

Inferential Statistics uses statistical methods to

make generalizations about a population using a sample drawn from that population.
Relies on more complex probability theories.

New cards
42

Inferential Statistics

Mathematical tools for estimating how likely it is that a statistical result based on data from a random sample is representative of the population from which the sample is assumed to have been selected.
makes assumptions about distribution.
"p-value"
Hypothesis testing.

<p>Mathematical tools for estimating how likely it is that a statistical result based on data from a random sample is representative of the population from which the sample is assumed to have been selected.<br>makes assumptions about distribution.<br>"p-value"<br>Hypothesis testing.</p>
New cards
43

Descriptive Statistics examples

What is the distribution of dementia by levels of social support?
What is the distribution of trauma scores by level of ACES?

New cards
44

Inferential Statistics examples

Is there a statistically significant relationship between social support and dementia in the population, based on this sample?
Is there a statistically significant relationship between trauma scores and levels of ACEs in the population, based on this sample?

New cards
45

Data Distribution

Data can be distributed in different ways: it can be spread out more on the left, more on the right, or all jump up.

<p>Data can be distributed in different ways: it can be spread out more on the left, more on the right, or all jump up.</p>
New cards
46

Normal Distribution (bell curve)

The data is evenly (normally) distributed.
Looks like a bell and is symmetric; this shape is important for sampling and statistical analysis.

<p>The data is evenly (normally) distributed. <br>Looks like a bell and is symmetric; this shape is important for sampling and statistical analysis.</p>
New cards
47

Skewedness

The extent to which cases are clustered more at one or the other end of the distribution of a quantitative variable, rather then in a symmetric pattern around its center.
Highly skewed data can lead to misleading results.

<p>The extent to which cases are clustered more at one or the other end of the distribution of a quantitative variable, rather then in a symmetric pattern around its center.<br>Highly skewed data can lead to misleading results.</p>
New cards
48

Positive Skew

Skew to the right
Number of cases taper off to the right

<p>Skew to the right<br>Number of cases taper off to the right</p>
New cards
49

Negative Skew

Skew to the left
Number of cases tapering off to the left

<p>Skew to the left<br>Number of cases tapering off to the left</p>
New cards
50

p-value

A number (statistic) used to indicate that the result of a statistical test was not due to chance (random variation).
A "significant" effect or difference.
Think of as the probability that what was found was due to chance. ("p" = "probability")
Relates to inferential rather than descriptive statistics.

New cards
51

p < 0.05

Generally, the acceptable cut off (sometimes p < .10 for more exploratory research) for a result to be "statistically significant".
May also see p < 0.01; p < 0.001 -- these are all acceptable.

<p>Generally, the acceptable cut off (sometimes p &lt; .10 for more exploratory research) for a result to be "statistically significant".<br>May also see p &lt; 0.01; p &lt; 0.001 -- these are all acceptable.</p>
New cards
52

Hypothesis

Propose relationship between two variables.
A tentative statement about empirical reality involving a relationship between two or more variables.

New cards
53

Null Hypothesis (Ho)

The hypothesis of no (null) effects. The hypothesis you are testing against; try to dissolve or discredit; there is no statistical significance between the two variables.

<p>The hypothesis of no (null) effects. The hypothesis you are testing against; try to dissolve or discredit; there is no statistical significance between the two variables.</p>
New cards
54

Alternative Hypothesis (Ha)

The hypothesis of something being different than the null. The hypothesis you are testing; here is a statistically significant relationship between two variables.

<p>The hypothesis of something being different than the null. The hypothesis you are testing; here is a statistically significant relationship between two variables.</p>
New cards
55

Statistically Significant

A p-value less than 0.05 is typically considered statistically significant, in which case the null hypothesis (Ho) should be rejected.

<p>A p-value less than 0.05 is typically considered statistically significant, in which case the null hypothesis (Ho) should be rejected.</p>
New cards
56

Hypothesis Example

You believe there is a lower level of depression in our class than in the general population, so you give students in the class the BDI (Beck's Depression Inventory) & take the average.

New cards
57

Hypothesis Example: (Ho) or (Ha) ?

(Ho) = The average level of depression in this class is the same as the general population.
(Ha)= The average level of depression in this class is lower than (different than) the general population.

You reject the null hypothesis and accept the alternative hypothesis if statistics indicate a difference (with a p < 0.05 level of confidence) => 95% sure that the average level of depression in our class is lower than the general population.

New cards

Explore top notes

note Note
studied byStudied by 1 person
74 days ago
4.0(1)
note Note
studied byStudied by 3 people
113 days ago
5.0(1)
note Note
studied byStudied by 1 person
112 days ago
5.0(1)
note Note
studied byStudied by 472 people
781 days ago
4.0(1)
note Note
studied byStudied by 4 people
148 days ago
5.0(1)
note Note
studied byStudied by 53 people
705 days ago
5.0(1)
note Note
studied byStudied by 5 people
288 days ago
5.0(1)
note Note
studied byStudied by 462 people
156 days ago
4.0(2)

Explore top flashcards

flashcards Flashcard (25)
studied byStudied by 45 people
820 days ago
4.0(1)
flashcards Flashcard (69)
studied byStudied by 111 people
344 days ago
5.0(2)
flashcards Flashcard (45)
studied byStudied by 220 people
111 days ago
5.0(1)
flashcards Flashcard (126)
studied byStudied by 4 people
686 days ago
5.0(1)
flashcards Flashcard (26)
studied byStudied by 14 people
834 days ago
5.0(2)
flashcards Flashcard (98)
studied byStudied by 237 people
370 days ago
5.0(6)
flashcards Flashcard (62)
studied byStudied by 29 people
917 days ago
4.5(2)
flashcards Flashcard (28)
studied byStudied by 4 people
30 days ago
5.0(1)
robot