Chapter 14- - Quantitative Data Analysis

Studied by 0 people

0.0(0)

LearnA personalized and smart learning plan

Practice TestTake a test on your terms and definitions

Spaced RepetitionScientifically backed study method

Matching GameHow quick can you match all your cards?

FlashcardsStudy terms and definitions

1 / 56

There's no tags or description

Looks like no one added any tags here yet for you.

57 Terms

Database

A repository of data (what researchers access to retrieve data).

New cards

Dataset

A set of variables and cases that researchers use for analysis.
Often derived from a larger database.
Could be the same as a "database" if an entire database is included in the dataset.
Can also have a dataset that you create/develop with no database (like most examples we use in class).

<p>A set of variables and cases that researchers use for analysis.<br>Often derived from a larger database.<br>Could be the same as a "database" if an entire database is included in the dataset.<br>Can also have a dataset that you create/develop with no database (like most examples we use in class).</p>

New cards

Database ex

Income information for all people in NC who received Medicaid services from 19995-2017 (housed on a government server in Raleigh).

New cards

Dataset ex

income information for 100 people in Pitt County who received Medicaid services in 2017 (in an Excel spreadsheet on a researchers computer at ECU).

New cards

Sample Size (Sample)

A subset of a population that is used to study the population as a whole.

New cards

Sampling Frame

A list of all elements or other units containing the elements in a population.
A listing of the accessible population from which you'll draw your sample.

<p>A list of all elements or other units containing the elements in a population.<br>A listing of the accessible population from which you'll draw your sample.</p>

New cards

Rows

Cases or observations (indicate the Units of Analysis)

New cards

Columns

Variables (quantitative or qualitative)

New cards

Units of Analysis

What is being studied.
Is often people (children, adults, etc.)

<p>What is being studied.<br>Is often people (children, adults, etc.)</p>

New cards

Identification (ID) Variable

Typically, the first column in a dataset.
A unique value for each unit of analysis (can not have the same value for different cases).
Examples: case numbers, name (if all unique), Medicaid ID, etc.

<p>Typically, the first column in a dataset.<br>A unique value for each unit of analysis (can not have the same value for different cases).<br>Examples: case numbers, name (if all unique), Medicaid ID, etc.</p>

New cards

Descriptive Statistics

Statistics used to describe the distribution of and relationship among variables.
The analysis of data that helps describe, show, or summarize data.
Two main ways: central tendency and standard deviation

<p>Statistics used to describe the distribution of and relationship among variables.<br>The analysis of data that helps describe, show, or summarize data.<br>Two main ways: central tendency and standard deviation</p>

New cards

Central Tendency

The most common value (for variables measured at the nominal level) or the value around which cases tend to center (for a qualitative variable).
The typical or central value for a variable (mean, median, and mode - the most common measure)

<p>The most common value (for variables measured at the nominal level) or the value around which cases tend to center (for a qualitative variable).<br>The typical or central value for a variable (mean, median, and mode - the most common measure)</p>

New cards

Mean

The arithmetic or weighted average of a set of numbers is computed by adding up the values of all the cases and dividing by the number of cases.

New cards

Advantages to using the Mean

Everyone understands and is commonly used (most reliable)
A good indication of the central tendency of a distribution
Takes into account all scores
Good for interval & ration data

<p>Everyone understands and is commonly used (most reliable)<br>A good indication of the central tendency of a distribution<br>Takes into account all scores<br>Good for interval & ration data</p>

New cards

Disadvantages to using the Mean

Can be skewed by extreme scores (best when there are no extremes).

New cards

Median

The point that divides a distribution in half; the middle value in a sequence.
Organize the set of numbers from lowest to highest -> locate the middle number.

<p>The point that divides a distribution in half; the middle value in a sequence.<br>Organize the set of numbers from lowest to highest -> locate the middle number.</p>

New cards

Advantages to using the Median

Easy to calculate
Is not skewed by extreme scores
Useful when needing a "middle value" (e.g., housing values)
Works for interval and ratio data

<p>Easy to calculate<br>Is not skewed by extreme scores<br>Useful when needing a "middle value" (e.g., housing values)<br>Works for interval and ratio data</p>

New cards

Disadvantages to using the Median

No information provided by more extreme scores

New cards

Mode

The most frequently occurring value in the distribution.
Organize the set of numbers from smallest to largest
Locate the number that appears in the list most often.

<p>The most frequently occurring value in the distribution.<br>Organize the set of numbers from smallest to largest<br>Locate the number that appears in the list most often.</p>

New cards

Advantages to using the Mode

East to calculate
Is not skewed by extreme scores
Useful to determine what occurred the "most" (e.g., the # of children in the home)/
Can use with counts of nominal data (e.g., most common major at ECU)

<p>East to calculate<br>Is not skewed by extreme scores<br>Useful to determine what occurred the "most" (e.g., the # of children in the home)/<br>Can use with counts of nominal data (e.g., most common major at ECU)</p>

New cards

Disadvantages to using the Mode

Information is obtained about only one response value.
Cannot be used of all the scores are different & might be several modes.
The most frequent score may not actually be "in the middle" of the distribution.

<p>Information is obtained about only one response value.<br>Cannot be used of all the scores are different & might be several modes.<br>The most frequent score may not actually be "in the middle" of the distribution.</p>

New cards

Variability (or Spread)

The extent to which cases are spread out through the distribution or clustered in just one location.
The dispersion of scores for a variable (range and SD)
The greater the difference between scores, the more spread out the distribution is.
The more tightly the scores group together, the less variability there is in the distribution.

<p>The extent to which cases are spread out through the distribution or clustered in just one location.<br>The dispersion of scores for a variable (range and SD)<br>The greater the difference between scores, the more spread out the distribution is.<br>The more tightly the scores group together, the less variability there is in the distribution.</p>

New cards

Range

The true upper limit in a distribution minus the true lower limit (or the highest rounded value minus the lowest rounded value).
The difference between the highest & and lowest scores -> tells us something about the variation of the scores.
(The simplest way)
Can be drastically altered by just on exceptionally high or low value.

<p>The true upper limit in a distribution minus the true lower limit (or the highest rounded value minus the lowest rounded value).<br>The difference between the highest & and lowest scores -> tells us something about the variation of the scores.<br>(The simplest way)<br>Can be drastically altered by just on exceptionally high or low value.</p>

New cards

Standard Deviation

The square root of the average squared deviation of each case from the mean.
A statistic that indicate how far scores in a sample (or population) are spread out from the mean.
Provides a measure of the overall variation In a dataset.

<p>The square root of the average squared deviation of each case from the mean.<br>A statistic that indicate how far scores in a sample (or population) are spread out from the mean.<br>Provides a measure of the overall variation In a dataset.</p>

New cards

Standard Deviation Example

Small SD indicates a small amount of variability for a given data set; there will be a lot of values closer to the mean (less spread out).

New cards

Univariate Analysis: Qualitative Variable (nominal; ordinal)

Mode

New cards

Univariate Analysis: Quantitative Variable (interval; ratio)

Mean
Median
Mode
Range
Standard Deviation

New cards

What are Univariate Tables

Often used with qualitative (nominal/ordinal) values.
If quantitative (interval/ratio_ variables, need to group them into categories first.
Shows frequencies, percentages, or proportions of values for one variable.
Very commonly used in research.

<p>Often used with qualitative (nominal/ordinal) values.<br>If quantitative (interval/ratio_ variables, need to group them into categories first.<br>Shows frequencies, percentages, or proportions of values for one variable.<br>Very commonly used in research.</p>

New cards

Bar Charts

Used with qualitative (nominal/ordinal) data
Shows the frequencies, percentage, or proportions of values for one variable
The variable distribution is displayed with solid bars separated by spaces.

<p>Used with qualitative (nominal/ordinal) data<br>Shows the frequencies, percentage, or proportions of values for one variable<br>The variable distribution is displayed with solid bars separated by spaces.</p>

New cards

Histograms

Used with quantitative (interval/ratio) data.
Shows frequencies, percentages, or proportions of values for one variable.
Good at showing how data is achieved.
The variable distribution is displayed with adjacent bars NOT separated to indicate that the variable is continuous.

<p>Used with quantitative (interval/ratio) data.<br>Shows frequencies, percentages, or proportions of values for one variable.<br>Good at showing how data is achieved.<br>The variable distribution is displayed with adjacent bars NOT separated to indicate that the variable is continuous.</p>

New cards

Bar Chart Qualities

Mutually exclusive categories
Exhaustive categories
Count, percent, or proportion on y-axis
Qualitative variables (nominal/ordinal)
Space between bars

New cards

Histogram Qualities

Mutually exclusive categories
Exhaustive categories
Count, percent, or proportion on y-axis
Quantitative variables (interval/ratio)
No space between bars

New cards

Boxplots

For quantitative (interval/ratio) data
Shows/summarizes the distribution of a variable (i.e., the relative number, percent, or proportion of times that each possible score or value occurred).
Useful for comparing distributions between groups.

<p>For quantitative (interval/ratio) data<br>Shows/summarizes the distribution of a variable (i.e., the relative number, percent, or proportion of times that each possible score or value occurred).<br>Useful for comparing distributions between groups.</p>

New cards

Interquartile Range

The difference between the upper and lower quartiles.

New cards

Outliers

A data point located outside the whiskers of a boxplot.
A single data point that goes far outside the average value of a group of statistics. May be exceptions.

<p>A data point located outside the whiskers of a boxplot.<br>A single data point that goes far outside the average value of a group of statistics. May be exceptions.</p>

New cards

Pie Charts

Best when you'd like to emphasize one especially big value or one especially small value.

New cards

Why are pie charts generally not recommended for research?

Not always right for making comparisons:
- if the size of the pieces is similar and difficult to judge the relative size of the pieces.
- Problems especially if many levels of the variables and/or levels of the variable are roughly the same.

New cards

Univariate Graph: Qualitative Variables

Table
Bar Chart

New cards

Univariate Graph: Quantitative Variables

Histogram
Box Plot

New cards

Descriptive Statistics uses numbers, variables, proportions, etc., to

characterize (i.e., describe) or summarize information about a particular group.
Fairly intuitive.
Most statistics we have talked about in this course (e.g., mean & SD)

New cards

Inferential Statistics uses statistical methods to

make generalizations about a population using a sample drawn from that population.
Relies on more complex probability theories.

New cards

Inferential Statistics

Mathematical tools for estimating how likely it is that a statistical result based on data from a random sample is representative of the population from which the sample is assumed to have been selected.
makes assumptions about distribution.
"p-value"
Hypothesis testing.

<p>Mathematical tools for estimating how likely it is that a statistical result based on data from a random sample is representative of the population from which the sample is assumed to have been selected.<br>makes assumptions about distribution.<br>"p-value"<br>Hypothesis testing.</p>

New cards

Descriptive Statistics examples

What is the distribution of dementia by levels of social support?
What is the distribution of trauma scores by level of ACES?

New cards

Inferential Statistics examples

Is there a statistically significant relationship between social support and dementia in the population, based on this sample?
Is there a statistically significant relationship between trauma scores and levels of ACEs in the population, based on this sample?

New cards

Data Distribution

Data can be distributed in different ways: it can be spread out more on the left, more on the right, or all jump up.

New cards

Normal Distribution (bell curve)

The data is evenly (normally) distributed.
Looks like a bell and is symmetric; this shape is important for sampling and statistical analysis.

<p>The data is evenly (normally) distributed. <br>Looks like a bell and is symmetric; this shape is important for sampling and statistical analysis.</p>

New cards

Skewedness

The extent to which cases are clustered more at one or the other end of the distribution of a quantitative variable, rather then in a symmetric pattern around its center.
Highly skewed data can lead to misleading results.

<p>The extent to which cases are clustered more at one or the other end of the distribution of a quantitative variable, rather then in a symmetric pattern around its center.<br>Highly skewed data can lead to misleading results.</p>

New cards

Positive Skew

Skew to the right
Number of cases taper off to the right

<p>Skew to the right<br>Number of cases taper off to the right</p>

New cards

Negative Skew

Skew to the left
Number of cases tapering off to the left

<p>Skew to the left<br>Number of cases tapering off to the left</p>

New cards

p-value

A number (statistic) used to indicate that the result of a statistical test was not due to chance (random variation).
A "significant" effect or difference.
Think of as the probability that what was found was due to chance. ("p" = "probability")
Relates to inferential rather than descriptive statistics.

New cards

p < 0.05

Generally, the acceptable cut off (sometimes p < .10 for more exploratory research) for a result to be "statistically significant".
May also see p < 0.01; p < 0.001 -- these are all acceptable.

<p>Generally, the acceptable cut off (sometimes p < .10 for more exploratory research) for a result to be "statistically significant".<br>May also see p < 0.01; p < 0.001 -- these are all acceptable.</p>

New cards

Hypothesis

Propose relationship between two variables.
A tentative statement about empirical reality involving a relationship between two or more variables.

New cards

Null Hypothesis (Ho)

The hypothesis of no (null) effects. The hypothesis you are testing against; try to dissolve or discredit; there is no statistical significance between the two variables.

<p>The hypothesis of no (null) effects. The hypothesis you are testing against; try to dissolve or discredit; there is no statistical significance between the two variables.</p>

New cards

Alternative Hypothesis (Ha)

The hypothesis of something being different than the null. The hypothesis you are testing; here is a statistically significant relationship between two variables.

New cards

Statistically Significant

A p-value less than 0.05 is typically considered statistically significant, in which case the null hypothesis (Ho) should be rejected.

New cards

Hypothesis Example

You believe there is a lower level of depression in our class than in the general population, so you give students in the class the BDI (Beck's Depression Inventory) & take the average.

New cards

Hypothesis Example: (Ho) or (Ha) ?

(Ho) = The average level of depression in this class is the same as the general population.
(Ha)= The average level of depression in this class is lower than (different than) the general population.

You reject the null hypothesis and accept the alternative hypothesis if statistics indicate a difference (with a p < 0.05 level of confidence) => 95% sure that the average level of depression in our class is lower than the general population.

New cards