Module 3 descriptive stats

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/62

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

63 Terms

1
New cards

what is a varible


Variables: characteristics of the observation unit that we can measure.

Ex. The number of Facebook posts a person might read each day is a variable.

2
New cards

Ex. The number of Facebook posts a person might read each day is a variable.
True or false this is an example of a varible

true

3
New cards

what are the 3 key pieces of info. that a varible must include

  1. what the varible repersents

  2. mesuarment units

  3. descriution of the observation unit

Ex. If we wanted to observe the long jump distance for high schoolstudents, our variable might be "Long jump distance (cm) for eachstudent". Breaking it down, long jump distance' is what the variablerepresents, 'centimeters' is the measurement unit, and 'student' is theobservation unit

4
New cards

breaking down this example what is the 3 key piece of info in this varibel?

Ex. If we wanted to observe the long jump distance for high schoolstudents, our variable might be "Long jump distance (cm) for eachstudent".

Breaking it down, long jump distance' is what the variable represents, 'centimeters' is the measurement unit, and 'student' is the observation unit

5
New cards

what am I? the particular value of the variable that you measure from an observation unit is called.

data (pl)

datum (sing)

6
New cards

what is the data and varibel of this example?

Ex. Imagine we observed three students and recorded the following long jump distances: Amelia 401 cm, Mackenzie 389 cm, and Sarine 315 cm.

The variable is "Longjump distance (cm) for each student" and the data are 401 cm, 389 cm, and 315cm respectively

7
New cards

what are two main type of varibles?

  1. numerical varibles

  2. categorical varibles

8
New cards

what am I? varibles where the data is numeric, has measurementunits

numerical varibles

9
New cards

what are the two types of numerical varibles?

  1. continous numerical varibles

  2. discrete numerical varibles

10
New cards

what is continuous Numerical varibles

a varible that takes on continous numbers, including fractional and deciaml Ex 19.5kg

11
New cards

these are examples of what kind of varibles Ex. weight = 90.5kg, time

continous numerical

12
New cards

what is discrete numerical varibles?

numerical varible that only takes on whole numbers (intergers)

ex Ex. counting the number of people (30 people)

13
New cards

counting the number of people (30 people) is an example of what kind of varible

discrete numerical

14
New cards

what is categorical variables

those where the data is a qualitative description no measurement units

15
New cards

what are the two types of caatergorical varibles?


1. Ordinal Categorical varibles


2. Nominal Categorical

16
New cards

what varible am I? variable that can take on qualitative valuesbut where values are ranked from a scale (particular order). Ex.using emojis to rank how you are feeling today.(Likert scale)

ordinal catergorical varible

17
New cards

what varible type am I? variable that can take on qualitative values but where values do not have any particular order. Ex. Food.apples, oranges, kiwi: all fruits, no particular order. Ex colour


Nominal Categorical varibles

18
New cards

what is the average value of your sample called?


Mean

19
New cards

what is the middle value of your sample when ordered low to high?

median

20
New cards

T or F: counts are used for numerical data

false counts are used in caterigorical data

21
New cards

what are counts

are used for categorical variables and are thenumber of observations in your sample that fall into each category

<p><span>are used for categorical variables and are thenumber of observations in your sample that fall into each category </span></p>
22
New cards

T or F: proportions are used for numerical varibles

F used for catergoical data

23
New cards

what are proptions

are used for categorical variables and are the share of observations in your sample that fall into each category (%)

24
New cards

False: counts are are the share of observations inyour sample that fall into each category (%)

this is the defination for proportions.

counts: are thenumber of observations in your sample that fall into eachcategor

25
New cards

using counts makes data eaiser to understand vs unsing proptions T or F

F proptions makes it easier

26
New cards

what is the central tendency?

describes the typical value in your sample (ie mean) (only in categorical data)

27
New cards

the central tendency is used in numerical data T or F

F the central tendency is used in catergotical data

<p>F the central tendency is used in catergotical data </p>
28
New cards

what is dispersion

describes the spread of the values ( ie variance) (only in categorical data)

<p><span>describes the spread of the values ( ie  variance) (only in categorical data)</span></p>
29
New cards

T or F: disperison is used for numerica data

F it is used for categorical data

30
New cards

what is the range?

the difference between the maximum and minimum values for the numerical variables or the dif. Between the max and min of the counts for the categorical variable. Used to indicate dispersion.


Ex of range: the proportion of renewable energy jobs in each category ranges from 2%to 46%, which is a relatively wide range.

31
New cards

what is used to indicate disperison?

the range

32
New cards

T or F range is used for both numerical and catergorical data

T

33
New cards

what is a quartile

one-quarter of your sample when the values are ranked from lowest to highest

34
New cards

how do you calculate the mean

  1. summing all of the values in your sample

  2. Dividing by the number of data points in your sample

<ol><li><p><span>summing all of the values in your sample</span></p></li><li><p><span>Dividing by the number of data points in your sample</span></p></li></ol><p></p>
35
New cards

Calculating Variance:

1. Calculate the mean for a sample

2. Calculate the difference between each data point and the mean, then square that value

3. Sum the squares of the differences and divide by the number of observations/data points (this gives the population variance NOT the sample variance)

<p><span>1. Calculate the mean for a sample</span></p><p><span>2. Calculate the difference between each data point and the mean, then square that value</span></p><p><span>3. Sum the squares of the differences and divide by the number of observations/data points (this gives the population variance NOT the sample variance)<br></span></p>
36
New cards

what is the standard devation EQ

the square root of the varience/population varience

<p>the square root of the varience/population varience</p>
37
New cards

what am I? typical squared distance from the values to the mean, measures theamount of variation in your sample (σ^2

Variance: typical squared distance from the values to the mean, measures theamount of variation in your sample (σ^2

38
New cards

what am I? Is the squared root of the variance (σ)

standard variance (σ)

<p>standard variance  (σ)</p>
39
New cards

Steps: Calculating quartiles


1. Sort the data from low → high

2. Find the 2nd quartile by splitting the data in ½ according to.

  • The sample has an odd # of observations → middle # of the dataset is the second quartile.

  • The sample has an even # of observations → average of the two values closest to the middle is the second quartile

3. Find the 1st quartile by creating a subset of the data that is the lower-valued half of the observations, then use the rules in step 2 to find the middle value. The lower subset is created.

  • The sample has an odd number of observations, in which case the lower-valued subset is all values less than or equal to the second quartile.

  • The sample has an even number of observations, in which case the lower-valued subset is all values less than the second quartile The subset does not include the second quartile.

    4. Find the 3rd quartile by repeating step 3 but for the upper-valued half of the observations.5. Ex. even # observation (Right) Ex. even # observation (Left)

<p><span><br>1. Sort the data from low → high</span></p><p><span>2. Find the 2nd quartile by splitting the data in ½ according to. </span></p><ul><li><p><span>The sample has an odd # of observations → middle # of the dataset is the second quartile. </span></p></li><li><p><span>The sample has an even # of observations → average of the two values closest to the middle is the second quartile</span></p></li></ul><p><span>3. Find the 1st quartile by creating a subset of the data that is the lower-valued half of the observations, then use the rules in step 2 to find the middle value. The lower subset is created. </span></p><ul><li><p><span>The sample has an odd number of observations, in which case the lower-valued subset is all values less than or equal to the second quartile. </span></p></li><li><p><span>The sample has an even number of observations, in which case the lower-valued subset is all values less than the second quartile The subset does not include the second quartile.</span></p><p></p><p><span>4. Find the 3rd quartile by repeating step 3 but for the upper-valued half of the observations.5. Ex. even # observation (Right) Ex. even # observation (Left)</span></p></li></ul><p></p>
40
New cards

what is the central tendency

is the middle, given by the second quartile. it is also called the medican as it is the central quartile

41
New cards

We know 50% of the data is above and below what point?

the central tendency value

42
New cards

what shows how much varience there is in a sample? (numerical data)

dispersion

43
New cards

with quartiles how do tell the variation

the range of values that contain the center-most 50% of the data. The range is btw the 1st and 3rd quartiles and is called the interquartile range

<p><span>the range of values that contain the center-most 50% of the data. The range is btw the 1st and 3rd quartiles and is called the interquartile range<br></span></p>
44
New cards

what is the interquartile range?

uses quartiles to describe dispersion in a numericalvariable. It is the difference between the 3rd and 1st quartiles and gives therange of the innermost 50% of a numerical sample○ Calc: subtract the 1st quartile from the 3rd quartile

<p><span>uses quartiles to describe dispersion in a numericalvariable. It is the difference between the 3rd and 1st quartiles and gives therange of the innermost 50% of a numerical sample○ Calc: subtract the 1st quartile from the 3rd quartile</span></p>
45
New cards

how do you calculate the IQR

subtact 1st quartile from the 3rd quartile

46
New cards

using quartiles for numerical data pro and con (over using the mean)

Pro: The median and interquartile range are relatively good for extreme values

Con: The median and interquartile range change a lot for samples with a small # of observations

47
New cards

using mean for numerical data pro and con (over using the quartiles)

Pro: The mean and the standard deviation are better when there are small numbers of observations in the sample

Con: The mean and the standard deviation are sensitive to extreme values

48
New cards

what is Standard deviation

tells you how spread out the data is. It is a measure of how far each observed value is from the mean

49
New cards

what is effect size?

is the change in mean value of the response variable among groups (difference)

50
New cards


The difference is the calculation of what?

alculations of effect size are Y1-Y2 where Y1 and Y2 are the means of samples from group 1 and group 2 respectively

51
New cards

what are the two ways of cacluating effect size?

the difference (-)

ratio (/)

52
New cards

ratio is the calcuation of what?

effect size Y1/Y2 where Y1 and Y2 are the means of samples from group 1 and group 2 respectively

53
New cards

what does “mindfulness” refer to when taking about the difference

refers to whether the difference among groups is important for your study

54
New cards

T or F effect size is a discriptive statistic? if T what does it describe?

T, it is a description of the change in mean among thesamples you collected from different groups

55
New cards

In a case-control study effect size would be…

the change in the mean value ofthe response variable between the case and control groups

56
New cards


In observational studies, effect size is calculated as the change…

amoung groups

57
New cards

In experimental studies, effect size is calculated among…

treatment levels

58
New cards

in a single factor experiment effect size would be the change in mean valueof the response variable among….

the levels of the factor/treatement

59
New cards

absolute effect size (define)

simple change is the mean value between groups

60
New cards

T or False: a con using differences is that it does not keep the orignal scale (keeps the units)

F: using diffeneces keeps the units that is a pro

61
New cards

T or F a con of using ratio is that is does not indicate relactive change

F

a pro of using ratios is that it indicates relative change

62
New cards

T or F : using ratios keeps the units of the scale

False a con of using ratios for effect size is that it does not keep original scale (units)

63
New cards

You are conducting a medical study to see the risk of cancer. to calculate the effect size would you use difference or ratios?

you would use ratios easier to see the risk and units are not important. By using rations the researchers can evaluate the proportional increase in therisk of getting a disease.