Looks like no one added any tags here yet for you.
Individual
objects described by a set of data (could be a animal, person, or thing)
Variable
any characteristic of a individual
what are the two types of variables?
Numerical and Categorical
Discrete (Numerical)
Not able to take all possible real numbers in a reasonable range
Continuous (Numerical)
Able to take all possible real numbers in a reasonable range
Categorical
any variable that is not numerical
Nominal (categorical)
no real value in the order of categories
Ordinal (categorical)
There is meaningful order among the categories
What graphs are used for categorical variables?
bar graphs and pie charts
What graphs are used for numerical variables?
Stem leaf plot and histrogram
How to find mean
add up all the numbers, then divide by how many numbers there are
Median
The midpoint of distribution, it is the sample size plus 1 divided by two
Is mean or median resistant?
The median is resistant because it will be the same no matter what. The mean is NOT resistant because it will change based on the numbers in the value set
Is standard deviation resistant?
The standard deviation is NOT resistant
First quartile
the median of the lower half of the data set
Third quartile
the median of the upper half of the data set
5 number summary
min, Q1, median, Q3, max
Are quartiles resistant measures?
Quartiles are resistant
Inner Quartile Range (IQR)
the distance between Q1 and Q3. To find it subtract Q3-Q1
1.5 IQR rule
used for identifying outliers: any values that are more than 1.5 times the IQR lower than the first quartile or higher than the third quartile are called outliers
Scatterplot
a graphed cluster of dots, each of which represents the values of two variables
Explanatory variables go on the X axis and Response variables go on the Y axis
What does correlation (r) measure?
Correlation measures the degree of relationship between two or more variables
what does r tell us?
Whether there is a linear relationship between two variables, what direction the relationship is (positive or negative), and how strong the relationship is.
R is not resistant
R is not affected in the changing of number of units
Population
the entire group of individuals we want to know about
Sample
a part of the population we actually collect information from
Parameter
a number that describes the population
Statistic
A number that can be computed from sample data without using any unknown parameters
Sampling design
describes exactly how to choose a sample from the population
Bias
the design of a statistical study is biased if it systematically favors a outcome
Convince sample
easy to reach
Voluntary response sample
voluntary by individual
simple random sample
n individual from a population chosen in a way so that every set of n individuals has an equal an equal chance of the sample n being selected
Inference
A conclusion reached on the basis of evidence and reasoning
Observational study
observes individuals and measures variables of interest but does not attempt to influence the responses
Experiment
deliberately oppose some type of treatment on individuals to influence their response
Response variable
measures an outcome of a study
explanatory variable
A variable that helps explain or influences changes in a response variable.
Terms for experiments
subjects, factors, treatments, statistical significance
Subjects
individuals studied in an experiment
Factors
explanatory variables of experiments
treatments
any specific experimental condition applied to the subjects
Random
if individual outcomes are uncertain but there is nonetheless a regular distribution of outcomes in a large number of repetitions
Probability
the portion of times the outcome would occur in a very long series of reptation
Sample space
the set of all possible outcomes
Event
a set of outcomes, it will be a subset of the sample space
Discrete vs. Continuous Variables
discrete variables will be a list in a sample space and its probabilities whereas continuous variables are on a density curve and will occur under the umbrella of a density curve
A researcher wants to evaluate the relationship between the weights of the diamond (carats) on a ring with its price. He collected some data on 40 diamond rings, weights varying between 0.5 carat to 10 carts, and the least square regression line is price (dollar) = 259.63 + 3721.01 carats, with the correlation r = 0.8. What would not be true about this statement?
About 80% of the variability in the data is explained by the least square regression
About correlation r, which of the following is NOT true?
Correlation is a resistant measure
What type of variable is Variable X if it is the weight of apples from an orchard?
Continuous
What type of variable is Variable Y if it is student classification in college (Freshman, Sophomore, Junior, Senior)
Ordinal
What type of variable is Variable Z if it is the color of cars in the parking lot
Nominal
Is mean resistant?
not resistant
Is Standard Deviation Resistant?
The standard deviation is NOT resistant
Is the first quartile resistant
The first quartile is resistant
Is the median resistant?
median is resistant
Is the variance (square of the std) resistant?
not resistant
Is the inner quartile range resistant?
it is resistant
Suppose we know that the students body weight in NMSU is normally distributed with population mean u=170, and population standard deviation o=35. If we take a SRS (simple random sample) of 100 students from NMSU, which of the following is correct?
The sample mean x will have a normal distribution of u=170 and o=3.5
The central limit theorem gives the distribution of what statistic?
Sample mean
About correlation r, which of the following is NOT true?
Correlation r is always between 0 and 1
A study in El Paso looked at the seat belt used by drivers. They randomly select some convenience stores parking lot to conduct the study. When drivers left their vehicles, they were invited to answer questions that included questions about seat belt use. They have invited 1432 drivers to survey, but only 892 drivers participated. Of this study, who was the population and who was the sample.
The population is all the drivers in el paso and the sample is the 892 drivers who participated
A restaurant want to know their customers satisfaction about their food and service. They set up a box at the entry, and customers can finish a survey and then drop it in the box. For the 3 months period, they collected 500 filled surveys, but 30 of those filled surveys are just kids scribbling with no real meaning. Who is the population and who is the sample.
The population is all customers for the restaurant and the sample is the 470 who carefully filled out the survey
A study wants to see if exercise will reduce the chance of having diabetes. In the study, a few staff will make phone calls randomly to people across the nation, and survey about their age, gender, how much they exercise and if they have diabetes. What type of study is this?
Observational study
The two types of statistical inferences are
confidence and hypothesis
The parameter we make inferences on in our class are
population mean
What will give you a smaller margin of error?
Increase in sample size
By the end of the semester, students are encouraged to take the course evaluation survey to assess how instructor delivered the course. This survey gives an example of what sampling method?
Voluntary response sample
A study wants to see how different types of feed would affect cows grow. There are 4 major feeds available in the market. The researcher visited 20 ranchers nearby and collected data weekly on the feed used as well as wight gains of cows. this gives an example of what type of study?
Observational study
About probability, which of the following statements is NOT true?
If event A and event are dependent on each other, then P (A or B) = P(A)+P(B)
About Normal Distribution, which of the following is NOT correct?
Normal distributions sample space is between 0 and 1
When using normal distribution to approximate binomial distribution, one of necessary steps is to adjust the binomial event using CCF. If x>20 is the binomial event, what will the adjusted event after applying CCF?
20.5
What type of variable is student grade (A,B,C,D)
categorical, ordinal
A coin has two faces (head and tail). If two coins are tossed, and the number of heads (X) are recorded, what is the sample space of variable X?
0,1,2