Looks like no one added any tags here yet for you.
“status” ; state
Statistics is derived from the Latin word ________ with the meaning ________.
Plural
In _______ sense, statistics is defined as any set of numerical data (e.g. vital statistics, monthly sales)
Singular
In ______ sense, statistics is defined as a branch of science that deals with the collection, presentation, analysis, and interpretation of data
Aids in decision making
Summarizes data for public use
Roles of Statistics
Descriptive
Inferential
2 Areas of Statistics
Descriptive Statistics
Area of Statistics that is concerned with describing a set of data without drawing conclusions or inferences from it.
It includes collecting, presenting, and analyzing of data.
Inferential Statistics
Area of Statistics that utilizes sample data to make inferences and draw conclusions about a larger set of data.
It includes interpreting, making inferences, hypothesis testing, determining relationships, and making predictions.
Data
facts or figures from which conclusions may be drawn
Data Set
collection of facts and figures or data
Elements/Units
entities on which data are collected
Variable
a characteristic or attribute of elements which can assume different values or labels under statistical study
Observation
set of measurements collected for a particular element
Qualitative Variable
Quantitative Variable
2 Types of Variables
Qualitative Variable
outcomes of the variables expressed non- numerically or categorically
example: name, gender, eye color, religion, etc.
Quantitative Variable
outcomes are expressed numerically that are meaningful or indicate some sort of amount
example: age, allowance, number of students, height, etc.
Quantitative Discrete Variable
Quantitative Continuous Variable
2 Kinds of Quantitative Variables
Quantitative Discrete Variable
It is a variable which can assume finite, or at most , countably infinite number of values.
It is usually measured by counting. It answers the question “how many”.
example: # of students, # of children
Quantitative Continuous Variable
It is a variable which can assume infinitely many values corresponding to a line interval
It gives rise to measurement. It answers the question “how much”.
example: weight, allowance, height
Nominal
Ordinal
Interval
Ratio
Scale/Levels of Measurement of Variables
Nominal
It is a classificatory scale. It is the weakest level of measurement where numbers or symbols are used simply for labeling or categorizing subjects into different groups
example: sex (male/female)
Ordinal
It is classificatory with ordering scale. It is numbers assigned to categories of any variable may be ranked or ordered.
example: educational attainment (elementary/HS/college/MS/PhD)
Interval
It has the properties of the nominal and ordinal levels. The distances between any two numbers on the scale are of known sizes.
It has arbitrary zero.
example: temperature
arbitrary zero
zero does not mean nothing
Ratio
It is the highest level of measurement. It has the properties of the nominal, ordinal, and interval levels. It is anything that is countable or measurable.
It has absolute zero or true zero.
absolute zero or true zero
zero means nothing
Primary Data
It is acquired directly from the original source of information. Data that are measured or gathered by the researcher themselves
Secondary Data
data taken from published or unpublished data which have been previously gathered by others
Subjective Data
It means “from someone’s point of view”. Data that is commonly about perceptions, beliefs, feelings, and opinions.
Objective Data
fact-based, measurable, countable, and observable data
Interview
Questionnaire
Experimental
Observation
Registration
5 Data Collection Methods
Interview
Data Collection Methods. There is a person-to-person contact or exchange of information between the interviewer and interviewee.
It is more appropriate for obtaining complex emotional-laden topics probing sentiments underlying an expressed opinion. It provides consistent and more precise information since the interviewee may give clarifications.
It is time consuming and has limited field of coverage
Questionnaire
Data Collection Methods. Data are collected by means of written responses based on a list of questions which are relevant to the problems of the study.
It inexpensive and can cover a wide area in a shorter period of time
It has high possibility of incomplete response or may not return the questionnaire, especially if it is mailed.
Experimental
Data Collection Methods. It is used when the objective is to determine the cause-and-effect relationship of certain phenomena under controlled conditions
Observation
Data Collection Methods. The researcher observes the behavior of persons and their outcomes. The potential bias caused by the interviewing process is reduced and eliminated in this method
Registration
Data Collection Methods. This method of collecting data is enforced by certain laws such as registration of births, deaths, licenses, etc.i Information are kept systematized and made available to all because of the requirement of the law.
Population
entire group of observations or elements where inferences and conclusions are made
Parameter
a numerical characteristic of the population
Sample
subset of the entire group of observations or elements where data is collected
representative of the population
Statistic
a numerical characteristic of the sample
Census/Complete Enumeration
Sampling/Survey Sampling
General Classification of Collecting Data
Census/Complete Enumeration
process of gathering information from every unit or all the units of the population
Sampling/Survey Sampling
process of obtaining a part or subset of the population
less cost
greater accuracy
greater speed
greater scope
Why do we sample?
Probability Sampling
Nonprobability Sampling
Types of Sampling Methods
Probability Sampling
Types of Sampling Methods. Each unit in the population has a known, non-zero probability of selection, and have equal chances of being selected as a sample.
It uses some chance mechanism
Nonprobability Sampling
Types of Sampling Methods. The elements in the population do not have equal chances of being selected as a sample.
Elements of the population are taken depending to a large extent on the personal feelings or purpose of the researcher and without regard for some chance mechanism for choosing an element
sampling frame,
listing of all individual units in the population which is required in the execution of probability sampling methods
Simple Random Sampling (SRS)
Systematic Sampling
Stratified Sampling
Cluster Sampling
4 Types of Probability Sampling Methods
Simple Random Sampling (SRS)
Method of selecting n units out of N units in the population where all elements in the population have an equal chance of being included in the sample.
This sampling method is suitable when the population being studied is homogeneous or have the same characteristics.
ex. draw lots, random number generator
SRS with Replacement (SRSWR)
SRS without Replacement (SRSWOR)
2 Types of Simple Random Sampling (SRS)
SRS with Replacement (SRSWR)
Type of SRS where a chosen element is always replaced before the next selection is made.
SRS without Replacement (SRSWOR)
Type of SRS where a chosen element is not replaced before the next selection is made
Systematic Sampling
Type of probability sampling method which is a method of selecting a sample by taking every kth unit from an ordered population, where the first unit being selected at random
sampling interval
In systematic sampling, what is k?
Stratified Sampling
It is done if the population is heterogeneous and can be subdivided into non-overlapping homogeneous subpopulation called strata.
Samples are then randomly selected from all the strata using SRS or systematic sampling
Cluster Sampling
A method of sampling where a sample of distinct groups, or clusters, of elements is randomly selected and then a census or all elements in the selected clusters is taken.
Clusters are non-overlapping subpopulations which together comprise the entire population, and is preferably formed with heterogeneous.
Purposive Sampling
Convenience Sampling
Quota Sampling
Snowball Sampling
4 Types of Non-probability Sampling
Summation Symbol
Upper Limit
Index of Summation
Lower Limit
Summand
Parts of Summation Notation (from upper left, upper right, lower left, …)
Rules on Summation
(ewn paano ipapasok huhu)
measure of central tendency
a value at the center or middle of a data set, that is, the value where the data tend to cluster
Mean
Measure of Central Tendency that is average value.
It is susceptible to extreme values, single value, and continuous data. It works well with many statistcial methods.
Median
Measure of Central Tendency that is the middle value of an ordered data.
It is not susceptible to extreme values, single value, and continuous data. It is often a good choice if there are some extreme observations.
Mode
Measure of Central Tendency that is the most frequent value. It locates the point where the observation values occur with the greatest density
It can be a single value, multiple values, may not exist. It can be categorical or continuous data. It is appropriate for data at nominal and ordinal level.
Measure of variability or dispersion
It indicates the extent to which observations in a data set are scattered about an average. It is also used as a measure of reliability of the average value
True
True or False. The higher the measure of variability, the more dispersed the data is
Range
Variance
Standard Deviation
Standard Error of the Mean
Coefficient of Variation
Measures of Variability
Range
A measure of variability which is the difference between the highest value and the lowest value in the data set.
It uses extreme values; an outlier can greatly alter its value. It fails to communicate any information about the clustering.
Variance
A measure of variability which refers to the mean of the squared deviations of the observation from the mean. It is not a measure of absolute dispersion. It can only take the values from 0 to +∞
Standard Deviation
A measure of variability which refers to the positive square root of variance. It is the measure of absolute dispersion.
Standard Error
A measure of variability which refers to the standard deviation of the sampling distribution of the mean. It provides a tolerance of an estimate of the mean which is calculated from a sample
Coefficient of Variation
A measure of variability which refers to the ratio of the standard deviation and the mean and is expressed in percentage.
It is Unitless. It is used to compare the variability of two or more data sets
Measures of Skewness
It tells us the distribution of data. It can be revealted through a comparison of the mean, median and mode.
skewed
A distribution of data is _____ if it is not symmetric and extends more to one side than the other
symmetric
A distribution of data is ______ if the left half of its histogram is roughly a mirror image of its right half
Positive Skew
Measures of Skewness. It is ‘skewed to the right’. It has more concentration of values below the mean.
Negative Skew
Measures of Skewness. It is skewed to the left. It has more concentration of values abovethe mean.
Symmetrical Distribution
Measures of Skewness. It is ‘normally distributed’. It has approximately same values for the three central tendencies
Percentile
Quartile
Measure of Location
Quartile
It divides a set of data into four groups with about 25% of the values in each group.
Boxplot
Graphical Presentation that gives information about the distribution and spread of data. It shows information on the minimum and maximum value, Q1,Q2, and the median
Histogram
Graphical Presentation in which the classes are marked on the horizontal axis and the class frequencies on the vertical axis.
The class frequencies are represented by the heights of the bars, and the bars are drawn adjacent to each other.
Scatterplot
Graphical Presentation that is used to evaluate the relationship between two different continuous variables.
Probability
It is a quantitative measure of uncertainty. It is a number that expresses the strength of our belief in the occurrence of an uncertain event
Random experiment
It is any process that allows researchers to obtain observations. It is any process that can be repeated under basically same conditions and yields well defined outcomes.
ex. toss coin
Sample space (S)
It is the set of all possible outcomes of a random experiment.
ex. {head, tail}
sample points
Elements of the sample space.
n(S)
ex. toss coin, n(S) = 2
number of sample points is denoted by ____
Event
A subset of the sample space
Classical Approach
Relative Frequency Approach
Subjective Probability
Three Types of Probability
Classical Approach
Based on the idea that certain occurrences are equally likely, that is, we assume that in a given experiment, all the sample points in the sample space have equal chances of occurring
priori probability
Classical Approach is also called as ______. We can state the answer in advance without performing the experiment.
Relative Frequency Approach
An experiment is conducted or observed in large number of times that an event actually occurs, that is, probabilities are determined based on experimental approach
posteriori probability or empirical method
Relative Frequency Approach is also called as ______
Subjective Approach
It is based on the beliefs of the person making the probability assessment.
Law of Large Numbers
This law states that “as a procedure is repeated again and again, the relative frequency probability of an event tends to approach the actual probability”
random variable
a variable that has a single numerical value (determined by chance) for each outcome of a random experiment
It is denoted by X, Y, Z.
Discrete Random Variable
Continuous Random Variable
2 Types of Random Variable
Discrete Random Variable
Types of Random Variable. It has either a finite number of values or a countable number of values.
ex. coin (h or t)
Continuous Random Variable
Types of Random Variable. It has infinitely many values which can be associated with measurements on a continuous scale
Probability distribution
the listing of all possible value that a random variable can take on together with their corresponding probabilities.