knowt logo

Statistics and Probability

Statistics - is the SCIENCE of Planning studies, collecting, organizing, presenting, analyzing or summmarizing, interpreting, and drawing conclusion based on the data. It is also a way of reasoning, along with a colection of tools and methods designed to help us understand the world.

Collection - process of obtaining information.

Organization - determining/ascertaining manner of of presenting the data into tables/graphs or chart so that logical and statistical conclusion can be drawn from collected measurement.

Analysis of data - process of extracting from the given data relevant information from which numerical description can be formed.

Interpretation of Data - task of drawing conclusions from the analyzed data.

Probability - is the chance that something will happen

Descriptive - collection, organization, presentation, analysis/summarization of data.

Inferential - using a sample to interpret and draw conclusion based on the data or about a population

Inferential - Percentages and sample size

Descriptive - data as a whole

Universe - collection or set of entities from whom we got the data

Population - the set of all possible values of a variable

Parameter - It is a value that tells or describe something about a population

Sample - subgroup of a population

Statistic - value that tells or describe something about a sample

Variable - a characteristic that is observable and measurable in every unit of a universe

Qualitative - non-numeric and express categorical attributes

Quantitative - numerical data

Types of quantitative data

  1. Discrete data - data that can be counted

  2. Continuous data - data that can be measured

Central Tendency of Ungrouped Data

Mean - Average, add all value then divide by how many the number is ( arranged in lowest to highest)

Median - middle value, arrange chronologically

Mode - most appearing number

One mode - unimodal

Two modes - bimodal

3 or more modes - multi modal

None - non-modal

LEVEL OF MEASUREMENT

Nominal level - simplest form of measurement. Used to classify for purely classification and identification

Ordinal - variable are rank ordered according to their magnitude or intensity

Interval Level - no true zero point, expressed in real number so that data can be ranked.

Ratio level - represent the most precise level of measurement

PROBABILITY DISTRIBUTION

Concept of random variable

statistical experiment - used to describe any process by which several chance observations are obtained.

sample space - all possible outcomes of an experiment

Random variable - variable whose value is determined by the outcome of a random experiment or an event.

Discrete Random Variable - set of assumed values is uncountable (arises from measurement)

Discrete probability distribution - is a table listing all possible values that a discrete variable can take on, together with the associated probabilities.

NORMAL DISTRIBUTION

Normal Probability distribution - is a data distribution where the mean, median, and mode are equal and the distribution is clustered at the center

Graph of normal distribution - symmetrical bell-shaped curve along the mean and extends indefinitely in both directions

Total area under the normal curve - is equal to 100%or 1, or 50% or 0.5 to each side from the center.

STANDARD SCORE or Z-SCORE - equivalent value of a raw score expressed in terms of the mean and standard deviation of the distribution.

Different types of curves according to skewness

Negatively skewed - skewed to the left and the mean has the lowest value among the three measures of central tendency.

Positively skewed - skewed to the right and the mean has the highest value among the three measures of central tendency.

No skew - all measures of central tendency are equal

Different types of curves according to kurtosis

Mesokurtic - Normal distribution, Kurtosis = 0

Leptokurtic - High degree of peakedness, Kurtosis > 0

Platykurtic - Low degree of peakedness, kurtosis < 0

THE AREA OF A NORMAL CURVE

Z-table - area under the normal curve

Case 1 - only one side of the curve is WHOLY shaded, either left or right

Case 2 - two sides of the curve are shaded, both negative and positive ( Addition)

Case 3 - one side of the curve is shaded but only limited (Subtraction)

Case 4 - one side of the curve is halfly shaded up to the end of the tail with subtracting the area of the normal curve 0.5000

Case 5- both sides of the curve is shaded up to the end of the tail with adding the area of the normal curve 0.5000

SAMPLING DESIGN

Basic concepts and procedures

Frame – a collection of units, (referred to as sampling units) in a population; the materials or devices, which delimit, identify and allow access to the elements of the target population.

Survey – This refers to a method of collecting information about a population in which direct contact is made with the units of study through systematic means such as questionnaires and interview schedules.

o Census or complete enumeration - This is a survey in which data are to be collected from all elements of the target population.

o Sample survey - This refers to the gathering of information from only a fraction of the population chosen to represent the whole.

Sampling – a process of selecting samples from a given population.

SAMPLING TECHNIQUE

Sampling technique can be grouped into how selections of items are made such as probability sampling and nonprobability sampling.

Probability Sampling – the sample is a proportion of the population and such sample is selected from the population by means of systematic way in which every element of the population has chance of being included in the sample.

Non-Probability Sampling – The sample is not a proportion of the population and there is no system in selecting the sample. The selection depends on the situation.

TYPES OF PROBABILITY SAMPLING

Pure Random Sampling – Is one in which everyone in the population of the study has an equal chance of being selected to be included in the sample.

Systematic Random Sample –In this method, a research develops an accurate sampling frame, selects elements from sampling frame according to mathematically random procedure, and then locates the exact element that was selected for inclusion in the sample.

Stratified Random Sampling – It involves splitting subjects into mutually exclusive groups and then using simple random sampling to choose members from groups.

Cluster Random Sampling – It is a way to randomly select participants from a list that is too large for simple random sampling. For example, if you wanted to choose 1000 participants from the entire population of the Philippines, it is likely impossible to get a complete list of everyone. Instead, the researcher randomly selects areas (i.e. cities or province) and randomly selects from within those boundaries.

Multi - stage Sampling - Selection of the sample is done in two or more steps or stages, with sampling units varying in each stage. The population is first divided into a number of first-stage sampling units from which a sample is drawn. Smaller units, called the secondary sampling units, comprising the selected first stage units then serve as the sampling units for the next stage. If needed additional stages may be added until the units of observation for the survey are clearly identified. The smaller units comprising the samples selected from the previous stage constitute the frame for the stages.

TYPES OF NON-PROBABILITY SAMPLING

Accidental Sampling – There is no system of selection but only those whom the researcher or interviewer meets by chance.

Quota Sampling – There is specified number of persons of certain types is included in the sample.

Convenience Sampling – is a process of picking out people in the most convenient and fastest way to get reactions immediately. This method can be done by telephone interview to get the immediate reactions of a certain group of sample for a certain issue.

Purposive Sampling – It is based on certain criteria laid down by the researcher. People who satisfy the criteria are interviewed. It is used to determine the target population of those who will be taken for the study.

Snowball Sampling – It is where research participants recruit other participants for a test or study. It is used when potential participants are hard to find.

Statistics and Probability

Statistics - is the SCIENCE of Planning studies, collecting, organizing, presenting, analyzing or summmarizing, interpreting, and drawing conclusion based on the data. It is also a way of reasoning, along with a colection of tools and methods designed to help us understand the world.

Collection - process of obtaining information.

Organization - determining/ascertaining manner of of presenting the data into tables/graphs or chart so that logical and statistical conclusion can be drawn from collected measurement.

Analysis of data - process of extracting from the given data relevant information from which numerical description can be formed.

Interpretation of Data - task of drawing conclusions from the analyzed data.

Probability - is the chance that something will happen

Descriptive - collection, organization, presentation, analysis/summarization of data.

Inferential - using a sample to interpret and draw conclusion based on the data or about a population

Inferential - Percentages and sample size

Descriptive - data as a whole

Universe - collection or set of entities from whom we got the data

Population - the set of all possible values of a variable

Parameter - It is a value that tells or describe something about a population

Sample - subgroup of a population

Statistic - value that tells or describe something about a sample

Variable - a characteristic that is observable and measurable in every unit of a universe

Qualitative - non-numeric and express categorical attributes

Quantitative - numerical data

Types of quantitative data

  1. Discrete data - data that can be counted

  2. Continuous data - data that can be measured

Central Tendency of Ungrouped Data

Mean - Average, add all value then divide by how many the number is ( arranged in lowest to highest)

Median - middle value, arrange chronologically

Mode - most appearing number

One mode - unimodal

Two modes - bimodal

3 or more modes - multi modal

None - non-modal

LEVEL OF MEASUREMENT

Nominal level - simplest form of measurement. Used to classify for purely classification and identification

Ordinal - variable are rank ordered according to their magnitude or intensity

Interval Level - no true zero point, expressed in real number so that data can be ranked.

Ratio level - represent the most precise level of measurement

PROBABILITY DISTRIBUTION

Concept of random variable

statistical experiment - used to describe any process by which several chance observations are obtained.

sample space - all possible outcomes of an experiment

Random variable - variable whose value is determined by the outcome of a random experiment or an event.

Discrete Random Variable - set of assumed values is uncountable (arises from measurement)

Discrete probability distribution - is a table listing all possible values that a discrete variable can take on, together with the associated probabilities.

NORMAL DISTRIBUTION

Normal Probability distribution - is a data distribution where the mean, median, and mode are equal and the distribution is clustered at the center

Graph of normal distribution - symmetrical bell-shaped curve along the mean and extends indefinitely in both directions

Total area under the normal curve - is equal to 100%or 1, or 50% or 0.5 to each side from the center.

STANDARD SCORE or Z-SCORE - equivalent value of a raw score expressed in terms of the mean and standard deviation of the distribution.

Different types of curves according to skewness

Negatively skewed - skewed to the left and the mean has the lowest value among the three measures of central tendency.

Positively skewed - skewed to the right and the mean has the highest value among the three measures of central tendency.

No skew - all measures of central tendency are equal

Different types of curves according to kurtosis

Mesokurtic - Normal distribution, Kurtosis = 0

Leptokurtic - High degree of peakedness, Kurtosis > 0

Platykurtic - Low degree of peakedness, kurtosis < 0

THE AREA OF A NORMAL CURVE

Z-table - area under the normal curve

Case 1 - only one side of the curve is WHOLY shaded, either left or right

Case 2 - two sides of the curve are shaded, both negative and positive ( Addition)

Case 3 - one side of the curve is shaded but only limited (Subtraction)

Case 4 - one side of the curve is halfly shaded up to the end of the tail with subtracting the area of the normal curve 0.5000

Case 5- both sides of the curve is shaded up to the end of the tail with adding the area of the normal curve 0.5000

SAMPLING DESIGN

Basic concepts and procedures

Frame – a collection of units, (referred to as sampling units) in a population; the materials or devices, which delimit, identify and allow access to the elements of the target population.

Survey – This refers to a method of collecting information about a population in which direct contact is made with the units of study through systematic means such as questionnaires and interview schedules.

o Census or complete enumeration - This is a survey in which data are to be collected from all elements of the target population.

o Sample survey - This refers to the gathering of information from only a fraction of the population chosen to represent the whole.

Sampling – a process of selecting samples from a given population.

SAMPLING TECHNIQUE

Sampling technique can be grouped into how selections of items are made such as probability sampling and nonprobability sampling.

Probability Sampling – the sample is a proportion of the population and such sample is selected from the population by means of systematic way in which every element of the population has chance of being included in the sample.

Non-Probability Sampling – The sample is not a proportion of the population and there is no system in selecting the sample. The selection depends on the situation.

TYPES OF PROBABILITY SAMPLING

Pure Random Sampling – Is one in which everyone in the population of the study has an equal chance of being selected to be included in the sample.

Systematic Random Sample –In this method, a research develops an accurate sampling frame, selects elements from sampling frame according to mathematically random procedure, and then locates the exact element that was selected for inclusion in the sample.

Stratified Random Sampling – It involves splitting subjects into mutually exclusive groups and then using simple random sampling to choose members from groups.

Cluster Random Sampling – It is a way to randomly select participants from a list that is too large for simple random sampling. For example, if you wanted to choose 1000 participants from the entire population of the Philippines, it is likely impossible to get a complete list of everyone. Instead, the researcher randomly selects areas (i.e. cities or province) and randomly selects from within those boundaries.

Multi - stage Sampling - Selection of the sample is done in two or more steps or stages, with sampling units varying in each stage. The population is first divided into a number of first-stage sampling units from which a sample is drawn. Smaller units, called the secondary sampling units, comprising the selected first stage units then serve as the sampling units for the next stage. If needed additional stages may be added until the units of observation for the survey are clearly identified. The smaller units comprising the samples selected from the previous stage constitute the frame for the stages.

TYPES OF NON-PROBABILITY SAMPLING

Accidental Sampling – There is no system of selection but only those whom the researcher or interviewer meets by chance.

Quota Sampling – There is specified number of persons of certain types is included in the sample.

Convenience Sampling – is a process of picking out people in the most convenient and fastest way to get reactions immediately. This method can be done by telephone interview to get the immediate reactions of a certain group of sample for a certain issue.

Purposive Sampling – It is based on certain criteria laid down by the researcher. People who satisfy the criteria are interviewed. It is used to determine the target population of those who will be taken for the study.

Snowball Sampling – It is where research participants recruit other participants for a test or study. It is used when potential participants are hard to find.