Practical Data Science (PPTs)

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/78

flashcard set

Earn XP

Description and Tags

reviewer

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

79 Terms

1
New cards

Unorganized, raw, simple, sometimes useless thing in information

Data

2
New cards

Set of data which is included in the context to describe the details of any topic or research

Information

3
New cards

Consists of numbers representing counts or measurements

Quantitative Data

4
New cards

Can be separated intro different categories that are distinguished by some non-numeric characteristics

Qualitative Data

5
New cards

result when the number of possible values is either a finite number or a “countable”

Discrete Data

6
New cards

Result from infinitely many possible values that correspond to some continuous scale that covers a range of values without gaps, interruptions or jumps.

Continuous Data

7
New cards

characterized by data that consist of names, labels, or categories only. The data cannot be arranged in an ordering scheme (such as low to high).

Nominal Level of Measurement

8
New cards

Can be arranged in some order, but differences between data values either cannot be determined or are meaningless.

Ordinal Level of Measurement

9
New cards

Two Sources of Data

Primary Data & Secondary Data

10
New cards

Information collected firsthand by the researcher for a specific purpose

Primary Data

11
New cards

Information that already exists and has been collected by someone else for a different purpose

Secondary Data

12
New cards

Process for Data Collection

  1. Determine the data you want to collect

  2. Set a timeframe for data collection

  3. Determine the data collection method

  4. Collect Data

  5. Analyze Data & implement the findings

13
New cards

Information is gathered through questionnaire, mostly based on individual or group experiences

regarding a particular

phenomenon.

Survey

14
New cards

obtaining data whose results are based on intensive engagement with respondents about a particular study.

Interview

15
New cards

used by monitoring participants in a specific situation or environment at a given time and day.

Observation

16
New cards

process of examining existing documents and records of an organization for tracking changes over a period of time.

Documents & Records

17
New cards

Data are mostly collected based on the cause and effect of the two variables being studied.

Experiment

18
New cards

A subset of population

Sample

19
New cards

the science of collecting, analyzing, interpreting, and presenting numerical data.

Statistics

20
New cards

number that describes the sample

Statistic

21
New cards

entire collection of individuals about which information is sought.

Population

22
New cards

Census

collection of data from every member of the population

23
New cards

study and practice on how we can extract knowledge and insights from large amount of data

Data science

24
New cards

the process of selecting a subset of individuals from a population to estimate characteristics of the whole.

Sampling

25
New cards

sampling technique that gives everyone in the population

Probability Sampling

26
New cards

one or more parts of the population are favored over others

biased sampling

27
New cards

considered the most common probability sampling

simple random sampling

28
New cards

A sampling done by numbering each subject of the population and selecting every kth subject.

Systematic Random Sampling

29
New cards

sampling method which categorized first the respondents based on their similarities and then select at random sample from each category.

Stratified Random Sampling

30
New cards

a probability sampling used when the population is very large. It involves dividing the population into different areas, and then researchers selects.

Cluster Sampling

31
New cards

sample that is not drawn by a well-defined random method

Sample of Convenience

32
New cards

number that describes the population

parameter

33
New cards

A characteristic or attribute that can assume different values

Variable

34
New cards

are variables that have distinct categories according to some characteristics or attribute

Qualitative variables

35
New cards

are variables that can be measured or counted.

Quantitative Variables

36
New cards

variables that assume values can be counted

Discrete

37
New cards

variables that can assume an infinite number of values between any two specific values

Continuous

38
New cards

classifies data into categories that can be ranked; however precise differences between ranks do not exist

ordinal level of measurement

39
New cards

classifies data into mutually exclusive categories in which no order or ranking can be imposed on the data

nominal level of measurement

40
New cards

ranks data and precise differences between units of measure do exist; however there is no meaningful zero

interval level of measurement

41
New cards

possess all the characteristics of interval measurement, and there exists a true zero

ratio level of measurement

42
New cards

in an experimental study is the one being manipulated by the researcher

independent variable

43
New cards

a variable that being studied to see if it has changed significantly because of the manipulation of the independent variable

dependent variable

44
New cards

is one that influences the dependent or outcome variable but was not separated from the independent variable

confounding variable

45
New cards

the independent variable is also called the?

explanatory variable

46
New cards

the group that received the special instruction is called?

treatment/ controlled group

47
New cards

occurs when some members of a population are systematically more likely to be selected for a study than others.

Sampling bias

48
New cards

is a chance process leads to a well-defined result

Probability Experiment

49
New cards

are the result in probability experiment

outcome

50
New cards

is the totality or the set of all possible outcomes in a single experiment

sample space

51
New cards

set of outcomes in an experiment

event

52
New cards

An experiment is called ____ or ___ if any outcome is equally likely

random or fair

53
New cards

if the event a does not effect event b occurring or vice versa

independent events

54
New cards

uses sample space S to determine the numerical chance that an event will happen, and it also assumed that the experiment is fair and have equally likely events

classical probability

55
New cards

relies on actual experience to determine the likelihood of outcomes.

empirical probability

56
New cards

uses a probability value based on an educated guess or estimate, employing opinions and inexact information

subjective probability

57
New cards

an event B in relationship to an event A was defined as the probability that event B occurs after event A has already occurred

conditional probability

58
New cards

a variable whose values are determined by chance

random variable

59
New cards

gave a finite number of possible values or an infinite number of values that can be counted

discrete random variable

60
New cards

are variables that can assume in all values in the interval between any two given values

continuous random variable

61
New cards

the ______ for a random variable describes how the probabilities are distributed over the values of the random variable

probability distribution

62
New cards

a ____ consists of the values a random variable can assume and the corresponding probabilities of the values

discrete probability distribution

63
New cards

is the outcome of a binomial experiment and the corresponding probabilities of these outcomes

binomial distribution

64
New cards

The ____ is often used as a model of the number of arrivals at a facility within a given period of time

poisson probability distribution

65
New cards

it is a normal distribution with a mean is zero and standard deviation is 1

standard normal distribution

66
New cards

it is the number of standard deviations that a particular X value is away from the mean

Standard Score or Z Score

67
New cards

it is a quantitative measure of the extent to which the deviation of one variable from its mean matches the deviation of the other from its mean

covariance

68
New cards

the ____ used by statisticians in sample data measures the strength and direction of a linear relationship between two variables.

correlation coefficient

69
New cards

5 Steps on How to Collect Data

  1. Determine what information you want to collect

  2. Set a timeframe for data collection

  3. Determine your data collection method

  4. Collect the Data

  5. Analyze the data & implement your findings

70
New cards

In ways of obtaining data, this involves the collection of data from already published text available in the public domain.

literature sources

71
New cards

In ways of obtaining data, and it is another way of gathering data for research purposes.

surveys

72
New cards

In ways of obtaining data, it is a qualitative method of obtaining data whose results are based on intensive engagement with respondents about a particular study.

interviews

73
New cards

In ways of obtaining data, is used by monitoring participants in a specific situation or environment at a given time and day.

observations

74
New cards

In ways of obtaining data, this is the process of examining existing documents and records of an organization for tracking changes over a period of time.

documents & records

75
New cards

In ways of obtaining data, data are mostly collected based on the cause and effect of the two variables being studied.

experiments

76
New cards

It is a type of interview in which the interviewer asks a particular set of predetermined questions.

structured interview

77
New cards

It is a type of interview in which the interviewer asks questions which are not prepared in advance.

unstructured interview

78
New cards

Also known as one on one interview. It is a data collection method when the interviewer directly communicates with the respondent in accordance with prepared questionnaire,

face-to-face interview

79
New cards

Components of the complete & accurate data set

  1. Content just needs to be right

  2. Form eliminates ambiguities about the content