DM01: Data Mining

studied byStudied by 0 people
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions

1 / 49

encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

50 Terms

1

Data mining

is the process of discovering patterns, relationships, and insights from large datasets.

New cards
2

Data mining

generally refers to the transformation of data into meaningful information for evidence based decision making.

New cards
3

Data Mining Techniques

Classification and class probability estimation, Regression, Similarity matching, Clustering, Co-occurrence grouping, Profiling, Link Prediction, Data Reduction, Causal modeling

New cards
4

Classification and class probability estimation

attempt to predict, for each individual in a population, which of a (small) set of classes this individual belongs to

New cards
5

Regression (“Value Estimation”)

attempts to estimate or predict, for each individual, the numerical value of some variable for that individual

New cards
6

Similarity matching

attempts to identify similar individuals based on data known about them

New cards
7

Clustering

attempts to group individuals in a population together by their similarity, but not driven by any specific purpose

New cards
8

Co-occurrence grouping

also known as frequent itemset mining, association rule discovery, and market basket analysis

New cards
9

Co-occurrence grouping

attempts to find associations between entities based on transactions involving them

New cards
10

Profiling

also known as behavior description

New cards
11

Profiling

attempts to characterize the typical behavior of an individual, group, or population.

New cards
12

Link Prediction

attempts to predict connections between data items, usually by suggesting that a link should exist, and possibly also estimating the strength of the link

New cards
13

Link Prediction

is common in social networking system

New cards
14

Data reduction

attempts to take a large set of data and replace it with a smaller set of data that contains much of the important information in the larger set

New cards
15

Causal modeling

attempts to help us understand what events or actions actually influence others

New cards
16

Business Understanding Stage

represents a part of the craft where the analysts’ creativity plays a large role

New cards
17

Business Understanding Stage

In this stage, the design team should think carefully about the use scenario.

New cards
18

Data Understanding

It is important to understand the strengths and limitations of the data. Check if the data is appropriate for the goals established and check data taken from different sources have the same format.

New cards
19

costs and benefits

A critical part of the data understanding phase is estimating the __ of each data source and deciding whether further investment is merited

New cards
20

Data Preparation

Involves data cleaning, check for outliers

New cards
21

Data Preparation

Typical examples of ___ are converting data to tabular format, removing or inferring missing values, and converting data to different types

New cards
22

Modeling

Determine mathematical models to establish patterns.

New cards
23

Evaluation

Determine if results of analysis are aligned with objectives

New cards
24

quantitative and qualitative assessments

Evaluating the results of data mining includes both ___

New cards
25

Deployment

Communicate the discoveries in a timely well written report to serve as input for decision making in the business operation

New cards
26

Deployment

___can also be much more subtle, such as a change to data acquisition procedures, or a change to strategy, marketing, or operations resulting from insight gained from mining the data.

New cards
27

Primary source of data

Surveys, Interviews, Focus Discussion, Scientific Simulations, Social Experiments, Scientific Experiments

New cards
28

Secondary source of data

books, journal articles, research studies (even unpublished), verifiable news clippings (be careful with fake news), published corporate reports, laws, ordinances, government memos, etc.

New cards
29

Nominal, Ordinal, Scalar

Types of Data or Variables in Surveys

New cards
30

Nominal

Do not have any quantitative values.These data cannot be ordered or cannot be measured.

New cards
31

Nominal

[Types of Data] Examples: Sex Male & Female; Marital Status Single, Married, Widower

New cards
32

Ordinal

Have natural ordered categories, and the order between them cannot be determined.

New cards
33

Ordinal

[Type of Data] Examples: Ranking, Likert scale (1 Strongly Disagree to 5 Strongly Agree

New cards
34

Scalar

continuous physical attribute or quantity that can be measured.

New cards
35

Scalar

[Types of Data] Examples Age, weight, height, time

New cards
36

Qualitative, Quantitative, Mixed

Types of Analysis Research Methods

New cards
37

Quantitative

gathering, manipulation and interpretation of data taken from surveys or other secondary sources (financial reports).

New cards
38

Quantitative

it involves, financial and statistical analysis, pattern and trending recognition, forecasting, etc.

New cards
39

Qualitative

gathering, curating, and interpreting information taken from interviews, focus group discussions, and social experiments.

New cards
40

Qualitative

It deals with understanding the meaning of concepts and factors affecting human behavior

New cards
41

Mixed Methods

Mixed of both the qualitative and quantitative methods in order to verify, validate, and triangulate the data gathered.

New cards
42

Sequential, Parallel

Types of mixed method

New cards
43

Sequential

One method is done one after the other:

1) Qualitative Interviews Pre test the questionnaire to validate the coherence of the questions in the survey);

2) Quantitative analysis of survey data;

3) Qualitative KII and FGD to get the story behind the numbers or results of the survey

New cards
44

Parallel

Qualitative and Quantitative data gatherings and analysis are done independently.

New cards
45
New cards
46
New cards
47
New cards
48
New cards
49
New cards
50
New cards
robot