lecture 7

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/14

flashcard set

Earn XP

Description and Tags

These flashcards cover key concepts and definitions related to data mining in eHealth as introduced in the lecture.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

15 Terms

1
New cards

What is Data Mining?

The process of discovering patterns in data, typically involving extracting useful and meaningful information.

2
New cards

What are the four main steps in the KDD process?

1. Data Collection

2. Preprocessing

3. Data Mining Techniques

4. Evaluation and Interpretation

3
New cards

What is attribute selection explain in theory and in practice?

attribute selection is the process of selecting the most relevant variables for model building

In theory: Having more attributes should result in more accurate patterns

In practice: Irrelevant attributes may "confuse" data mining algorithms

4
New cards

What is attribute construction?

This involves creating new attributes from existing ones to make regularities more apparent.

5
New cards

What is clustering in data mining? and a application?

The process of identifying groups where data points within a cluster are similar while those across clusters are dissimilar.


Market basket analysis (items bought together)

Healthcare: identifying patients with similar treatment needs

6
New cards

What is regression in data mining?

A model that maps a given input to a numerical value

7
New cards

Unsupervised Learning

A type of learning that works with unlabeled data, seeking to identify patterns or groupings.

8
New cards

What is classification in data mining?

A model that can predict the value of a class attribute based on the values of set attributes

9
New cards

What is the difference between training sets and test sets in classification?

  • Training set: Known-class examples used to build a model

  • Test set: Unknown-class examples used to predict classes

10
New cards

What type of data are ANNs particularly well-suited for in healthcare?

Noisy, complex sensor data often recorded in healthcare contexts.

11
New cards

What is the goal of Support Vector Machines?

To find the maximum margin hyperplane (the plane that gives the greatest separation between classes).

12
New cards

How is a decision tree traversed to classify an example?

  1. Starting at the root node and traversing down the tree

  2. At each internal node, an attribute test is performed

  3. Based on the test outcome, follow the appropriate branch

  4. Continue until reaching a leaf node, which provides the predicted class

13
New cards

Why might decision trees be preferred over neural networks in some healthcare applications?

you can see how the model predicted the output and the decisions it took because you can interpret white box models better

14
New cards

What is an example of attribute maintenance in healthcare data?

Using date of birth instead of age for better data management.

15
New cards

when should you use SVM?

Best for problems with clear separation between classes