1/14
These flashcards cover key concepts and definitions related to data mining in eHealth as introduced in the lecture.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is Data Mining?
The process of discovering patterns in data, typically involving extracting useful and meaningful information.
What are the four main steps in the KDD process?
1. Data Collection
2. Preprocessing
3. Data Mining Techniques
4. Evaluation and Interpretation
What is attribute selection explain in theory and in practice?
attribute selection is the process of selecting the most relevant variables for model building
In theory: Having more attributes should result in more accurate patterns
In practice: Irrelevant attributes may "confuse" data mining algorithms
What is attribute construction?
This involves creating new attributes from existing ones to make regularities more apparent.
What is clustering in data mining? and a application?
The process of identifying groups where data points within a cluster are similar while those across clusters are dissimilar.
Market basket analysis (items bought together)
Healthcare: identifying patients with similar treatment needs
What is regression in data mining?
A model that maps a given input to a numerical value
Unsupervised Learning
A type of learning that works with unlabeled data, seeking to identify patterns or groupings.
What is classification in data mining?
A model that can predict the value of a class attribute based on the values of set attributes
What is the difference between training sets and test sets in classification?
Training set: Known-class examples used to build a model
Test set: Unknown-class examples used to predict classes
What type of data are ANNs particularly well-suited for in healthcare?
Noisy, complex sensor data often recorded in healthcare contexts.
What is the goal of Support Vector Machines?
To find the maximum margin hyperplane (the plane that gives the greatest separation between classes).
How is a decision tree traversed to classify an example?
Starting at the root node and traversing down the tree
At each internal node, an attribute test is performed
Based on the test outcome, follow the appropriate branch
Continue until reaching a leaf node, which provides the predicted class
Why might decision trees be preferred over neural networks in some healthcare applications?
you can see how the model predicted the output and the decisions it took because you can interpret white box models better
What is an example of attribute maintenance in healthcare data?
Using date of birth instead of age for better data management.
when should you use SVM?
Best for problems with clear separation between classes