COMP8160 eHealth Complete Flashcards

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/61

Earn XP

Description and Tags

Flashcards generated from lecture notes.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

62 Terms

New cards

What is data in the context of healthcare?

Raw facts or measurements without context. Examples: blood pressure reading 180/110, temperature 38.5°C.

New cards

What is information in the context of healthcare?

Data with meaning and context. Example: Blood pressure 180/110 is classified as "high blood pressure."

New cards

What is knowledge in the context of healthcare?

Understanding what information means and what actions to take. Example: High blood pressure increases heart attack risk and requires treatment.

New cards

Identify data, information, and knowledge: A patient's cholesterol level is 250 mg/dL. The doctor says this is high and recommends statins.

Data: Cholesterol level of 250 mg/dL. Information: This level is classified as "high." Knowledge: High cholesterol needs treatment with statins to reduce heart disease risk.

New cards

What is KDD (Knowledge Discovery in Databases)?

The process of discovering useful knowledge from data through 4 steps:

1) Collect data,

2) Clean and prepare it,

3) Find patterns,

4) Turn patterns into useful knowledge.

New cards

What are the main types of medical data?

Six main types:

1. Narrative (clinical notes)

2. Structured text (standardized forms)

3. Numerical measurements (lab values) 4. Signal data (ECG, EEG)

5. Images (X-rays, MRIs) 6. Genetic information

New cards

What are the four types of measurement scales used in healthcare data?

Nominal scales: Categories with no order (e.g., blood types)
Ordinal scales: Categories with a meaningful order (e.g., pain scale 1-10)
Interval scales: Equal intervals but no true zero (e.g., temperature in Celsius)
Ratio scales: Equal intervals with a meaningful zero (e.g., weight, height)

New cards

What is PPG (Photoplethysmography)?

Technology that measures blood volume changes in vessels using light to determine heart rate and other cardiovascular metrics.

New cards

Explain how PPG sensors measure heart rate.

1) LED light shines onto skin

2) Blood absorbs light proportionally to blood volume

3) Photodetector measures reflected light 4) Changes in light intensity (more absorption during heartbeats, less between beats) are processed to calculate heart rate

New cards

How does an Apple Watch use PPG to measure heart rate?

It uses green LED lights and photodetectors to detect blood flow variations in the wrist, applying signal processing algorithms to calculate heart rate from the pattern of light absorption.

New cards

What are the advantages and limitations of PPG-based devices?

Advantages: Inexpensive, low power consumption, portable, convenient to wear.

Limitations: Sensitive to motion artifacts, less precise than ECG for detailed analysis

New cards

What are the limitations of green light PPG?

Green light is absorbed by skin (weakening the signal), affected by skin tone (melanin absorbs green light), and cannot reach deeper tissue because hemoglobin strongly absorbs it

New cards

What are the differences between green light and infrared light PPG sensors?

Green light PPG sensors have better signal-to-noise ratio and resistance to motion artifacts but are affected by skin tone and can't penetrate deep tissue. Infrared light PPG sensors penetrate 10x deeper into tissues, are less affected by skin characteristics (melanin, tattoos), but require more advanced signal processing to filter motion noise.

New cards

What advantages do infrared light PPG sensors offer?

Infrared light can penetrate much deeper into tissue, is less affected by skin tone variations, and can measure additional biometrics beyond heart rate (such as oxygen saturation, hydration, and muscle oxygen).

New cards

Why is green light often used in wrist-based PPG sensors?

Green light provides good signal-to-noise ratio and resistance to motion artifacts, making it effective for wrist-based measurements despite being less penetrating than infrared.

New cards

What is Pulse Wave Velocity (PWV)?

The speed at which the pressure wave moves along an artery. It can be calculated with the formula PWV = Distance/Time delay between pulse waves.

New cards

What is Heart Rate Variability (HRV)?

The variation in time intervals between successive heartbeats, measured as differences between consecutive heart beats.

New cards

How does HRV relate to stress levels?

Lower variation between heartbeats indicates higher stress; higher variation indicates relaxation.

New cards

What are the key HRV metrics used for stress monitoring?

AVNN (average of inter-beat intervals), SDNN (standard deviation of intervals), RMSSD (root mean square of successive differences).

New cards

What is the difference between supervised and unsupervised learning?

Supervised learning uses labeled data with known answers, while unsupervised learning finds patterns in unlabeled data without predefined categories.

New cards

What is descriptive data mining?

Finding patterns in existing data without making predictions. Examples: clustering and association rule mining.

New cards

What is predictive data mining?

Using historical data patterns to make predictions about future outcomes. Examples: classification and regression.

New cards

What is Classification in data mining?

Assigning items to predefined categories based on patterns learned from labeled training data.

New cards

What is Regression in data mining?

Finding a model (function) that maps a given input (attributes values) to a numeric prediction.

New cards

What are some applications of regression in healthcare?

Predicting patient recovery time, forecasting blood glucose levels, estimating hospital stay length, and calculating optimal drug dosages.

New cards

What is attribute selection in data mining?

Choosing only the useful features from data and ignoring irrelevant ones to improve model performance and reduce complexity.

New cards

What is attribute construction in data mining?

Creating new, more useful features from existing ones to help find patterns that aren't visible in the original data.

New cards

What are common preprocessing steps for healthcare data?

Filling in missing values, standardizing values to comparable scales, converting categories to numbers, removing outliers, and balancing class distributions.

New cards

How does class imbalance affect classification in healthcare?

With rare diseases, a model could achieve high accuracy by always predicting "healthy," making accuracy a misleading metric for performance.

New cards

What is clustering in data mining?

Grouping similar data objects together while keeping dissimilar objects in different groups, without using predefined categories.

New cards

How does K-means clustering work?

A 4-step process: 1. Select k initial centroids 2. Assign each point to nearest centroid 3. Recalculate centroids as means of assigned points 4. Repeat steps 2-3 until convergence (groups stop changing)

New cards

What is the objective function that K-means tries to minimize?

The sum of squared distances between each point and its cluster center: J(V) = ∑∑||xi - μj||²

New cards

How do you calculate the distance between a point and a centroid in K-means?

Use Euclidean distance: 1. Find the difference for each feature 2. Square each difference 3. Add all squared differences 4. Take the square root of the total

New cards

What is the difference between K-means and Partitioning Around Medoids (PAM)?

K-means uses calculated means as centroids which may not be actual data points; PAM uses existing data points (medoids) as centers.

New cards

What are "black-box" models in data mining?

Models that provide predictions without explaining their reasoning process (e.g., neural networks, SVMs).

New cards

What are "white-box" models in data mining?

Models that provide interpretable decision processes (e.g., decision trees, rule-based systems).

New cards

Why is model interpretability particularly important in healthcare applications?

Medical professionals need to understand and validate the reasoning behind predictions for patient safety, trust, and regulatory compliance.

New cards

What is a Confusion Matrix?

A table showing prediction performance with: • True Positives (TP): Correctly predicted "Yes" • False Positives (FP): Wrongly predicted "Yes" • True Negatives (TN): Correctly predicted "No" • False Negatives (FN): Wrongly predicted "No"

New cards

How do you calculate Accuracy from a confusion matrix?

Accuracy = (TP + TN) / (TP + TN + FP + FN)

New cards

How do you calculate Sensitivity (True Positive Rate)?

Sensitivity = TP / (TP + FN)

New cards

How do you calculate Specificity (True Negative Rate)?

Specificity = TN / (TN + FP)

New cards

Why is accuracy alone insufficient for healthcare applications?

Different types of errors have different consequences. Missing a disease (false negative) can be fatal, while a false alarm (false positive) causes anxiety and unnecessary tests.

New cards

When should you prioritize sensitivity in medical tests?

When missing the disease is very dangerous and treatment is safe and effective. For example, screening for treatable cancers.

New cards

When should you prioritize specificity in medical tests?

When the treatment is risky, expensive, or has serious side effects.

New cards

If you increase a test's threshold, what happens to sensitivity and specificity?

Raising the threshold: • Decreases sensitivity (more false negatives) • Increases specificity (fewer false positives)

New cards

What is Cross-Validation?

A technique to evaluate model performance by partitioning data into multiple training and testing subsets.

New cards

How does k-fold cross-validation work?

Split data into k equal parts. Use k-1 parts for training and 1 part for testing. Repeat k times using a different part for testing each time. Average the results.

New cards

What is stratification in cross-validation?

Ensuring each fold has the same mix of classes as the full dataset, preventing bias from uneven distribution of classes.

New cards

In 10-fold cross-validation with 1000 examples, how many examples are used for training in each round?

900 examples (90%) for training, 100 examples (10%) for testing in each round.

New cards

In 5-fold cross-validation with 5000 examples, how many examples are used for testing in each round?

1000 examples (20%) are used for testing in each round.

New cards

What is Leave-One-Out Cross-Validation (LOOCV)?

A special case of k-fold cross-validation where k equals the number of examples, so each example is used once as a test case.

New cards

When is Leave-One-Out Cross-Validation most appropriate?

When working with small datasets where maximizing the amount of training data is crucial.

New cards

For a dataset with 9,000 MRI images, if using 10-fold cross-validation, how many images would be used for training in each fold?

8,100 images (90% of 9,000) would be used for training in each fold.

New cards

What is Bayes' Theorem for medical testing?

P(disease|positive test) = [sensitivity × P(disease)] / [sensitivity × P(disease) + (1-specificity) × P(no disease)]

New cards

What is prior probability in diagnostic testing?

The estimated probability of disease before testing (often the disease prevalence in the population).

New cards

What is posterior probability in diagnostic testing?

The updated probability of disease after incorporating test results.

New cards

Given a test with 95% sensitivity and 90% specificity for a disease with 1% prevalence, what is the probability of having the disease after a positive test?

P(D|T+) = (0.95×0.01)/[(0.95×0.01)+(0.1×0.99)] = 0.0095/0.1085 ≈ 8.76%

New cards

Why does disease prevalence significantly impact the predictive value of tests?

In low-prevalence populations, even highly specific tests will generate many false positives relative to true positives, reducing positive predictive value.

New cards

How is a workflow formally defined?

WF = (T, P, C, A, S₀) where: • T is the task set • P is the precedence matrix • C is the conflict matrix • A is the precondition set • S₀ is the initial state

New cards

What is a Precedence Matrix in workflow modeling?

A matrix P = (Pᵢⱼ)ₘₓₘ where Pᵢⱼ = 1 indicates task j must be completed before task i can start.

New cards

What is a Conflict Matrix in workflow modeling?

A matrix C = (cᵢⱼ)ₘₓₘ where cᵢⱼ = 1 indicates tasks i and j cannot be performed simultaneously.

New cards

What are the four possible states of a task in workflow modeling?

S(Tᵢ) = 0: Not executable, not executed S(Tᵢ) = 1: Executable, not executed S(Tᵢ) = 2: Not executable, executed S(Tᵢ) = 3: Executable, executed