1/24
Vocabulary flashcards covering core terms and concepts from the lecture on SYFLOW, including its differentiable rule-learning mechanism, use of normalizing flows, and measures of subgroup exceptionality.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
SYFLOW
A neuro-symbolic method that learns exceptional subgroups by jointly optimizing subgroup rules and target-distribution models end-to-end with KL-divergence.
Subgroup Discovery
A data-mining task that finds population subsets whose target variable behaves exceptionally compared with the whole data and returns human-readable rules.
Subgroup
A sub-population of samples selected by a rule; denoted S=1 when membership conditions hold.
Predicate π(xi)
A Boolean test on one feature that returns 1 inside a chosen interval and 0 otherwise; building block of a rule.
Soft Predicate
A differentiable relaxation π̂(xi;α,β,t)∈[0,1] that smoothly approximates an interval test, controlled by temperature t.
Temperature Parameter (t)
Controls the steepness of soft predicates; t→0 yields crisp (binary) interval membership.
Soft Rule s(x)
The probabilistic subgroup-membership function obtained by combining soft predicates; outputs values in [0,1].
Differentiable Rule Induction
Learning subgroup rules via gradient descent on soft predicates and soft rules instead of combinatorial search.
Weighted Harmonic Mean
The aggregation used in SYFLOW to combine predicate outputs into a conjunction while keeping gradients stable.
α and β (Thresholds)
Learned lower (α) and upper (β) bounds that define the interval of each soft predicate.
Normalizing Flow
An invertible neural transformation that maps a simple base distribution to a complex target distribution while allowing exact likelihoods.
Neural Spline Flow
A normalizing-flow architecture that uses piece-wise rational quadratic splines for flexible, invertible density estimation.
Target Distribution (PY)
The probability distribution of the target variable across the whole dataset.
Conditional Distribution (PY|S=1)
The target distribution restricted to the subgroup; used to measure exceptionality.
KL-Divergence
A measure of how one probability distribution differs from another; maximized in SYFLOW to capture exceptionality.
Size-Corrected KL
The KL-divergence between subgroup and global targets multiplied by n_s^γ to balance exceptionality with subgroup size.
Exceptionality
The extent to which the subgroup’s target distribution differs from the overall distribution.
Diversity Regularizer (λ)
An extra term that penalizes similarity between newly found subgroups and already discovered ones to encourage variety.
Continuous Optimization
Using gradient-based methods on differentiable objectives rather than discrete search over rule space.
Pre-discretization
Manual binning of continuous features prior to subgroup discovery; avoided by SYFLOW’s learned thresholds.
Combinatorial Explosion
The rapid growth of candidate rules in traditional subgroup discovery, causing scalability issues.
Branch-and-Bound Algorithms
Exact or heuristic search methods that prune rule spaces; scale poorly compared to SYFLOW’s gradient approach.
HOMO-LUMO Gap
The energy difference between the highest occupied and lowest unoccupied molecular orbitals; used as a target in the materials case study.
Bhattacharyya Coefficient
A statistic that measures the overlap between two probability distributions; used to assess subgroup exceptionality.
Rule Membership Function σ(x)
The binary function (after t→0) that assigns each sample to the subgroup (1) or not (0).