1/40
These vocabulary flashcards summarize essential concepts, components, algorithms, and theoretical notions introduced in the lecture on Differentiable Pattern Set Mining and the BinaPs model. Reviewing them will reinforce understanding of the paper’s terminology, methodology, and context.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Pattern Mining
The task of discovering interpretable symbolic statements (patterns) that reveal structure in data.
Frequent Pattern Mining
Classical approach that returns all patterns whose frequency exceeds a user-defined threshold, often producing many redundant or spurious results.
Pattern Set Mining
Modern reformulation that seeks a small set of patterns whose combination describes the data well.
Twice-Exponential Search Space
The size of the candidate space when enumerating all possible pattern sets; it grows doubly exponentially with the number of features.
Heuristic Search
A non-exhaustive strategy used by many pattern-set miners to explore an intractably large search space.
Autoencoder
A neural network that encodes inputs into a latent representation and then reconstructs the original data from that representation.
Binary Pattern Networks (BinaPs)
A novel binarized autoencoder that learns sets of conjunctive patterns through binary activations and weights.
Binary Activation Function
An activation that outputs 0 or 1, enabling neuron outputs to be interpreted as presence or absence of a pattern.
Binarized Weights
Weights constrained to {0,1} (during forward passes) so each weight can be interpreted as item-in-pattern membership.
Continuous Weights
Real-valued parameters updated during back-propagation; they are stochastically binarized before each forward pass.
Conjunctive Pattern
A set of features that must co-occur; in BinaPs each hidden neuron represents one such conjunction.
Pattern Layer (Hidden Layer)
The single linear hidden layer in BinaPs whose neurons correspond directly to learned patterns.
Mirrored Weight Matrix
Design choice where the decoder weight matrix equals the encoder matrix transposed, ensuring symmetric reconstruction.
Bias Term (BinaPs)
Learnable negative threshold that enforces a minimum number of items before a hidden neuron fires.
Reconstruction Loss
Objective measuring how accurately the autoencoder recreates the input; minimizing it drives pattern discovery.
Data-Sparsity-Aware Loss
A weighted reconstruction loss that compensates for the imbalance between many 0s and few 1s in sparse data.
Gradient-Based Optimization
Learning approach that updates parameters via gradients (e.g., Adam) instead of combinatorial search.
Straight-Through Estimator (STE)
Technique that approximates the gradient of a binary activation by passing the upstream gradient unchanged.
Gated Straight-Through Estimator
Modified STE used in BinaPs that blocks negative gradients when the corresponding neuron was inactive.
Stochastic Binarization
Process of converting continuous weights to 0/1 by treating each value as Bernoulli success probability.
Capacity (c)
Upper bound on the number of hidden neurons (patterns) in a BinaPs model.
NP-Hard
Complexity class indicating a problem is at least as hard as the hardest problems in NP; deciding optimal pattern-set size is NP-hard.
Bipartite Dimension
Minimum number of bicliques needed to cover all edges of a bipartite graph; its computation is NP-hard and relates to optimal pattern count.
Boolean Matrix Factorization (BMF)
Decomposition of a binary matrix into two Boolean factors; closely related to mining pattern sets.
Tiling
Covering a binary matrix with non-overlapping rectangles (tiles) representing patterns that explain the data.
Minimum Description Length (MDL) Principle
Information-theoretic framework that selects models yielding the shortest combined encoded length of model plus data.
Maximum Entropy Modeling
Approach that selects the distribution with highest entropy subject to constraints, used to evaluate pattern interestingness.
Sparse Data
Dataset in which 1-entries (true events) are rare compared to 0-entries; common in transaction databases.
Dense Data
Dataset where 1-entries are frequent; e.g., genomic copy-number matrices.
Signal-to-Noise Ratio
Proportion between true pattern occurrences (signal) and random flips or errors (noise).
Local Minimum
A parameter setting where loss cannot be reduced by small moves but is not the global optimum.
Clamp Function
Utility function that bounds a value within a specified range [a,b].
Bernoulli Distribution
Probability distribution over outcomes {0,1}; used for stochastic weight binarization.
Biclique
Complete bipartite subgraph; corresponds to a perfect pattern covering a subset of items and transactions.
Adam Optimizer
Adaptive stochastic gradient descent method commonly used to train neural networks.
GPU Acceleration
Leveraging graphics processing units to perform parallel computations, drastically speeding up BinaPs training.
Escaping Poor Minima
Benefit of stochastic binarization which allows occasional random weight changes, preventing dead neurons.
Asso Algorithm
Boolean matrix factorization method that iteratively adds patterns based on association accuracy.
Slim Algorithm
MDL-based pattern miner that greedily builds a code table to compress data; may overfit on noisy data.
Desc Algorithm
Pattern-set miner that uses Maximum Entropy models to select patterns explaining data distributions.
Synthetic Data
Artificially generated dataset used to evaluate algorithms under controlled conditions (e.g., known ground truth).