Home
Explore
Exams
Search for anything
Login
Get started
Home
CS6262 Lecture 16 - Machine Learning for Security
CS6262 Lecture 16 - Machine Learning for Security
0.0
(0)
Rate it
Studied by 0 people
0.0
(0)
Rate it
Call Kai
Learn
Practice Test
Spaced Repetition
Match
Flashcards
Knowt Play
Card Sorting
1/68
There's no tags or description
Looks like no tags are added yet.
Study Analytics
All Modes
Learn
Practice Test
Matching
Spaced Repetition
Name
Mastery
Learn
Test
Matching
Spaced
No study sessions yet.
69 Terms
View all (69)
Star these 69
1
New cards
What is the primary goal of applying machine learning to intrusion detection?
To automatically and quickly identify new attacks and stop bad behavior early
2
New cards
Which type of analytics models normal network behavior and identifies deviations?
Anomaly detection
3
New cards
Which analytics approach combines misuse and anomaly detection?
Hybrid detection
4
New cards
Which type of analytics detects known attacks using signatures?
Misuse detection
5
New cards
Which type of analytics is capable of detecting zero-day attacks?
Anomaly detection
6
New cards
What is the main objective of machine learning given training examples?
To learn a function that can predict outputs from inputs
7
New cards
What occurs during the training phase of machine learning?
The algorithm learns a function by minimizing prediction error using labeled examples
8
New cards
What happens during the testing phase of machine learning?
The learned function is applied to new, unseen data to predict output values
9
New cards
Why should data used in machine learning be drawn from real-world applications?
To ensure that training and test data reflect realistic conditions
10
New cards
How is real-world data typically prepared for machine learning?
It is randomly split into training and test datasets
11
New cards
What is used as input for learning a predictive function in supervised machine learning?
Feature vectors with associated labels
12
New cards
What is a feature vector in the context of object recognition?
A numerical representation of an object based on extracted characteristics
13
New cards
What is the purpose of labeling feature vectors during training?
To provide correct outputs for learning the predictive function
14
New cards
What determines which features are useful in a machine learning model?
Features depend on the specific application being analyzed
15
New cards
What is the most important property of a good machine learning model?
The ability to generalize from training data to unseen test data
16
New cards
What does generalization mean in machine learning?
Correctly predicting outputs for new examples not seen during training
17
New cards
Which type of machine learning finds patterns or structure in unlabeled data?
Unsupervised learning
18
New cards
Which type of machine learning uses labeled data to learn a model that maps inputs to outputs?
Supervised learning
19
New cards
Which type of machine learning uses datasets where only some examples are labeled?
Semi-supervised learning
20
New cards
What is error rate in machine learning performance metrics?
The fraction of false predictions
21
New cards
What is accuracy in machine learning performance metrics?
The fraction of correct predictions
22
New cards
What is precision in machine learning metrics?
The fraction of correct positive predictions among all predicted positives
23
New cards
What is recall in machine learning metrics?
The fraction of correct positive predictions among all actual positives
24
New cards
Can precision and recall be used beyond binary classification?
Yes, they can be generalized to multi-class applications
25
New cards
What type of problems involve training datasets with attributes and class labels?
Classification problems
26
New cards
What is the goal of a machine learning classification model?
To output a class label based on a set of input attributes or features
27
New cards
How is a decision tree constructed in the training process?
By repeatedly partitioning data until each partition contains examples from only one class
28
New cards
What alternative perspective can a decision tree be viewed as?
A set of rules describing the decision logic
29
New cards
What is the first step in building a decision tree?
Find the best attribute to use as the root
30
New cards
What metrics are used to determine which attribute best partitions a dataset into subsets?
Entropy and information gain
31
New cards
What does entropy measure in the context of decision trees?
The purity of examples in a dataset based on class label distribution
32
New cards
When is entropy at its maximum?
When examples are evenly distributed among different classes
33
New cards
When is entropy at its minimum?
When all examples in a dataset belong to a single class
34
New cards
What does high information gain for an attribute indicate?
The attribute produces purer subsets when partitioning the dataset
35
New cards
What is the purpose of information gain in decision tree construction?
To evaluate how well an attribute separates samples according to their classifications
36
New cards
What does clustering do with training examples?
Assigns them into different clusters based on a distance measure
37
New cards
What is commonly used to measure similarity between examples in clustering?
A distance function
38
New cards
What is typically predetermined before beginning clustering?
The number of clusters
39
New cards
How is cluster membership determined?
By assigning samples to the cluster with the closest centroid
40
New cards
When does the clustering process stop?
When clusters converge and membership no longer changes
41
New cards
For detecting botnet command-and-control using supervised learning, what type of data is needed?
Labeled data with known C&C communication examples
42
New cards
What makes classifiers effective in machine learning security applications?
Smart features that capture domain knowledge
43
New cards
How can intrusion detection be viewed in machine learning terms?
As a classification problem distinguishing normal traffic from attack traffic
44
New cards
What is the goal when applying machine learning to intrusion detection?
To partition mixed traffic into pure-class subsets such as normal or specific attack types
45
New cards
What type of features are useful when partitioning traffic for intrusion detection?
Features with high information gain
46
New cards
What does raw network data need to be summarized into for machine learning processing?
Connection records
47
New cards
What types of attributes are typically included in a connection record?
Timestamp, duration, source IP, destination IP, bytes, service, and flag
48
New cards
What does the connection flag SF indicate?
That the connection completed both SYN and FIN
49
New cards
What does the connection flag REJ indicate?
That the connection request was rejected
50
New cards
What is one method for constructing useful intrusion detection features?
Using temporal and statistical patterns associated with attacks
51
New cards
What is an example of a temporal/statistical pattern useful for detecting intrusions?
Many S0 connections to the same service or host in a short time period
52
New cards
What is the first step in the high-level process of building intrusion detection models from network data?
Collect raw audit data such as packets
53
New cards
After capturing raw data, what is the next step in preparing it for machine learning?
Summarize data into connection records
54
New cards
What do we search for in connection records to help build intrusion detection features?
Frequent patterns
55
New cards
How do we identify unique intrusion-related behaviors from discovered patterns?
Compare frequent patterns to determine those associated specifically with intrusions
56
New cards
What do we construct after identifying unique intrusion-related patterns?
Features used to train classification models
57
New cards
How is the machine learning feature construction and improvement process described?
Iterative, with each step repeated to improve performance
58
New cards
What approach is used to discover patterns within the data?
Data mining algorithms
59
New cards
What is the purpose of association rule mining in this context?
To find associations among features, such as many S0 HTTP connections
60
New cards
Why may basic association rule and frequent episode algorithms produce useless patterns?
They can include irrelevant attributes not useful for intrusion detection
61
New cards
What modification is applied to basic pattern-finding algorithms for intrusion detection?
Restricting results to patterns involving essential or reference attributes
62
New cards
What is an axis attribute in the context of intrusion detection pattern mining?
The most important attribute, such as the service, that must appear in any association
63
New cards
Why is the axis attribute required in computed associations?
To eliminate patterns that involve only non-essential attributes
64
New cards
After computing associations involving axis attributes, what is the next step?
Compute sequential patterns involving the associations
65
New cards
What is the purpose of constructing features from intrusion-specific patterns?
To build classifiers that detect intrusions
66
New cards
Why is dataset selection challenging in intrusion detection?
Because there is no perfect way to label data and thus no perfect IDS dataset
67
New cards
What evaluation dataset is used to assess the intrusion detection approach?
The DARPA evaluation dataset
68
New cards
How many attack types are included in the DARPA dataset used for evaluation?
38 attack types
69
New cards
What are the four categories of attack types in the DARPA dataset?
Denial-of-Service, probing, remote-to-local, and user-to-root