1/27
Flashcards about Graph Analysis for Cyber-Attacks and Malware
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What does Network Analysis allow us to do?
Understand ways in which different malware samples are related by looking for commonalities between large number of malware samples based on analysis of shared attributes.
What are some examples of shared attributes used in Network Analysis?
IP Addresses, Hostnames, Strings, Code signatures, Graphics
What can we identify by analyzing the relationships between large numbers of malware samples?
Cyber-attack campaigns, Common / Novel malware tactics, Sources of malware e.g. malware infected websites/servers, Command and control networks, Signatures of known attackers/organisations
What does a graph consist of?
Vertexes (aka Nodes) and Edges.
In a graph, what do vertices and edges typically represent?
Vertices represent the objects being studied (e.g., malware sample), and Edges represent the relationships between vertices (e.g., two malware samples receive commands from the same IP address).
How does plotting a graph of malware connections help in identifying related malware?
Similar malware files will share many edges and cluster together, aiding in identifying clusters of related malware.
Besides malware samples, what else can be treated as nodes in a graph?
Malware samples and their attributes, such as callback IP addresses.
How does including IP addresses as nodes help in graph analysis?
To identify command and control servers via IP addresses common to many malware samples.
How can attributes enrich a graph in malware analysis?
Edges can be weighted by the percentage of shared code between malware samples, and nodes can have attributes like file size.
What does this real-world example graph show?
This graph shows a group of malware samples (nodes) and their connections (edges). The edges link malware samples that “call back” to the same hostnames and IP addresses, indicating they were likely to have been deployed by the same attacker(s)
What can use use the connections between malware samples for?
The connections can be used to differentiate between a coordinated attack and random attackers.
What is Bipartite Graph?
A graph whose nodes can be divided into two groups where neither group contains internal connections.
What are the two groups of NODES for in a bipartite graph when showing shared attributes between malware samples?
One group is for malware samples, and the other group is for the domain names the malware samples “call back” to.
What's a specific characteristic that determine that the graph is a bipartite graph?
Callbacks never directly connect to other callbacks, and malware samples never directly connect to other malware samples.
What happens examination of malware sample similarity by creating a bipartite graph projection?
Link malware samples only if they have attributes in the other partition in common, such as sharing the same ‘call back’ domain names.
What do the connections between malware samples on the bipartite projection show?
This graph helps to reveal the overall “social network” of these malware samples and can show related malware used in different attack campaigns.
When visualizing malware graphs, how should nodes be displayed to reflect their relationships?
Neighbouring nodes should be close together on screen, and the distance between nodes should reflect their path distance in the graph.
What is the distortion problem in visualizing malware graphs?
As the number of nodes increases, distortion must be introduced to display all nodes, and it can only be minimized, not eliminated.
How do force-directed algorithms minimize layout distortion in malware graphs?
Force-directed algorithms minimize layout distortion using physical simulations of spring-like forces between nodes.
How can shared code analysis be used to help with reverse engineering?
Shared code analysis reveals previously analyzed code shared with other malware, helping with reverse engineering and uncovering the deployer of the malware.
What does Shared code analysis do?
Shared code analysis compares malware samples to estimate the percentage of shared code.
What are the different types of features for Shared Code Analysis?
Assembly instruction sub-sequences, Strings, Import Address Table, and Dynamic API calls.
What is Bag of Words (BoW)?
A method where the description of each malware sample is based only on the presence or absence of features, discarding the order of features.
What are the steps to use Bag of Words (BoW)?
Specify a vocabulary, represent each malware sample in terms of the features, and specify a similarity function for comparing malware samples.
How do you calculate Jaccard Index?
Jaccard Index = shared attributes / total attributes
What does the similarity matrix allow us to do?
The similarity matrix allows us to visually compare similarity of a large numbers of malware samples
What do the cells of the similarity matrixes demonstrate?
Square sections of bright cells, showing that all samples from the same family are similar to each other
How do you create Similarity Matrixes as Graphs?
If the similarity (Jaccard Index) between a pair of malware samples is greater than a threshold.