Week 10 Lecture 1: Graph Analysis for Cyber-Attacks and Malware

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/27

flashcard set

Earn XP

Description and Tags

Flashcards about Graph Analysis for Cyber-Attacks and Malware

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

28 Terms

1
New cards

What does Network Analysis allow us to do?

Understand ways in which different malware samples are related by looking for commonalities between large number of malware samples based on analysis of shared attributes.

2
New cards

What are some examples of shared attributes used in Network Analysis?

IP Addresses, Hostnames, Strings, Code signatures, Graphics

3
New cards

What can we identify by analyzing the relationships between large numbers of malware samples?

Cyber-attack campaigns, Common / Novel malware tactics, Sources of malware e.g. malware infected websites/servers, Command and control networks, Signatures of known attackers/organisations

4
New cards

What does a graph consist of?

Vertexes (aka Nodes) and Edges.

5
New cards

In a graph, what do vertices and edges typically represent?

Vertices represent the objects being studied (e.g., malware sample), and Edges represent the relationships between vertices (e.g., two malware samples receive commands from the same IP address).

6
New cards

How does plotting a graph of malware connections help in identifying related malware?

Similar malware files will share many edges and cluster together, aiding in identifying clusters of related malware.

7
New cards

Besides malware samples, what else can be treated as nodes in a graph?

Malware samples and their attributes, such as callback IP addresses.

8
New cards

How does including IP addresses as nodes help in graph analysis?

To identify command and control servers via IP addresses common to many malware samples.

9
New cards

How can attributes enrich a graph in malware analysis?

Edges can be weighted by the percentage of shared code between malware samples, and nodes can have attributes like file size.

10
New cards

What does this real-world example graph show?

This graph shows a group of malware samples (nodes) and their connections (edges). The edges link malware samples that “call back” to the same hostnames and IP addresses, indicating they were likely to have been deployed by the same attacker(s)

11
New cards

What can use use the connections between malware samples for?

The connections can be used to differentiate between a coordinated attack and random attackers.

12
New cards

What is Bipartite Graph?

A graph whose nodes can be divided into two groups where neither group contains internal connections.

13
New cards

What are the two groups of NODES for in a bipartite graph when showing shared attributes between malware samples?

One group is for malware samples, and the other group is for the domain names the malware samples “call back” to.

14
New cards

What's a specific characteristic that determine that the graph is a bipartite graph?

Callbacks never directly connect to other callbacks, and malware samples never directly connect to other malware samples.

15
New cards

What happens examination of malware sample similarity by creating a bipartite graph projection?

Link malware samples only if they have attributes in the other partition in common, such as sharing the same ‘call back’ domain names.

16
New cards

What do the connections between malware samples on the bipartite projection show?

This graph helps to reveal the overall “social network” of these malware samples and can show related malware used in different attack campaigns.

17
New cards

When visualizing malware graphs, how should nodes be displayed to reflect their relationships?

Neighbouring nodes should be close together on screen, and the distance between nodes should reflect their path distance in the graph.

18
New cards

What is the distortion problem in visualizing malware graphs?

As the number of nodes increases, distortion must be introduced to display all nodes, and it can only be minimized, not eliminated.

19
New cards

How do force-directed algorithms minimize layout distortion in malware graphs?

Force-directed algorithms minimize layout distortion using physical simulations of spring-like forces between nodes.

20
New cards

How can shared code analysis be used to help with reverse engineering?

Shared code analysis reveals previously analyzed code shared with other malware, helping with reverse engineering and uncovering the deployer of the malware.

21
New cards

What does Shared code analysis do?

Shared code analysis compares malware samples to estimate the percentage of shared code.

22
New cards

What are the different types of features for Shared Code Analysis?

Assembly instruction sub-sequences, Strings, Import Address Table, and Dynamic API calls.

23
New cards

What is Bag of Words (BoW)?

A method where the description of each malware sample is based only on the presence or absence of features, discarding the order of features.

24
New cards

What are the steps to use Bag of Words (BoW)?

Specify a vocabulary, represent each malware sample in terms of the features, and specify a similarity function for comparing malware samples.

25
New cards

How do you calculate Jaccard Index?

Jaccard Index = shared attributes / total attributes

26
New cards

What does the similarity matrix allow us to do?

The similarity matrix allows us to visually compare similarity of a large numbers of malware samples

27
New cards

What do the cells of the similarity matrixes demonstrate?

Square sections of bright cells, showing that all samples from the same family are similar to each other

28
New cards

How do you create Similarity Matrixes as Graphs?

If the similarity (Jaccard Index) between a pair of malware samples is greater than a threshold.