Software Side Channels - 18

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/25

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 1:30 PM on 4/15/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

26 Terms

1
New cards

What is a Microarchitectural Attack?

Abuse hardware side-channels from software by exploiting microarchitectural components

2
New cards

What is Microarchitecture?

  • (Assembly) Instructions lowest abstraction in programming

    • Defined by the Instruction Set Architecture

  • But hardware needs to execute these instructions, including:

    • Decoding instructions

    • Interacting with memory

    • Detecting faults

  • All of this is implemented in the microarchitecture

  • Analogy: CPUs are the interpreter for assembly instructions. The microarchitecture is the ā€œcodeā€ for the interpreter

3
New cards

What are Micro-Architectural Components?

  • Modern CPUs are complex

    • Multi-stage instruction pipelines

    • Out-of-order execution

    • Branch prediction

    • MicroOps

  • Huge gains for performance

    • But not designed for security

4
New cards

What are the types of Micro-Architectural Components?

Branch Predictors and CPU caches

5
New cards

What are Branch Predictors?

  • Control flow of programs is decided by branches

    • The branch prediction unit (BPU) tries to predict the target for each branch

  • General good heuristic:

    • ā€œFor already encountered branches, assume the same branch target againā€

  • Often true in practice, e.g.:

    • Loops

    • Frequently called functions

    • Indirect branches for class methods in OOP

6
New cards

What is Speculative Execution?

  • CPU assumes that prediction of BPU as correct

    • And begins to speculatively execute the instructions on branch target side

    • No need to wait on computation result for data-dependent branches

    • This happens in the microarchitecture(!)

  • If prediction was correct:

    • Huge performance gain

    • Results from speculative execution are architecturally committed

  • If prediction was incorrect (ā€œmispredictionā€):

    • Discard results of speculative execution

    • Restart pipeline at correct branch target

    • More or less same performance as if branch prediction was not in-place

7
New cards

What is Direct Branch Prediction?

  • Branches to fixed targets

    • May be conditional

    • In (x86) assembly, e.g.: jmp, jnz, jz, etc

  • Performed via the ā€œPattern History Tableā€ (PHT)

    • Keeps track of previously taken/non-taken history

    • Uses the resulting history as index into the pattern table

    • Optional: Also as index into branch target address cache

  • Simplest form: Pattern table is a saturating 2-bit counter:

    • MSB used for prediction

    • For taken branches, increment entry

    • For not–taken, decrement entry

<ul><li><p>Branches to fixed targets</p><ul><li><p>May be conditional </p></li><li><p>In (x86) assembly, e.g.: jmp, jnz, jz, etc </p></li></ul></li><li><p>Performed via the ā€œPattern History Tableā€ (PHT) </p><ul><li><p>Keeps track of previously taken/non-taken history </p></li><li><p>Uses the resulting history as index into the pattern table </p></li><li><p>Optional: Also as index into branch target address cache </p></li></ul></li><li><p>Simplest form: Pattern table is a saturating 2-bit counter: </p><ul><li><p>MSB used for prediction </p></li><li><p>For taken branches, increment entry </p></li><li><p>For not–taken, decrement entry</p></li></ul></li></ul><p></p>
8
New cards

What is Indirect Branch Prediction?

  • Branches with targets computed at run-time

    • In (x86) assembly, e.g.: jmp [rax], call [rdi]

    • More complicated to predict

  • Prediction via:

    • Branch History Buffer (BHB)

    • Branch Target Buffer (BTB)

  • Creates tags for prediction by combining:

    • Source

    • Destination

    • History

<ul><li><p> Branches with targets computed at run-time </p><ul><li><p>In (x86) assembly, e.g.: jmp [rax], call [rdi] </p></li><li><p>More complicated to predict </p></li></ul></li><li><p>Prediction via: </p><ul><li><p>Branch History Buffer (BHB) </p></li><li><p>Branch Target Buffer (BTB) </p></li></ul></li><li><p>Creates tags for prediction by combining:</p><ul><li><p>Source </p></li><li><p>Destination </p></li><li><p>History</p></li></ul></li></ul><p></p>
9
New cards

What problem do CPU caches solve?

Accessing data from memory is a bottleneck

  • Data transfer via bus

  • Memory controller to map physical addresses to DIMMs

  • Complex organization of memory on DIMM (Bank, Ranks, Rows)

<p>Accessing data from memory is a bottleneck </p><ul><li><p>Data transfer via bus </p></li><li><p>Memory controller to map physical addresses to DIMMs </p></li><li><p>Complex organization of memory on DIMM (Bank, Ranks, Rows)</p></li></ul><p></p>
10
New cards

How do CPU caches solve this problem?

CPU Caches:

  • Microarchitectural component between memory and CPU

  • Caches memory contents

  • Usually implemented with SRAM

  • Small sizes

  • Fast access types

<p>CPU Caches: </p><ul><li><p> Microarchitectural component between memory and CPU </p></li><li><p>Caches memory contents </p></li><li><p>Usually implemented with SRAM </p></li><li><p>Small sizes </p></li><li><p>Fast access types</p></li></ul><p></p>
11
New cards

What are the main ideas with CPU Caches?

  • Previously accessed data & instruction:

    • Have a high likelihood to be accessed again

    • Let’s keep them close to the CPU!

  • When data is loaded from DIMM, bring it into cache

    • Subsequent operations are on value in cache

    • Huge performance optimization

  • When cache is full and new values are loaded from memory:

    • Write data back from cache to memory (ā€œevictā€ old values)

12
New cards

What are the types of caches?

  • I-Cache: Instruction Cache, for executed code

  • D-Cache: Data Cache, for accessed data

13
New cards

What is the cache granularity?

  • Organized in ā€œLinesā€

  • Usual cache line size: 64kb

  • Single memory access brings (or evicts) full line from cache

14
New cards

What is the Cache Hierarchy?

  • L1: Very Small & Very Fast

    • Typically located directly on the CPU

  • L2: Small & Fast

    • Typically located on CPU or in proximity

  • L3: Large & Slow

    • Typically shared across multiple cross

  • Memory: Very large & very slow

<ul><li><p> L1: Very Small &amp; Very Fast </p><ul><li><p>Typically located directly on the CPU </p></li></ul></li><li><p>L2: Small &amp; Fast </p><ul><li><p>Typically located on CPU or in proximity </p></li></ul></li><li><p>L3: Large &amp; Slow </p><ul><li><p>Typically shared across multiple cross </p></li></ul></li><li><p>Memory: Very large &amp; very slow</p></li></ul><p></p>
15
New cards

What side channel does the CPU cache introduce?

knowt flashcard image
16
New cards

What is the Flush+Reload cache attack?

  • Originally leaks from a victim to a spy process

    • Assumes shared memory between victim and spy

    • Can even work via L3

    • Shared code and libraries(!)

  • Strategy:

    • 1) Spy flushes cache

    • 2) Victim carries out operations

    • 3) Spy probes cache and measures time:

      • Fast access: victim brought value into cache

<ul><li><p> Originally leaks from a victim to a spy process </p><ul><li><p>Assumes shared memory between victim and spy </p></li><li><p>Can even work via L3 </p></li><li><p>Shared code and libraries(!) </p></li></ul></li><li><p>Strategy: </p><ul><li><p>1) Spy flushes cache </p></li><li><p>2) Victim carries out operations </p></li><li><p>3) Spy probes cache and measures time: </p><ul><li><p>Fast access: victim brought value into cache</p></li></ul></li></ul></li></ul><p></p>
17
New cards

What are some other cache attacks?

  • Flush & Flush

  • Evict & Reload

  • Prime & Probe

18
New cards

What is the Flush & Flush attack?

  • Mechanism: Flush takes longer when data is cached

  • Use-case: Same as flush and reload, but more stealthy

19
New cards

What is the Evict & Reload attack?

  • Mechanism: Use cache eviction policies to evict sensitive data from cache

  • Use-case: clflush (or similar) instruction unavailable, shared memory available

20
New cards

What is the Prime & Probe attack?

  • Mechanism:

    • (1) Attacker primes cache by filling all values.

    • (2) If later access is slow, victim accessed relevant cache line in-between

  • Use-case: clflush (or similar) instruction unavailable, shared memory unavailable

21
New cards

What do these cache attacks allow?

  • Breaking cryptography by learning:

    • Which parts of code was executed (e.g., Square-and-Multiply in RSA)

    • Which parts of data was accessed (e.g., look-up tables in AES)

  • Building covert channels

    • Transmitting information between two processes not allowed to communicate

  • Exfiltrating data during transient execution attacks

    • We can probe whether certain data ended up in cache

22
New cards

Wha the Platypus attack?

Uses Intelā€˜s Running Average Power Limit (RAPL) functions to measure power consumption of target operations

  • Similar to traditional power side channels

23
New cards

What is RAPL?

Running Average Power Limit

<p>Running Average Power Limit</p>
24
New cards

How can you measure instruction energy consumption using RAPL?

  • RAPL allows to measure the consumed energy over sampling period

    • Update intervals up to 50µs

  • Energy is closely related to power consumption:

    • Cumulative consumed power over time

  • RAPL measurements can target four different domains

    • package (PKG), power planes (PP0 and PP1), and DRAM

  • Similar interfaces available for AMD

<ul><li><p>RAPL allows to measure the consumed energy over sampling period </p><ul><li><p>Update intervals up to 50µs </p></li></ul></li><li><p>Energy is closely related to power consumption: </p><ul><li><p>Cumulative consumed power over time </p></li></ul></li><li><p>RAPL measurements can target four different domains </p><ul><li><p>package (PKG), power planes (PP0 and PP1), and DRAM </p></li></ul></li><li><p>Similar interfaces available for AMD</p></li></ul><p></p>
25
New cards

What is an example platypus with an unprivileged attacker?

  • Breaking KASLR within 20 seconds

  • Using the (unprivileged) powercap interface directly

    • And requires Intel TSX for fault suppression

    • Not possible on modern systems

26
New cards

What is an example platypus with a privileged attacker?

  • Side-channel attack on execution in SGX enclave

  • Recovering RSA private keys within 100 minutes

  • Leaking of keys for AES-NI between 26-277 hours