Topic 3 - Files and Indexing (Drives, RAID, Indices)

0.0(0)
studied byStudied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/39

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 12:05 AM on 2/23/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

40 Terms

1
New cards

What are the Primary/Secondary/Tertiary types in the Storage Pyramid in order (fastest to slowest)?

Primary: CPU registers → Cache Memory → Main Memory

Secondary: Solid State Drives, Hard Disk Drives

Tertiary Storage: Tapes/Optical Disk Libraries

2
New cards

What section of storage in the Storage Pyramid is volatile, and what are the types in order from (fastest to slowest)?

The Primary Storage types are all volatile
CPU Registers → Cache Memory → Main Memory

3
New cards

What are the 4 main physical components of a Hard Drive?

  1. Platters (hold data)

  2. Axle (Spins the platters)

  3. R/W Heads (R/W data)

  4. R/W Arm (Moves the heads over the platters)

4
New cards

What is a sector/block?

The basic unit of transfer of information

5
New cards

Explain the 3 sources of delay (use names) for R/W operations on a HDD

  1. Seek Time - Physical movement of the R/W heads takes time

  2. Rotational Latency - Must wait for the block (on the platter) to rotate to where the head is positioned

  3. Transfer Time - R/W data from/to the platter surface

6
New cards

What is disk mirroring?

Disk mirroring is the technique of writing the same data to two drives, s.t. the user has effectively two identical drives.

7
New cards

What are the advantages/disadvantages to RAID 1?

Advantages:

  • System survives if one of the drives fails

  • Opens the possibility for reading from both drives simultaneously

Disadvantages:

  • Expense of 2x the # of drives

8
New cards

What level is RAID 0?

Striping

9
New cards

What is RAID 1?

Mirroring

10
New cards

What is disk striping?

Disk striping is a technique where logical units of data (like a file) are distributed across several disks

11
New cards

What are the advantages / disadvantages of RAID 0?

Advantages:

  • Increases performance (since read/writes can be parallelized)

Disadvantages:

  • Increased probability/chance of the system failing due to one of the drives failing (if one goes out, the whole system is unrecoverable)

  • Lack of data replication

12
New cards

What are the four kinds of parity schemes in RAID?

even, odd, mark, none

13
New cards

Describe how the parity scheme works in RAID 4.

In RAID 4, a single drive stores all parity information across the other drives in use. If one of the drives fails, the data can be reconstructed using the parity drive and the information of the other surviving drives.

14
New cards

What is the minimum amount of physical disks required for parity schemes in RAID 4 / RAID 5, and explain why this is

Parity schemes require at least 3 drives to rebuild data if 1 drive fails. This is because using only 2 disks means the parity disk essentially becomes a mirrored disk.

In the minimum case of 3 drives, if 1 of any of drives dies, the system survives (either by rebuilding the data using the parity drive, or if the parity drive dies, rebuilding the parity drive using a new drive)

15
New cards

What is the difference between RAID 4 and RAID 5?

RAID 4 and RAID 5 both use a parity scheme to survive drive failures

4 stores all parity data on a single disk
5 distributes/stripes the parity data across all disks

16
New cards

Events A and B are independent if?

p(A intersect B) = p(A) * p(B)

17
New cards

What does RAID stand for? What did the “I” originally stand for?

Redundant Array of Independent Disks

Originally meant “Inexpensive”

18
New cards

Which RAID levels are standardized?

RAID levels 0-6

19
New cards

What is RAID 6? Min. # of disks? Main performance diff. to RAID 5? How many drives can fail concurrently and system survive?

Distributed “Double Parity” scheme, requires 4 disks.

Essentially the same as RAID 5, except 2 parity schemes are used, which allows for 2 drives to fail.

Compared to RAID 5, RAID 6 has slower write performance

20
New cards

What is the technology behind SSD storage called?

NAND-based non-volatile RAM

21
New cards

What are the advantages/disadvantages to SSDs?

Advantages

  • No mechanical parts

  • Excellent read performance

  • Smaller physical size

Disadvantages

  • Write speed is slow than read speed

  • # of erasures per cell is limited (cells eventually die)

  • More expensive than HDDs (per unit storage)

22
New cards

What is the definition of Blocking Factor (bf)?

The number of whole records that can be stored in a single block

23
New cards

What is Internal Fragmentation?

Unallocatable storage within an allocation unit (such as a block)

24
New cards

Give a brief informal description of how to calculate the byte amount of Internal Fragmentation in a block

Calculate blocking factor/bf (# of records that can fit in a block)

Find out how many bytes this is, and subtract it from the block size = Remainder is IF size in bytes

25
New cards

What are the two ways of locating records within a block? Explain both.

  1. Fixed-Length Records

  • a) Packed/Contiguous Allocation (use qty to index)

  • b) Unpacked Allocation (use bitmap field to index)

  1. Variable-Length Records

  • Requires a record directory (location + length for each record)

26
New cards

What is the definition of an Index?

A file containing structured references to records of another file

27
New cards

What is a Candidate Key (CD)?

A key that is able to uniquely ID a record

28
New cards

What is a Primary Key (PK)?

The selected/chosen candidate key for that record

29
New cards

What is a Secondary Key?

Any non-candidate key

30
New cards

What is a Sort Key?

The key used to order records within a file

31
New cards

What are two ways of classifying an index, and strategies for both?

  1. Ordered

  • a) Single-level (Sorted file)

  • b) Multi-level (B+Tree)

  1. Unordered

  • Hashing

32
New cards

What is a Primary Index? (Specifically, what 3 characteristics describe a Primary Index)

  1. The indexed field is a candidate key

  2. The index records are sorted on the key

  3. The DB file records are sorted on the key

33
New cards

How many primary indices can exist per file?

At most one

34
New cards

What is a Clustered Index? (Specifically, what 3 characteristics describe a clustered Index)

  1. The indexed field is a secondary key

  2. The index records are sorted on the key

  3. The DB file records are sorted on the key

<ol><li><p>The indexed field is a <strong>secondary key</strong></p></li><li><p>The index records are <strong>sorted</strong> on the key</p></li><li><p>The DB file records are <strong>sorted</strong> on the key</p></li></ol><p></p><p></p>
35
New cards

Per file, the # of primary indices + the # of clustered indices cannot _________.

Exceed one

36
New cards

Index-to-file references can be per-record or per-block. What does this mean?

This is saying that indices can point to the exact location of a record, or to the block that it is located within

37
New cards

What is a Secondary Index? (Specifically, what 3 characteristics describe a secondary Index)

  1. The indexed field is any field

  2. The index records are sorted on the key

  3. The DB file records are not sorted on the key

38
New cards

How many secondary indices can you have per file?

As many as you want, but special index construction may be needed

39
New cards

What is an example of a clustered index?

Student records where the clustered index is on the “Name” field (non-CK field, and clusters dusplicate names together when sorted)

40
New cards

What are the other two ways of categorizing indices? Explain both, give a pro for each

  1. Dense Indices

  • Holds a pointer to every record in the file of interest

  • Allows existence search w/ index only

  1. Sparse Indices

  • Holds a pointer to blocks of records in the file of interest

  • Faster to search the index, since it is smaller than dense

Explore top notes

note
APUSH Unit 1
Updated 1393d ago
0.0(0)
note
3 | Electron Configuration
Updated 981d ago
0.0(0)
note
Prepositions (copy)
Updated 142d ago
0.0(0)
note
Civil Rights Movement
Updated 291d ago
0.0(0)
note
Advance Directives
Updated 1375d ago
0.0(0)
note
APUSH Unit 1
Updated 1393d ago
0.0(0)
note
3 | Electron Configuration
Updated 981d ago
0.0(0)
note
Prepositions (copy)
Updated 142d ago
0.0(0)
note
Civil Rights Movement
Updated 291d ago
0.0(0)
note
Advance Directives
Updated 1375d ago
0.0(0)

Explore top flashcards

flashcards
Bio Cell Organelle Test
84
Updated 1195d ago
0.0(0)
flashcards
K4 Wörterliste A
36
Updated 337d ago
0.0(0)
flashcards
AIS Chapter 2
62
Updated 514d ago
0.0(0)
flashcards
Range of motion
23
Updated 626d ago
0.0(0)
flashcards
QMB 3200 Final Part A: Symbols
42
Updated 1177d ago
0.0(0)
flashcards
Bio Cell Organelle Test
84
Updated 1195d ago
0.0(0)
flashcards
K4 Wörterliste A
36
Updated 337d ago
0.0(0)
flashcards
AIS Chapter 2
62
Updated 514d ago
0.0(0)
flashcards
Range of motion
23
Updated 626d ago
0.0(0)
flashcards
QMB 3200 Final Part A: Symbols
42
Updated 1177d ago
0.0(0)