1/64
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What are the 5 data ownership roles?
Data Owner (accountable for the data), Data Custodian (daily maintenance and administration), Data Steward (ensures context and meaning are understood), Data Processor (manipulates/stores/moves data on behalf of owner), Data Controller (international law term for data owner).
Who is the data owner in a cloud environment?
The cloud customer. The cloud provider is the data processor. Even if the provider is negligent, the customer remains legally responsible for their data.
What is the difference between a data custodian and a data steward?
Custodian = technical role, handles daily maintenance and applies security controls. Steward = governance role, ensures data is used properly and its meaning is understood.
What is a data processor?
Any organization or person who manipulates, stores, or moves data on behalf of the data owner. The cloud provider is typically the data processor for a customer's data.
True or False: Data owners are not legally responsible for data once it is handed to a cloud provider.
False. Data owners remain legally responsible for all data they own, even if a data processor several times removed from them causes a breach.
What are the 3 common bases for data classification?
Sensitivity (impact of unauthorized disclosure), Jurisdiction (where data is stored or sourced from, e.g. EU GDPR), and Criticality (how essential the data is to organizational survival).
What is data mapping?
The process of normalizing and translating data shared between organizations or systems so that classifications and security controls carry over properly. Without it, data may lose its protections when transferred.
What is data labeling and what should a label include?
Marking data so its classification and handling requirements are clear. Labels typically include: date of creation, date of scheduled destruction, confidentiality level, handling directions, access limitations, source, jurisdiction, and applicable regulation.
What are the 3 methods of data discovery?
Label-based (uses existing labels/tags), Metadata-based (uses data about data, like file properties), and Content-based (inspects the actual content using term searches, pattern matching like SSNs or dates).
What is the difference between structured, semi-structured, and unstructured data?
Structured = organized in a relational database (easiest to search). Semi-structured = uses tags/elements like JSON or XML (flexible but searchable). Unstructured = unsorted, like emails or free-form text (hardest to search).
What type of data is stored in a MySQL database?
Structured data. Relational databases always store structured data.
What type of data is JSON or XML?
Semi-structured data. It uses tags and elements to create fields without requiring rigid structure.
What are the 6 key functions an IRM system should provide?
Persistent protection, dynamic policy control, automatic expiration, continuous auditing, replication restrictions, and remote rights revocation.
What is IRM and how does it differ from DRM?
DRM (Digital Rights Management) controls access to media like movies/games. IRM (Information Rights Management) applies DRM techniques to individual files and documents, controlling actions like editing, copying, printing, forwarding, and deleting.
What is provisioning in the context of IRM?
The process of giving users the appropriate rights and permissions based on their roles. It ensures IRM doesn't disrupt business operations while still managing data rights effectively.
What is a legal hold (litigation hold)?
A requirement to preserve data relevant to a legal case or investigation. It overrides normal data retention and destruction policies until the hold is lifted.
What key items must a data retention policy address?
Retention periods, applicable regulations and compliance requirements, data classification, archiving procedures, deletion processes, and monitoring/maintenance/enforcement.
What is crypto-shredding and why is it the best option in the cloud?
Cryptographic erasure - destroying the encryption keys so the data becomes permanently unreadable. It's the only practical secure disposal method in cloud environments where physical destruction, degaussing, and overwriting aren't feasible.
What data destruction methods are available on-premises but NOT in the cloud?
Physical destruction (burning, shredding, drilling), degaussing (magnetic erasure), and overwriting. In the cloud, only crypto-shredding is reliably effective.
What is permissions creep?
When users accumulate access rights over time beyond what their current role requires, often because permissions were never removed when roles changed. It's a risk that data labeling and IRM provisioning help prevent.
What does an audit policy for data typically include?
Audit periods, audit scope, audit responsibilities (internal/external), audit processes and procedures, applicable regulations, and monitoring/maintenance/enforcement.
What is packet capture and why is it limited in cloud environments?
Packet capture records network traffic for security analysis. It's commonly available on-premises but is unavailable or severely limited in SaaS and PaaS cloud environments due to shared infrastructure.
True or False: System owners are always the data owners of the data on their systems.
False. A cloud provider may own the underlying infrastructure (system owner) but the customer owns the data stored on it (data owner).
What is the difference between data ingress and egress costs?
Ingress = cost of moving data INTO the cloud (usually free or low). Egress = cost of moving data OUT of the cloud (often significant). This matters for data discovery because pulling large datasets out for analysis can be expensive.
What are the 3 types of data analytics methods?
Data mining (analyzing large datasets to find unknown patterns), real-time analytics (analyzing data as it is created), and business intelligence (iterative tools that detect trends across historical and recent data).
What is NOT a common IRM function?
Crypto-shredding. IRM functions are: persistency, dynamic policy control, automatic expiration, continuous auditing, replication restrictions, and remote rights revocation. Crypto-shredding is a data destruction method, not an IRM function.
What is NOT a common data discovery method?
User-based discovery. The three common methods are label-based, metadata-based, and content-based.
What does all policies need to include?
Policy maintenance, policy monitoring, and policy enforcement. Policy transference is not a standard policy component.
What is the difference between redundancy and resilience?
Redundancy is backup equipment capacity to take over when primary fails; resilience is the data center's overall ability to keep operating despite failures.
What are the key factors in choosing a data center location?
Cost of electricity, high-speed connectivity availability, likelihood of natural disasters, temperature control challenges, proximity to other data centers.
What is the build vs. buy decision for data centers?
Build involves high cost but full control over design, location, and customization; buy involves leasing space in an existing facility, shared staffing and infrastructure, lower cost, less control.
What are the 4 Tier 1 data center requirements?
UPS system, area to house IT systems, dedicated cooling, power generator. Protects against human error only - NOT outages or disasters.
What does Tier 2 add over Tier 1?
Generators, UPS devices, chillers, cooling units, pumps, fuel tanks. Critical operations are NOT interrupted by planned maintenance.
What is Tier 3 known as and what does it add?
"Concurrently maintainable site infrastructure" - adds multiple distribution paths so only one path is needed at a time while others are maintained.
What is Tier 4?
The highest level - independent and physically isolated redundant systems at both component and distribution path levels, not disrupted by planned or unplanned events.
How much downtime does Tier 1 allow per year?
Up to 28.8 hours per year.
What is tenant partitioning?
Separating multiple tenants in the same physical data center space at rack, cage, bay, or facility level using locked racks, cages, cameras, and security staff.
What is KVM access?
Keyboard, Mouse, and Video access - a method of directly accessing server hardware that requires access controls, logging, and monitoring.
What is distributed resource scheduling?
Managing resources across a cluster to optimize reliable service delivery, providing resources to VMs to meet SLAs, allowing migrations during maintenance.
What is dynamic optimization?
Assesses performance in real time and takes action to meet desired targets using real-time data and defined goals.
What is maintenance mode in virtualization?
Safely removes a host from a cluster for maintenance by transferring running guest OSes to other nodes first.
What is high availability (HA) in virtualization?
Guest OSes can be moved to other hardware if a failure occurs, allowing systems to continue operating where hardware failure would cause an outage.
What is ephemeral computing?
Quickly standing up virtual systems then shutting them down when no longer needed, enabling horizontal scaling using many smaller systems.
What is serverless technology?
Replaces constantly running servers with code that runs only when needed, billed on an as-used basis for efficient resource use.
What is the difference between tightly coupled and loosely coupled storage clusters?
Tightly coupled = devices directly connected to shared physical backplane; loosely coupled = logically connected only, performance does NOT scale additively.
What is RAID and how does it work?
Redundant Array of Independent Disks - stores data across multiple disks using striping; if one drive fails, data can be recovered from others.
What is data dispersion?
Distributing data among multiple data centers or locations to prevent data loss or availability issues from disruptions.
What is RDP and what are its security controls?
Remote Desktop Protocol - Windows native remote desktop over encrypted channel. Controls include strong passwords, MFA, restrict users, account lockout policies.
What is SSH and what are its security controls?
Secure Shell - command-line access to Linux/Unix systems. Controls include MFA, SSH certificates, firewalls.
What is a jumpbox/bastion host?
A system placed at the boundary between a lower-security and higher-security zone, acting as a controlled entry point.
What are virtual clients?
Software tools allowing remote connection to a VM as if it were your local system, with processing and data staying in a trusted data center.
What is a SOC?
Security Operations Center - centralized facility for continuous monitoring of network performance and security controls.
What tools does a SOC monitor?
Firewalls, IDS/IPS logs, honeypots, AI/ML detection tools.
What are the 3 goals of incident response?
Minimize loss of value/assets, continue service provision, halt increase of damage.
What are the 6 incident response steps in order?
Preparation, Identification, Containment, Eradication, Recovery, Post-Incident Activities.
What happens in the Containment phase of incident response?
Short-term: isolate the affected network segment; long-term: temporary fixes to allow production use while rebuilding clean systems.
What happens in Post-Incident Activities?
Full documentation, retrospective, root cause analysis, evaluate improvements in the IR process within 2 weeks after the incident.
What is multivendor pathway connectivity?
Ensuring connectivity through more than one ISP that do NOT share upstream dependencies to prevent single event disruptions.
What is instance isolation?
Each VM should be logically isolated from others using firewalls and security group controls.
What is host isolation?
Physically and logically isolating underlying host servers from each other to minimize connections.
What are hypervisor hardening best practices?
Patch and update to vendor standards, restrict superuser accounts, require MFA, use logging and alerting, limit access to authorized users.
What is JIT (Just in Time) management?
Privileged access management that provides rights only when needed, assigning privileges on the fly.
What are common physical security design elements for data centers?
Fencing, guard patrols, video surveillance, controlled entry points, fire detection and suppression.
What is power resilience in data centers?
Connecting to multiple grids, distinct physical paths for power, UPS for brief outages, redundant generators for longer outages.
What container security practices are important?
Secure container images, secure orchestration platform, monitor environment, manage secrets, validate signatures.