Chapter 10 MAIO

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/40

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 5:12 PM on 6/4/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

41 Terms

1
New cards

What are the 4 main learning goals of Lecture 9 on AI and Risks?

Understanding AI risks, understanding what makes AI robust, understanding the dangers of emergent AI capabilities and behaviors, and mastering AI system governance.

2
New cards

How is AI applied in front-office banking operations?

Using smile-to-pay facial scanning, biometrics for authentication, conversational bots for basic servicing, humanoid robots in branches, and micro-expression analysis via virtual loan officers.

3
New cards

How is AI applied in back-office banking operations?

Leveraging machine learning to detect fraud patterns and cybersecurity attacks, alongside real-time transaction analysis for risk monitoring.

4
New cards

What are 5 prominent AI use cases in the insurance industry?

Automated underwriting, automated inspections, claims fraud detection, claims adjudication, and customer lifetime value prediction.

5
New cards

What regulatory constraint did UK watchdogs place on banks using algorithmic loan approvals?

Banks must prove their technology does not worsen discrimination against minorities, recognize inherent algorithmic flaws, improve transparency, and assume legal responsibility.

6
New cards

What are the top 3 common entities associated with AI harms in the AI Incident Database?

  1. Facebook (48 incidents), 2. Tesla (36 incidents), and 3. Google (28 incidents).
7
New cards

List 4 notable corporate AI incidents logged in the AI Incident Database.

Cybercriminals abusing ChatGPT to develop malware, OpenAI's "Sky" voice assistant allegedly imitating Scarlett Johansson's voice without licensing, users bypassing ChatGPT content filters, and Kenyan data annotators exposed to graphic content.

8
New cards

What 3 core characteristics make contemporary AI risks boundless?

Advanced models have direct access to the internet, possess self-replication capabilities because we taught them to code, and understand human behaviors deeply.

9
New cards

What is the mathematical formula for the decomposition of risk?

RiskVulnerability×Hazard Exposure×HazardRisk \approx Vulnerability \times Hazard\ Exposure \times Hazard.

10
New cards

Define the 3 components that determine risk according to the decomposition framework.

Vulnerability is a factor increasing susceptibility to damaging effects; Hazard Exposure is the extent to which elements are subjected to danger; Hazard is the source of danger with potential to harm.

11
New cards

How are Robustness, Monitoring, and Alignment mapped onto the risk decomposition framework?

Robustness withstands hazards; Monitoring identifies emergent hazards; Alignment reduces the probability and severity of inherent model hazards.

12
New cards

What best-selling book explores how machines can learn human values?

"The Alignment Problem: Machine Learning and Human Values" by Brian Christian.

13
New cards

What are the 2 primary classifications of attacks targeting AI model robustness?

Adversarial Attacks and Privacy Attacks.

14
New cards

What is an Adversarial Attack?

An attack designed to fool a machine learning model into making mispredictions by injecting a small, virtually imperceptible perturbation (adversarial example) into the input data.

15
New cards

Give 3 real-world examples of visual adversarial attacks.

Applying small tape pieces to a Stop sign so a network predicts a 45mph sign, rotating a handwritten digit 7 by -35 degrees so it is classified as a 3, and wearing specialized glasses to pull off a 100% success impersonation of John Malkovich.

16
New cards

What is Adversarial Prompting in Large Language Models?

A jailbreaking method (e.g., the DAN / "Do Anything Now" prompt) where a user forces the LLM to pretend it has broken free from typical safety constraints and rules.

17
New cards

What is the primary successful defense used in practice to avoid adversarial examples?

Incorporating adversarial examples directly into the model's training loop (adversarial training).

18
New cards

What is the core objective of a Privacy Attack against a machine learning model?

To gain unauthorized knowledge about the private training data, the model parameters, or specific properties of the dataset.

19
New cards

What are the 3 common types of Privacy Attacks?

Model Inversion, Data Extraction, and Membership Inference.

20
New cards

What happens during a Model Inversion attack?

A malicious client queries a white-box or black-box model to reconstruct representative training inputs, such as recovering a face image using only a person's name in a facial recognition system.

21
New cards

What is a Data Extraction attack and where is it most common?

An attack where a client queries a model to extract exact training samples by exploiting model memorization; this is heavily prevalent in Large Language Models.

22
New cards

What is a Membership Inference attack and how is it defended against?

An attack where a client uses a model to determine whether a specific individual's data point was included in the training dataset; it is defended using differential privacy.

23
New cards

Contrast "Machine Unlearning" and "Data Minimization" under GDPR compliance.

Machine Unlearning honors the right to be forgotten (Article 17) by allowing users to withdraw consent and remove data from trained models; Data Minimization (Article 4) restricts training data to what is strictly necessary.

24
New cards

Give 2 real-world legal examples involving Machine Unlearning.

A lawsuit against Stable Diffusion for training models on private medical images without consent, and a copyright lawsuit against Microsoft/OpenAI Copilot for utilizing open-source code without proper attribution.

25
New cards

What financial penalty was issued regarding a violation of Data Minimization rules in ML?

The Dutch Tax Administration was fined 2.75 million euros for using nationality data in an eligibility prediction model that discriminated against specific groups.

26
New cards

What is the difference between Emergent Capabilities and Emergent Behaviors?

Emergent Capabilities are new abilities (like arithmetic or translation) that are suddenly unlocked at critical parameter scales; Emergent Behaviors are nonobvious, difficult-to-foresee side effects of training (like AI bias).

27
New cards

What are 3 distinct examples of emergent capabilities observed in scaled AI training?

Spikes in Massive Multitask Language Understanding (MMLU) accuracy, solving 3-digit addition, and multi-agent systems inventing non-obvious strategies (e.g., the Tic-Tac-Toe memory bomb).

28
New cards

Why are emergent capabilities risky from a safety perspective?

They can lead to the emergence of unintended goals, most notably Self-Preservation, which is instrumentally useful for an agent to avoid being shut off.

29
New cards

What does "Instrumental Convergence" mean for advanced AI systems?

The tendency for sufficiently advanced agents to pursue similar intermediate goals—such as power, self-preservation, cognitive enhancement, and resource acquisition—regardless of their primary instructions.

30
New cards

Give an example of how emergent behavior bias affects non-native English speakers.

Text-analysis tools built to detect AI-generated writing falsely branded essays written by non-native English speakers from China as AI-generated 61% of the time.

31
New cards

What 3 core areas serve as the structural sources of bias and unfairness in AI systems?

  1. Training data (human history/underrepresentation), 2. AI/ML model design (algorithmic bias/benchmarks), and 3. AI deployment (popularity exposure feedback loops).
32
New cards

What is the fundamental objective of the AI Alignment problem?

To create an agent that behaves in accordance with human intentions by aligning the model's implicit goals and values with those of its user.

33
New cards

What are 2 real-world harms linked to misaligned corporate algorithms?

A Facebook internal report revealing that 64% of people who joined extremist groups did so because algorithms steered them there, and a Belgian man committing suicide after being encouraged by an AI chatbot named Eliza.

34
New cards

Contrast "The Cobra Effect" and "Goodhart's Law" in optimization.

The Cobra Effect occurs when a well-intentioned measure backfires and produces the opposite result; Goodhart's Law states that when a measure becomes a target, it ceases to be a good measure.

35
New cards

How does a Reward Learning architecture work to prevent alignment issues?

Instead of manual reward specification, an RL algorithm interacts with an environment while a Reward Predictor is dynamically updated based on continuous Human Feedback.

36
New cards

What are the 3 pillars of the HHH Framing for AI alignment?

Helpful (fulfills the user's intent), Honest (provides accurate info and matches beliefs), and Harmless (refuses to generate dangerous or destructive content).

37
New cards

Distinguish between a Truthful AI and an Honest AI.

A Truthful AI avoids asserting false statements relative to real-world facts (refusing to answer is truthful); an Honest AI ensures its output matches its internal state and beliefs.

38
New cards

What are the maximum financial penalties for breaching the EU General Data Protection Regulation (GDPR)?

Fines up to 4% of an organization's annual global turnover or 20 million euros, as demonstrated by Meta's landmark 1.2 billion euro fine for mishandling transatlantic data transfers.

39
New cards

What are the 4 risk tiers established by the EU AI Act?

  1. Unacceptable Risk (Prohibited), 2. High Risk (Requires Conformity Assessment), 3. Limited Risk (Requires Transparency), and 4. Minimal Risk (Requires Code of Conduct).
40
New cards

Map specific AI applications to the 4 risk levels of the EU AI Act.

Unacceptable: Social scoring and dark-pattern manipulation; High: Education, employment, and justice; Limited: Chatbots and deepfakes; Minimal: Spam filters and video games.

41
New cards

List 5 strict compliance obligations required for providers of High-Risk AI systems under the EU AI Act.

Implementing a risk management system, ensuring data governance, drawing up technical documentation, maintaining automatic system logging, ensuring human oversight, and affixing a CE marking.