Philosophy Final

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/17

There's no tags or description

Looks like no tags are added yet.

Last updated 7:27 PM on 4/30/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

18 Terms

New cards

Power Seeking

The argument that some advanced AI systems will seek power in order to achieve goals and use the power in ways that are catastrophic to humanity

New cards

how might AI successfully gain power

supercapability, supernumerosity, human delegation

New cards

supercapability

Ai may surpass humans in skills needed to gain power and will be able to evade human oversight

New cards

supernumerosity

Software can be replicate itself cheaply and easily

New cards

Human delegation

humans outsource sensitive tasks to AI before the risk is apparent or concerning

→ AI gain control of military weapons or the economy

New cards

Intrumental convergance

many agents pursuing many goals will converge on similar instrumental strategies

→ gain resources, self-preservation, resist shutdown, increase capabilities

New cards

AI alignment

ensuring AI does what humans intend may be difficult because of reward misspecification and goal misgeneralization

New cards

reward misspecification

designers reward the wrong thing accidentally

→ ex. a boat racing game rewards the player with points when they win a race, however an AI racer goes in circles collecting points that way rather than winning the race

New cards

goal misgeneralization

AI behaves correctly in training situations but fails in new environments

→ AI is unable to apply the skills it has been taught to new situations correctly

New cards

The singularity

Once AI improvement is done by AI itself, progress becomes exponential until AI surpasses human intelligence

→ caused by power seeking

New cards

Chalmers’ Argument for the singularity

Once human-level AI exists, it may trigger recursive self-improvement, leading to superintelligence and an “intelligence explosion“

→ AI = Human Level

→ AI+ = intelligence beyond human level

→ AI++ = vastly superhuman intelligence

New cards

Proportionality thesis

increases in intelligence produce proportionate increases in the ability to design better intelligence

→ exponential growth

New cards

Possible defeaters of the singularity

resource limits
technological barriers
humans halt development
1. there is some skepticism about whether this would happen because having an advanced AI system leads to extreme economic and military advantages

New cards

Power-seeking argument for catastrophic risk

Advanced AI could pose a catastrophic risk because goal-directed AI systems may seek power as a means to achieve their goals, which could bring them into conflict with humans
main worry
- not necessarily “evil AI“
- ordinary goal pursuit may naturally produce power-seeking behavior, and power plus misalignment could be catastrophic

New cards

How power seeking could lead to catastrophe

if AI goals conflict with human interest
- humans may become obstacles
- AI may try to prevent shutdown or correction
- It may seize resources humans depend on
this could lead to massive loss of human control and possible catastrophe

New cards

Why might we build these systems anyway

competition pressures
economic incentives
deceptive alignment
1. AI might appear safe during testing but behave dangerously after deployment

New cards

Orthogonality

it is thought that alignment can arise naturally
orthogonality thesis denies this
- intelligence that concerns AI is purely competence, it is compatible with any goal or subgoal, including power seeking

New cards

opacity and deceptive alignment

opacity gives a false sense of safety because humans do not know what AI is collecting data on (harmful false proxies)