Philosophy Final

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/17

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 7:27 PM on 4/30/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

18 Terms

1
New cards

Power Seeking

The argument that some advanced AI systems will seek power in order to achieve goals and use the power in ways that are catastrophic to humanity

2
New cards

how might AI successfully gain power

supercapability, supernumerosity, human delegation

3
New cards

supercapability

Ai may surpass humans in skills needed to gain power and will be able to evade human oversight

4
New cards

supernumerosity

Software can be replicate itself cheaply and easily

5
New cards

Human delegation

humans outsource sensitive tasks to AI before the risk is apparent or concerning

→ AI gain control of military weapons or the economy

6
New cards

Intrumental convergance

many agents pursuing many goals will converge on similar instrumental strategies

→ gain resources, self-preservation, resist shutdown, increase capabilities

7
New cards

AI alignment

ensuring AI does what humans intend may be difficult because of reward misspecification and goal misgeneralization

8
New cards

reward misspecification

designers reward the wrong thing accidentally

→ ex. a boat racing game rewards the player with points when they win a race, however an AI racer goes in circles collecting points that way rather than winning the race

9
New cards

goal misgeneralization

AI behaves correctly in training situations but fails in new environments

→ AI is unable to apply the skills it has been taught to new situations correctly

10
New cards

The singularity

Once AI improvement is done by AI itself, progress becomes exponential until AI surpasses human intelligence

→ caused by power seeking

11
New cards

Chalmers’ Argument for the singularity

Once human-level AI exists, it may trigger recursive self-improvement, leading to superintelligence and an “intelligence explosion“

→ AI = Human Level

→ AI+ = intelligence beyond human level

→ AI++ = vastly superhuman intelligence

12
New cards

Proportionality thesis

increases in intelligence produce proportionate increases in the ability to design better intelligence

→ exponential growth

13
New cards

Possible defeaters of the singularity

  1. resource limits

  2. technological barriers

  3. humans halt development

    1. there is some skepticism about whether this would happen because having an advanced AI system leads to extreme economic and military advantages

14
New cards

Power-seeking argument for catastrophic risk

  • Advanced AI could pose a catastrophic risk because goal-directed AI systems may seek power as a means to achieve their goals, which could bring them into conflict with humans

  • main worry

    • not necessarily “evil AI“

    • ordinary goal pursuit may naturally produce power-seeking behavior, and power plus misalignment could be catastrophic

15
New cards

How power seeking could lead to catastrophe

  • if AI goals conflict with human interest

    • humans may become obstacles

    • AI may try to prevent shutdown or correction

    • It may seize resources humans depend on

  • this could lead to massive loss of human control and possible catastrophe

16
New cards

Why might we build these systems anyway

  1. competition pressures

  2. economic incentives

  3. deceptive alignment

    1. AI might appear safe during testing but behave dangerously after deployment

17
New cards

Orthogonality

  • it is thought that alignment can arise naturally

  • orthogonality thesis denies this

    • intelligence that concerns AI is purely competence, it is compatible with any goal or subgoal, including power seeking

18
New cards

opacity and deceptive alignment

opacity gives a false sense of safety because humans do not know what AI is collecting data on (harmful false proxies)