Competitive Programming with Large Reasoning Models

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/8

flashcard set

Earn XP

Description and Tags

Vocabulary flashcards related to competitive programming and large reasoning models.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

9 Terms

1
New cards

Competitive programming

A challenging benchmark for evaluating reasoning and coding proficiency, demanding advanced computational thinking and problem-solving skills.

2
New cards

Large Reasoning Models (LRMs)

Language models trained via reinforcement learning to reason and think through extended chains of thought.

3
New cards

Chain-of-thought reasoning

An internal process where the model works through a challenging problem step by step before answering, refined by reinforcement learning to identify and correct errors, break down complex tasks, and explore alternative solution paths.

4
New cards

CodeForces

A programming competition website that hosts live contests and is internationally competitive.

5
New cards

OpenAI o1

A large language model trained with reinforcement learning to tackle complex reasoning tasks and improve programming performance, also trained to use external tools for code execution and verification.

6
New cards

OpenAI o1-ioi

A fine-tuned system based on o1 tailored to compete in the 2024 International Olympiad in Informatics (IOI), incorporating specialized test-time inference strategies engineered for competitive programming.

7
New cards

OpenAI o3

A model that has significantly advanced reasoning capabilities and does not depend on coding-specific test-time strategies defined by humans, instead complex test-time reasoning strategies emerged naturally from end-to-end RL.

8
New cards

HackerRank Astra dataset

A dataset composed of project-oriented coding challenges, designed to simulate real-world software development tasks and assess problem-solving skills in complex, multi-file, long-context scenarios.

9
New cards

SWE-bench Verified

A human-validated subset of SWE-bench that more reliably evaluates AI models' ability to solve real-world software issues.