AI in Software Engineering
Al in Software Engineering - Comprehensive Study Notes
Page 1
Introduction
Presenters:
Cigdem Sengul
Rumyana Neykova
Affiliation:
Computer Science Department - Brunel University London
Page 2
Part 1: The Current Landscape
Theme: How AI is reshaping the way software is built
Page 3
AI Coding Tools — Adoption in 2025
Statistics on AI in Developer Workflows:
76% of professional developers use AI tools or plan to do so.
41% of all code is now generated by AI.
82% of developers utilize AI on a weekly or daily basis in their working processes.
3.5× average return on investment (ROI) from enterprise AI investments.
Sources:
Stack Overflow Dev Survey 2024
EliteBrains 2025
Microsoft Market Study 2025
Areas of AI Utilization by Developers:
Code writing: 82%
Debugging: 68%
Documentation: 57%
Code review: 45%
Page 4
The AI Paradox
Observation: AI tools enhance developers' speed but do not correlate with increased software delivery.
Quote: "AI coding assistants are making developers more productive at writing code. But why aren't most enterprises actually delivering more software?" — Bill Staples, CEO of GitLab (GitLab, Feb 2026)
Challenges in Software Delivery:
Time Allocation:
Code accounts for only 10–20% of a developer's day.
Developers spend 80–90% on reviews, security scans, pipeline waits, and compliance checks.
Tool Sprawl:
60% of teams use 5 or more development tools.
49% use 5 or more AI tools, leading to fragmented toolchains that diminish time savings from AI.
Quality Concerns:
48% of AI-generated code may harbor security vulnerabilities.
Review backlogs can increase as output speed rises.
Sources:
GitLab Global DevSecOps Report 2025
Faros AI Productivity Paradox Report 2025
Page 5
Signals from the Frontier
Reporting from Leading Companies as of Early 2026
Anthropic:
90–95% of Claude Code’s codebase is written by Claude itself.
Spotify:
Shipped 50+ features in 2025 through AI-driven workflows. Notably, top developers haven’t written code since December.
Microsoft:
Approximately 30% of code generated by AI, with CEO Satya Nadella attributing major output gains to AI tools.
Google:
21% of code classified as AI-assisted, documented as one of the largest enterprise-level AI coding deployments.
Barriers to Adoption:
Regulatory: Compliance with GDPR and financial/health services regulations.
Technical: Legacy systems and low test coverage create integration challenges.
Cultural: Resistance from teams and leadership alongside issues of trust.
Economic: High costs of enterprise tools prevent many SMEs from accessibility.
Page 6
Part 2: AI Across the Software Development Life Cycle (SDLC)
Phases of SDLC:
Planning
Requirements
Design
Coding
Testing
Deployment
Maintenance
Page 7
LLM Papers in SDLC
Research State:
Most mature phase is Development at 56.65% of studies.
Requirements and Design phases are the least explored at <5%.
Quality Assurance is emerging at 15.14%.
Maintenance gaining traction at 22.71%.
Page 8
What LLMs Can and Cannot Do - Evidence-Based Summary
Development Phase:
Successful Areas:
Code generation (72% success rate)
Documentation
Refactoring
Code review
Limitations:
Struggles with business logic and system-level understanding.
Testing Phase:
Successful Areas:
Unit test generation and oracle generation
Regression tests with a 48% bug detection rate (Tian et al., 2023)
Limitations:
Limited effectiveness in domain edge cases and integration testing.
Requirements Phase:
Successful Areas:
Classification, generation, translation to templates.
Effective augmentation tool for story inspiration.
Limitations:
Little industrial validation (3.9% of studies).
Needs human analysts for context clarification.
Design Phase:
Successful Areas:
UML generation and specification synthesis yielding 21% improvement (SpecSyn, EASE 2023).
Limitations:
Weak understanding of design patterns and architectural reasoning; least explored (0.92%).
Maintenance Phase:
Successful Areas:
Automated problem resolution (APR) leads to effective bug reporting (162/337 bugs at $0.42/bug).
Limitations:
Risky for system-wide changes and context-sensitive legacy systems.
Page 9
Spotlight: Requirements Engineering with AI
Objective: Automate the understanding of software requirements.
Processes Involved:
Elicitation: Utilizing conversational agents to interview stakeholders, summarize needs, and detect conflicts.
Specification: LLMs converting meeting notes/user stories into formal requirements formats (e.g., IEEE 830/EARS).
Validation: Models flagging ambiguous, incomplete, or contradictory requirements pre-development.
Traceability: AI linking requirements to code, tests, and documentation for impact analysis and coverage reporting.
Page 10
Which AI Coding Tool?
Claude Code:
Acts as an agentic terminal, autonomously writing, running, testing, and committing full-codebase features.
Cursor:
AI-native IDE designed for deep, context-aware edits in a tailored environment.
GitHub Copilot:
Best for real-time suggestions, chat assistance, and pull request summaries within established enterprises.
Amazon CodeWhisperer:
Specialized for deep integration with AWS tools and security scanning for cloud-based development.
Tabnine:
Optimal choice for on-premise requirements in regulated industries ensuring data privacy.
Page 11
Part 3: Prompting & RAG
Purpose: Foundations underpinning functionality when using LLMs.
Page 12
Prompting Strategies
Impact of Model Communication on Output Quality:
Zero-shot Prompting:
Example Prompt: "Generate a JUnit test for a function that matches players by skill level."
Result: Generic test without edge case considerations.
Few-shot Prompting:
Example Prompt: "Here are 2 test examples. Generate tests for match_players() in the same style."
Result: Includes fixture setup, edge case assertions, and consistent naming - higher completeness.
Chain-of-thought Prompting:
Example Prompt: "First identify valid inputs and edge cases, then generate JUnit tests for match_players()."
Result: Systematic reasoning covers varied scenarios, providing the best coverage without examples.
Role/System Prompt:
Example Prompt: "You are a senior security engineer reviewing code for vulnerabilities."
Sets model persona and limitations crucial for agentic contexts.
Structured Output:
Prompt: "Respond only in JSON with keys: issue, severity, recommendation."
This format facilitates downstream parsing, essential for integration with tools.
Page 13
Beyond Prompting: Importance of Prompting
Exploration of whether prompting's quality truly influences outcomes.
Page 14
The Evolution of Technology Stack
Previous State: Prompting through chat interfaces and one-shot generation with human involvement for every task.
Current State: Introduction of Retrieval-Augmented Generation (RAG) and Tool Use.
RAG allows AI to fetch context before responding, enhancing the accuracy of outputs.
Future Outlook: Agents capable of independent actions, utilizing a Model Context Protocol (MCP) to connect multiple tools seamlessly.
Page 15
Retrieval-Augmented Generation (RAG)
Process: How LLMs Address Questions Using Real-world Knowledge:
User Query - Initiates the process.
Retrieve Documents - Pulls relevant documents from a database according to the query.
Augment Prompt - The retrieved context is modified for the LLM response.
Generate Response - The LLM synthesizes context plus query for an answer.
Advantages:
Reduced hallucination risks, permit real-time knowledge updates, and ensure traceable sources through semantic search.
Page 16
RAG Architecture
Summary of Workflow Steps:
Encode Documents into a Vector Database
Execute Similarity Searches
Generate Queries for LLM
Return Responses
Page 17
RAG-Bingo
A participatory element involving various concepts within RAG, prompting, and AI tools.
Page 18
Part 4: Agents & the Future
Focus: Transitioning to autonomous action with multi-step execution.
Page 19
What Does "Agentic AI" Mean?
Definition: Moving from single-turn interactions to autonomous, multi-step task execution.
Traditional LLM Interaction Workflow:
User writes prompt.
Model generates text.
User copies results to run and evaluates outcome.
Process repeats manually.
Agentic AI Workflow:
User specification leads to autonomous planning, tool usage, iteration, and presentation of results.
Key Insight: The agent dictates the subsequent actions rather than the user.
Page 20
Anatomy of an LLM Agent
Explores the structural components and functionalities of an AI agent's architecture.
Page 21
AI Test Generation Agent Workflow
Input: Function Code
Workflow Steps:
Agent Configuration (Including Memory Settings, LLM Parameters, Tool Selections)
ReAct Generation Loop: Analyzing function, generating tests, validating results, refining strategies.
Page 22
Model Context Protocol (MCP)
Purpose: Standardizing AI agent interactions with tools and data sources.
Overview:
MCP serves as an open standard allowing LLMs an interface to various resources including APIs and databases.
Benefits:
Decouples context provision from model logic, enhances integration flexibility, and promotes controlled actions across various applications.
Page 23
5 Trends to Watch in AI Software Engineering
Full SDLC Automation:
GitLab Duo aims to automate the entire software lifecycle from issue identification to deployment.
Specification-Driven IDEs:
Example: Kiro (Amazon AWS) focuses on generating requirements and implementation directly from natural language specifications.
AI as Sub-Agent Teams:
Claude Cowork orchestrates sub-agents for parallel workflows in a shorter timeframe.
VS Code as Agent Command Center:
Evolving IDEs coordinating multiple specialized AI agents.
Recursive AI Development:
AI tools developing additional AI, creating a rapid iteration environment.
Page 24
Part 5: Risks
Exploring the Risks of AI Technologies in Software Engineering.
Page 25
When AI Gets it Wrong
Case Study: A CTO reflects on the challenges of balancing innovative AI development with practical software creation.
Insights: The dilemma of potentially using non-reliable AI approaches versus foundational practices could lead to a trade-off between speed and maintainability.
Page 26
Risks Associated with AI in Software Engineering
Hallucination & Accuracy:
LLM outputs may contain plausible but incorrect code, highlighting the need for human oversight.
Security & Data Privacy Risks:
Usage of cloud-based AI may expose proprietary code, necessitating a clear understanding of data retention policies.
Licensing & Copyright Issues:
Potential legal ambiguities surrounding reproduction of GPL-licensed code by AI tools.
Cost, Dependency, & Environmental Concerns:
Financial lessons from API pricing and the ecological footprint of AI inference processes should inform decision-making.
Page 27
The Changing Developer Skillset
Skills in Decline:
Memorizing syntax, writing boilerplate code, and basic unit tests rapidly automated.
Skills on the Rise:
Crafting precise specifications, critical evaluation of AI outputs, system design, and AI literacy for orchestration.
Page 28
Key Takeaways
AI is widespread yet not transformational at an enterprise level.
Coding remains a small percentage of the entire SDLC, with bottlenecks occurring downstream.
Understanding prompting and Retrieval-Augmented Generation (RAG) as foundational; mastery of agents and MCP is essential for future adaptability.
Shift towards higher abstraction levels, moving from direct coding to conceptual specifications and intents.
Engineers need to cultivate AI literacy encompassing orchestration, evaluative skills, and informed judgment.
Page 29
Try This at Home
Input your project brief into LLMs to identify ambiguous requirements.
Install and interact with Cursor for AI-assisted coding tasks, reflecting on outcomes.
Experiment with function test generation prompts across zeros, few-shot, and chain-of-thought models.
Initiate a security vulnerability review on a piece of code and verify the findings.
Engage in discussion regarding AI productivity metrics and skills maintenance in the context of evolving technology.