Software Engineering Interview Study Guide 2025-2026

Core Backend Concepts

REST & HTTP Fundamentals

REST (Representational State Transfer) Definition: An architectural style for providing standards between computer systems on the web.
The 6 Constraints of REST: * (1) Client-Server Separation: The user interface and data storage concerns are independent. * (2) Statelessness: Every request must carry all information needed to be processed. The server stores zero client context between requests. * (3) Cacheable: Responses must define themselves as cacheable or not to prevent clients from reusing stale data. * (4) Uniform Interface: Simplifies and decouples the architecture, allowing each part to evolve independently. * (5) Layered System: A client cannot ordinarily tell whether it is connected directly to the end server or an intermediate (e.g., load balancer). * (6) Code on Demand (Optional): Servers can temporarily extend client functionality by transferring executable code (e.g., JavaScript).
HTTP Methods: PUT vs. PATCH: * PUT: Replaces the entire resource. It is idempotent. If fields are missing in the request, they get nulled out in the database. * PATCH: Partially updates a resource. It only touches the specific fields sent in the request.
Essential HTTP Status Codes: * $200$ OK: Success. * $201$ Created: Success, resource created. * $204$ No Content: Success, but nothing to return. * $400$ Bad Request: Client-side error in request. * $401$ Unauthorized: Not authenticated. * $403$ Forbidden: Authenticated, but lacks required permissions. * $404$ Not Found: Resource non-existent. * $409$ Conflict: Conflict with current state (e.g., duplicate entries). * $422$ Unprocessable Entity: Semantic errors (often used for validation errors). * $429$ Too Many Requests: Rate limit exceeded. * $500$ Internal Server Error: Generic server-side failure. * $503$ Service Unavailable: Server temporary overload or maintenance.
REST vs. GraphQL: * REST: Chosen for simplicity, established standards, superior caching, and typical CRUD APIs. * GraphQL: Chosen for flexible client queries, avoiding over-fetching/under-fetching, or supporting diverse frontend clients. * Trade-offs: GraphQL introduces complexity overhead, such as the $N+1$ problem if not batched, and more difficult caching logic.
HATEOAS (Hypermedia As The Engine Of Application State): * Concepts: Server responses include hypermedia links to valid next actions. * Advantage: Allows clients to discover the API dynamically. * Practicality: Rarely implemented in real-world scenarios but a frequent interview topic.

Authentication & Security

JWT vs. Session-based Authentication: * JWT (JSON Web Token): Stateless, scales horizontally because no server-side storage is needed. Disadvantage: Cannot invalidate tokens before expiry; requires short $TTL$ and refresh tokens. * Session: Stored on the server. Advantage: Easy to revoke sessions. Disadvantage: Requires sticky sessions or a shared session store like Redis in distributed environments.
SQL Injection Prevention: * Method: Always use parameterized queries or prepared statements. * Warning: Never interpolate user input directly into SQL strings. ORMs typically handle this by default. Do not rely solely on input sanitization.
CSRF (Cross-Site Request Forgery): * Definition: A malicious site tricks a user's browser into performing an unwanted action on a site where they are logged in. * Mitigations: Use CSRF tokens (server-generated nonces in forms), use the SameSite cookie attribute, and check Origin/Referer headers.
Encryption Types: * Symmetric: One shared key (e.g., AES). Fast and suitable for bulk data. * Asymmetric: Public/private key pair (e.g., RSA, ECC). Slower, but allows key exchange without shared secrets. HTTPS uses asymmetric to exchange a symmetric session key.
OAuth2 Flow Simplified: * $1$ : Resource Owner (user) grants permission to the Client app. * $2$ : Authorization Server issues an access token. * $3$ : Client uses the token to call the Resource Server (API). * Note: Access tokens are short-lived; refresh tokens extend sessions without re-login.

Caching

The Caching Layers: Browser cache → CDN cache → Load balancer cache → Application-level (Redis/Memcached) → DB query cache → CPU cache.
Cache Eviction Strategies: * LRU (Least Recently Used): The most common default. * LFU (Least Frequently Used): Prioritizes keeping high-frequency data. * FIFO (First-In, First-Out): Simple queue logic. * TTL (Time To Live): Expiry based on a timer.
Cache Patterns: * Cache-aside (Lazy Loading): Application checks the cache; on a miss, queries the database, stores result in cache, and returns. It is resilient to cache failures. * Write-through/Write-behind: Alternative patterns where cache updates happen during writes.
When NOT to Cache: Regularly changing data, security-sensitive/per-user responses, or when strict consistency is required. Over-caching leads to stale data bugs.

Advanced API Topics

Idempotency: * Definition: An operation is idempotent if calling it $N$ times has the same effect as calling it once. * Status: GET, PUT, and DELETE are idempotent. POST is not. * Importance: Critical for retry logic on network failures.
API Versioning Approaches: * (1) URL Path: /api/v1/users (Most common and clearest). * (2) Header: Accept: application/vnd.api+json;version=1. * (3) Query Parameter: /users?version=1.
Rate Limiting: * Definition: Limits requests per time window per client/IP/API key. * Algorithms: Token bucket (allows bursts), Leaky bucket (strict constant rate), Fixed window counter, Sliding window log. * Implementation: Use Redis for distributed limiting. Return HTTP $429$ with a Retry-After header.
Real-time Communication: * Polling: Simple but wasteful. * Long-polling: More efficient than basic polling. * SSE (Server-Sent Events): One-direction push (Server → Client); simpler than WebSockets. * WebSockets: Full duplex (bi-directional); best for chat/collaboration.

Database & SQL Deep-Dive

ACID & Transactions

Atomicity: All or nothing. Either all operations commit or none do. Example: Transferring money involves debit + credit; both must succeed.
Consistency: The database moves from one valid state to another. Constraints (Foreign keys, NOT NULL) are enforced after every transaction.
Isolation: Concurrent transactions do not see each other's intermediate states. Users see consistent snapshots.
Durability: Committed transactions persist even on crash. Data is written to disk or the WAL (Write-Ahead Log).
Isolation Levels (PostgreSQL): * READ COMMITTED: Default; reads only committed data. * REPEATABLE READ: Consistent snapshot for the entire transaction duration. * SERIALIZABLE: Strictest; prevents phantoms. Note: Higher isolation increases locking and lowers throughput.
Transaction Phenomena: * Dirty Read: Reading uncommitted data (Prevented by READ COMMITTED). * Phantom Read: Different results when re-running a query in the same transaction due to other inserts/deletes (Prevented by SERIALIZABLE).

Indexes & Query Optimization

B-tree Index: A balanced tree structure. Provides $O( ext{log } n)$ lookups and supports range queries. The default index for PostgreSQL/MySQL.
When Not to Index: Avoid on low-cardinality columns (e.g., booleans), write-heavy tables (as indexes slow down INSERT/UPDATE/DELETE), or very small tables where full scans are faster.
Covering Index: An index containing all columns required by a query, allowing index-only scans without touching the table data.
EXPLAIN ANALYZE Indicators: * Large Seq Scans (should ideally be Index Scans). * High rows/cost estimates. * Nested loop joins on extremely large datasets. * Mismatch between actual and estimated row count (Suggests stale statistics; solved by running ANALYZE).
N+1 Query Problem: Occurs when querying one record and then looping to perform $N$ sub-queries for related records. * Fixes: Use JOINs, SELECT IN (...), eager loading, or prefetch_related.

Key SQL Knowledge

Window Functions: Non-aggregating aggregates. Example: ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC). Ranks employees per department without collapsing rows. Includes: RANK, DENSE_RANK, LAG, LEAD, SUM/AVG OVER.
WHERE vs. HAVING: WHERE filters rows before grouping. HAVING filters groups after GROUP BY. Aggregate functions cannot be used in WHERE.
NULL Comparison: NULL = NULL returns UNKNOWN, not TRUE. Use IS NULL or IS NOT NULL.
Common Table Expressions (CTEs) vs. Subqueries vs. Temp Tables: * CTE (WITH clause): Readable and reusable; non-materialized by default in PostgreSQL. * Subquery: Inline; can be slow if correlated. * Temp Table: Materialized to disk; useful for multi-step ETL or indexing intermediate results.
Finding/Deleting Duplicates: * Find: SELECT col, COUNT(*) FROM table GROUP BY col HAVING COUNT(*) > 1. * Delete: Use ROW_NUMBER() partition, keep records where $rn=1$ , and delete others.

PostgreSQL Specifics

Row-Level Security (RLS): Policies that control row visibility/modification based on the user's role. Key for multi-tenant SaaS. Enable via ALTER TABLE ... ENABLE ROW LEVEL SECURITY.
Partial Index: An index with a WHERE clause targeting specific rows. Example: CREATE INDEX ON orders(user_id) WHERE status='pending'.
pg_cron: A PostgreSQL extension used for scheduling SQL-based jobs like cleanup or aggregations internally.
Statement Timeouts: Triggered when a query exceeds statement_timeout. Causes: missing indexes, expensive JOINs, or $N+1$ triggers.

System Design

The RESHADED Framework

R - Requirements: Clarify functional and non-functional requirements (Scale, Latency, Availability).
E - Estimation: Back-of-the-envelope calculations for DAU, QPS, and storage.
S - Storage: Choose SQL vs NoSQL, sketch schema, and estimate storage volume.
H - High-Level Design: Map main components (LB, API Servers, DB, Cache, Queue).
A - API Design: Define endpoints and data contracts.
D - Deep Dive: Focus on bottlenecks (e.g., DB sharding, caching).
E - Edge Cases: Failures, clock skew, hot partitions.
D - Done: Summarize trade-offs.

Core System Design Concepts

CAP Theorem: Choose only 2 out of: Consistency, Availability, Partition Tolerance. * CA: SQL. * CP: HBase, MongoDB. * AP: Cassandra, DynamoDB.
Scaling: Horizontal scaling (adding more servers) relies on stateless applications and load balancers.
Sharding: Partitioning DB per shard key (e.g., user_id). Risks: Hotspots, resharding complexity.
Replication: Leader-follower (Primary-replica) sets for scaling reads and High Availability (HA).
Message Queues: Async decoupling (Kafka, RabbitMQ, SQS). Useful for retry queues and event-driven architecture.
CDN (Content Delivery Network): Edge servers close to users serving static content. Reduces origin load.
Load Balancer: Distributes traffic. Layer $4$ (TCP) or Layer $7$ (HTTP-aware). Uses Health Checks and SSL termination.

Classic System Design Blueprints

URL Shortener (TinyURL): Hash long URL to $6$ – $8$ char base $62$ key. Use KV store (Redis/DynamoDB). Handle collisions and rate limits.
Chat App (WhatsApp): WebSockets for real-time. Cassandra for message storage. Presence service for status. Media stored in S3.
Social Feed (Twitter): Push model (fanout on write) for normal users. Pull model (fanout on read) for celebrities. Hybrid for scalability.
Notification System: Service routing to email/SMS/push. Includes templates, retry queues, and deduplication IDs.
File Upload Service: Use presigned URLs from the server for direct upload to object stores (S3). Use chunking for large files.

Data Structures & Algorithms (Live Coding)

Most Frequently Tested Patterns

Two Pointers: Used for sorted arrays, pairs, triplets, and palindromes (e.g., 3Sum, Container With Most Water).
Sliding Window: Used for subarrays or substrings of fixed/variable size (e.g., Longest substring without repeat).
Hash Map / Set: Essential for $O(1)$ lookups, frequency counts, and deduplication (e.g., Two Sum, Group Anagrams).
BFS / Graph: Shortest path in unweighted graphs or level order traversal (e.g., Number of Islands, Word Ladder).
DFS / Backtracking: Exploring all paths, combinations, or permutations (e.g., N-Queens, Subsets).
Binary Search: Sorted input or finding boundary conditions (e.g., Search in Rotated Array).
Stack / Monotonic Stack: Brackets matching and "next greater element" problems.
Dynamic Programming: Optimal substructure and overlapping subproblems (e.g., Coin Change, Knapsack).
Heap / Priority Queue: Top-K elements, median streaming, or task scheduling.

Big-O Complexity Cheat Sheet

Structure	Access	Search	Insert	Delete	Notes
Array	$O(1)$	$O(n)$	$O(n)$	$O(n)$	Random access; cache-friendly
Linked List	$O(n)$	$O(n)$	$O(1)$	$O(1)$	Requires pointer to node
Hash Table	—	$O(1)$	$O(1)$	$O(1)$	Average case; $O(n)$ worst on collisions
BST (Balanced)	—	$O(\text{log } n)$	$O(\text{log } n)$	$O(\text{log } n)$	AVL / Red-Black trees
Heap	$O(n)$	—	$O(\text{log } n)$	$O(\text{log } n)$	Min/Max in $O(1)$
Stack/Queue	—	—	$O(1)$	$O(1)$	LIFO / FIFO

Live Coding Approach (Time Management)

Clarify ( $2$ min): Ask about edge cases and expected data formats. Do not start coding.
Brute Force ( $1$ min): State the brute force complexity verbally.
Optimize ( $2$ min): Think aloud (e.g., "I can use a hashmap to reduce $O(n^2)$ .").
Code ( $15$ min): Write clean, readable code with meaningful variable names.
Test ( $3$ min): Trace through examples manually. Catching your own bugs is a positive signal.
Analysis: Always state Time and Space complexity unprompted at the end.
Critical Rule: If stuck for $5$ minutes, ask for a hint.

Odoo-Specific Questions

Core Architecture

Odoo ORM: Maps Python classes to PostgreSQL tables. Inherits from models.Model. Prefetching loads fields for multiple records in batches.
Model Types: * models.Model: Records are persisted in the database. * models.TransientModel: Temporary records (e.g., for wizards); auto-deleted periodically. * models.AbstractModel: Used as a mixin to share logic; not instantiated as a table.
Module Structure: Includes __manifest__.py, models/, views/ (XML), security/ (CSV + record rules), controllers/, and data/.
Inheritance: * Classical (_inherit = 'model'): Extends the existing model in-place. * Prototype (_inherit + _name): Creates a new model based on a parent copy. * Delegation (_inherits): Embeds parent via Many2one; fields are shared via foreign key.
Access Rules: * Access Rights (ir.model.access.csv): CRUD permissions per group for a model. * Record Rules (ir.rule): Dynamic domain-based filters (e.g., "User sees only their own leads").

ORM & Performance

Avoiding N+1: Odoo ORM prefetches by default. Avoid search() inside loops. Use record.mapped('field') for batch related fields.
Raw SQL: Use self.env.cr.execute('SELECT...', params) with $\%s$ placeholders. Use fetchall() for results.
Lazy Loading: Fields are not fetched until accessed. Accessing one field triggers a batch fetch for all records in the prefetch set.
Debugging: Enable --log-level=debug. Use EXTENDED ANALYZE in psql. Check for indexes on _order or frequently searched columns.
Computed Fields: Defined with compute='_method'. store=True makes it a searchable DB column; store=False (default) computes on the fly and is not searchable.

Custom Module Development

Creating from Scratch: Directory → Manifest → Import models → Define class → Access CSV → View XML → Install.
Extending Views: Create XML with inherit_id. Use xpath with positions: after, before, inside, replace, or attributes.
Server Actions: ir.actions.server. Can be automated triggers. Access records via env, self, etc.
QWeb: XML-based templating engine for reports and websites. Directives: t-if, t-foreach, t-field, t-call.

Quick Cheat Sheet

ACID: Atomicity, Consistency, Isolation, Durability.
CAP: Consistency, Availability, Partition Tolerance.
REST: Stateless HTTP style with $6$ constraints.
JWT: Stateless auth token (Header, Payload, Signature).
Idempotency: $N$ calls = $1$ call result.
Sharding: Horizontal DB partitioning.
CDN: Edge servers for latency reduction.
RLS: Database policies for row visibility.
TransientModel: Odoo temporary storage (wizard).
Big-O $O(\text{log } n)$ : Halving the problem at each step (Binary Search).

Final Mindset Tips

Think Out Loud: Reasoning is more valuable than just the correct solution.
Honesty: If you don't know, explain how you would find out rather than bluffing.
Trade-offs Over Perfection: In system design, acknowledge that no single "right" answer exists.
Own Your Projects: Be prepared to justify every technical decision in your portfolio.
Ask Clarifying Questions: Signals senior-level systematic thinking.