Software Engineering Interview Study Guide 2025-2026

Core Backend Concepts

REST & HTTP Fundamentals

  • REST (Representational State Transfer) Definition: An architectural style for providing standards between computer systems on the web.

  • The 6 Constraints of REST:     * (1) Client-Server Separation: The user interface and data storage concerns are independent.     * (2) Statelessness: Every request must carry all information needed to be processed. The server stores zero client context between requests.     * (3) Cacheable: Responses must define themselves as cacheable or not to prevent clients from reusing stale data.     * (4) Uniform Interface: Simplifies and decouples the architecture, allowing each part to evolve independently.     * (5) Layered System: A client cannot ordinarily tell whether it is connected directly to the end server or an intermediate (e.g., load balancer).     * (6) Code on Demand (Optional): Servers can temporarily extend client functionality by transferring executable code (e.g., JavaScript).

  • HTTP Methods: PUT vs. PATCH:     * PUT: Replaces the entire resource. It is idempotent. If fields are missing in the request, they get nulled out in the database.     * PATCH: Partially updates a resource. It only touches the specific fields sent in the request.

  • Essential HTTP Status Codes:     * 200200 OK: Success.     * 201201 Created: Success, resource created.     * 204204 No Content: Success, but nothing to return.     * 400400 Bad Request: Client-side error in request.     * 401401 Unauthorized: Not authenticated.     * 403403 Forbidden: Authenticated, but lacks required permissions.     * 404404 Not Found: Resource non-existent.     * 409409 Conflict: Conflict with current state (e.g., duplicate entries).     * 422422 Unprocessable Entity: Semantic errors (often used for validation errors).     * 429429 Too Many Requests: Rate limit exceeded.     * 500500 Internal Server Error: Generic server-side failure.     * 503503 Service Unavailable: Server temporary overload or maintenance.

  • REST vs. GraphQL:     * REST: Chosen for simplicity, established standards, superior caching, and typical CRUD APIs.     * GraphQL: Chosen for flexible client queries, avoiding over-fetching/under-fetching, or supporting diverse frontend clients.     * Trade-offs: GraphQL introduces complexity overhead, such as the N+1N+1 problem if not batched, and more difficult caching logic.

  • HATEOAS (Hypermedia As The Engine Of Application State):     * Concepts: Server responses include hypermedia links to valid next actions.     * Advantage: Allows clients to discover the API dynamically.     * Practicality: Rarely implemented in real-world scenarios but a frequent interview topic.

Authentication & Security

  • JWT vs. Session-based Authentication:     * JWT (JSON Web Token): Stateless, scales horizontally because no server-side storage is needed. Disadvantage: Cannot invalidate tokens before expiry; requires short TTLTTL and refresh tokens.     * Session: Stored on the server. Advantage: Easy to revoke sessions. Disadvantage: Requires sticky sessions or a shared session store like Redis in distributed environments.

  • SQL Injection Prevention:     * Method: Always use parameterized queries or prepared statements.     * Warning: Never interpolate user input directly into SQL strings. ORMs typically handle this by default. Do not rely solely on input sanitization.

  • CSRF (Cross-Site Request Forgery):     * Definition: A malicious site tricks a user's browser into performing an unwanted action on a site where they are logged in.     * Mitigations: Use CSRF tokens (server-generated nonces in forms), use the SameSite cookie attribute, and check Origin/Referer headers.

  • Encryption Types:     * Symmetric: One shared key (e.g., AES). Fast and suitable for bulk data.     * Asymmetric: Public/private key pair (e.g., RSA, ECC). Slower, but allows key exchange without shared secrets. HTTPS uses asymmetric to exchange a symmetric session key.

  • OAuth2 Flow Simplified:     * 11: Resource Owner (user) grants permission to the Client app.     * 22: Authorization Server issues an access token.     * 33: Client uses the token to call the Resource Server (API).     * Note: Access tokens are short-lived; refresh tokens extend sessions without re-login.

Caching

  • The Caching Layers: Browser cache → CDN cache → Load balancer cache → Application-level (Redis/Memcached) → DB query cache → CPU cache.

  • Cache Eviction Strategies:     * LRU (Least Recently Used): The most common default.     * LFU (Least Frequently Used): Prioritizes keeping high-frequency data.     * FIFO (First-In, First-Out): Simple queue logic.     * TTL (Time To Live): Expiry based on a timer.

  • Cache Patterns:     * Cache-aside (Lazy Loading): Application checks the cache; on a miss, queries the database, stores result in cache, and returns. It is resilient to cache failures.     * Write-through/Write-behind: Alternative patterns where cache updates happen during writes.

  • When NOT to Cache: Regularly changing data, security-sensitive/per-user responses, or when strict consistency is required. Over-caching leads to stale data bugs.

Advanced API Topics

  • Idempotency:     * Definition: An operation is idempotent if calling it NN times has the same effect as calling it once.     * Status: GET, PUT, and DELETE are idempotent. POST is not.     * Importance: Critical for retry logic on network failures.

  • API Versioning Approaches:     * (1) URL Path: /api/v1/users (Most common and clearest).     * (2) Header: Accept: application/vnd.api+json;version=1.     * (3) Query Parameter: /users?version=1.

  • Rate Limiting:     * Definition: Limits requests per time window per client/IP/API key.     * Algorithms: Token bucket (allows bursts), Leaky bucket (strict constant rate), Fixed window counter, Sliding window log.     * Implementation: Use Redis for distributed limiting. Return HTTP 429429 with a Retry-After header.

  • Real-time Communication:     * Polling: Simple but wasteful.     * Long-polling: More efficient than basic polling.     * SSE (Server-Sent Events): One-direction push (Server → Client); simpler than WebSockets.     * WebSockets: Full duplex (bi-directional); best for chat/collaboration.

Database & SQL Deep-Dive

ACID & Transactions

  • Atomicity: All or nothing. Either all operations commit or none do. Example: Transferring money involves debit + credit; both must succeed.

  • Consistency: The database moves from one valid state to another. Constraints (Foreign keys, NOT NULL) are enforced after every transaction.

  • Isolation: Concurrent transactions do not see each other's intermediate states. Users see consistent snapshots.

  • Durability: Committed transactions persist even on crash. Data is written to disk or the WAL (Write-Ahead Log).

  • Isolation Levels (PostgreSQL):     * READ COMMITTED: Default; reads only committed data.     * REPEATABLE READ: Consistent snapshot for the entire transaction duration.     * SERIALIZABLE: Strictest; prevents phantoms. Note: Higher isolation increases locking and lowers throughput.

  • Transaction Phenomena:     * Dirty Read: Reading uncommitted data (Prevented by READ COMMITTED).     * Phantom Read: Different results when re-running a query in the same transaction due to other inserts/deletes (Prevented by SERIALIZABLE).

Indexes & Query Optimization

  • B-tree Index: A balanced tree structure. Provides O(extlogn)O( ext{log } n) lookups and supports range queries. The default index for PostgreSQL/MySQL.

  • When Not to Index: Avoid on low-cardinality columns (e.g., booleans), write-heavy tables (as indexes slow down INSERT/UPDATE/DELETE), or very small tables where full scans are faster.

  • Covering Index: An index containing all columns required by a query, allowing index-only scans without touching the table data.

  • EXPLAIN ANALYZE Indicators:     * Large Seq Scans (should ideally be Index Scans).     * High rows/cost estimates.     * Nested loop joins on extremely large datasets.     * Mismatch between actual and estimated row count (Suggests stale statistics; solved by running ANALYZE).

  • N+1 Query Problem: Occurs when querying one record and then looping to perform NN sub-queries for related records.     * Fixes: Use JOINs, SELECT IN (...), eager loading, or prefetch_related.

Key SQL Knowledge

  • Window Functions: Non-aggregating aggregates. Example: ROW_NUMBER() OVER (PARTITION BY dept ORDER BY salary DESC). Ranks employees per department without collapsing rows. Includes: RANK, DENSE_RANK, LAG, LEAD, SUM/AVG OVER.

  • WHERE vs. HAVING: WHERE filters rows before grouping. HAVING filters groups after GROUP BY. Aggregate functions cannot be used in WHERE.

  • NULL Comparison: NULL = NULL returns UNKNOWN, not TRUE. Use IS NULL or IS NOT NULL.

  • Common Table Expressions (CTEs) vs. Subqueries vs. Temp Tables:     * CTE (WITH clause): Readable and reusable; non-materialized by default in PostgreSQL.     * Subquery: Inline; can be slow if correlated.     * Temp Table: Materialized to disk; useful for multi-step ETL or indexing intermediate results.

  • Finding/Deleting Duplicates:     * Find: SELECT col, COUNT(*) FROM table GROUP BY col HAVING COUNT(*) > 1.     * Delete: Use ROW_NUMBER() partition, keep records where rn=1rn=1, and delete others.

PostgreSQL Specifics

  • Row-Level Security (RLS): Policies that control row visibility/modification based on the user's role. Key for multi-tenant SaaS. Enable via ALTER TABLE ... ENABLE ROW LEVEL SECURITY.

  • Partial Index: An index with a WHERE clause targeting specific rows. Example: CREATE INDEX ON orders(user_id) WHERE status='pending'.

  • pg_cron: A PostgreSQL extension used for scheduling SQL-based jobs like cleanup or aggregations internally.

  • Statement Timeouts: Triggered when a query exceeds statement_timeout. Causes: missing indexes, expensive JOINs, or N+1N+1 triggers.

System Design

The RESHADED Framework

  • R - Requirements: Clarify functional and non-functional requirements (Scale, Latency, Availability).

  • E - Estimation: Back-of-the-envelope calculations for DAU, QPS, and storage.

  • S - Storage: Choose SQL vs NoSQL, sketch schema, and estimate storage volume.

  • H - High-Level Design: Map main components (LB, API Servers, DB, Cache, Queue).

  • A - API Design: Define endpoints and data contracts.

  • D - Deep Dive: Focus on bottlenecks (e.g., DB sharding, caching).

  • E - Edge Cases: Failures, clock skew, hot partitions.

  • D - Done: Summarize trade-offs.

Core System Design Concepts

  • CAP Theorem: Choose only 2 out of: Consistency, Availability, Partition Tolerance.     * CA: SQL.     * CP: HBase, MongoDB.     * AP: Cassandra, DynamoDB.

  • Scaling: Horizontal scaling (adding more servers) relies on stateless applications and load balancers.

  • Sharding: Partitioning DB per shard key (e.g., user_id). Risks: Hotspots, resharding complexity.

  • Replication: Leader-follower (Primary-replica) sets for scaling reads and High Availability (HA).

  • Message Queues: Async decoupling (Kafka, RabbitMQ, SQS). Useful for retry queues and event-driven architecture.

  • CDN (Content Delivery Network): Edge servers close to users serving static content. Reduces origin load.

  • Load Balancer: Distributes traffic. Layer 44 (TCP) or Layer 77 (HTTP-aware). Uses Health Checks and SSL termination.

Classic System Design Blueprints

  • URL Shortener (TinyURL): Hash long URL to 6688 char base6262 key. Use KV store (Redis/DynamoDB). Handle collisions and rate limits.

  • Chat App (WhatsApp): WebSockets for real-time. Cassandra for message storage. Presence service for status. Media stored in S3.

  • Social Feed (Twitter): Push model (fanout on write) for normal users. Pull model (fanout on read) for celebrities. Hybrid for scalability.

  • Notification System: Service routing to email/SMS/push. Includes templates, retry queues, and deduplication IDs.

  • File Upload Service: Use presigned URLs from the server for direct upload to object stores (S3). Use chunking for large files.

Data Structures & Algorithms (Live Coding)

Most Frequently Tested Patterns

  • Two Pointers: Used for sorted arrays, pairs, triplets, and palindromes (e.g., 3Sum, Container With Most Water).

  • Sliding Window: Used for subarrays or substrings of fixed/variable size (e.g., Longest substring without repeat).

  • Hash Map / Set: Essential for O(1)O(1) lookups, frequency counts, and deduplication (e.g., Two Sum, Group Anagrams).

  • BFS / Graph: Shortest path in unweighted graphs or level order traversal (e.g., Number of Islands, Word Ladder).

  • DFS / Backtracking: Exploring all paths, combinations, or permutations (e.g., N-Queens, Subsets).

  • Binary Search: Sorted input or finding boundary conditions (e.g., Search in Rotated Array).

  • Stack / Monotonic Stack: Brackets matching and "next greater element" problems.

  • Dynamic Programming: Optimal substructure and overlapping subproblems (e.g., Coin Change, Knapsack).

  • Heap / Priority Queue: Top-K elements, median streaming, or task scheduling.

Big-O Complexity Cheat Sheet

Structure

Access

Search

Insert

Delete

Notes

Array

O(1)O(1)

O(n)O(n)

O(n)O(n)

O(n)O(n)

Random access; cache-friendly

Linked List

O(n)O(n)

O(n)O(n)

O(1)O(1)

O(1)O(1)

Requires pointer to node

Hash Table

O(1)O(1)

O(1)O(1)

O(1)O(1)

Average case; O(n)O(n) worst on collisions

BST (Balanced)

O(log n)O(\text{log } n)

O(log n)O(\text{log } n)

O(log n)O(\text{log } n)

AVL / Red-Black trees

Heap

O(n)O(n)

O(log n)O(\text{log } n)

O(log n)O(\text{log } n)

Min/Max in O(1)O(1)

Stack/Queue

O(1)O(1)

O(1)O(1)

LIFO / FIFO

Live Coding Approach (Time Management)

  • Clarify (22 min): Ask about edge cases and expected data formats. Do not start coding.

  • Brute Force (11 min): State the brute force complexity verbally.

  • Optimize (22 min): Think aloud (e.g., "I can use a hashmap to reduce O(n2)O(n^2).").

  • Code (1515 min): Write clean, readable code with meaningful variable names.

  • Test (33 min): Trace through examples manually. Catching your own bugs is a positive signal.

  • Analysis: Always state Time and Space complexity unprompted at the end.

  • Critical Rule: If stuck for 55 minutes, ask for a hint.

Odoo-Specific Questions

Core Architecture

  • Odoo ORM: Maps Python classes to PostgreSQL tables. Inherits from models.Model. Prefetching loads fields for multiple records in batches.

  • Model Types:     * models.Model: Records are persisted in the database.     * models.TransientModel: Temporary records (e.g., for wizards); auto-deleted periodically.     * models.AbstractModel: Used as a mixin to share logic; not instantiated as a table.

  • Module Structure: Includes __manifest__.py, models/, views/ (XML), security/ (CSV + record rules), controllers/, and data/.

  • Inheritance:     * Classical (_inherit = 'model'): Extends the existing model in-place.     * Prototype (_inherit + _name): Creates a new model based on a parent copy.     * Delegation (_inherits): Embeds parent via Many2one; fields are shared via foreign key.

  • Access Rules:     * Access Rights (ir.model.access.csv): CRUD permissions per group for a model.     * Record Rules (ir.rule): Dynamic domain-based filters (e.g., "User sees only their own leads").

ORM & Performance

  • Avoiding N+1: Odoo ORM prefetches by default. Avoid search() inside loops. Use record.mapped('field') for batch related fields.

  • Raw SQL: Use self.env.cr.execute('SELECT...', params) with %s\%s placeholders. Use fetchall() for results.

  • Lazy Loading: Fields are not fetched until accessed. Accessing one field triggers a batch fetch for all records in the prefetch set.

  • Debugging: Enable --log-level=debug. Use EXTENDED ANALYZE in psql. Check for indexes on _order or frequently searched columns.

  • Computed Fields: Defined with compute='_method'. store=True makes it a searchable DB column; store=False (default) computes on the fly and is not searchable.

Custom Module Development

  • Creating from Scratch: Directory → Manifest → Import models → Define class → Access CSV → View XML → Install.

  • Extending Views: Create XML with inherit_id. Use xpath with positions: after, before, inside, replace, or attributes.

  • Server Actions: ir.actions.server. Can be automated triggers. Access records via env, self, etc.

  • QWeb: XML-based templating engine for reports and websites. Directives: t-if, t-foreach, t-field, t-call.

Quick Cheat Sheet

  • ACID: Atomicity, Consistency, Isolation, Durability.

  • CAP: Consistency, Availability, Partition Tolerance.

  • REST: Stateless HTTP style with 66 constraints.

  • JWT: Stateless auth token (Header, Payload, Signature).

  • Idempotency: NN calls = 11 call result.

  • Sharding: Horizontal DB partitioning.

  • CDN: Edge servers for latency reduction.

  • RLS: Database policies for row visibility.

  • TransientModel: Odoo temporary storage (wizard).

  • Big-O O(log n)O(\text{log } n): Halving the problem at each step (Binary Search).

Final Mindset Tips

  • Think Out Loud: Reasoning is more valuable than just the correct solution.

  • Honesty: If you don't know, explain how you would find out rather than bluffing.

  • Trade-offs Over Perfection: In system design, acknowledge that no single "right" answer exists.

  • Own Your Projects: Be prepared to justify every technical decision in your portfolio.

  • Ask Clarifying Questions: Signals senior-level systematic thinking.