Chapter 7 – Databases in Applications: Comprehensive Study Notes
Location in Data Storage and Multi-Tiered Perspective
- "Location, location, location": in application design the physical/ logical location of data is as crucial as in real estate.
- Multi-tiered (client, application, database) environments provide several storage possibilities; architects can balance
- Performance (latency, bandwidth usage)
- Offline capability (local caches on smartphones, tablets)
- Security / regulatory controls (certain data must remain on servers)
- Client/server principles drive all modern mobile services (text messaging, browsing, image upload/download).
- Immediate, sub-second response times are now a user expectation → tight understanding of request/response lifecycle mandatory.
Client/Server Fundamentals
- Definition: split application processing between an intelligent front-end client and an intelligent back-end server.
- Either machine may reside on the same box or across a network.
- Key historical drivers (past 25 yrs): inexpensive PCs, GUIs, LANs/WANs, smartphones/tablets, broadband/wireless comms.
- Electronic commerce pattern: client (PC/phone) → dynamic Web page → database → real-time information to user.
- Legacy mainframe apps often re-partitioned into client/server to profit from lower-cost networks & modern interfaces.
- Benefits of the model
- Flexibility (choose best platform per tier)
- Scalability (upgrade one tier w/o total redesign)
- Extensibility (add new data types/operations).
Client/Server Application Logic Layers
- Presentation Logic
- Handles UI rendering & user input validation.
- Typically executes in browser, desktop app, or mobile app.
- Business Logic
- Implements rules, calculations, workflow.
- May be split (fat client vs. fat server) or concentrated in middle tier.
- Storage Logic
- CRUD access, indexing, physical storage.
- Lives inside DBMS on database server.
Architecture Types
- Two-Tier
- Thick client: business logic local → \text{more CPU/RAM on client, lighter servers}.
- Thin client: business logic on server → browser acts mostly as display.
- Distributed: logic divided.
- Three-Tier (most Web systems)
- Tier 1 = client/browser (thin).
- Tier 2 = application server(s) with code in C#, Java, Python, PHP …
- Tier 3 = database server.
- N-Tier
- Adds further specialized tiers (caching, micro-services, reporting) → more flexibility/manageability, higher complexity.
Typical Web Application Stack Components
- Database Server: Oracle, SQL Server, MySQL, PostgreSQL …
- Web Server: Apache, IIS, Nginx – receives HTTP/HTTPS.
- Application Server / Frameworks: .NET Core, Java EE, Node.js, Django, Flask, PHP.
- Browser (client): Chrome, Firefox, Edge, Safari, mobile WebView.
Canonical Configuration Examples
- Microsoft: IIS + ASP.NET + SQL Server/Oracle.
- Java: Apache + Oracle or IBM DB2 + JSP/Servlet/EJB.
- LAMP: Linux + Apache + MySQL + PHP/Python/Perl.
Databases in Three-Tier Execution Flow
- User requests URL.
- Web server inspects suffix:
- .html/.htm → static file returned directly.
- .jsp, .aspx, .py, .php … → forwarded to application server.
- Application server:
- Parses template, executes business logic.
- Uses middleware to create DB connection.
- Submits SQL, receives result set.
- Renders dynamic HTML/XML/JSON back to browser.
Middleware & Database-Oriented Connectivity
- Middleware = "glue" enabling inter-application operation without low-level socket coding.
- Database-oriented middleware: API + driver.
- ODBC, ADO.NET (VB.NET, C#) — Microsoft ecosystem.
- JDBC — Java.
- Python DB-API connectors (psycopg2, mysqlclient …).
- Generic 6-step DB access pattern (language-agnostic):
- Identify / register driver.
- \text{conn} = \text{openConnection()}.
- \text{result} = \text{conn.execute(query)}.
- Process \text{result}.
- Loop as necessary.
- \text{conn.close()}.
Java Web Application (JSP) Walk-Through
- JSP blends HTML (view) & Java (controller/model) — dynamic pages compiled to servlets.
- registration.jsp example responsibilities:
- Display form (FirstName, LastName, userName, password, confirmPassword …).
- On submit: server-side Java checks for
- Missing fields
- Mismatch passwords
- If errors → redisplay with message via
<% if … %>
blocks. - Else → build SQL INSERT, execute using
conn.prepareStatement
, executeUpdate
, conn.commit()
. - Redirect to success.jsp on commit.
- Java-inside-JSP delimiters:
<%
code %>
— hidden from client. - ResultSet navigation:
- Cursor starts before first row.
.next()
→ advance & return boolean.- Common mappings:
- INTEGER ↔
int
via getInt()/setInt()
- CHAR/VARCHAR ↔
String
via getString()/setString()
- DATE ↔
java.util.Date
via getDate()/setDate()
- TIME ↔
java.sql.Time
- TIMESTAMP ↔
java.sql.Timestamp
. - Default: forward-only, read-only; advanced types allow scroll/update.
- Maintenance drawback: intermix of HTML, Java, SQL → solved via MVC frameworks (Struts, Spring MVC, JSF, etc.).
Python Web Application (Django) Example
- Python: cross-platform, readable syntax; booming due to web frameworks.
- Django embodies MVC (or MTV: Model–Template–View).
- Architecture tiers:
- Client: HTML + JavaScript, no embedded Python.
- Application: Python classes
- Model: one per table (attributes → columns).
- View: functions/classes responding to HTTP verbs; returns JSON, HTML, etc.
- Serializer (when using Django REST Framework): converts model objects → JSON.
- Database: configured in
settings.py
(ENGINE driver, NAME path/URL).
- Example
EmployeeViewSet
:Employee.objects.all().order_by('LastName')
retrieves rows.EmployeeSerializer(queryset, many=True)
–> JSON.
- Switching DBMS: edit
settings.py
; ORM abstracts SQL. - Minor column changes: update Model; run migrations; View code untouched.
Key Considerations for Three-Tier Developers
- Core CRUD/transaction code patterns are language-neutral; only API placement differs.
- Devs must balance:
- Connection management (pooling, min lifespan).
- Stored procedure vs. ORM vs. inline SQL.
- Transaction boundaries / isolation.
- Security at every tier.
Stored Procedures
- Definition: precompiled SQL modules stored & executed inside DBMS engine.
- Pros:
- Performance via pre-parsing/optimizing.
- Reduced network traffic (logic server-side).
- Security (expose proc, hide tables).
- Data integrity (single canonical implementation for multiple apps).
- Leaner client code.
- Cons:
- Longer dev & test cycles (proprietary languages: PL/SQL, T-SQL).
- Portability issues when migrating DBMS.
- Example: PL/SQL procedure
checkUserExists
returning BOOL; Java would call via CallableStatement.
Transactions & the ACID Model
- Business transaction = sequence of CRUD steps that together accomplish a task (e.g., place order).
- ACID properties:
- \textbf{Atomic} – all or nothing.
- \textbf{Consistent} – DB constraints satisfied pre & post.
- \textbf{Isolated} – no intermediate state visible.
- \textbf{Durable} – committed changes survive crashes.
- SQL keywords:
BEGIN TRANSACTION
/ START TRANSACTION
→ begin log.COMMIT
→ make permanent.ROLLBACK
→ undo to begin.AUTOCOMMIT
ON/OFF.
- Best practices: keep transactions short, gather user input beforehand, validate early, release locks ASAP.
Concurrency Control
- Necessity: multi-user environment → simultaneous access → potential data integrity conflict.
- Two broad philosophies:
- Pessimistic (assume conflict likely) → locking.
- Optimistic (assume conflict rare) → versioning.
- CPU actually executes one instruction at a time, rapidly interleaving transactions to appear parallel.
Lost Updates & Inconsistent Reads
- Lost update example (John & Marsha joint account): simultaneous withdraws cause one overwrite.
- Inconsistent/dirty/unrepeatable reads: transaction reads data mid-update.
- Remedy: serializable isolation level via locks or MVCC.
Serializability & Locking Fundamentals
- Goal: produce schedule equivalent to some sequential order.
- Implementation: lock data being changed until transaction ends or aborts.
- Locking granularity:
- Database
- Table
- Block/Page (common)
- Record (row)
- Field (rare)
- Lock types:
- Shared (S/read) → many readers, no writers.
- Exclusive (X/write) → single writer, no readers/writers.
Deadlocks
- Condition: T1 waits for resource held by T2 while T2 waits for resource held by T1 (cycle).
- Prevention (impractical): conservative two-phase locking; order resources.
- Resolution: DBMS detects cycles via wait-for/resource matrix; chooses victim, rolls it back, releases locks, restarts.
Versioning (Optimistic Concurrency / MVCC)
- Concept: each transaction sees snapshot at start time.
- On update, DBMS writes new record version; commit merges if no conflict.
- If conflict (two updates same row): first commit wins, second rolls back & retries.
- Advantage: readonly transactions run without blocking writers; high performance.
Database Connections & Pooling
- Web & application servers may share hardware; DB often remote.
- Opening sockets, auth, session memory expensive; DBMS caps concurrent sessions.
- Connection pool driver keeps small set open; application borrows/returns.
- Dev guideline: open late, close early; reuse prepared statements when possible.
Three-Tier Benefits Recap
- Scalability: add application servers, TP monitors.
- Technological flexibility: swap DBMS or move middle tier to cloud containers.
- Lower long-term cost: upgrade modules not monolith.
- Better business alignment: fine-grained services.
- Improved customer service: multiple client types hitting same logic.
- Competitive agility & reduced project risk.
- Trade-offs: higher initial cost, need skilled personnel, standards/gov issues.
Security Across the Stack
- Comprehensive program covers physical, OS, network, DBMS, application, personnel.
- Threats
- Accidental loss (user error, HW failure)
- Theft/fraud (electronic & physical)
- Privacy/confidentiality breaches
- Integrity compromise
- Availability denial (sabotage, virus)
- Regulatory drivers: Sarbanes-Oxley, PCI-DSS, GDPR, etc.
Server-Side Controls
- Servers in locked rooms; limited admin accounts.
- Multi-layer passwords (OS + DBMS accounts).
- GRANT/REVOKE least-privilege.
Network Controls
- Dangers: eavesdropping, spoofing, packet capture.
- Countermeasures: encryption (TLS), mutual authentication, IP filtering, VPN.
- Maintain audit trails for forensics.
Web Security Particularities
- Dynamic pages introduce SQL-Injection risk → cleanse/filter input, parameterized queries.
- User authentication: HTML login forms → session cookies.
- Session security: HTTPS (SSL/TLS) encrypts cookies & payload.
- Web server hardening: minimize open ports, remove unused packages, limit compiler/interpreter access.
Data Privacy & Legislation
- Users’ right to know, correct, delete personal data under GDPR.
- Platform for Privacy Preferences (P3P) communicates site policy vs. browser/user preferences.
- Employers may legally monitor employee communications on corporate assets; encryption recommended for private comms.
Glossary & Key Terms
- Application Partitioning – distributing code across tiers.
- API – callable routines exposing service (ODBC, JDBC …).
- Two-Phase Locking (2PL) – growing phase acquire locks, shrinking release; ensures serializability.
- Thick vs. Thin Client – amount of business logic stored client-side.
- Transaction Boundaries – begin/commit/rollback points.
- TP Monitor – middleware coordinating transactions among multiple resources.
- JSON – lightweight text format for data interchange, native to JS.
Big-Picture Takeaways
- Selecting the right partitioning and concurrency strategy is fundamental to performance, integrity, and maintainability.
- Middleware APIs abstract vendor specifics but devs must still understand underlying DB constraints (transactions, locks, security).
- Three-tier + cloud/container orchestration is current mainstream; expect continued movement toward micro-services, serverless functions using same core principles.