AM

Chapter 7 – Databases in Applications: Comprehensive Study Notes

Location in Data Storage and Multi-Tiered Perspective

  • "Location, location, location": in application design the physical/ logical location of data is as crucial as in real estate.
  • Multi-tiered (client, application, database) environments provide several storage possibilities; architects can balance
    • Performance (latency, bandwidth usage)
    • Offline capability (local caches on smartphones, tablets)
    • Security / regulatory controls (certain data must remain on servers)
  • Client/server principles drive all modern mobile services (text messaging, browsing, image upload/download).
  • Immediate, sub-second response times are now a user expectation → tight understanding of request/response lifecycle mandatory.

Client/Server Fundamentals

  • Definition: split application processing between an intelligent front-end client and an intelligent back-end server.
  • Either machine may reside on the same box or across a network.
  • Key historical drivers (past 25 yrs): inexpensive PCs, GUIs, LANs/WANs, smartphones/tablets, broadband/wireless comms.
  • Electronic commerce pattern: client (PC/phone) → dynamic Web page → database → real-time information to user.
  • Legacy mainframe apps often re-partitioned into client/server to profit from lower-cost networks & modern interfaces.
  • Benefits of the model
    • Flexibility (choose best platform per tier)
    • Scalability (upgrade one tier w/o total redesign)
    • Extensibility (add new data types/operations).

Client/Server Application Logic Layers

  • Presentation Logic
    • Handles UI rendering & user input validation.
    • Typically executes in browser, desktop app, or mobile app.
  • Business Logic
    • Implements rules, calculations, workflow.
    • May be split (fat client vs. fat server) or concentrated in middle tier.
  • Storage Logic
    • CRUD access, indexing, physical storage.
    • Lives inside DBMS on database server.

Architecture Types

  • Two-Tier
    • Thick client: business logic local → \text{more CPU/RAM on client, lighter servers}.
    • Thin client: business logic on server → browser acts mostly as display.
    • Distributed: logic divided.
  • Three-Tier (most Web systems)
    • Tier 1 = client/browser (thin).
    • Tier 2 = application server(s) with code in C#, Java, Python, PHP …
    • Tier 3 = database server.
  • N-Tier
    • Adds further specialized tiers (caching, micro-services, reporting) → more flexibility/manageability, higher complexity.

Typical Web Application Stack Components

  • Database Server: Oracle, SQL Server, MySQL, PostgreSQL …
  • Web Server: Apache, IIS, Nginx – receives HTTP/HTTPS.
  • Application Server / Frameworks: .NET Core, Java EE, Node.js, Django, Flask, PHP.
  • Browser (client): Chrome, Firefox, Edge, Safari, mobile WebView.

Canonical Configuration Examples

  • Microsoft: IIS + ASP.NET + SQL Server/Oracle.
  • Java: Apache + Oracle or IBM DB2 + JSP/Servlet/EJB.
  • LAMP: Linux + Apache + MySQL + PHP/Python/Perl.

Databases in Three-Tier Execution Flow

  • User requests URL.
  • Web server inspects suffix:
    • .html/.htm → static file returned directly.
    • .jsp, .aspx, .py, .php … → forwarded to application server.
  • Application server:
    1. Parses template, executes business logic.
    2. Uses middleware to create DB connection.
    3. Submits SQL, receives result set.
    4. Renders dynamic HTML/XML/JSON back to browser.

Middleware & Database-Oriented Connectivity

  • Middleware = "glue" enabling inter-application operation without low-level socket coding.
  • Database-oriented middleware: API + driver.
    • ODBC, ADO.NET (VB.NET, C#) — Microsoft ecosystem.
    • JDBC — Java.
    • Python DB-API connectors (psycopg2, mysqlclient …).
  • Generic 6-step DB access pattern (language-agnostic):
    1. Identify / register driver.
    2. \text{conn} = \text{openConnection()}.
    3. \text{result} = \text{conn.execute(query)}.
    4. Process \text{result}.
    5. Loop as necessary.
    6. \text{conn.close()}.

Java Web Application (JSP) Walk-Through

  • JSP blends HTML (view) & Java (controller/model) — dynamic pages compiled to servlets.
  • registration.jsp example responsibilities:
    • Display form (FirstName, LastName, userName, password, confirmPassword …).
    • On submit: server-side Java checks for
    • Missing fields
    • Mismatch passwords
    • If errors → redisplay with message via <% if … %> blocks.
    • Else → build SQL INSERT, execute using conn.prepareStatement, executeUpdate, conn.commit().
    • Redirect to success.jsp on commit.
  • Java-inside-JSP delimiters: <% code %> — hidden from client.
  • ResultSet navigation:
    • Cursor starts before first row.
    • .next() → advance & return boolean.
    • Common mappings:
    • INTEGER ↔ int via getInt()/setInt()
    • CHAR/VARCHAR ↔ String via getString()/setString()
    • DATE ↔ java.util.Date via getDate()/setDate()
    • TIME ↔ java.sql.Time
    • TIMESTAMP ↔ java.sql.Timestamp.
    • Default: forward-only, read-only; advanced types allow scroll/update.
  • Maintenance drawback: intermix of HTML, Java, SQL → solved via MVC frameworks (Struts, Spring MVC, JSF, etc.).

Python Web Application (Django) Example

  • Python: cross-platform, readable syntax; booming due to web frameworks.
  • Django embodies MVC (or MTV: Model–Template–View).
  • Architecture tiers:
    • Client: HTML + JavaScript, no embedded Python.
    • Application: Python classes
    • Model: one per table (attributes → columns).
    • View: functions/classes responding to HTTP verbs; returns JSON, HTML, etc.
    • Serializer (when using Django REST Framework): converts model objects → JSON.
    • Database: configured in settings.py (ENGINE driver, NAME path/URL).
  • Example EmployeeViewSet:
    • Employee.objects.all().order_by('LastName') retrieves rows.
    • EmployeeSerializer(queryset, many=True) –> JSON.
  • Switching DBMS: edit settings.py; ORM abstracts SQL.
  • Minor column changes: update Model; run migrations; View code untouched.

Key Considerations for Three-Tier Developers

  • Core CRUD/transaction code patterns are language-neutral; only API placement differs.
  • Devs must balance:
    • Connection management (pooling, min lifespan).
    • Stored procedure vs. ORM vs. inline SQL.
    • Transaction boundaries / isolation.
    • Security at every tier.

Stored Procedures

  • Definition: precompiled SQL modules stored & executed inside DBMS engine.
  • Pros:
    • Performance via pre-parsing/optimizing.
    • Reduced network traffic (logic server-side).
    • Security (expose proc, hide tables).
    • Data integrity (single canonical implementation for multiple apps).
    • Leaner client code.
  • Cons:
    • Longer dev & test cycles (proprietary languages: PL/SQL, T-SQL).
    • Portability issues when migrating DBMS.
  • Example: PL/SQL procedure checkUserExists returning BOOL; Java would call via CallableStatement.

Transactions & the ACID Model

  • Business transaction = sequence of CRUD steps that together accomplish a task (e.g., place order).
  • ACID properties:
    • \textbf{Atomic} – all or nothing.
    • \textbf{Consistent} – DB constraints satisfied pre & post.
    • \textbf{Isolated} – no intermediate state visible.
    • \textbf{Durable} – committed changes survive crashes.
  • SQL keywords:
    • BEGIN TRANSACTION / START TRANSACTION → begin log.
    • COMMIT → make permanent.
    • ROLLBACK → undo to begin.
    • AUTOCOMMIT ON/OFF.
  • Best practices: keep transactions short, gather user input beforehand, validate early, release locks ASAP.

Concurrency Control

  • Necessity: multi-user environment simultaneous access potential data integrity conflict.
  • Two broad philosophies:
    1. Pessimistic (assume conflict likely) → locking.
    2. Optimistic (assume conflict rare) → versioning.
  • CPU actually executes one instruction at a time, rapidly interleaving transactions to appear parallel.

Lost Updates & Inconsistent Reads

  • Lost update example (John & Marsha joint account): simultaneous withdraws cause one overwrite.
  • Inconsistent/dirty/unrepeatable reads: transaction reads data mid-update.
  • Remedy: serializable isolation level via locks or MVCC.

Serializability & Locking Fundamentals

  • Goal: produce schedule equivalent to some sequential order.
  • Implementation: lock data being changed until transaction ends or aborts.
  • Locking granularity:
    • Database
    • Table
    • Block/Page (common)
    • Record (row)
    • Field (rare)
  • Lock types:
    • Shared (S/read) → many readers, no writers.
    • Exclusive (X/write) → single writer, no readers/writers.

Deadlocks

  • Condition: T1 waits for resource held by T2 while T2 waits for resource held by T1 (cycle).
  • Prevention (impractical): conservative two-phase locking; order resources.
  • Resolution: DBMS detects cycles via wait-for/resource matrix; chooses victim, rolls it back, releases locks, restarts.

Versioning (Optimistic Concurrency / MVCC)

  • Concept: each transaction sees snapshot at start time.
  • On update, DBMS writes new record version; commit merges if no conflict.
  • If conflict (two updates same row): first commit wins, second rolls back & retries.
  • Advantage: readonly transactions run without blocking writers; high performance.

Database Connections & Pooling

  • Web & application servers may share hardware; DB often remote.
  • Opening sockets, auth, session memory expensive; DBMS caps concurrent sessions.
  • Connection pool driver keeps small set open; application borrows/returns.
  • Dev guideline: open late, close early; reuse prepared statements when possible.

Three-Tier Benefits Recap

  • Scalability: add application servers, TP monitors.
  • Technological flexibility: swap DBMS or move middle tier to cloud containers.
  • Lower long-term cost: upgrade modules not monolith.
  • Better business alignment: fine-grained services.
  • Improved customer service: multiple client types hitting same logic.
  • Competitive agility & reduced project risk.
  • Trade-offs: higher initial cost, need skilled personnel, standards/gov issues.

Security Across the Stack

  • Comprehensive program covers physical, OS, network, DBMS, application, personnel.
  • Threats
    • Accidental loss (user error, HW failure)
    • Theft/fraud (electronic & physical)
    • Privacy/confidentiality breaches
    • Integrity compromise
    • Availability denial (sabotage, virus)
  • Regulatory drivers: Sarbanes-Oxley, PCI-DSS, GDPR, etc.

Server-Side Controls

  • Servers in locked rooms; limited admin accounts.
  • Multi-layer passwords (OS + DBMS accounts).
  • GRANT/REVOKE least-privilege.

Network Controls

  • Dangers: eavesdropping, spoofing, packet capture.
  • Countermeasures: encryption (TLS), mutual authentication, IP filtering, VPN.
  • Maintain audit trails for forensics.

Web Security Particularities

  • Dynamic pages introduce SQL-Injection risk → cleanse/filter input, parameterized queries.
  • User authentication: HTML login forms → session cookies.
  • Session security: HTTPS (SSL/TLS) encrypts cookies & payload.
  • Web server hardening: minimize open ports, remove unused packages, limit compiler/interpreter access.

Data Privacy & Legislation

  • Users’ right to know, correct, delete personal data under GDPR.
  • Platform for Privacy Preferences (P3P) communicates site policy vs. browser/user preferences.
  • Employers may legally monitor employee communications on corporate assets; encryption recommended for private comms.

Glossary & Key Terms

  • Application Partitioning – distributing code across tiers.
  • API – callable routines exposing service (ODBC, JDBC …).
  • Two-Phase Locking (2PL) – growing phase acquire locks, shrinking release; ensures serializability.
  • Thick vs. Thin Client – amount of business logic stored client-side.
  • Transaction Boundaries – begin/commit/rollback points.
  • TP Monitor – middleware coordinating transactions among multiple resources.
  • JSON – lightweight text format for data interchange, native to JS.

Big-Picture Takeaways

  • Selecting the right partitioning and concurrency strategy is fundamental to performance, integrity, and maintainability.
  • Middleware APIs abstract vendor specifics but devs must still understand underlying DB constraints (transactions, locks, security).
  • Three-tier + cloud/container orchestration is current mainstream; expect continued movement toward micro-services, serverless functions using same core principles.