Database
• Collection of related data stored centrally or at multiple sites.
• Core of nearly every information system; supports transaction processing, decision support, analytics.
• Example: A university DB storing student, course, and payroll information in one logical repository.
Database Management System (DBMS)
• Software layer that creates, stores, maintains, and secures database files.
• Improves data sharing, integrity, security, and reduces program–data dependence.
• Popular packages include Oracle, MySQL, Microsoft SQL Server.
Data Hierarchy
• Logical structure: fields → records → files → databases.
• Ensures organized storage and retrieval; mirrors real-world entities.
Data Dictionary
• "Metadata repository" holding field names, data types, defaults, validation rules, indexes, relationships.
• DBA uses it for impact analysis when altering structures.
Data Model
• Blueprint that dictates how data are represented (entities, attributes, relationships), manipulated, and kept consistent.
• Must define:
– Data structures (tables, objects, nodes)
– Operations (insert, update, delete, query)
– Integrity rules (referential, domain, semantic)
• Choosing the right model (relational, object-oriented, etc.) affects scalability and performance.
Relational Model
• Organizes data into two-dimensional tables (relations). Rows = records (tuples), columns = attributes (fields).
• Mathematical foundation in set theory & relational algebra; enables \sigma (selection), \pi (projection), \bowtie (join) operations.
• Keys:
– Primary key uniquely identifies each row.
– Foreign key enforces referential integrity by pointing to a primary key in another table.
Hierarchical Model
• Data organized in a tree; one parent per child.
• Fast for 1-to-many look-ups (e.g., organizational charts).
• Drawback: redundancy if many-to-many relationships exist.
Network Model
• Extends hierarchical by allowing multiple parents per node, enabling many-to-many links.
• Uses sets and records; navigation requires explicit path traversal.
Object-Oriented Database
• Encapsulates both state (attributes) and behavior (methods) in objects.
• Supports encapsulation (grouping) and inheritance (reuse).
• Suitable for complex data (CAD, multimedia, scientific).
• Querying often via OQL (Object Query Language).
Encapsulation
• Bundles data with code manipulating it → simplifies maintenance and modeling of real-world objects like images, graphs.
Inheritance
• New object classes derive properties/methods from existing ones → faster development and consistency.
Logical vs. Physical Views
• Logical view – how users conceptually see data: tables, forms, reports.
• Physical view – how data are actually stored on media: blocks, files, indexes.
• DBMS translates logical requests into physical operations.
Create, Read, Update, Delete (CRUD)
• Fundamental operations; access control lists specify which roles can perform them.
• Example: Customer support rep may Read & Update a ticket, but not Delete records.
Normalization
• Process of decomposing tables to eliminate redundancy and anomalies; progresses through 1\text{NF} \rightarrow 2\text{NF} \rightarrow 3\text{NF} \rightarrow ….
• Benefits: smaller tables, consistent updates, simpler maintenance.
Distributed DBMS (DDBMS)
• Spreads data over multiple servers; provides location transparency and improved availability.
• Three key data-placement strategies:
– Fragmentation (horizontal, vertical, mixed)
– Replication (full/partial copies)
– Allocation (hybrid of both, placing data where accessed most).
Fragmentation
• Horizontal – each fragment holds subset of rows.
• Vertical – each holds subset of columns.
• Mixed – combination; improves locality but complicates queries.
Replication
• Duplicates tables/sites; boosts read performance & fault tolerance; risks consistency challenges handled by concurrency protocols.
Allocation
• Sites keep data they access frequently; minimizes WAN latency, balances load.
Random vs. Sequential Access File Structures
• Random – direct access to any record (e.g., hashing). Ideal for small, ad-hoc lookups.
• Sequential – records processed in fixed order; efficient for batch reporting.
Indexed Sequential Access Method (ISAM)
• Hybrid; small set → index → random access; large set → sequential scan. Balances performance.
Structured Query Language (SQL)
• Declarative 4^{\text{th}}-generation language; keywords like SELECT, INSERT, UPDATE, DELETE.
• Example query:
SELECT product_name, SUM(quantity) AS total
FROM sales
GROUP BY product_name
HAVING SUM(quantity) > 100;
Query By Example (QBE)
• Graphical interface; users fill templates; DBMS converts to SQL.
• Logical operators (AND, OR, NOT) refine criteria.
Data-Driven Web Site
• Web front-end connected to a database; dynamic pages generated via queries.
• Enables e-commerce catalogs, account dashboards, content management.
Data Warehouse
• Subject-oriented, integrated, time-variant, non-volatile repository for decision support.
• Typically refreshed via ETL pipelines; supports OLAP, data mining.
Data Mart
• Department-scale subset of a warehouse (e.g., marketing), cheaper & faster to deploy.
Data Lake
• Central storage for raw, structured & unstructured data; schema applied "on read."
• Supports big-data analytics using Hadoop, Spark.
Extraction, Transformation & Loading (ETL)
• Extract \rightarrow Transform \rightarrow Load pipeline prepares disparate data for warehouses.
– Extract: pull from ERP, CRM, flat files.
– Transform: cleanse, deduplicate, format, derive.
– Load: bulk insert into warehouse tables.
Big Data
• Characterized by the 5 V’s: Volume, Velocity, Variety, Veracity, Value.
• Requires parallel processing frameworks (MapReduce), NoSQL stores, in-memory computing.
Business Analytics (BA)
• Uses stats, predictive models, data mining to glean insights; feeds dashboards for executives.
• Tools: Power BI, Tableau, Python libraries, R.
Online Transaction Processing (OLTP)
• Real-time, record-level processing for routine tasks (POS, ATM).
• Emphasizes ACID properties: Atomicity + Consistency + Isolation + Durability.
Online Analytical Processing (OLAP)
• Multidimensional analysis (cubes) → slice-and-dice, drill-down, roll-up.
• Power users examine sales by {\text{product},\text{region},\text{time}} axes simultaneously.
Data-Mining Analysis
• Discovers hidden patterns via clustering, association rules (e.g., market basket {\text{diapers}}\rightarrow{\text{beer}}), classification.
Text-Mining Analysis
• Extracts sentiment, entities from large text corpora; useful for social-media monitoring.
Power BI & Tableau
• Visualization platforms turning raw data into interactive dashboards; enable self-service BI.
Database Administrators (DBA)
• Responsibilities: design schema, tune indexes, manage security, backup/restore, ensure uptime.
Data Communication
• Electronic transfer of data between locations; foundation for distributed systems, IoT.
Components
• Sender/Receiver
• Hardware (NICs, routers, modems)
• Communication media
• Protocols
• Software (OS, drivers)
Communication Media
• Conducted (guided) – twisted pair, coaxial, fiber optics.
• Radiated (unguided) – microwave, radio (Wi-Fi, cellular), infrared.
Bandwidth vs. Throughput
• Bandwidth – theoretical capacity measured in \text{bps}.
• Throughput – actual achieved rate under real conditions.
Narrowband vs. Broadband
• Narrowband ≤ 56 kbps (legacy voice lines).
• Broadband transmits multiple signals concurrently; cable, DSL, fiber.
Attenuation
• Signal power loss over distance; mitigated by repeaters, amplifiers, fiber.
Modem (Modulator-Demodulator)
• Converts digital ↔ analog signals for transmission over legacy phone lines.
Multiplexer
• Combines several input signals on one line; saves cost.
Network Topologies
• Bus – linear backbone, terminators at ends. Cheap but collision-prone.
• Ring – nodes form closed loop; token-passing avoids collisions.
• Star – central hub/switch; easy fault isolation.
• Mesh – each node connects to all; high redundancy (used in WAN backbones).
• Hierarchical (Tree) – mix of star plus additional branching levels.
LAN / MAN / WAN
• LAN – < 1 km, high speed.
• MAN – spans city (fiber rings).
• WAN – global, leased lines, satellites.
Wireless Technologies
• Wi-Fi (IEEE 802.11) – 2.4/5 GHz, 120–300 ft.
• Bluetooth – personal area network (~800 ft).
• WiMAX (802.16) – 3–30 mi metro coverage.
• Mobile/Cellular Network – cells served by base stations; supports TDMA, CDMA.
Multiple Access Techniques
• FDMA – by frequency bands.
• TDMA – splits channel into time slots; 3\times voice capacity.
• CDMA – spread-spectrum code; high security and user density.
Routing Concepts
• Router – directs packets; maintains routing table (destination → next hop).
• Static routes – manually defined (good for small, stable LANs).
• Dynamic routes – learned via protocols (RIP, OSPF).
• Centralized routing – one node computes all paths.
• Distributed routing – each node computes own best path.
Packet
• Wrapper of user data + headers (source/dest IP, sequence, checksum); enables store-and-forward networks.
Protocols
• Rules for syntax, error control, flow control, \text{bps}.
• TCP/IP suite underpins the Internet; OSI 7-layer model standardizes exchanges.
Client/Server & N-Tier Architectures
• Two-tier: UI ↔ DB server.
• N-tier: adds application servers to balance load; easier scaling, security segregation.
Controller & NIC
• Interface hardware enabling network access; contains MAC address.
Convergence
• Integration of voice, video, data; enables VoIP, video conferencing on same IP network.
DSL, Cable, Fiber
• DSL uses copper phone lines; asynchronous speeds.
• Cable shares coax; higher speeds but contention.
• Fiber – highest capacity, low attenuation.
Internet & Backbone
• Global "network of networks"; backbone = high-capacity fiber links run by Tier-1 ISPs using BGP routing.
ARPANET
• 1969 DoD project; packet-switching proof-of-concept; four original nodes (UCLA, SRI, UCSB, Univ. Utah).
DNS Protocol
• Converts human domain names to IP addresses via hierarchy of root, TLD, authoritative servers.
URL
• Uniform Resource Locator format: scheme://host:port/path?query#fragment.
HTML & Hypertext/Hypermedia
• HTML tags (\<h1>, \<img>, etc.) structure web pages.
• Hypertext = clickable links; hypermedia extends to audio/video.
Web 2.0
• Shift to interactive, user-generated content (wikis, blogs, social networks); enables e-collaboration.
Blog / Wiki / Podcast
• Blog = personal journal; wiki = collaborative editing; podcast = downloadable audio series.
Search Engines & Directories
• Use crawlers, indexers, and query processors; early Yahoo = manual directory, Google = PageRank algorithm.
Navigational Tools
• Browsers (Chrome, Edge), bookmarks, history facilitate "surfing" the Web.
Instant Messaging, IRC, Discussion Groups, Newsgroups
• Real-time or asynchronous communication; precursor to modern social media & Slack.
RSS Feeds
• XML-based syndication; feed readers aggregate headlines automatically.
Internet Telephony & VoIP
• Conveys voice packets over IP; savings vs. PSTN; requires QoS to minimize \text{delay} \le 150\,\text{ms}.
Internet of Things (IoT), IoE, IIoT
• IoT – billions of sensor-equipped physical objects online.
• IoE – adds people, processes, data for holistic connectivity.
• IIoT – industrial applications (predictive maintenance, SCADA).
• Smart home = consumer IoT implementation (thermostats, lights).
Intranet & Extranet
• Intranet – private corporate network using Internet tech; enhances internal comms.
• Extranet – extends limited access to business partners (B2B portals).
Social Networking
• Platforms (Facebook, LinkedIn) enabling user profiles, relationships graph.
Systems Development Life Cycle (SDLC)
• Waterfall model:
Planning Phase
• Define problem not symptoms; form task force; produce preliminary feasibility study.
Feasibility Study Dimensions
• Economic – cost/benefit (ROI, NPV = \sum \frac{CF_t}{(1+r)^t}).
• Technical – availability/compatibility of tech.
• Operational – acceptance by users; alignment w/ procedures.
• Scheduling – can it be done on time?
• Legal – compliance (Info Privacy Act, GDPR).
Internal vs. External Users
• Internal employees vs. external stakeholders; both provide requirements.
Joint Application Design (JAD)
• Facilitated workshops merging users, managers, IT to accelerate consensus.
Computer-Aided Systems Engineering (CASE)
• Diagramming, code generation, repository tools that automate SDLC tasks; improves documentation.
Design Phase
• Choose best alternative; produce detailed specs: data models, UI mock-ups, hardware needs.
Implementation Phase
• Code, test, install; deliver training; prepare conversion strategy:
– Plunge (direct)
– Parallel
– Phased-in / Phased-out
– Pilot
Maintenance Phase
• Post-deployment changes, optimizations, patches; absorbs \approx 80\% of life-cycle cost.
Prototyping Approaches
• Proof-of-Concept – validates feasibility.
• Selling – demonstrates benefits to secure funding.
• RAD – merges planning & analysis, iterative builds.
• Extreme Programming (XP) – short cycles, pair programming, continual testing.
• Agile Methodology – incremental delivery, embraces change, limits scope per sprint.
Low-Code / No-Code Platforms
• Drag-and-drop components; enable self-sourcing by power users; speed MVP delivery but risk shadow IT.
Service-Oriented Architecture (SOA)
• Decomposes functionality into reusable services accessed via open protocols (SOAP, REST). Encourages agility.
Outsourcing vs. Insourcing vs. Crowdsourcing
• Outsourcing – hire vendor; focus on core competences, might lose control.
• Insourcing – internal dev team; better IP retention.
• Crowdsourcing – open call to large online community; leverages diversity, cost-effective.
IT Project Management
• Initiation, planning (Gantt/PERT), execution, monitoring (EV =\text{% complete}\times\text{budget}), closure.
• Tools: MS Project, Jira.
Requests for Information (RFI) & Proposal (RFP)
• Formal procurement docs; RFI = market scan; RFP = solicit bids with specs.
Conversion Costs & Risk
• Direct-cutover cheaper but risky; parallel safest but doubles operations temporarily.
Scheduling Techniques
• Critical Path = \text{longest path through PERT network}; slack time helps allocate resources.