C

MIS Modules 03, 06, 07 & 10 – Comprehensive Study Notes

Module 03 – Databases, Big Data & Business Analytics

  • Database
    • Collection of related data stored centrally or at multiple sites.
    • Core of nearly every information system; supports transaction processing, decision support, analytics.
    • Example: A university DB storing student, course, and payroll information in one logical repository.

  • Database Management System (DBMS)
    • Software layer that creates, stores, maintains, and secures database files.
    • Improves data sharing, integrity, security, and reduces program–data dependence.
    • Popular packages include Oracle, MySQL, Microsoft SQL Server.

  • Data Hierarchy
    • Logical structure: fields → records → files → databases.
    • Ensures organized storage and retrieval; mirrors real-world entities.

  • Data Dictionary
    • "Metadata repository" holding field names, data types, defaults, validation rules, indexes, relationships.
    • DBA uses it for impact analysis when altering structures.

  • Data Model
    • Blueprint that dictates how data are represented (entities, attributes, relationships), manipulated, and kept consistent.
    • Must define:
    – Data structures (tables, objects, nodes)
    – Operations (insert, update, delete, query)
    – Integrity rules (referential, domain, semantic)
    • Choosing the right model (relational, object-oriented, etc.) affects scalability and performance.

  • Relational Model
    • Organizes data into two-dimensional tables (relations). Rows = records (tuples), columns = attributes (fields).
    • Mathematical foundation in set theory & relational algebra; enables \sigma (selection), \pi (projection), \bowtie (join) operations.
    • Keys:
    Primary key uniquely identifies each row.
    Foreign key enforces referential integrity by pointing to a primary key in another table.

  • Hierarchical Model
    • Data organized in a tree; one parent per child.
    • Fast for 1-to-many look-ups (e.g., organizational charts).
    • Drawback: redundancy if many-to-many relationships exist.

  • Network Model
    • Extends hierarchical by allowing multiple parents per node, enabling many-to-many links.
    • Uses sets and records; navigation requires explicit path traversal.

  • Object-Oriented Database
    • Encapsulates both state (attributes) and behavior (methods) in objects.
    • Supports encapsulation (grouping) and inheritance (reuse).
    • Suitable for complex data (CAD, multimedia, scientific).
    • Querying often via OQL (Object Query Language).

  • Encapsulation
    • Bundles data with code manipulating it → simplifies maintenance and modeling of real-world objects like images, graphs.

  • Inheritance
    • New object classes derive properties/methods from existing ones → faster development and consistency.

  • Logical vs. Physical Views
    Logical view – how users conceptually see data: tables, forms, reports.
    Physical view – how data are actually stored on media: blocks, files, indexes.
    • DBMS translates logical requests into physical operations.

  • Create, Read, Update, Delete (CRUD)
    • Fundamental operations; access control lists specify which roles can perform them.
    • Example: Customer support rep may Read & Update a ticket, but not Delete records.

  • Normalization
    • Process of decomposing tables to eliminate redundancy and anomalies; progresses through 1\text{NF} \rightarrow 2\text{NF} \rightarrow 3\text{NF} \rightarrow ….
    • Benefits: smaller tables, consistent updates, simpler maintenance.

  • Distributed DBMS (DDBMS)
    • Spreads data over multiple servers; provides location transparency and improved availability.
    • Three key data-placement strategies:
    Fragmentation (horizontal, vertical, mixed)
    Replication (full/partial copies)
    Allocation (hybrid of both, placing data where accessed most).

  • Fragmentation
    Horizontal – each fragment holds subset of rows.
    Vertical – each holds subset of columns.
    Mixed – combination; improves locality but complicates queries.

  • Replication
    • Duplicates tables/sites; boosts read performance & fault tolerance; risks consistency challenges handled by concurrency protocols.

  • Allocation
    • Sites keep data they access frequently; minimizes WAN latency, balances load.

  • Random vs. Sequential Access File Structures
    Random – direct access to any record (e.g., hashing). Ideal for small, ad-hoc lookups.
    Sequential – records processed in fixed order; efficient for batch reporting.

  • Indexed Sequential Access Method (ISAM)
    • Hybrid; small set → index → random access; large set → sequential scan. Balances performance.

  • Structured Query Language (SQL)
    • Declarative 4^{\text{th}}-generation language; keywords like SELECT, INSERT, UPDATE, DELETE.
    • Example query:

    SELECT product_name, SUM(quantity) AS total
    FROM sales
    GROUP BY product_name
    HAVING SUM(quantity) > 100;
    
  • Query By Example (QBE)
    • Graphical interface; users fill templates; DBMS converts to SQL.
    • Logical operators (AND, OR, NOT) refine criteria.

  • Data-Driven Web Site
    • Web front-end connected to a database; dynamic pages generated via queries.
    • Enables e-commerce catalogs, account dashboards, content management.

  • Data Warehouse
    • Subject-oriented, integrated, time-variant, non-volatile repository for decision support.
    • Typically refreshed via ETL pipelines; supports OLAP, data mining.

  • Data Mart
    • Department-scale subset of a warehouse (e.g., marketing), cheaper & faster to deploy.

  • Data Lake
    • Central storage for raw, structured & unstructured data; schema applied "on read."
    • Supports big-data analytics using Hadoop, Spark.

  • Extraction, Transformation & Loading (ETL)
    • Extract \rightarrow Transform \rightarrow Load pipeline prepares disparate data for warehouses.
    – Extract: pull from ERP, CRM, flat files.
    – Transform: cleanse, deduplicate, format, derive.
    – Load: bulk insert into warehouse tables.

  • Big Data
    • Characterized by the 5 V’s: Volume, Velocity, Variety, Veracity, Value.
    • Requires parallel processing frameworks (MapReduce), NoSQL stores, in-memory computing.

  • Business Analytics (BA)
    • Uses stats, predictive models, data mining to glean insights; feeds dashboards for executives.
    • Tools: Power BI, Tableau, Python libraries, R.

  • Online Transaction Processing (OLTP)
    • Real-time, record-level processing for routine tasks (POS, ATM).
    • Emphasizes ACID properties: Atomicity + Consistency + Isolation + Durability.

  • Online Analytical Processing (OLAP)
    • Multidimensional analysis (cubes) → slice-and-dice, drill-down, roll-up.
    • Power users examine sales by {\text{product},\text{region},\text{time}} axes simultaneously.

  • Data-Mining Analysis
    • Discovers hidden patterns via clustering, association rules (e.g., market basket {\text{diapers}}\rightarrow{\text{beer}}), classification.

  • Text-Mining Analysis
    • Extracts sentiment, entities from large text corpora; useful for social-media monitoring.

  • Power BI & Tableau
    • Visualization platforms turning raw data into interactive dashboards; enable self-service BI.

  • Database Administrators (DBA)
    • Responsibilities: design schema, tune indexes, manage security, backup/restore, ensure uptime.

Module 06 – Data Communications & Networking

  • Data Communication
    • Electronic transfer of data between locations; foundation for distributed systems, IoT.

  • Components
    • Sender/Receiver
    • Hardware (NICs, routers, modems)
    • Communication media
    • Protocols
    • Software (OS, drivers)

  • Communication Media
    Conducted (guided) – twisted pair, coaxial, fiber optics.
    Radiated (unguided) – microwave, radio (Wi-Fi, cellular), infrared.

  • Bandwidth vs. Throughput
    Bandwidth – theoretical capacity measured in \text{bps}.
    Throughput – actual achieved rate under real conditions.

  • Narrowband vs. Broadband
    Narrowband ≤ 56 kbps (legacy voice lines).
    Broadband transmits multiple signals concurrently; cable, DSL, fiber.

  • Attenuation
    • Signal power loss over distance; mitigated by repeaters, amplifiers, fiber.

  • Modem (Modulator-Demodulator)
    • Converts digital ↔ analog signals for transmission over legacy phone lines.

  • Multiplexer
    • Combines several input signals on one line; saves cost.

  • Network Topologies
    Bus – linear backbone, terminators at ends. Cheap but collision-prone.
    Ring – nodes form closed loop; token-passing avoids collisions.
    Star – central hub/switch; easy fault isolation.
    Mesh – each node connects to all; high redundancy (used in WAN backbones).
    Hierarchical (Tree) – mix of star plus additional branching levels.

  • LAN / MAN / WAN
    LAN – < 1 km, high speed.
    MAN – spans city (fiber rings).
    WAN – global, leased lines, satellites.

  • Wireless Technologies
    Wi-Fi (IEEE 802.11) – 2.4/5 GHz, 120–300 ft.
    Bluetooth – personal area network (~800 ft).
    WiMAX (802.16) – 3–30 mi metro coverage.
    Mobile/Cellular Network – cells served by base stations; supports TDMA, CDMA.

  • Multiple Access Techniques
    FDMA – by frequency bands.
    TDMA – splits channel into time slots; 3\times voice capacity.
    CDMA – spread-spectrum code; high security and user density.

  • Routing Concepts
    Router – directs packets; maintains routing table (destination → next hop).
    Static routes – manually defined (good for small, stable LANs).
    Dynamic routes – learned via protocols (RIP, OSPF).
    Centralized routing – one node computes all paths.
    Distributed routing – each node computes own best path.

  • Packet
    • Wrapper of user data + headers (source/dest IP, sequence, checksum); enables store-and-forward networks.

  • Protocols
    • Rules for syntax, error control, flow control, \text{bps}.
    TCP/IP suite underpins the Internet; OSI 7-layer model standardizes exchanges.

  • Client/Server & N-Tier Architectures
    Two-tier: UI ↔ DB server.
    N-tier: adds application servers to balance load; easier scaling, security segregation.

  • Controller & NIC
    • Interface hardware enabling network access; contains MAC address.

  • Convergence
    • Integration of voice, video, data; enables VoIP, video conferencing on same IP network.

  • DSL, Cable, Fiber
    DSL uses copper phone lines; asynchronous speeds.
    Cable shares coax; higher speeds but contention.
    Fiber – highest capacity, low attenuation.

Module 07 – Internet & Web Technologies

  • Internet & Backbone
    • Global "network of networks"; backbone = high-capacity fiber links run by Tier-1 ISPs using BGP routing.

  • ARPANET
    • 1969 DoD project; packet-switching proof-of-concept; four original nodes (UCLA, SRI, UCSB, Univ. Utah).

  • DNS Protocol
    • Converts human domain names to IP addresses via hierarchy of root, TLD, authoritative servers.

  • URL
    • Uniform Resource Locator format: scheme://host:port/path?query#fragment.

  • HTML & Hypertext/Hypermedia
    HTML tags (\<h1>, \<img>, etc.) structure web pages.
    Hypertext = clickable links; hypermedia extends to audio/video.

  • Web 2.0
    • Shift to interactive, user-generated content (wikis, blogs, social networks); enables e-collaboration.

  • Blog / Wiki / Podcast
    • Blog = personal journal; wiki = collaborative editing; podcast = downloadable audio series.

  • Search Engines & Directories
    • Use crawlers, indexers, and query processors; early Yahoo = manual directory, Google = PageRank algorithm.

  • Navigational Tools
    • Browsers (Chrome, Edge), bookmarks, history facilitate "surfing" the Web.

  • Instant Messaging, IRC, Discussion Groups, Newsgroups
    • Real-time or asynchronous communication; precursor to modern social media & Slack.

  • RSS Feeds
    • XML-based syndication; feed readers aggregate headlines automatically.

  • Internet Telephony & VoIP
    • Conveys voice packets over IP; savings vs. PSTN; requires QoS to minimize \text{delay} \le 150\,\text{ms}.

  • Internet of Things (IoT), IoE, IIoT
    IoT – billions of sensor-equipped physical objects online.
    IoE – adds people, processes, data for holistic connectivity.
    IIoT – industrial applications (predictive maintenance, SCADA).
    • Smart home = consumer IoT implementation (thermostats, lights).

  • Intranet & Extranet
    Intranet – private corporate network using Internet tech; enhances internal comms.
    Extranet – extends limited access to business partners (B2B portals).

  • Social Networking
    • Platforms (Facebook, LinkedIn) enabling user profiles, relationships graph.

Module 10 – Systems Development & Project Management

  • Systems Development Life Cycle (SDLC)
    • Waterfall model:

    1. Planning
    2. Requirements Gathering & Analysis
    3. Design
    4. Implementation
    5. Maintenance
      • Each phase produces deliverables feeding the next.
  • Planning Phase
    • Define problem not symptoms; form task force; produce preliminary feasibility study.

  • Feasibility Study Dimensions
    Economic – cost/benefit (ROI, NPV = \sum \frac{CF_t}{(1+r)^t}).
    Technical – availability/compatibility of tech.
    Operational – acceptance by users; alignment w/ procedures.
    Scheduling – can it be done on time?
    Legal – compliance (Info Privacy Act, GDPR).

  • Internal vs. External Users
    • Internal employees vs. external stakeholders; both provide requirements.

  • Joint Application Design (JAD)
    • Facilitated workshops merging users, managers, IT to accelerate consensus.

  • Computer-Aided Systems Engineering (CASE)
    • Diagramming, code generation, repository tools that automate SDLC tasks; improves documentation.

  • Design Phase
    • Choose best alternative; produce detailed specs: data models, UI mock-ups, hardware needs.

  • Implementation Phase
    • Code, test, install; deliver training; prepare conversion strategy:
    Plunge (direct)
    Parallel
    Phased-in / Phased-out
    Pilot

  • Maintenance Phase
    • Post-deployment changes, optimizations, patches; absorbs \approx 80\% of life-cycle cost.

  • Prototyping Approaches
    Proof-of-Concept – validates feasibility.
    Selling – demonstrates benefits to secure funding.
    RAD – merges planning & analysis, iterative builds.
    Extreme Programming (XP) – short cycles, pair programming, continual testing.
    Agile Methodology – incremental delivery, embraces change, limits scope per sprint.

  • Low-Code / No-Code Platforms
    • Drag-and-drop components; enable self-sourcing by power users; speed MVP delivery but risk shadow IT.

  • Service-Oriented Architecture (SOA)
    • Decomposes functionality into reusable services accessed via open protocols (SOAP, REST). Encourages agility.

  • Outsourcing vs. Insourcing vs. Crowdsourcing
    Outsourcing – hire vendor; focus on core competences, might lose control.
    Insourcing – internal dev team; better IP retention.
    Crowdsourcing – open call to large online community; leverages diversity, cost-effective.

  • IT Project Management
    • Initiation, planning (Gantt/PERT), execution, monitoring (EV =\text{% complete}\times\text{budget}), closure.
    • Tools: MS Project, Jira.

  • Requests for Information (RFI) & Proposal (RFP)
    • Formal procurement docs; RFI = market scan; RFP = solicit bids with specs.

  • Conversion Costs & Risk
    • Direct-cutover cheaper but risky; parallel safest but doubles operations temporarily.

  • Scheduling Techniques
    • Critical Path = \text{longest path through PERT network}; slack time helps allocate resources.