BUS 391 Midterm #1
BUS 391 WINTER 2026 MIDTERM Comprehensive Study Guide
1. Week One
a. The Difference Between Data and Information
Data:
Definition: Raw, unprocessed facts and figures without context.
Examples: Numbers, text, observations.
Information:
Definition: Processed, organized, and contextualized data that has meaning and value for decision-making.
Example: '25' is data; '25 degrees Celsius in San Francisco' is information.
Transformation of Data to Information:
Data becomes information when it is processed to answer questions such as who, what, when, where, and why.
b. How Do Current Computers 'Think'
Computers process information using binary code (0s and 1s).
The Central Processing Unit (CPU) executes instructions through a fetch-decode-execute cycle.
Logic gates perform Boolean operations (AND, OR, NOT) on electrical signals.
Transistors act as electronic switches that control the flow of electricity.
Memory stores data and instructions temporarily (RAM) or permanently (storage).
Algorithms provide step-by-step instructions for solving problems.
2. Business Management Systems
a. Moore's Law
Definition: States that the number of transistors on a computer chip doubles approximately every 18 months.
Associated Effects:
The price of transistors correspondingly decreases over time.
Results in exponential growth in computing power and performance.
Enables smaller, faster, and more efficient computers.
Historical Context: Has been a reliable predictor of technological advancement since 1965.
b. DBMS (Database Management System)
Definition: Software that creates, manages, and controls access to databases.
Key Functions:
Provides data security, integrity, and consistency.
Allows multiple users to access data simultaneously.
Examples: MySQL, Oracle, Microsoft SQL Server, PostgreSQL.
DBMS Functions include data definition, manipulation, querying, and administration.
c. Decision-making Information Systems
Definition: Systems designed to support business decision-making activities.
Types include:
Management Information Systems (MIS).
Decision Support Systems (DSS).
Executive Information Systems (EIS).
Purpose: Provide reports, analytics, and insights from organizational data, helping managers make informed decisions.
d. Implementing Change Brings Short-term Stress and Reduced Productivity
Definition: Change management is challenging for organizations and employees.
Issues:
Learning curve associated with new systems temporarily decreases efficiency.
Employees may resist change due to comfort with existing processes.
Solutions:
Proper training and support can minimize disruption.
Long-term benefits typically outweigh short-term productivity losses.
e. On-premises Solutions
Definition: Software and hardware located physically within an organization's facilities.
Characteristics:
Organization has full control over infrastructure and data.
Requires significant capital investment in servers, networking, and maintenance.
Organization responsible for security, updates, and disaster recovery.
Often necessary for regulatory compliance or data sensitivity requirements.
f. Cloud-based Options
Definition: Services hosted on remote servers accessed via the internet.
Key Benefits:
Lower upfront costs with subscription-based pricing models.
Scalability: easily adjust resources based on demand.
Provider handles maintenance, updates, and security.
Accessible from anywhere with internet connection.
Types include:
Public cloud.
Private cloud.
Hybrid cloud.
g. SaaS (Software as a Service)
Definition: Cloud-based software delivery model where applications are hosted by a vendor.
Characteristics:
Users access software through web browsers, no local installation required.
Examples: Salesforce, Microsoft 365, Google Workspace, Dropbox.
Subscription-based pricing (monthly or annual fees).
Automatic updates and maintenance handled by provider.
Reduces IT overhead and infrastructure costs.
h. Business Intelligence
Definition: Technologies and strategies for analyzing business data.
Purpose: Transforms raw data into meaningful insights for decision-making.
Key Components include data mining, reporting, querying, and visualization.
Tools: Tableau, Power BI, QlikView, SAP BusinessObjects.
Benefits: Helps identify trends, patterns, and opportunities; supports strategic planning and competitive advantage.
i. Big Data
Definition: Extremely large and complex datasets that traditional processing cannot handle.
Characterization: Described by the Four V's:
Volume: Massive amounts of data (terabytes to petabytes).
Velocity: High speed of data generation and processing (real-time or near real-time).
Variety: Different types and formats of data (structured, semi-structured, unstructured).
Veracity: Uncertainty and quality of data (accuracy, trustworthiness).
Sources: Include social media, sensors, transactions, and IoT devices.
Requires specialized technologies like Hadoop and Spark.
Enables predictive analytics and data-driven decision-making.
j. Decision Support Systems (DSS)
Definition: Interactive computer-based systems that help decision-makers use data and models.
Applications: Support semi-structured and unstructured decision-making tasks.
Components:
Database.
Model base.
User interface.
Features: Provide what-if analysis and scenario planning capabilities, used for strategic planning, resource allocation, and risk assessment.
k. Learning Management Systems (LMS)
Definition: Software applications for administration, documentation, and delivery of educational content.
Examples: Moodle, Canvas, Blackboard, Google Classroom.
Features: Include course management, assessment tools, and progress tracking.
Applications: Support both academic institutions and corporate training programs, enabling remote and self-paced learning.
3. Databases
a. Relational Databases
Definition: Organize data into tables (relations) with rows and columns.
Structure: Tables linked through primary and foreign keys.
Basis: Based on relational model developed by E.F. Codd.
Integrity: Ensures data integrity through normalization.
ACID properties:
Atomicity: Transactions are all-or-nothing.
Consistency: Transactions bring the database from one valid state to another.
Isolation: Transactions occur independently without interference.
Durability: Once a transaction is committed, it remains so even in the event of a failure.
Examples: MySQL, PostgreSQL, Oracle, Microsoft SQL Server.
b. SQL (Structured Query Language)
Definition: Standard language for managing and manipulating relational databases.
Main Operations:
SELECT (query).
INSERT (add data).
UPDATE (modify data).
DELETE (remove data).
Subcategories:
DDL (Data Definition Language): CREATE, ALTER, DROP.
DML (Data Manipulation Language): INSERT, UPDATE, DELETE.
Features: Supports joins, aggregations, and complex queries.
c. NoSQL
Definition: Non-relational databases designed for large-scale data and flexibility.
Types:
Document stores: MongoDB.
Key-value stores: Redis.
Column-family stores: Cassandra.
Graph databases: Neo4j.
Characteristics: Schema-less or flexible schema design; horizontal scalability for handling big data.
Usage: Optimized for specific use cases and performance.
d. Cloud Database
Definition: Database running on cloud computing platform.
Types: Can be SQL or NoSQL.
Examples: Amazon RDS, Google Cloud SQL, Azure SQL Database.
Benefits: Include scalability, automated backups, high availability; operate on a pay-per-use pricing model.
e. Data Warehouse
Definition: Centralized repository that stores integrated data from multiple sources.
Characteristics: Optimized for query and analysis rather than transaction processing; subject-oriented, integrated, time-variant, and non-volatile.
Purpose: Supports business intelligence and reporting; utilizes ETL (Extract, Transform, Load) processes.
f. Data Mart
Definition: Subset of a data warehouse focused on a specific business area or department.
Comparisons: Smaller and more focused than an enterprise data warehouse.
Examples: Sales data mart, Marketing data mart, Finance data mart.
Benefits: Faster implementation and lower cost than a full data warehouse; provides tailored data access for specific user groups.
g. Big Data
Reference: See section 2.i for detailed explanation.
In Database Context: Data sets too large for traditional DBMS require distributed processing frameworks.
h. Four V's of Big Data
Volume: Massive amounts of data (measured in terabytes to petabytes).
Velocity: High speed of data generation and processing (real-time or near real-time).
Variety: Different types and formats of data (structured, semi-structured, unstructured).
Veracity: Uncertainty and quality of data (accuracy, trustworthiness).
Additional Context: Some models add Value as the fifth V.
i. Online Analytical Processing (OLAP)
Definition: Technology for performing multidimensional analysis on large data volumes.
Functions: Enables complex queries and calculations, allows operations such as slice, dice, drill-down, roll-up, pivot.
Organization: Data organized in cubes with dimensions and measures.
Optimization: Optimized for read-heavy operations and analysis; contrasts with OLTP (Online Transaction Processing).
j. Data Mining
Definition: Process of discovering patterns, correlations, and insights from large datasets.
Techniques:
Classification.
Clustering.
Regression.
Association rules.
Methodology: Uses statistical and machine learning algorithms.
Applications: Market basket analysis, customer segmentation, fraud detection.
Outcome: Transforms raw data into actionable business intelligence.
4. Computers and Software
a. Four Basic Computing Functions
Input: Receiving data and instructions (e.g., keyboard, mouse, sensors).
Processing: Manipulating and calculating data (CPU operations).
Storage: Saving data for later use (e.g., RAM, hard drives, SSD).
Output: Presenting processed information (e.g., monitor, printer, speakers).
b. Examples of Embedded Computers
Automotive:
Engine control units
Anti-lock braking systems
Infotainment systems
Home Appliances:
Smart refrigerators
Washing machines
Thermostats
Medical Devices:
Pacemakers
Insulin pumps
Diagnostic equipment
Consumer Electronics:
Digital cameras
Smartwatches
Fitness trackers
Industrial:
Manufacturing robots
ATMs
Point-of-sale systems.
c. What is a Server
Definition: Computer or software that provides services to other computers (clients).
Types:
Web servers.
File servers.
Database servers.
Email servers.
Characteristics:
High processing power, large storage, redundancy.
Designed for 24/7 operation and handling multiple requests.
Can be physical hardware or virtual machines.
d. Wi-Fi
Definition: Wireless networking technology using radio waves.
Standards: Based on IEEE 802.11 standards.
Frequencies: Common frequencies include 2.4 GHz and 5 GHz.
Versions: 802.11n, 802.11ac, 802.11ax (Wi-Fi 6).
Purpose: Enables wireless internet access within limited range.
e. Bluetooth
Definition: Short-range wireless technology for data exchange.
Operation: Operates in the 2.4 GHz frequency band.
Range: Typically between 10-100 meters depending on class.
Uses: Common applications include headphones, keyboards, mice, and file transfers.
Advantages: Lower power consumption compared to Wi-Fi.
f. Ethernet
Definition: Wired networking technology for local area networks.
Media: Uses twisted-pair or fiber optic cables.
Speeds: Range from 10 Mbps (original) to 100 Gbps (modern).
Reliability: More reliable and faster than wireless connections.
Standard: Based on IEEE 802.3 standard.
g. Operating Systems
Definition: System software that manages hardware and software resources.
Functions:
Process management
Memory management
File systems
Security
User Interface: Acts as an intermediary between user and hardware.
Examples: Windows, macOS, Linux, iOS, Android.
h. Utility Software
Definition: Programs that perform maintenance and optimization tasks.
Examples: Antivirus, disk cleanup, backup software, file compression.
Purpose: Helps keep computer running efficiently and securely; often included with operating systems or available separately.
i. Productivity Software
Definition: Applications that help users complete tasks and create content.
Types:
Word processors: Microsoft Word, Google Docs.
Spreadsheets: Excel, Google Sheets.
Presentation software: PowerPoint, Google Slides.
Other Tools: Email clients, calendars, project management tools.
5. Computer Networks and the Internet
a. Computer Networks
Definition: Interconnected computing devices that can exchange data and share resources.
Components: Computers, routers, switches, cables, wireless access points.
Purposes: Enable communication, resource sharing, and collaboration.
Types by Size: PAN, LAN, MAN, WAN.
b. Local Area Networks (LAN)
Definition: Network covering small geographic area (e.g., building, campus).
Characteristics: High data transfer rates and low latency.
Ownership: Owned and managed by a single organization.
Technologies: Ethernet, Wi-Fi.
Functions: Enables file sharing, printer sharing, and internal communication.
c. Wide Area Networks (WAN)
Definition: Network covering a large geographic area (e.g., countries, continents).
Purpose: Connects multiple LANs together.
Example: The Internet is the largest WAN.
Infrastructure: Uses leased telecommunication lines, satellites, or cellular networks.
Characteristics: Generally slower than LANs but enables global connectivity.
d. Internet Service Provider (ISP)
Definition: Company that provides internet access to customers.
Types: Cable, DSL, fiber optic, satellite, mobile broadband.
Examples: Comcast, AT&T, Verizon, Spectrum.
Functions: Provides infrastructure and connectivity to the internet backbone.
e. Internet Protocol Address (IP)
Definition: Unique numerical identifier assigned to each device on a network.
Format: IPv4 format: Four octets (e.g., 192.168.1.1).
Purpose: Enables routing of data packets across networks.
Types: Can be static (permanent) or dynamic (changes periodically).
f. IPv6
Definition: Next generation Internet Protocol addressing system.
Address Length: 128-bit addresses (vs. 32-bit for IPv4).
Format: Eight groups of hexadecimal digits (e.g., 2001:0db8:85a3:0000:0000:8a2e:0370:7334).
Advantage: Provides virtually unlimited address space developed to address IPv4 address exhaustion.
g. Transmission Control Protocol (TCP)
Definition: Core protocol that ensures reliable, ordered delivery of data.
Characteristics: Connection-oriented protocol (establishes connection before transmission).
Features: Includes error checking and correction mechanisms.
Reliability: Guarantees packet delivery in correct order.
Usage: Employed in applications requiring reliability (e.g., web, email, file transfer).
h. TCP/IP
Definition: Suite of communication protocols used for the internet and similar networks.
Combination: Combines TCP (reliability) with IP (routing).
Structure: Four layers: Application, Transport, Internet, Network Access.
Importance: The foundation of internet communication; platform-independent and scalable.
i. Resources Provided by the Internet
World Wide Web (WWW): Websites and web applications.
Email: Electronic messaging system.
File Transfer Protocol (FTP): Method for file sharing.
Cloud Storage and Services: Remote data storage.
Communication: Voice and video communication (VoIP, video conferencing).
Media: Streaming media, social networks, online gaming.
j. Tier 1 Providers
Definition: Top-level ISPs that form the backbone of the internet.
Characteristics: Can reach every other network on the internet without purchasing transit.
Interconnection: Peer with each other through settlement-free interconnection.
Examples: AT&T, Verizon, NTT, Level 3.
Operations: Operate high-capacity fiber optic networks globally.
k. Undersea Cables
Definition: Fiber optic cables laid on the ocean floor connecting continents.
Importance: Carry 99% of international data traffic.
Characteristics: Provide high bandwidth and low latency for global communications.
Role: Critical infrastructure for internet, telecommunications, and finance.
Coverage: Hundreds of cables spanning millions of kilometers worldwide.
l. World Wide Web (WWW)
Definition: Information system of interlinked hypertext documents accessed via the internet.
Inventor: Created by Tim Berners-Lee in 1989.
Protocols: Uses HTTP/HTTPS protocol and URLs for addressing.
Content: Created with HTML, CSS, JavaScript.
Access: Accessed through web browsers such as Chrome, Firefox, Safari, Edge.
m. Web 2.0
Definition: Second generation of web characterized by user-generated content.
Characteristics: Interactive and collaborative platforms.
Examples: Social media (Facebook, Twitter), wikis, blogs, video sharing (YouTube).
Evolution: Shift from static web pages to dynamic web applications.
Technologies: Involves AJAX, APIs, cloud computing.
n. Web 3.0
Definition: Emerging vision of semantic and decentralized web.
Features: Key aspects include Blockchain, decentralization, AI integration.
Purpose: Semantic web where machines can understand and interpret content.
Technologies: Cryptocurrencies, NFTs, decentralized apps (dApps).
Focus: On user ownership and privacy.
o. Deep Web
Definition: Content not indexed by standard search engines.
Content Types: Includes password-protected sites, databases, private networks.
Size: Much larger than the surface web (visible web).
Examples: Email inboxes, online banking, medical records, academic databases.
Characteristics: Not inherently illegal or malicious.
p. Dark Web
Definition: Small portion of deep web intentionally hidden and anonymous.
Accessibility: Requires special software to access (e.g., Tor browser).
Purpose: Provides anonymity for users and website operators.
Usage: Used for both legitimate (privacy, whistleblowing) and illegal activities.
Access Characteristics: Not indexed by search engines and uses .onion domains.
6. Understanding Database Concepts
a. Hierarchy of Data
Bit: Smallest unit of data (0 or 1).
Byte: 8 bits, represents a single character.
Field: Single piece of information (e.g., First Name, Age).
Record: Collection of related fields (e.g., one customer's information).
Table/File: Collection of related records.
Database: Collection of related tables.
b. Major Access Objects
Tables: Store data in rows and columns.
Queries: Retrieve and manipulate data from tables.
Forms: User-friendly interface for data entry and viewing.
Reports: Formatted presentation of data for printing or viewing.
Macros: Automated actions and processes.
Modules: VBA code for advanced functionality.
7. Requirements Gathering Processes
a. Importance of Planning
Objectives: Prevents scope creep and project delays.
Stakeholder Alignment: Ensures alignment among stakeholders and clear expectations.
Cost Reduction: Reduces costs by identifying issues early.
Development Roadmap: Provides roadmap for development and implementation.
Success Metrics: Establishes success criteria and measurable goals.
b. Requirements
Functional requirements: What the system must do (features, capabilities).
Business requirements: High-level organizational needs and objectives.
User requirements: Specific needs of end users.
Characteristics: Must be clear, specific, measurable, and testable.
Methods: Gathered through interviews, surveys, observation, and documentation review.
c. Non-functional Requirements
Definition: Define how the system should perform (quality attributes).
Key Areas:
Performance: Speed, response time, throughput.
Security: Authentication, authorization, data protection.
Usability: User-friendliness, accessibility, learnability.
Reliability: Uptime, fault tolerance, recoverability.
Scalability: Ability to handle growth.
8. Artificial Intelligence
a. What is AI?
Definition: Computer systems capable of performing tasks requiring human intelligence.
Key Areas: Learning, reasoning, problem-solving, perception, language understanding.
Types: Narrow AI (designed for specific tasks) vs. General AI (human-level intelligence across domains).
Applications: Include virtual assistants, recommendation systems, autonomous vehicles, and medical diagnosis.
b. Machine Learning
Definition: Subset of AI where systems learn from data without explicit programming.
Functionality: Algorithms identify patterns and make predictions.
Improvements: Performance improves with more data and experience.
Main Types:
Supervised learning.
Unsupervised learning.
Reinforcement learning.
c. Deep Learning
Definition: Subset of machine learning using artificial neural networks with multiple layers.
Inspiration: Models inspired by human brain structure and function.
Strengths: Excels at processing complex, unstructured data (e.g., images, speech, text).
Requirements: Requires large datasets and significant computational power.
Applications: Include image recognition, natural language processing, and game playing.
d. Cognitive Computing
Definition: AI systems simulating human thought processes.
Components: Combines machine learning, natural language processing, and reasoning.
Functionality: Can understand context, interpret meaning, and learn continuously.
Example: IBM Watson used for complex decision support and human-computer interactions.
e. Machine Learning Methods
i. Supervised Learning
Definition: Learns from labeled training data (input-output pairs).
Function: Algorithm learns to map inputs to correct outputs.
Types:
Classification: Produces discrete outputs.
Regression: Produces continuous outputs.
Examples: Spam detection, house price prediction, image classification.
ii. Unsupervised Learning
Definition: Learns from unlabeled data without predefined outputs.
Purpose: Discovers hidden patterns and structures in data.
Types:
Clustering: Groups similar items.
Dimensionality reduction.
Examples: Customer segmentation, anomaly detection, recommendation systems.
iii. Reinforcement Learning
Definition: Agent learns through trial and error by interacting with an environment.
Feedback: Receives rewards or penalties for actions taken.
Objective: Goal is to maximize cumulative rewards over time.
Examples: Game playing (e.g., AlphaGo), robotics, autonomous vehicles.
f. Neural Networks
Definition: Computing systems inspired by biological neural networks.
Structure: Consist of interconnected nodes (neurons) organized in layers.
Learning: Learn by adjusting connection weights during training.
i. Input Layer
Function: Receives initial data/features.
Structure: Each node represents one feature or attribute; no computation occurs, simply passes data forward.
ii. Hidden Layer(s)
Definition: Intermediate layers between input and output.
Functionality: Performs transformations and extracts features.
Characteristics: Deep networks have multiple hidden layers; applies activation functions to introduce non-linearity.
iii. Output Layer
Function: Produces final prediction or classification.
Nodes: The number of nodes depends on task (e.g., 1 for regression, multiple for classification).
Probability: May use softmax for generating probability distributions.
g. Types of Data Used in AI
i. Structured Data
Definition: Organized in a defined format (e.g., tables, databases).
Searchability: Easily searchable and analyzable.
Examples: Spreadsheets, relational databases, CSV files.
Data Types: Includes numerical and categorical data.
ii. Unstructured Data
Definition: Lacks a predefined format or organization.
Processing: Requires processing before analysis.
Examples: Text documents, images, videos, audio, social media posts.
Significance: Comprises the majority of data generated today.
iii. Historical Data
Definition: Past data used to train AI models.
Importance: Enables pattern recognition and trend analysis.
Quality: The quality and quantity affect model performance; used for predictions and forecasting.
iv. Internal vs. External Data
Internal: Data generated within an organization (e.g., sales, operations, HR).
External: Data from outside sources (e.g., market data, social media, government statistics).
Combination: Combining both types provides comprehensive insights; internal data is more controlled, whereas external data offers broader context.
h. Generative AI
Definition: AI that creates new content (e.g., text, images, music, code).
Mechanism: Learns patterns from training data to generate similar outputs.
Technologies: Includes Large Language Models (LLMs), Generative Adversarial Networks (GANs), and Diffusion models.
Examples: ChatGPT, DALL-E, Midjourney, GitHub Copilot.
Applications: Content creation, design, coding assistance, and other creative work.
i. Current Problems with AI
Bias and Fairness: Models can perpetuate or amplify societal biases.
Explainability: Difficulty understanding how AI makes decisions (black box problem).
Data Quality and Privacy: Models require large amounts of data, raising privacy concerns.
Hallucinations: AI generating false or nonsensical information.
Job Displacement and Ethical Concerns.
Environmental Impact: High energy consumption for training.
j. Bias
Definition: Systematic errors in AI predictions favoring certain outcomes.
Sources: Biased training data, algorithm design, feature selection.
Types:
Selection bias.
Confirmation bias.
Historical bias.
Consequences: Unfair treatment, discrimination, reinforced stereotypes.
Mitigation: Diverse training data, fairness metrics, regular audits.
k. Prompt Engineering
Definition: Crafting effective inputs (prompts) to get desired outputs from AI models.
Importance: Particularly important for Large Language Models (LLMs).
Techniques: Clear instructions, providing context, examples (few-shot learning), role assignment.
Methodology: An iterative process of testing and refining prompts for better results; it's an emerging skill for maximizing AI tool effectiveness.
Good luck on your midterm!