Course Code: COM162
Introduction to Databases and Database Environment.
Understand common uses of database systems.
Learn the limitations of file-based systems compared to databases.
Distinguish between databases and database management systems (DBMS).
Identify components and roles within the database environment.
Explore ANSI-SPARC three-level architecture.
Comprehend logical and physical data independence.
Differentiate between DDL (Data Definition Language) and DML (Data Manipulation Language).
Recognize the significance of data models and conceptual modelling.
Understand client-server architecture and its advantages for DBMS.
Databases are essential for modern applications and systems.
They organize information, bringing order to digital chaos.
Databases are crucial gatekeepers of information power in contexts from social media to finance.
Relational databases revolutionized data storage, management, and retrieval.
The evolution of databases over 30+ years has significantly impacted society and the economy.
They are a key development in software engineering, affecting organizational operations.
Advances in hardware and communications technologies (e.g., Internet and e-commerce) have boosted database importance.
Common use cases include:
Supermarket purchases
Credit card transactions
Holiday bookings
Library management
Insurance transactions
Internet searches
University studies
The file-based system is the ancestor of modern databases, now considered largely outdated.
It consists of applications that independently manage their data.
Typically limited to small-scale applications without complex data needs.
Difficulty arises when needing to cross-reference information across files.
Example challenges:
Querying specific features like "three-bedroom properties with a garage."
Calculating averages (e.g., average rent).
Generating total salary reports easily.
Departments access their files via dedicated application programs.
Each program handles its own data entries, maintenance, and reports.
The application's code defines the physical structure and data storage.
Separation and isolation of data across programs leads to:
Users may miss out on useful, cross-application data.
Duplication of data wastes resources and can create inconsistencies.
Changes to data structures in file-based systems are complex and labor-intensive.
Modifications require all related programs to be updated, risking errors.
Different application programs may not work well together due to varied file formats.
Fixed queries lead to proliferation of specific application programs with limited scope.
The database approach addresses file-based limitations through:
Centralizing data definition independent of application programs.
Implementing controls over data access and manipulation.
A database serves as a large, integrated repository for data used by multiple departments.
Data definitions are distinct from application programs, allowing flexibility in modifications.
Essential components include:
Entities: distinct objects like people or events.
Attributes: properties describing the entities.
Relationships: associations among entities.
A DBMS interacts with applications and the database, offering functions such as:
Data storage, retrieval, and updates.
Providing user-accessible catalogs.
Supporting transactions and ensuring data consistency.
Additional functions include:
Recovery services for data integrity.
Authorisation and security services to protect data.
Integration with communication software for data sharing.
Users communicate with the database through various application programs.
The DBMS manages the data's physical structure and storage, not the applications directly.
Illustration showing integration of various applications with the DBMS.
Confirming that a centralized database facilitates coordinated access without redundant data.
Key components of a DBMS include:
Programmers
Database administrator (DBA)
Application programs and Queries
Database schema and management utilities
Main elements in the DBMS environment:
Hardware: Ranging from PCs to networked systems.
Software: DBMS, operating systems, and relevant application software.
Data: Information utilized by the organization, encapsulated in a schema.
Procedures: Governance rules and guidelines for DB usage.
People: Stakeholders including developers and end-users.
Control of data redundancy and enhanced data consistency.
Allows for increased information retrieval from integrated data.
Shared data resources enhance inter-departmental collaboration.
Improved integrity guarantees validity across data stored.
Enhanced security at the database level, unlike isolated files.
Standardized data processes enforced across the organization.
Improved productivity and resource management aided with centralized control.
DBA control helps balance resource use effectively.
Enhanced data access and system responsiveness.
Improved maintenance through data independence, limiting impact of changes.
Complexity in design and maintenance necessitates skilled knowledge.
Storage size demands and potential high costs for DBMS solutions.
Transitioning from existing systems may incur additional costs and performance issues.
Data Administrator (DA): Manages data resources and strategic planning.
Database Administrator (DBA): Handles database design, implementation, security, and performance.
Further reading on the roles of DA and DBA recommended.
Logical Database Designer: Focuses on data structure and integrity.
Physical Database Designer: Maps logical structures to physical formats.
Application Developers: Create programs enabling user functionality.
End-Users: Utilize the database that meets their needs.
1960s: Hierarchical and network data models.
1970s: Emergence of the relational model with SQL.
1990s: Growth in object-oriented models and data warehousing.
2000s: Rise of NoSQL databases for unstructured data.
Content Addressable Storage (CAS): Simplifies data retrieval by description rather than location.
AI and Natural Language Processing: Users can interact through natural language queries.
Scalable Data Mining Algorithms: Enhancing large dataset processing capabilities.
Trends include mobile database accessibility and web service integration.
Advancements in security through computer forensics and self-tuning systems.
The ANSI-SPARC framework outlines three abstraction levels:
External Level: User view of the database.
Conceptual Level: Community data view and relationships.
Internal Level: Physical data storage representation.
Introduced by ANSI in 1975, forming a foundation for most DBMS designs.
Differences among the levels emphasize the concept of multiple external views relating to one conceptual view.
Uniform data access, autonomy from changes across schema levels, and DBA flexibility in modifications without user impact.
Logical Data Independence: Changes to the conceptual schema should not affect external schemas.
Physical Data Independence: Modifications to internal structures should not require external or conceptual schema changes.
Illustrates mappings among external schemas, conceptual schema, and internal schema.
DDL: Description, naming, and constraints of entities in the database.
DML: Operations for data querying, insertion, modification, and deletion.
Includes SQL, form and report generators, and application generators designed for ease of use.
A coherent data model should clarify organizational data through:
Structural rules
Allowed operations
Integrity rules
Object-Based Models: Utilize entities and their relationships.
Record-Based Models: Include relational and hierarchical structures.
Conceptual modelling captures organizational data requirements accurately and comprehensively.
Clients manage interfaces while servers store the database; advantages include better access, performance gains, reduced costs, and enhanced consistency.
Review of applications, distinctions between file-based and database approaches, role of DBMS, architecture models, data independence, and the advantages and disadvantages associated with DBMS.