1/168
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
What is a database?
A structured collection of related data.
What is the purpose of a database?
To organize data for efficient retrieval and management.
What does a database represent?
Aspects of the real world.
For whom is a database designed?
For a specific purpose or audience.
What is persistent storage in databases?
Data remains after the program ends.
What does organized structure mean in databases?
Data follows defined patterns.
What are shared resources in the context of databases?
Multiple users can access the data.
What is integrated data in databases?
Related information is linked together.
What does controlled redundancy refer to in databases?
Minimized duplicate data.
Types of Databases
1. Document
2. Graph
3. Relational
4. Analytical (OLAP)
5. Key Value
What are fields?
1. Columns
2. Individual pieces of information
Example: Name, Address, Phone
What are records?
1. Rows
2. Complete sets of related data
Example: One customer's information
When would you use a database?
1. When you need to handle large amounts of data
2. When there are complex & enforced relationships
3. When there are multiple user access
4. When there are data integrity rules
5. When there is a need for powerful querying
When would you use a spreadsheet?
1. When there's smaller databases
2. When there are simple ad hoc relationships
3. When there is usually a single user
4. When there needs to be manual validation
5. When there is a need for visualization and analysis
Database Management System (DBMS)
A computer program that is used to create, process, and administer a database
Why don't many organizations develop their own DBMS?
Due to the complexities and DBMS development process
What are some examples of DBMS?
Oracle Microsoft, and IBM
What is the main difference between a database and a DBMS?
A database is a collection of tables, relationships, and metadata, whereas a DBMS is a software program designed to organize and administer a database.
Data Redundancy
Duplication of data
Leads to inconsistent data
How do DBMS solve file management issues?
The use of a DBMS makes it easier to secure data and information. DBMS allows for the creation of access constraints so that only authorized users are able to access the data.
Users are assigned a different set of access rules; this helps to protect data and users from identity theft, data leaks, and the misuse of data.
Data Integrity
The database is reliable, accurate, and aligned to the goals of the organization
Factors influencing data integrity
1. Domain integrity
2. Entity integrity
3. Referential integrity
4. User-defined integrity
Domain Integrity
Ensures that data entered into a field adheres to predefined rules and constraints
Example: Short text, long text, number
Entity Integrity
Ensures that every table has a primary key and that each row is uniquely identifiable
Primary Keys
Unique identifier for each instance of an entity
Often auto increment and cannot be NULL
Example: StudentID must exist in the Students table
Foreign Keys
Links one table to another by referencing the primary key of another table.
enables referential integrity between related data
Example: CustomerID in an Orders table referencing CustomerID in a Customers table
Composite Primary Keys
Exists in some tables such as CourseEnrollment where Student ID + CourseID together act as the Primary Key
Referential Integrity
Accuracy and consistency of data within a table relationship in a database
User-Defined Integrity
Allows custom business roles to enforce specific data requirements
Example: A salary cannot have negative values
Data Centralization
Critical in increasing data integrity
1. When data is centralized, it means it is stored in only one place
2. When multiple lists and data sources are maintained, information can become inconsistent leading to decreased data integrity
Benefits of Data Centralization
Consistent data, faster services, better decision-making
Relational Databases
Organize data into tables based on structured data groupings
Use links called relationships between tables
Tables
Hold information about the objects to be represented in the database
When would a relationship be created?
When there is a common filed that is included in both tables
How many tables can be linked using a relationship in a relational database?
Two or more
How is the creation of relationships achieved?
1. A primary key value (in the primary or parent table)
2. Foreign keys (in associated tables)
One-to-one Relationship
Each instance of one entity is associated with one instance of another entity.
Example: Each driver's license has one unique owner
One-to-Many
An instance of one entity can be associated with multiple instances of another entity
Example: A professor could be associated with multiple students
Many-to-Many
Instances of one entity can be associated with multiple instances of another entity
Example: Students can enroll in multiple sources each semester, and each course can have multiple students enrolled
What does referential integrity require?
Whenever a foreign key value is used, it must reference a valid, existing primary key in the parent table
What is another term for a query?
A question
What does a query do in a database?
Requests information from a table (or combinations of tables)
What is one of the most popular query languages used in creating queries?
Structured Query Language (SQL)
What can queries help you do with data?
Find specific data quickly by filtering based on specific criteria
What are two additional functions of queries besides finding data?
Calculate/summarize data and automate data management tasks
What is a Select Query Type?
Select rows and columns that meet certain criteria
What is a Aggregate Query Type?
Calculate the sum, average, or other aggregate functions
What is a Crosstab Query Type?
Perform calculations then group the results by two sets of values
What is a Action Query Type?
Four basic types of action queries are used to append data, delete data, update data, and make new tables
What is a Parameter Query Type?
Prompt the user for input values in order to run/execute the query
What is a SQL Query Type?
Specific queries use specific SQL statements to execute the query
SQL
Used for human interface and communication with relational databases
Considered the standard language
What does SQL use?
User-generated lines of code (statements) to answer questions against the database
What is the purpose of entity-relationship diagrams (ERD)?
Used in the design of relational databases to visually represent the relationships between different data entities.
What are the three main elements of entity-relationship diagrams (ERD)?
Entities, relationships, and attributes.
Entities
Represented by rectangles
Represent objects or occurrences
Typically entered into tables
Have two characteristics
Are often nouns
Relationships
Represented by lines that connect the entities
Describe how entities interact with each other
Are represented as verbs
Categorized as one-to-one, one-to-many, or many-to-many
Attributes
Often included in the rectangle representing the entity
Are data points that are stored for each entity
Can be identified as either a primary or foreign key
Normalization and denormalization
Two database design strategies used to organize and structure data to balance integrity and simplicity in database structure
What is normalization in the context of databases?
Normalization involves structuring data and tables to reduce redundancy and anomalies, as well as to increase data integrity.
What does normalization often involve?
Normalization often involves dividing data into multiple organized tables.
What type of databases is normalization applied to?
Normalization is applied to relational databases.
In what type of systems is normalization commonly used?
Normalization is commonly used in systems where transaction integrity is critical.
How does normalization help with data consistency?
Normalization can maintain data consistency and help to avoid anomalies during insertions, updates, and deletions.
What is database denormalization?
Combines data from multiple tables into a single table.
How does denormalization affect data retrieval speed?
It increases retrieval speed by reducing complex joins across multiple database tables.
In what scenarios is database denormalization commonly used?
In data warehousing scenarios where the focus is on reading large volumes of data quickly.
How are updates typically handled in denormalized databases?
Updates are infrequent or handled in batch operations.
What method does Data Normalization use?
Organized approach of breaking down/simplifying tables to eliminate data redundancy and undesirable data characteristics
What method does ERD use?
Structurally representing database design via the use of diagrams
Involves the use of different symbols and connectors that help to visualize two different types of information
What must each table cell contain in 1NF?
A single atomic (indivisible) value
What must each column have in 1NF?
A unique name
What must all entities in a column be of in 1NF?
The same data type
What must each record be in 1NF?
Unique
What must a table be in before it can be converted to 2NF?
1NF
What is required for a table to be in 2NF regarding non-key attributes?
No partial dependencies: non-key attributes must depend on the ENTIRE primary key.
When is 2NF most relevant?
When you have a composite primary key.
What is the first requirement for a table to be in Third Normal Form (3NF)?
Must be in Second Normal Form (2NF) first.
What must be eliminated to achieve Third Normal Form (3NF)?
Transitive dependencies.
What is a transitive dependency?
A situation where a non-key column depends on another non-key column, rather than depending directly on the primary key.
Business Intelligence
Includes the technologies, computer applications, and procedures for the collection, analysis, and presentation of business information to help support decision making
What are Business Intelligence (BI) systems fundamentally classified as?
Data-driven Decision Support Systems (DSS)
What do BI systems help businesses to do?
Make better strategic decisions
What types of views do BI systems provide about operations?
Historic, current, and future views
What types of storage do BI systems use to create information output?
Data warehouses, data marts, in-memory computing, and other analytic platforms
What is a data warehouse?
A repository of data and information that organizations analyze to make informed business and operational decisions.
How do data and analytics benefit organizations?
They allow organizations to create and maintain a competitive advantage.
What types of systems flow data into a data warehouse?
Data flows into a data warehouse from transactional systems, databases, and other data-generating sources.
How often does information flow into a data warehouse?
Information flows into a data warehouse at regular intervals.
Who has access to a data warehouse within an organization?
Data scientists, key decision-makers (KDMs), and data specialists have access to the data warehouse.
What tools are used to analyze data in a data warehouse?
Data is analyzed using BI tools, SQL, and a variety of analytics applications.
What is a data mart?
A data mart is a subsection of a data warehouse designed specifically for individual departments or business functions.
Information Policies
Should be created and implemented by all organizations, regardless of the size of the organization
What do information policies specify?
The rules used in database design (how data is structured)
Who has access to the data
How the data is collected and maintained
Where information and data are distributed
Example: Only selected faculty and staff have access to students' educational records
What is Data Governance?
An important component of a data management plan.
What does Data Governance include?
Personnel, processes, and technology needed to oversee and secure an organization's data and data assets.
What do data governance policies help to ensure an organization data is?
Valid
Understandable
Complete
Accessible
Data Administration
Responsible for the policies and procedures that are used to manage an organization's data
What do common data administration tasks include?
Development of information policies, data planning, database designs, security, and how internal-users and end-users use data
Database Administrators (DBA)
Technically specific role that is usually part of an organization's IT department