1/56
Unit 1
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Data
Raw facts
Information
Transformed/processed data that facilitates decision making
Data management
A process that focuses on data collection, storage, and retrieval.
Database
A shared, integrated computer structure that houses a collection of related data.
Contains two types of data: end-user data(raw facts) and metadata
Metadata
Data about data characteristics and relationships
Database management system (DBMS)
the collection of programs that manages the database structure and controls access to the data stored in the database.
Data inconsisency
A condition in which different versions of the same data yield-different (inconsistent) results.
query
a specific request issued to the DBMS for data manipulation
centralized database
A single database located at a single site.
Distributed database
A logically related database that is stored in two or more physically independent sites.
cloud database
A database that is created and maintained using cloud services such as Microsoft Azure or Amazon AWS.
general-purpose database
A database that contains a wide variety of data used in multiple disciplines.
Unstructured data
Data that exists in its original, raw state; that is, in the format in which it was collected.
Structured data
Data that has been formatted to facilitate storage, use, and information generation.
Semistructured data
Data that has already been processed to some extent.
Extensible Markup Language (XML)
A metalanguage used to represent and manipulate data elements.
XML database
supports the storage and management of semistructured XML data.
Problems with File Systems
Data redundancy
Data inconsistency
Lack of data integrity
Difficult to access
No multi user access
3 levels of data abstraction are:
external, conceptual, and internal
ANSI stands for:
American National Standards institute
External model
end user’s view of the data environment
external schema
a specific representation of an external view
conceptual model
represents a global view of the entire database by the entire organization.
It integrates all external views(entities, relationships,etc) into a single global view of the data in the enterprise
The conceptual model is _____________ of both software and hardware.
independent
Data redundancy
when the same data is stored unneccessarily at different places.
Data inconsistency
when different and conflicting versions of the same data appear in different places.
Data integrity
Data is accurate and verifiable.
Database system
An organization of components that defines and regulates the collection, storage, management, and use of data in a database environment.
query language
a non-procedural language - one that lets the user specify what must be dont without having to specify how.
software independence
the model does not depend on the DBMS software used to implement the model.
hardware independence
model does not depend on the hardware used in the implementation of the model.
internal model
is the representation of the database as seen by the DBMS.
internal schema
depicts a specific representation of an internal model, using the database constructs supported by the chosen database.
logical independence
A condition in which the internal model can be changed without affecting the conceptual model.
physical model
operates at the lowest level of abstraction
describes the way data is saved on storage media
Primary Key
an attribute or combination of attributes that uniquely identifies any given row.
Key
consists of one or more attributes that determine other attributes.
Functional dependence
the value of one or more attributes determines the value of one or more other variables.
In functional dependency, the attribute whose value determines another is called the ____________or the key. The attribute whose value is determined by the other attribute is called the _______________.
determinant, dependent
composite key
a key that is composed of more than one attribute
superkey
a key that functionally determines every attribute in the row.
candidate key
is a superkey without any unneccessary attributes.
Entity Integrity
is the condition in which each row in the table has its own unique identity.
2 requirements of primary key
all of the values in the primary key must be unique
no key attribute in the primary key can contain a null
Foregin Key
is the primary key of one table that has been placed into another table to create a common attribute.
Secondary key
a key that is used strictly for data retrieval purposes.
SELECT
used to list all of the rows, or it can yield only rows tha match a specified criterion.
PROJECT
is a unary operator that yields all values for selected attributes.
UNION
combines all rows from two tables, excluding duplicate rows.(tables must have the same attribute characteristics).
INTERSECT
yields only the rows that appear in both tables.
DIFFERENCE
yields all rows in one table that are not found in the other table; that is, it subtracts one table from the other.
PRODUCT
yields all possible pairs of rows from two tables(cartesian product).
JOIN
allows information to be intelligently combined from two or more tables.
natural join
links tables by selecting only the rows with common values in their common attributes.
DIVIDE
used to answer questions about one set of data being associated with all values of data in another set of data.
Data Dictionary
provides a detailed description(metadata) of all tables in the databse created by the user and designer.
system catalog
can be described as a detailed system data dictionary that describeds all objects within the database(table names, number of columns,etc)