Data Management - CRASH COURSE

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/142

flashcard set

Earn XP

Description and Tags

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

143 Terms

1
New cards
Data
Refers to the collected resource itself; the basic data discovered, investigated, collected, and created in the real world. That is, data are related to a fact, which refers to a natural state free of the values and judgments of human beings.
2
New cards
Information
Refers to various data that are organized, classified, and systematized for certain purposes, according to certain rules. When data are processed in a certain form, the information necessary for achieving a specific purpose is created.
3
New cards
Knowledge
Refers to matter that is generalized from several items of concrete information. It is created while analyzing, and studying the meaning, and relationship of informational data. Correlations between information should be established to convert information into knowledge.
4
New cards
Wisdom
A state in which an individual can understand and apply knowledge. The mental ability to acquire, understand, apply, and develop knowledge.
5
New cards
Batch processing system
Data is collected for a certain period or a certain amount to process it at once.
6
New cards
Online processing system
When data are transferred to a computer, the computer immediately processes the data (Real-time processing system).
7
New cards
Distributed processing system
Data are processed by connecting processors and databases, which are geographically dispersed, over the network.
8
New cards
File processing system
Refers to a method of storing and retrieving paper documents that was widely used before the computer was invented. However, the meaning of the term began to include computerized records in the 1960s. Such a file system is designed to arrange and manage the data recorded by the user on a physical disk. In general, a hierarchical file system with a directory structure is used.
9
New cards
Database
Has the characteristics of integration, operation, storing, and sharing. In the past, data were inevitably saved in duplicate when recorded on paper, because even the same data could not be shared in real time. Data were also saved in several files in duplicate even when the file system was in use. The database, however, manages duplicated data in one place so intensively it is possible to eliminate duplication by the maximum.
10
New cards
Integrated data
This means that the same data are not duplicated in principle. Minimal redundancy. Avoided redundancy.
11
New cards
Stored data
Data are saved in a storage medium that can be accessed by a computer (tape, disk, etc.).
12
New cards
Operational data
Data which are required to perform the unique functions of an organization. (Temporary data used while performing a task, such as simple input/output, are not operational data.)
13
New cards
Shared data
Data which are jointly owned, maintained, and used by several application programs within an organization.
14
New cards
Database System (DBS)
Refers to a computer-centered system that creates necessary information by storing and managing data in a database. It is composed of four attributes
15
New cards
Database language
A tool that provides an interface between users and system.
16
New cards
Users
Database administrator (DBA), database application programmer, database user.
17
New cards
Database Management System (DBMS)
System software that provides database construction and utilization functions. A system continued to solve the problem of the file system, dependency and redundancy. A software system that manages the database so that it can be shared by all application programs as a mediator between the application program and data.
18
New cards
Data independence
The concept that the database structure and the application programs that use it are separate. It is broadly divided into logical independence and physical independence.
19
New cards
Logical independence
When a concept schema is changed, an external schema is not affected. An application program is not affected by a change in the logical database.
20
New cards
Physical independence
When an internal schema is changed, an external/conceptual schema is not affected. An application program and an external schema are not affected by a structural change of the storage device.
21
New cards
ANSI-SPARC 3-level database architecture
A data independence model proposed by the special subcommittee of the X3 committee under the American Standards Institute (ANSI) for the DBMS and its interfaces. It consists of an external phase, a conceptual phase, and an internal phase.
22
New cards
External schema
An external schema is composed of several users' viewpoints in the view phase that is, a personal database schema viewed by users. It is each user’s phase of a product database. Users or the database views by the application programmer are defined.
23
New cards
Conceptual schema
A conceptual and logical schema in the conceptual phase. The Database content definition is described which has integrated all users' perspectives. The view of the entire database is created, which has integrated the data desired by all application programs or users, including the data saved in the database and the schema that expresses the relationship between the data.
24
New cards
Internal schema
An internal schema is composed of an internal phase and an internal schema. The format in which the database is physically stored. A schema that expresses how data are actually stored in the physical device.
25
New cards
Database Administrator (DBA)
Takes responsibility for the configuration and overall administration of the database to ensure that the functions of the system are performed properly.
26
New cards
Data Architect (DA)
Establishes, models, and systemizes the policies and standards for data-related elements such as data, database, data standard, and data security.
27
New cards
DDL compiler
Processes the schema specified in the DDL as an internal schema, (ie. metadata), and stores it in the system catalog. A DBMS can then access and use this catalog information when necessary.
28
New cards
Query processor
Processes advanced queries submitted by general users, (i.e., queries are analyzed, parsed, and compiled). Then, the database access code is created and sent to the runtime DB handler for execution.
29
New cards
DML pre-compiler
DML commands inserted into an application program are extracted and sent to the DML compiler to be compiled as an object code for database access.
30
New cards
DML compiler
A DML compiler generates an object code by parsing and compiling the received DML command.
31
New cards
Runtime database handler
Executes a user's DML query by performing database efforts such as search or update are executed in the database using the stored data manager.
32
New cards
Transaction manager
The transaction manager checks compliance with integrity constraints and the user's rights while processing the transaction. Performs database entries, controlling transactions concurrently or when a failure occurs.
33
New cards
Stored data manager
The stored data manager manages access to the user database and catalog stored in the disk. (Request to the file manager of the OS).
34
New cards
Hierarchical database
Hierarchically stores data in a tree format of the relationship between subordinates and superiors.
35
New cards
Network database
Stores data by expanding the tree form of a hierarchical database into a network form. Pointers are used to maintain a many-to-many relationship between records and to link data.
36
New cards
Relational database (RDB)
Based on the relational data model proposed by E.F. Codd in the 1970s. Major commercial products include Oracle, SQL Server, DB2, Informix, Sybase, and Ingres.
37
New cards
Object-oriented database (OODB)
The relational database cannot create new types of data or expand existing types and has difficulty processing unstructured complex information such as multimedia. Also, the standard query language ‘SQL’ expresses data relationships by values, making it difficult to find and process mutually related entities when expressing a complex object. Accordingly, a database that can search and store information based on an object model emerged, namely, the object-oriented database.
38
New cards
Object relational database (ORDB)
Introduced to solve the weakness of the relational database regarding new advanced applications, it showed certain limitations in terms of its use in an enterprise environment. To overcome this, the object relational database was introduced by combining the existing relational database with the concept of the object-oriented database and expanding its functions.
39
New cards
XML (Extensible Markup Language)
HTML (HyperText Markup Language) is mainly used to create and format web documents in a web environment, but it remains unsuitable for specifying structured data extracted from a database. Therefore, the World Wide Web Consortium (W3C) has developed an extensible markup language (XML) as a standard language that is used to structure and exchange data in a web environment.
40
New cards
XML Document Type Definition (DTD)
A document that defines the form of an XML document in a consistent structure. XML DTD is used to validate XML.
41
New cards
XML Schema
A more powerful definition language than XML DTD and can be declared. It is a W3C standard recommendation that specifies the structure and constraints of an XML document.
42
New cards
XPath
A language for path expressions extended queries so that the search condition can be included in the XML path expression.
43
New cards
XQuery
A standard XML query language that can extract intended information from an XML file as if using a database. It is a query language that is used to search an XML-based database, and can extract information from XML files as if using a database.
44
New cards
XSL (Extensible Stylesheet Language)
A language that specifies the style sheet, which is used to express XML data in various different forms.
45
New cards
XLL (XML Linking Language)
The Extensible Linking Language (XLL) displays the connection and relationship between XML documents. XLink
46
New cards
XML parser
Checks and inspects the grammar and syntax (tree, etc) of an XML document (validation check).
47
New cards
XML syntax analyzer
Analyzes the syntax structure of an XML document (SAX, DOM).
48
New cards
XSL engine
Converts an XML document to a document format with expression information.
49
New cards
Multimedia database
Developed to efficiently search and manage unstructured multimedia data characterized by their large capacity, and complexity such as text, image, audio, video, etc.
50
New cards
Main memory database (MMDB)
Unlike general commercial databases in which a database is stored on a disk, the main memory database manages and manipulates a database by keeping it in the memory.
51
New cards
Embedded database
General commercial databases are not suitable for the embedded system, which has limited memory and special performance goals. The embedded database was developed for the embedded system, so as to allow specific functions to be used in a limited embedded environment.
52
New cards
Mobile database
Exclusively used by mobile devices. The mobile database, which is installed in a mobile device, processes the data generated during field work and sends them to the central server for synchronization.
53
New cards
Spatial database
A set of non-spatial data represented by letters and numbers and spatial data represented by coordinates. The spatial database was, initially developed because a technology for processing “unstructured data”, such as the geographic information system, was needed as a core technology for a type of guided missile that strikes a preset target by tracking the target’s location using geographical information.
54
New cards
Column base database
Physically stores data based on columns. The data storage method of the relational database is not defined as row base or column base, but general relational databases use a physical storage structure based on rows.
55
New cards
Conceptual data modeling
Has a high level of abstraction, focuses on business, and performs modeling at the comprehensive level. Often used for enterprise data modeling, and establishing the enterprise architecture. It is performed to embody the database at the higher level that has a high level of abstraction in the real world.
56
New cards
Logical data modeling
Accurately expresses keys, attributes, relationships, etc. for the work to develop a system, and the data model is detailed. Conversion of the result of the conceptual design, which is created for human understanding, into a logical structure that can be easily stored in a database.
57
New cards
Physical data modeling
Designs in which physical characteristics are considered in order to improve performance and storage efficiency, so that the data can be actually ported to a database. Determination of the physical storage structure of the database created using logical structure design.
58
New cards
Entity-Relationship Model (ER Model)
A form of notation for data models created by Peter Chen in 1970. This notation uses a rectangle, a diamond, and an ellipse to represent an entity, a relationship, and an attribute respectively.
59
New cards
Entity
Refers to a meaningful unit of information in the real world. An entity includes both physical objects and conceptual objects.
60
New cards
Weak entity
An entity without its own identifier. It is represented using a two-line rectangle.
61
New cards
Relationship
Refers to the correlation between entities and is represented using a diamond.
62
New cards
Attribute
Represents the intrinsic nature of an entity or relationship and is represented using an ellipse.
63
New cards
Identifier (key attribute)
An attribute or a set of attributes that always has a unique value in all entity sets (e.g., student ID, vehicle registration number, etc.). Represented by underlining the attribute's name.
64
New cards
Weak entity separator (partial key attribute)
As a weak entity does not have an identifier, it should be connected to another entity that plays the role of an identifying entity and the identifier of the identifying entity and some attributes of the weak entity are combined to be used as the identifier. The attributes of the weak entity used at this time are called separator or partial key attributes.
65
New cards
Multi-valued attribute
Can have multiple values at the same time, which are distinguished from general attributes (single-valued attribute) by marking them with a two-line ellipse.
66
New cards
Derived attribute
Refers to an attribute that can be derived from other stored data and is distinguished from general attributes (storage attribute) by marking it with a dotted ellipse.
67
New cards
Composite attribute
Refers to an attribute that can be decomposed into two or more elements, and is distinguished from general attributes (simple attribute) by displaying the links between the attributes.
68
New cards
Extended Entity-Relationship (EER) Model
Adds several useful concepts to the basic ER model.
69
New cards
Generalization
Separation of one entity type (super type) into multiple sub level entity types (sub type).
70
New cards
Specialization
Integration of multiple entity types (sub type) into one upper-level entity type (super-type).
71
New cards
Aggregation
Defines a new entity with a set of multiple entities. It is called an IS-PART-OF relationship.
72
New cards
Crow's Foot Notation
An ERD notation mainly used in industry.
73
New cards
Fan trap
Can occur when there are entity types A, B, and C, and when there is an N:1 relationship between entities A and B, and a 1:N relationship between entities B and C.
74
New cards
Chasm trap
Means that desired information cannot be found due to the disconnection of the information flow when there is an optional relationship rather than a mandatory relationship.
75
New cards
Object-Relational Mapping (ORM)
Means that the table of the relational database corresponds to the class used in the object-oriented design.
76
New cards
Integrity
Refers to the need to protect data from lost updates of data to maintain their accuracy, validity, consistency, and stability. Three types of integrity are defined depending on the viewpoint
77
New cards
Domain integrity
Domain integrity means to ensure the integrity of the field in a table. Data types, allowing a null value, etc. can be defined and used. The attribute value has atomicity and should be a value defined in the domain.
78
New cards
Key integrity
All records in a table should be identifiable from each other.
79
New cards
Entity integrity
All tables must have a primary key, and each primary key must have a unique value (but not null).
80
New cards
Referential integrity
Data in two tables in a reference relationship must always have consistent values. A foreign key must be a null value or a value that exists in the primary key of the table referenced by the foreign key. If there is a foreign key that refers to a primary key, the data cannot be deleted or changed.
81
New cards
Super key
A sub-set of all fields in a table. It is an attribute or set of attributes that can uniquely identify a record.
82
New cards
Candidate key
A candidate for the primary key. It is a super key that satisfies uniqueness and minimality. That is, if any attribute is removed from a super key, the super key loses its nature and becomes a candidate key.
83
New cards
Primary key
A unique identifier selected from among candidate keys to distinguish a specific record in the table. Like the candidate key, the primary key has the attribute of uniqueness and minimality, but a null value is not allowed.
84
New cards
Foreign key
When a foreign key of Table A refers to Table B, the foreign key of Table A refers to a key that can uniquely identify a record of Table B. A foreign key consists of one field or a subset of all fields, and cannot have duplicate values or a null value. Also, you must refer to a field that has a unique value in Table B.
85
New cards
Normalization
The theory is the foundation of completing a system and is one of the most important theories that need to be understood when building a system in the field. The theory of first normalization is the starting point for normalization, which removes duplication from a file system in order that data can be processed stably.
86
New cards
Denormalization
The process of integrating data models for normalized entities, attributes, and relationships in order to improve system performance and simplify development and operations.
87
New cards
Insertion anomaly
An anomaly in which unintended information is also inserted when inserting certain information.
88
New cards
Deletion anomaly
An anomaly in which necessary information is also deleted when deleting certain information.
89
New cards
Update anomaly
An anomaly in which the same content should be updated for several data repeatedly when modifying certain information.
90
New cards
Functional dependency
The subjects X and Y of the field defined in the table R. At this time, if the values of X of the arbitrary record part t1 and t2 are the same, and the Y values of those two records are always the same, Y is functionally dependent on X.
91
New cards
Armstrong's axioms
A set of inference rules used to infer all functional dependencies on a relational database. It includes reflexivity rule, augmentation rule, and transitivity rule.
92
New cards
First Normal Form (1NF)
A relation is in 1NF if all its attributes have atomic values.
93
New cards
Second Normal Form (2NF)
A relation is in 2NF if it is in 1NF and every non-primary-key attribute is fully functionally dependent on the primary key.
94
New cards
Third Normal Form (3NF)
A relation is in 3NF if it is in 2NF and no non-primary-key attribute is transitively dependent on the primary key.
95
New cards
Boyce-Codd Normal Form (BCNF)
A relation is in BCNF if every determinant is a candidate key.
96
New cards
Fourth Normal Form (4NF)
The process of removing more than two multi-valued dependencies (MVD) if they occur in one relation.
97
New cards
Fifth Normal Form (5NF)
If all join dependencies that exist in relation R can be satisfied by the candidate key of relation R, relation R is called the fifth normal form.
98
New cards
Index
A data structure that organizes database record information to perform search operations quickly.
99
New cards
View
A table that is virtually created by collecting the desired data only from one or more tables.
100
New cards
Distributed database
A database that is logically integrated and shared so that users can recognize it as a single database even though it is physically distributed among multiple computers on a network.