Looks like no one added any tags here yet for you.
Database
collection of data organized within a computing system to efficiently serve multiple applications by centralizing information and reducing redundancy
Structured Data
pre-defined data model, easy to search, and most mature applications exist
Unstructured Data
no pre-defined model, harder to search, applications are relatively recent and still developing
Semi-Structured
tag-driven structure that identifies data elements and their hierarchy, but does not follow the formal structure of a data model
Entity
person, place, thing, or event on which we store and maintain information
Attribute
each characteristic or quality describing a particular entity
Bit
represents the smallest unit of data a computer can handle
Byte
represents a single character, which can be a letter, number, or another symbol
Field
a grouping of characters into a group of words, or a complete number, such as person’s name or age
Record
a group of related fields
File
a group of records of same type
Database Management System (DBMS)
software that permits an organization to centralize data, manage them efficiently, and provide access to the stored data by application programs
Relationship
association between entities
Data Redundancy
the presence of duplicate data in multiple data files
Data Inconsistency
the presence of different values for same attribute when the same data are stored in multiple locations
Program-Data Dependence
the close relationship between data stored in files and the software programs that update and maintain those files. Any change in data organization or format requires a change in all the programs associated with those files
Requirement Analysis
based on business processes, what do users need? what should database do?
Conceptual (Logical) Design
high-level description (often E/R models)
Schema Refinement
consistency, normalization
Relational Database (SQL)
built around a single concept for modelling data: relations or tables of related data, represent data as two-dimensional tables
Records (tuples)
collections of different entities
Primary Key
used to identify rows (tuples) in the relation, to establish connections to other relations, and for storage purposes (most important key in a table)
Foreign Key
essentially a lookup field to look up data in another table using primary key
Select
creates a subset consisting of all records in the file that meet stated criteria
Join
combines relational tables to provide the user with more information than is available in individual tables
Project
creates a subset consisting of columns in a table, permitting the user to create new tables that contain only the information required
Data Definition (DBMS)
capability to specify the structure and the content of the DB.
Data Dictionary
stores definition of data elements and their characteristics
Data Manipulation
used to add, change, delete, retrieve, and querying and reporting the data in the DB
Data Manipulation Language
a language associated with a database management system that end users and programmers use to manipulate data in the database
Query
request for data from a database
Normalization
process of creating small, stable, yet flexible and adaptive data structures & substructures from complex groups of data
Referential Integrity
rules to ensure that relationships between coupled database tables remain consistent
Entity-Relationship Diagram
methodology for documenting databases illustrating the relationship between various entities in the database
Non-Relational Database Management System (NoSQL)
DMS for working with large quantities of structured and unstructured data that would be difficult to analyze with a relational model
Composite
both entity and relationship
One-To-One
single valued in both directions
One-To-Many
one entity has multivalued relationship with another (but not the reverse)
Many-To-Many
multivalued in both directions
Blockchain
distributed database technology that enables firms and organizations to create and verify transactions on a network nearly instantaneously without a central authority
Cloud Database
type of database service built and accessed through a cloud computing platform
Big Data
data sets with volumes so huge that they are beyond the ability of typical relational DBMS to capture, store, and analyze. The data are often unstructured or semistructured
Data Warehouse
a database, with reporting and query tools, that stores current and historical data extracted from various operational systems and consolidated for management reporting and analysis
Data Mart
a small data warehouse containing only a portion of the organization’s data for a specified function or population of users
Hadoop
open source software framework managed by the Apache Software Foundation that enables distributed parallel processing of huge amounts of data across inexpensive computers
In-Memory Computing
technology for very rapid analysis and processing of large quantities of data by storing the data in the computer’s main memory rather than in secondary storage
Analytic Platforms
preconfigured hardware-software system that is specifically designed for high-speed analysis of large datasets
Data Lake
repository for raw unstructured data or structured data that for the most part has not yet been analyzed, and the data can be accessed in many ways
Online Analytical Processing (OLAP)
Capability for manipulating and analyzing large volumes of data from multiple perspectives
Data Mining
analysis of large pools of data to find patterns and rules that can be used to guide decision making and predict future behaviour
Association
occurrences linked to a single event
Sequences
events are linked over time
Classification
recognizes patterns that describe the group to which an item belongs by examining existing items that have been classified and by inferring a set of rules
Clustering
no groups have yet been defined, a data mining tool can discover different groupings within data, such as finding affinity groups for bank cards or partitioning a database into groups of customers based on demographics and types of personal investments
Forecasting
series of existing values to _______ what other values will be
Text Mining
discovery of patterns and relationships from large sets of unstructured data
Sentiment Analysis
mining text comments in an email message, blog, social media conversation, or survey form to detect favourable and unfavourable opinions about specific subjects
Web Mining
discovery and analysis of useful patterns and information from the World Wide Web
Database Server
a computer in a client/server environment that is responsible for running a DBMS to process SQL statements and perform database management tasks
Data Quality Audit
structured survey of the accuracy and level of completeness of the data in an information system
Data Cleansing (data scrubbing)
consists of activities for detecting and correcting data in a database that are incorrect, incomplete, improperly formatted, or redundant
Data Governance
encompasses policies and procedures through which data can be managed as an organizational resource
Data Administration
responsible for the specific policies and procedures through which data can be managed as an organizational resource
Database Administration
responsible for defining and organizing the structure and content of the database, and maintaining the DB