Data
Can come in various shapes, such as numbers and text. This is the raw piece of data that is extracted from a target group
Examples of Data
StudentID, First Name, Last Name, Date of Birth, etc.
Information
Something you get after you change raw data into useful insights (aka processed data)
Structured Data
This type of data has clearly defined types where it is easy to identify valid and invalid data
Example of Structured Data
Numbers (e.g., product price, produce bar code)
Text (e.g., Product Name, Color, etc.)
Dates (e.g., Order Date, Shipment date, etc.)
Unstructured Data
A type of data that has only loosely defined valid and invalid values. Tends to be very complex and difficult to analyze.
Example of Unstructured Dataa
Images
Audios
Videos
Documents
Metadata
Descriptions of the properties or characteristics of the actual data, including the data type of the data, file sizes, allowable values, and data context.
In simple terms, it describes the user data.
Database
A collection of data that are specially structured for storing and retrieval
Database Management System (DBMS)
A system/program/application used to create, process, and administer other databases.
How many databases can a DBMS hold?
Multiple databases can be stored and hosted.
What are the different types of DBMS
1) Relational
2) Object-orientated
3) NoSQL
4) Single-User/Multi-User
Relational DBMS
Most commonly used for structured data, and data is stored in tables with rows and columns
Object-orientated DBMS
This DBMS stores data in the form of objects, similiar to how they are used in object-orientated programming (think of RStudio, how we store vectors in objects)
NoSQL DBMS
This type of DBMS are designed to handled unstructured or semi-structured data and provide more flexibility than traditional relational DBMS
Single User DBMS
This type of DBMS can only be sued by ONE user.
Good for smaller databases.
Some examples are Microsoft Access and LibreOffice Base
Multi-User DBMS
This type of DBMS can be used by multiple users.
Good for larger databases that need to be accessed by multiple people.
Some examples are: Microsoft SQL Server, IBM DBS, Oracle, and MySQL
What were the two systems used BEFORE DBMS?
1) Manual File Systems
2) Computerized file systems
Manual File Systems
A type of data management that used file folders and filing cabinets, and data were often stored on paper and categorized by file folders.
Computerized file systems
Data processing (DP) specialist created a computer-based system to track data and produce required reports.
What are some disadvantages of file processing?
1) Program data dependence
2) No standards about defining data & manipulating data
3) No centralized control of data
What are the results of the disadvantage of Program Data Dependence?
Data redundancy and inconsistency
Lengthy development times
Excessive program maintenance
What are the results of the disadvantage of No centralized control of data?
Limited data sharing and Lack of data security
What are some advantages of the Database Approach
1) Program-data independence
2) SQL (Structured Query Language)
3) Centralized Control of Data
SQL (Structured Query Language)
A standard language to define and manipulate data
What are the steps to create a relational database?
1) Create a database model
2) Create your tables
3) Specifying relationships between tables
4) Insert data
Model
A simplification of reality