Quiz#2 - data management

0.0(0)
studied byStudied by 2 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/19

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

20 Terms

1
New cards

Difficulties in managing data: Scattering

•Data increases exponentially with time and gets scattered through organizations

Collected by many individuals, using different servers, locations, databases & formats

2
New cards

Difficulties in managing data: Sources

•Multiple sources of data

Internal Sources: Corporate databases, company documents

Personal Sources: Personal thoughts, opinions, experiences

External Sources:  Commercial dbs, reports, Web data, Sensors

3
New cards

Difficulties in managing data: Redundancy

Information Systems don’t communicate w/ each other resulting in duplicate data

4
New cards

Difficulties in managing data: Information Changes

•Data inconsistency: information changes

Ex’s: customers move, change their contact info, companies get bought out, employee turnover

- could have various copies of the data don’t agree

5
New cards

Difficulties in managing data: Data rot or degradation

• physical machine issues

–Wears out over time; impacted by temperature, humidity, exposure to light

–Legacy storage devices make it hard to playback old media

•8-track players; floppy drives

6
New cards

Difficulties in managing data: Data security, quality, and integrity

Vulnerable as it can easily be jeopardized

7
New cards

Data Governance 

  • An approach to managing information across an entire organization.

  • Involves a set of business processes designed to ensure data is handled in a certain way

  • Goal is to make data available, transparent & useful for people authorized to access it

8
New cards

Database Management Systems (DBMS)

  • Set of programs that provide users with tools to add, delete, access and analyze data stored in one location

  • Interface between applications & a database e.x Oracle, Microsoft sql server

9
New cards

The things DBMS minimize

  • Data redundancy

    • same data stored in multiple locations

  • Data inconsistency

    • Various copies of the data don’t agree

  • Data isolation

    • applications cannot access data associated with other applications

10
New cards

The things DBMS maximize

  • Data Security

    • With all the data in one place, there’s a high risk of losing everything all at once

    • DBMS have high security measures to minimize mistakes & deter attacks

  • Data Integrity

    • Data must meet certain constraints

    • Ex: Students GPA cannot be negative

    • Ex: no alphabetic characters in a Social Security number

  • Data Independence

    • Applications & data are independent of one another

    • All applications can use the data as it’s not tied to just one system

11
New cards

Database Management System: Hierarchy

  • Flat File Database →

  • Relational Database: Hierarchical Database; RDBMS →

  • NoSQL : Key-value; Column Oriented; Documented Oriented; Graph DB

12
New cards

Relational Database Model

  • Most used DB architecture

  • Based on concept of two-dimensional tables, Rows & Columns

  • Data organized into one or more tables

  • Tables related to one another by means of a common field

  • Disadvantage 

Large-scale DB’s can have many interrelated tables making the overall design complex, slowing search & access times

13
New cards

Relational Model: Data Hierarchy

  • Field : words describing an item (master data)

    • Ex’s:  student Id, Student name, GPA

  • Record: grouping of related fields representing an item (transactional data)

    • Ex: fields grouped together to represent a student

  • Table or Data File: grouping of related records representing an entity

    • Ex: group of records for all students

  • Database: grouping of related tables or files

    • Ex: Students table & courses table

14
New cards

The data model 

a diagram that represents the entities in the database and their relationships 

15
New cards

Data Model Components

  • Entity: person, place or thing about which information is retained

    • Like a Table or Data file….

    • Ex’s: Student, parking permit, class, professor

  • Attribute:  each characteristic of an entity

    • Like a field

    • Ex’s: Student name, id, address

  • Instance: is a specific representation describing the entity.

    • Like a record

    • Ex:  Jimbo Brown, 789546, 3.21

16
New cards

Data Model Identifiers

  • Entity’s will have Identifiers, which are attributes that can uniquely specify an instance.

  • These are called:

  • Primary Key (PK)

    • uniquely identifies a record or entity instance

    • Student ID #, email address or social security #

    • Parking Permit #, License plate #

  • Foreign Key (FK)

    • attribute that has identifying info. but doesn’t uniquely identify a record in its own table.

    • Used to uniquely identify a record in a related table

17
New cards

Retrieving information

  • Retrieving Data is the most common DB operation

  • Structured Query Language (SQL) - allows users to perform complicated searches by using relatively simple statements or keywords.

  • Typical Keywords

    • SELECT – specify the wanted attributes

    • FROM – specify the table to be used

    • WHERE – specify conditions to apply in the query

18
New cards

Big Data

  • Diverse & high-volume set of information that requires new forms of processing to enable enhanced capabilities of information systems like:

    • decision making, insight discovery, and process optimization.

  • can be utilized in a reasonable amount of time only by sophisticated information systems 

  • Consists of unstructured data: 

    • Doesn’t fit into rows & columns of a table, like traditional, structured data does into relational databases

19
New cards

Characteristic of big data

  • Volume: Creates data management problems, but also means its incredibly valuable

    • Ex: airplane engine creates 10TB in 30 minutes; 25,000 flights/day

  • Velocity: Rate at which data flow is rapidly increasing

    • Ex: Internet connects customers fast, sites are able to capture your clicks & recommend interests to you generating data fast

  • Variety: data is untraditional & can be in many different types of unstructured formats:

    • Ex: satellite imagery, audio streams, digital music files, web content, documents, comments by users

20
New cards

Managing Big Data

  • Drivers of Big Data:

    • Cloud Computing for powerful and scalable IT resources

    • Open-Source software which makes Big Data affordable for most organizations to process

  • NoSQL databases are used instead of Relational DBs because they can process unstructured & structured data

NoSQL DBs: Neo4j, MongoDB, CouchDB