1/127
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
_____ are the raw facts that our systems and processes generate and collect
Data
____ is data that has been processed, analyzed and put into context
information
______ ____ is data that is organized into well defined tables
structured data
_____ _____ is data that is not easily organized into tables
unstructured data
where does most unstructured data come from
machine data
what kind of data is organized in rows and columns.
tabular data
each column in a table is called a ____
field
each row in a table is called a
record
____ _____ provide a simple way to store data values indexed by a key value
key value pair
______ data type is used to store quantitative data
numeric
_______ values are whole numbers that do not have decimal places. they may be positive or negative
integer
____ values contain decimal places and are used for exact values
decimal
_____ _____ have decimals but use less storage and store approximate values
floating point
_____ values provide a better option than floating point data types for storing information about money
currency
a _____ values is either true or false
boolean
booleans are also known as ____
flags
we can store text data in character _____
strings
_______ encoding transforms alphanumeric characters into binary form
ASCII
________ provides extend character sets, including non english characters
UNICODE
which data type is fixed length and uses ASCII
CHAR
which data type has a variable length and uses ASCII
VARCHAR
which datatype is fixed length and uses UNICODE
NCHAR
__________ capable of storing up to 4 gigabytes of one variable length character data field
CLOB
______ data takes on values from a limited set of possibilities. IT may be numeric or text based
discrete data
_____ data takes on possible values from a range. the data is numeric
continuous
______ data is text data that is discrete, grouping items into limited number of categories
categorical
_____ data types store values independent of display format
date
_____ data type stores values with an optional time zone
time
_____ data type store a date and time value in a timestamp
datetime
multimedia and other large files are stored as pointers using a data type called _____
Binary large object(Blob)
_____ data types store geometric and geographic information
spatial
_______ is data that is used to describe 2D and 3D shapes and objects
Geometric
______ data represents locations on the earths surface
geographic
_____ is a Microsoft implementation of a unique identifier
GUID
____ is the current standard for implementing a unique identifier
UUID
both GUID and UUID are ______ bit hexadecimal ID
128
what file format has value separated by commas
CSV
a TSV file is similar to a CSV but uses ____ instead of commas
tabs
______ allows for data structures with key value pairs
JSON
what are the data types youll find in JSON
strings
numbers
booleans
arrays
what language uses tags and is used for formatting web pages
HTML
which language uses tags and key value pairs
XML
____ ____ store data in a computer readable format
binary files
a .DAT file is an example of a ___ ____binary file
database _____ are special purpose database fields that help organize tables and define relationships
keys
the ______ key uniquely identify rows in a table
primary keys
_____ are rules that are enforced by the database
constraints
_____ keys define the relationships between two tables
foreign keys
A set of rules that Access uses to ensure that the data between related tables is valid.
referential integrity
_______ database supports day to day business operations
OLTP
online transaction processing
Manipulation of information to create business intelligence in support of strategic decision making
OLAP
online analytical processing
_____ _____ occur when data is isolated in a department
data silo
a data ______ is a company wide database aggregates data from many OLTP database
data warehouse
a data ____ is a subset of a data warehouse serving a specific part of the organization/department
data mart
data ___ stores data in raw native format and is complex to use
lake
a ______ combines elements of data lakes and data warehouses
lakehouse
Developed to handle large data sets of data that is not easily organized into tables, columns, and rows
non relational database
______ database are excellent at modeling relationships between objects
graph databases
_____ _____are optimized for he storage of large documents in JSON, XML, and similar formats
document stores
non relational databases gain efficiency by reducing _____
overhead
_____ organize data
schema
the _____ schema contains a fact table and its center and supplemental table surround it ad it is used for OLAP
star
_____ tables act as intermediary tables to model many to many relationships
bridge
the _______ schema uses multiple levels of dimension tables and it is used for OLAP
snowflake
________ ______ the practice of using a network of remote servers hosted on the Internet to store, manage, and process data, rather than a local server or a personal computer.
Cloud computing
the cloud is _____ meaning you can increase capacity with increased demand
scalable
______ scaling is when you add more servers to the pool to meet increased demand
horizontal scaling
______ scaling is when when you add more resources (CPU, memory, etc...) to existing servers to meet increased demand
vertical scaling
when cloud computing ____ means you can expand and contract your server resources at will
elasticity
most cloud services are ____ _____ mean you only pay for what you use
measured services
in a _____ cloud an organization uses a dedicated cloud infrastructure
a company may build and run their own cloud or pay a company to run a cloud dedication to that one company
private cloud
in a _____ cloud an organization uses a cloud provider and shares the server resources with other companies
public cloud
in a _____ cloud an organization uses both private and public cloud models
hybrid cloud
when using a public cloud model security to the cloud is typically handled using a ____ _____ model. meaning that the cloud provider assumes some responsibility and the cloud user assumes the other half of the responsibility.
shared responsibility
in a _____ cloud it is similar to a public cloud but the resources are only shared with other similar businesses
community cloud
a type ____ hypervisor is one that is directly installed on top of the hardware aka bare metal hypervisor
type 1
a type _____ hypervisor is one that is installed on top of an existing OS
type 2
when practicing virtualization security you must make sure to ______ each VM and ensure that each server only has access to its own memory and storage
isolation
virtualization platform must be ______ against security vulnerabilities
patched
creating VM is extremely easy to do but it needs to be managed to avoid VM ______ resulting in unused and unmaintained servers
vm sprawl
______ are light weight alternative to virtualized servers.
containers
containers run inside of a _____ ____
containerization platform
_____ storage allocates a large chunk of storage for access asa disk volume
block storage
_____ storage stores files and individual objects managed by the cloud service provider
object storage
_____ storage provides shared hierarchical storage
file storage
which is more expensive block or object storage
block storage
you only pay for what you use for which two types of storage?
object and file storage
block storage is allocated and paid for in ______ ______ blocks
drive sized
with _____ storage you can choose the type of drive your files are stored on
EX. SSD or HDD
block
what is the most widely used data analytic software
spreadsheets
microsoft excel
_______ _____ allow skilled developers to write their own software
programming languages
_____ is a popular programming language dedicated to analytics
R
the R programming language is free, open source and simplifies data analysis using _____
tidyverse
What IDE do most R developers use?
RStudio
what is the most popular general purpose programming language?
python
what python library is used for data analysis
pandas
which programming language is used for big data
scala
_____ editors are basic programs that you can use to write code
they offer syntax highlighting but cant run the code
text editors
_____ allow you to write and execute code in cells and immediately see the results
notebooks
_____ ____ ___ is a dedicated programming environment with a full suite of tools
integrated development environment