Database
Structured systems for storing, retrieving and managing data
Include various types e.g. relational databases
Raw data is often stored in databases before being processed
Data is organised in a structured format using data models - define how data is related and stored
Secure storage and access control of sensitive data is handled within databases
Why invented
Previously had file-processing applications which were hard to implement
Databases provide standardisation of concepts
Types
Key-Value Store
Simplest NoSQL databases
Every item in the database is stored as an attribute name (or 'key') and value
Store data in a schema-less way
Store data as maps:
HashMaps or associative arrays
Provide a very efficient average algorithm for accessing data
Document Databases
Pair each key with a complex data structure known as a document
The central concept is the notion of a "document“ corresponding to a row/tuple in RDBMS.
Documents:
Loosely structured sets of key/value pairs in documents, e.g., XML, JSON, BSON
Encapsulate and encode data in some standard formats or encodings
Treated as a whole, avoiding splitting a document into its constituent key/value pairs
Documents are retrieved via a unique key that represents that document
or via an API or query language that retrieves documents based on their contents
Documents are schema-free, i.e., different documents can have structures and schema that differ from one another (an RDBMS requires that each row contain the same columns)
Column Family Stores
Optimized for queries over large datasets and store columns of data together, sorted by row-key
Data stored in a column-oriented way
Focused on storing data efficiently, as needed for very large data
Avoids consuming space for storing nulls
Values indexed by row-key, column-key and timestamp
Related columns are grouped in column families
Ordered based on row-key
Graph Databases
Used to store information about data networks, such as social connections
Graph-oriented
Everything is stored as an edge, a node or an attribute
Each node and edge can have any number of attributes
Both the nodes and edges can be labelled
Labels can be used to narrow searches