Database

Structured systems for storing, retrieving and managing data

  • Include various types e.g. relational databases

  • Raw data is often stored in databases before being processed

  • Data is organised in a structured format using data models - define how data is related and stored

  • Secure storage and access control of sensitive data is handled within databases

Why invented

Previously had file-processing applications which were hard to implement
Databases provide standardisation of concepts

Types

Key-Value Store

  • Simplest NoSQL databases

  • Every item in the database is stored as an attribute name (or 'key') and value

  • Store data in a schema-less way

  • Store data as maps:

    • HashMaps or associative arrays

    • Provide a very efficient average algorithm for accessing data

Document Databases

  • Pair each key with a complex data structure known as a document

  • The central concept is the notion of a "document“ corresponding to a row/tuple in RDBMS.

  • Documents:

    • Loosely structured sets of key/value pairs in documents, e.g., XML, JSON, BSON

    • Encapsulate and encode data in some standard formats or encodings

    • Treated as a whole, avoiding splitting a document into its constituent key/value pairs

  • Documents are retrieved via a unique key that represents that document

  • or via an API or query language that retrieves documents based on their contents

  • Documents are schema-free, i.e., different documents can have structures and schema that differ from one another (an RDBMS requires that each row contain the same columns)

Column Family Stores

  • Optimized for queries over large datasets and store columns of data together, sorted by row-key

  • Data stored in a column-oriented way

    • Focused on storing data efficiently, as needed for very large data

    • Avoids consuming space for storing nulls

    • Values indexed by row-key, column-key and timestamp

    • Related columns are grouped in column families

    • Ordered based on row-key

Graph Databases

  • Used to store information about data networks, such as social connections

  • Graph-oriented

  • Everything is stored as an edge, a node or an attribute

  • Each node and edge can have any number of attributes

  • Both the nodes and edges can be labelled

  • Labels can be used to narrow searches