Looks like no one added any tags here yet for you.
describe lossy compression
Reducing file size while also losing some information in the file (JPEG/JPG/MP3)
describe lossless compression
Reducing file size with lost information able to be regained through an algorithm
Examples of lossless compression
run length encoding
dictionary encoding
describe run length encoding
repetitions in a file are compressed to one instance of data and the number of times it occurs- method of lossless compression. For images, colours may be averaged out first (lossy)
describe dictionary compression
Each frequently-used word is given an index equivalent, and its representation is stored in a dictionary, allowing the original data to be restored
Use of encryption
scrambles data to keep it secure (third parties cannot read it). It is able to be deciphered by the recipient
types of encryption
symmetric and asymmetric
describe symmetric encyption
sender and receiver both have the same private key (shared via key exchange) that is used to both encrypt and decrypt data
describe asymmetric encryption
there is a public and private key mathematically related to each other (=key pair). senders encrypt data with the public key, which can be decrypted using the receiver’s private key only
Use of asymmetric encryption
digital signatures- to prove a message is from you, you can encrypt it with your private key so that anyone can decrypt it with your public key
Describe hashing
putting an input through an algorithm to get a fixed-size value (=hash), which is non-invertible. This is a hash function
-mapping a text/series of numbers to an output
uses of hash tables
passwords
hash tables
describe hashing in passwords
the hash value of user’s passwords can be stored in a database, so that hackers only see the hash and cannot revert it back to the original password
describe hash tables
Uses a bucket array and hash values to create a data structure that stores data next to their corresponding hash value. Data can be looked up in constant time. It can also cause collisions (data have the same hash value), which can be solved by a second hash or a list
properties of a good hash table
hash output is smaller than the input (=shorter searching for hash than key)
low chance of collisions
quick hash calculation
What is an entity
category of item that has data/information stored about it. Entities can be made into records, with their information (attributes) stored in fields
Describe flat-file databases
Table that stores the attributes about one entity in a single file
disadvantages of flat-file databases
creates redundant/ repeated data
what is a primary key
unique identifier for each record in a table
what is a secondary key
An index of the rows in a table to increase the speed of querying data (they do not need to be unique, and can be used instead of the primary key)
what are relational databases
multiple tables linked to each other by foreign keys (a commonly-shared field which is a primary key in one of the tables)
what are the 3 different relationships in databases
one to one (one entity is linked to one entity e.g. one foreign key only found in one table)
one to many (one entity is linked to multiple entities e.g. same foreign key in multiple tables)
many to many (multiple entities are linked to multiple entities e.g. multiple foreign keys in many tables)
how can entity relationships be shown through diagrams
entity relationship modelling (shows the entities involved, the name of the relationship and the degree/ type of relationship)
for many-to-many relationships, a third table (linking table) is required to store 2 sets of foreign keys (reduce redundancies)
what is a composite primary key
a unique identifier of a table consisting of more than one attribute (e.g. two foreign keys in tables linking many-to-many relationships)
describe normalisation
the process of creating the best possible layout for relational databases (+ turning flat-file databases into relational databases)
= no redundancy, consistent data, no issues adding/removing data and being able to perform complex queries
describe different types of normalisation
first normal form (no attribute contains more than a single value- atomic, unique field names, primary key)
second normal form (is in 1NF and has no partial dependencies- where a nonprime/nonprimary key value is not dependent on all the necessary primary keys in a composite key)
third normal form (in 2NF and has no non-key dependencies/ transitive dependencies: value of one field is determined by a non-primary key value )
describe indexing within databases
method used to store the position of records (which are ordered by certain atributes/keys) by having the primary key automatically indexed (although secondary keys are indexed because they are more likely to be queried
different ways of capturing data
user fills in data manually
magnetic ink character recognition (MICR) in bank cheques
optical mark recognition (OMR) in multiple-choice tests
barcode readers
optical character recognition (OCR)- scans + edit files
sensors
describe data pre-processing
selecting and managing relevant data before any other processes (like analysis, exchange, etc) are performed on it
e.g. only storing car registration if their speed is above the speed limit (with SQL)
describe the process of exchanging data
transferring the collected and pre-processed data.
e.g. electronic data interchange (EDI), protocol of automatic data exchange between organisations without human interaction
Data is usually transferred as comma separated values file (.csv), or JSON/XML
what performs database management
database management systems (DBMS)
-e.g. Oracle, mySQL, Bigtable
describe the join method in sql
allows for the combination of multiple tables based on a common field (select… from table1 join table2 on table1.field1=table2.field2)
how to create a table in sql
-’CREATE TABLE’
-specify attributes and their data type, whether it must be filled in (‘not null’), and if it is a primary key
sql attribute data types
char(n)- string with fixed size n
varchar(n)- variable length string up to length n
boolean
integer
float
date - day/month/year
time - hour/minute/second
currency
sql purpose of alter
-ALTER TABLE:
allows table columns to be added (ADD), deleted (DROP), or modified (MODIFY)
describe referential integrity
Process of ensuring data consistency/accuracy, where data needed in linked databases is not removed (e.g. via cascade delete)
describe transaction processing
A single/multiple operations executed on data
describe record locking + disadvantage
Preventing simultaneous access of data in a record to avoid inconsistencies/ loss of updates
-can create deadlock, where 2 users both accessing one record are waiting for the other user’s record to become free to use
describe solution to deadlocking
serialisation- makes sure transactions do not overlap in time= cannot interfere with each other or lead to lost updates
timestamp ordering records a user’s read/write timestamp & bases write timestamp on the user with the latest read stamp (if read stamp not same for user, transaction is cancelled)
commitment ordering orders transactions based on initiation time + dependency between transactions
Describe a use of redundancy
So that important information is not lost by storing a copy in a different physical location
describe data integrity
maintenance and consistency of data, reflecting the reality is represents
-done using referential integrity
Rules for all DBMS transactions
atomicity : transaction processed entirely or not at all
consistency: transaction upholds referential integrity rules between linked tables- changes in a database retain overall database state
isolation: simultaneous transactions and those same transactions sequentially both lead to the same result- transactions are not interrupted by other transactions
durability: transactions remain after being processes (not lost from system failure)= DBMS writing effects of transactions to non-volatile memory