Unit 6.2 Data Integrity

Aims

  • Understand what is meant by the term data validation and the different types of data validation that can be used to ensure data is reasonable and sensible.

  • Understand what is meant by the term data verification and the different types of data verification that can be used during data entry and data transfer.

Data Integrity

Accuracy/consistency and reliability of data

Integrity is important as companies will make decisions accordingly

General Data Protection Regulations (GDPR) means companies must keep personal data accurate and up-to-date.

Two methods which only ensure integrity is improved (not 100% accurate!)

Data Validation

Data entered is reasonable & sensible, used when data is entered into a computer system.

E.g. can check a DOB is in correct format not that it is in fact your DOB

Validation

Description

E.g.

Range Check

Data can only be within two boundaries

Teenager can only be between ages 13-19

Type Check

Data is of right type like integer/letter/text

Age only no’s, name only letters

Length Check

Data entered has certain no. of characters

Password must have 8 characters min.

Presence Check

Checks data has been entered

Password box must be filled in

Format Check

Ensures data is in correct format/structure

Dates structed: DD/MM/YYYY, email has @ sign.

Existence Check

Checks if data in a file/file name exists in another

Checks that a username is unique.

Limit Check

Checks either upper or lower limit of data.

Exam mark number cant be >75 or Feb date cant be > 28/9

Check digit

Parity bit that’s added onto end of long no. for error checking

Check barcode no. has been entered correctly.

e.g. of checks from a email registration form

Type

How it can be used

How it can be used

Range Check

Day must be between 1 & 31

Month must be between 1 &12

Type Check

First name only accepts letters

Day only accepts numbers

Length Check

Password is 8 char min

Gender atleast 1 char long

Presence Check

Can be applied to all field

Format Check

DOB must be DD/MM/YYYY

Email address must include @

Limit Check

Day can have max value of 31

Month can have max value of 12

Data Verification

Compares one set of data to another to ensure both match. e,g, when its being entered or transferred (in case of corruption)

Method includes parity checks and checksums

Two methods:

  • Visual Checks - proofreading both copies to check they are the same

  • Double Entry - Entering data twice so they can be compared and see if they match e.g. setting up new password

Parity Checks

Single bit added to binary data sequence to ensure total no. of bits with value 1 is even or odd - helps detect errors during transmission of data.

Even parity - If no. of 1’s is even, parity is set to 0 OR if no of 1’s is odd parity is set to 1 to make amount even.

Odd Parity - If no of 1’s is even parity set to 1 to make it odd OR if no of 1’s is odd set parity to 0 to keep it odd.

Byte Parity - add a single bit to each byte (8 bits) but Block parity adds single bit to block of data containing many files.

Checksums

Value calculated from data sequence to ensure integrity & detect errors that can occur during transmission or storage. Many methods.

Modulo 10 method

  1. Write out number

  2. Start from left, give each no. a weighting, 1st number has a weighting of 1, 2nd no is 3, repeat the 1&3 for all no’s.

  3. Multiply each no by its weighting.

  4. Add up all no’s

  5. Divide total no. by 10, anything after decimal space is remainder

  6. Subtract remainder away from 10

  7. No. left is checksum

Also is Modulo 11 but only works with 7 digits

e.g. each digit is given a weighting starting from 7 down to 1, the digit is multiplied by its weighting to make a total which is then divided by 11 and subtracted from 11, if check digit is 10 then X is used.

robot