1/169
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Abstraction
a way of separating logical and physical parts of a problem e.g. the london underground map. You get rid of unnecessary details
Problem abstraction
where you keep removing details until the problem reduces to one that has already been solved
Modelling and Simulation
Building a model of a real world object can be used to solve a particular problem such as Aircraft simulation, Climate change models
Precondition
is the logic part of a statement statement that has to be true for the processing part of the algorithm before it can function
Specifiying a precondition before creating an algorithm ensures
The data is ok to process e.g. if it was empty it could cause the program to crash.
The function is reusable
It reduces unnecessary checks
It makes programs easier to debug and maintain
It makes programs clearer and shorter
Progamming standards that make programs easy to resuse include
Documenting inputs, outputs and preconditions
Variables should use camelCase or PascalCase
All variables should be local to that module
The documentation should contain information such as who made it, what it does and when it was written.
The code should be annotated where necessary
The module should not be greater than one page of code
Caching
the temporary storage of data instructions so that the information can be achieved quicker than performing a calculation e.g. frequently accessed web pages can be stored locally
The advantages pf caching
It's faster to access information that is cached
This saves the cost of bandwidth
It also reduces the load on web services in a client-server environment
Disadvantages of caching
There is slower performance if the result is not found in the cache
Sometimes the cache can be "stale" meaning that it doesn't contain the latest updated data. E.g. When using a cached database on available products you might think an item is still available when it is actually not
Decomposition
This is where a problem is broken down into a number of subproblems so that each subproblem completes a part of the bigger problem
Structured Programming
This is where the aim is to improve the quality of a program by -
Modularization - Breaking the program down into subroutines
Structured Code - The subroutines should use sequence, selection and iteration
Recursion
Top Down Design Model
A program can have many sub-procedures which are called from the main program and these sub-procedures can also have sub-tasks and so we use a hierarchy chart to show the overall program structure.
When following a hierarchy chart we execute from the left to the right
Benefits of modularisation
Large problems are broken down into smaller problems which are easier to manage
Each subroutine (module) can be easily tested
Modules can be reused several times in a program
Frequently used modules can be saved in a library and used by other programs
Lots of programmers can work on different modules at the same time saving time
It is easier to find errors taking less time to debug
Programs are easier to maintain
When creating modules we should
Use meaningful identifiers (for variable names)
Define and document inputs, outputs and preconditions
Add meaningful comments
Make them self contained by passing parameters and using local variables
An Algorithm should
Have clear and precise steps that produce a correct output for any valid input.
Allow for invalid inputs
Terminate at some point
Execute in as few steps as possible
Be understandable by other people so that they can modify it
To help design algorithms we use these solutions
Hierarchy Charts - Used to identify major tasks and breaking them down into sub tasks
Flow Charts - Used to work out individual subroutines
Pseudocode - Used to translate easily into program code
Validation routines
check that a user has entered a sensible value so it can be processed
Concurrent / Parallel Processing
where multiple processors execute instructions simultaneously
Having cpus that have multiple cores means that they can assign different tasks to different cpu cores
Methods of problem solving
Trial and error
Enumeration - listing all possible cases
Simulation
Theoretical approach
Creative solution
Simulation
the process of designing a model of a real system to try different solutions without having to waste money on doing it in real life
Enumeration
a complete, ordered listing of all the items in a collection
Theoretical Approach
having data to predict outcomes (eg knowing the height of a jump and predicting the outcome is safer than trial and error)
Divide and conquer
break the problem into subproblems, recursively solve the subproblems, combine the answers
Exhaustive search
an algorithm that explores all the possibilities and finds the best one
Backtracking
where you go part way along a route and then you backtrack to see if there is a better one
Heuristic Methods
an educated guess
Heuristic methods are used in applications such as
Routing messages across the internet
Transportation
Virus Checking
DNA Analysis
Artificial Intelligence
Data Mining
This is the process of collecting and analysing huge amounts of data. Large organisations such as Apple and Google collect the data and "mine" it to find connections and associations such as target advertisements that are tailored for individual customers based on what they have searched for or talked about
Performance Modelling
This is how we can find out how efficient and algorithm is. One measure of efficiency is the Big-O Notation. This measures the suitability of algorithms based on their execution time and how much storage they take up
Pipelining
This is where multiple instructions are overlapped in execution. Instructions enter the "pipeline" at one end and at each stage, part of the instruction is completed and moves onto the next stage whilst another instruction enters the pipeline
Variable declaration
a name that points to a memory location that also specifies the data type and leaves some space for the memory to increase
Static data structures
Can't grow, shrink, or be freed during execution (array)
Dynamic data structures
Allocates and deallocates memory from the heap (an area of memory especially used for this purpose)
Excessive allocation of memory without deallocation can cause an overflow (python)
Arrays
It must be a fixed size
The items must be the same data type
It can have more than 1 dimension
It has a fixed number of items
It can hold an item of other two structures
Tuples
It is an ordered set of values
It may have elements of mixed types (string/integer/real/boolean)
However it is immutable, meaning the elements of a tuple cannot change
It's a fixed size (static)
It has to have a fixed number of items
The items can be of different data types
Records
These are composed of a fixed number of fields of different data types and they resemble a spreadsheet. A record can be implemented as an object and a record can be treated like a tuple.
There has to be a fixed number of items
The items can be different data types
Elementary data type examples
Integer, real, boolean, char
Composite data type examples
String, array
Abstract data type examples
list, stack, queue
Abstract data type
a way of describing the data and possible operations (a queue of print jobs)
Main operations of abstract data types
Add the item to the rear of the queue
Remove the item from the front of the queue
Check if the queue is empty
Check if the queue is full
Static Queues
We can't add to a full queue and we can't remove from an empty queue so we create a variable called maxSize and we use another variable called size to hold the number of items in the queue
enQueue(item)
adds an item to the rear
deQueue
Removes and returns an item from the front
isEmpty
indicates if the queue is empty
isFull
indicates if the queue is full
Circular Queues
reuses the spaces that have been freed by dequeuing from the front of the array
Priority Queues
allows items to jump the queue depending on their importance
Queue (array) advantages
Simple to program
Predictable memory usage
Circle queue advantages
Can reuse free spaces
Priority queue advantages:
Gives preference to more important items
Queue (array) disadvantages
Fixed length
Single pass
Can't reuse spaces
Circular queue disadvantages
Slightly more complex to program
Priority queue disadvantages
Additional processing required to keep order
List
an abstract data type as it allows us to view the data and perform operations without viewing how they are implemented
Linked list
an ordered set of data elements, each containing a link to its successor (and sometimes its predecessor)
isEmpty()
Test for empty list
append(item)
Add a new item to the end of a list
remove(item)
Remove first occurrence of an item from list
count(item)
Return the number of occurrences of item in list
len(item (or list name) )
Return the number of items in the list
index(item)
Return the position of item
insert(pos,item)Add a new item at position pos
pop()
Remove and return the last item in the list
pop(pos)
Remove and return the item at position pos
Linked lists nodes
Each node has two parts - The data and the pointer of the next node
A start pointer identifies the first node in the list
The nextFree pointer shows the index of the next free space in the array.
When the first data is entered the pointer becomes Null and the start becomes 0 and next free becomes 1
Stacks
A stack has a last in first out method (LIFO) Where the item at the top / the recently added one is the first one removed
Basic stack operations
Add an item to the top
Remove an item from the top
Check if the stack is full
Check if the stack is empty
Push(item)
adds an item to the top of the stack
pop()
removes and returns the item from the top of the stack
IsFull
Checks if the stack is full
isEmpty()
Checks if the stack is empty
peek()
returns the top item without removing it
size()
Returns the number of items on the stack
Overflow
When you try to push to a full stack
Underflow
When you try to pop an item from an empty stack
Call stack
a system level data structure which provides the mechanism for passing parameters and return addresses to subroutines
Subroutine Calls
Parameters are saved onto the stack
The return address (Used to return the values) are saved onto the stack.
Execution is transferred into the subroutine code
Subroutine execution
Stack space is allocated for local variables
The subroutine code executes
The return address is retrieved
The parameters are popped
Execution is transferred back to the return address
Hash Tables
can find any record in the data set instantly no matter how large the dataset.
The address of the data item is to be stored and calculated from the data itself using a hashing algorithm / hash function.
The algorithm will take some part of the record e.g. the primary key, to map the record in the destination address in a hash table.
Collisions
happens when an algorithm generates the same address for different primary keys. The simplest way to overcome this is to put the item in the next free slot in the table. However this can lead to "clustering" of items in the table. A different solution to this is to change the skip value by adding 1, 3, 5 , 7 etc
The Mid-square Method
squares the item, uses some part of the resulting digits (in this case the middle numbers) and then perform the MOD function to get a result
The Folding Method
divides the item into equal sized pieces and adds them together then performs the MOD function to get an address
Alphanumeric Data
where letters need to be converted to integers to be stored in a table. It converts the letters into the ASCII code and then applies a hashing algorithm to the resulting series of digits.
Deletions
Items that are to be deleted must be replaced with a dummy item or marker e.g. Deleted. When searching for an item which hashes to that spot, if it is not found, the search will continue until it is found or a free space is located. If a new item is to be added, the address that contains "deleted" will replace the dummy item
Dictionary
an abstract data type and is made up of associated pairs. Each pair has a key and a value which is accessed through the key
ADD Key
Value - adds a new value and key
DELETE Key
Value - deletes a value and key
AMEND Key
Value - amends a value and key
Return Key
returns the value associated with the key entered
Undirected Graphs
a weighted graph but without arrows meaning you can travel in any direction so long as there is a connection
Unweighted, Directed Graph
a graph which shows the direction to each node but does not have a weight associated with an edge
Adjacency Matrix
Each row and column represent a node. [row, column] indicates a connection, an unweighted graph can show this by putting a 1 in the position.
In a weighted, undirected graph each entry represents the weight
Advantages of adjacency matrices
they are convenient to work with and adding a new edge is simple
it's quicker to find relationships with it
Disadvantages of adjacency matrices
a sparse graph with not many edges will leave a lot of cells empty and wasting a lot of memory space
Adjacency List
a way of creating a graph using a list of nodes and each node points to a list of adjacent nodes
Advantages of adjacency lists
only uses storage for connections that exist and so therefore more space - efficient. It is a good way of representing large, sparsely connected graphs
Disadvantages of adjancency lists
It takes longer to find relationships between different nodes than a matrix
Binary Trees
a tree where each node has a maximum number of two children
Each node of a tree can be broken down into 3 parts
The data
A left pointer to a child
A right pointer to a child
Balanced binary tree
is where the nodes are put in a way so the height is kept to a minimum to allow for more efficient searching