Lecture 1 Notes: Problem Solving and Programming

Week 1 Overview

Definition of Computer
What is Programming?
History of Programming Languages
Type of programming languages

What is a Computer?

A computer is an electronic device that manipulates information, or data.
It has the ability to store, retrieve, and process data.
We can use a computer to type documents, play games, listen to music, browse the internet, and more.
It also allows to edit or create spreadsheets, presentations, and even videos.

Computer System

The computer is an electronic machine that performs four general operations:
1. Input
2. Storage
3. Processing
4. Output

Functional Components of a Computer

Input ("read"): Takes data from an external device and stores it for later use.
Manipulate: Processes the stored data by executing the instructions within a program.
Output ("write"): Sends results to an external device.

Different Parts of a Computer

Can be viewed as comprising four units:
1. Processing unit: Has two distinct functions:
  - Control unit: Controls the behavior of the entire computer.
  - ALU (Arithmetic Logic Unit): Performs operations on data, such as arithmetic operations and comparisons of values.
2. Memory unit: Stores information that can be retrieved, modified, and processed.
  - Main memory & Secondary Storage.
3. Input unit: Inputs information from keyboards or files on disks into the memory.
4. Output unit: Outputs information to printers or graphic displays from the memory.

What is a Program?

A program is a list of instructions that is executed by a computer to accomplish a particular task.
Creating those instructions is programming, done by a programmer.

What is Computer Programming?

Programming consists of giving a computer instructions on what task to perform in order to solve problems.
It's a collaboration between humans and computers, in which humans create instructions for a computer to follow (code) in a language that the machine understands.

Why Programming?

Comparison between Humans and Machines:

Humans:
- Natural language
- Large vocabulary
- Complex syntax
- Semantic ambiguity
Machines:
- Binary language
- Small vocabulary
- Simple syntax
- No semantic ambiguity

Humans and Machines:

Programming language
- Ex: Pascal, C++, Python, Java, …
- Vocabulary: restricted
- Syntax: small and restricted
- Semantic: no ambiguity (almost)

Software

Software consists of programs written to perform specific tasks.
Application programs: Perform a specific task for the user.
- Examples: word processors, spreadsheets, and games (e.g., Microsoft Office, MATLAB, etc.)
System programs: Take control of the computer, such as an operating system.
- Examples: Linux, Windows XP, etc.

Types of Computer Programs

System Software
Middleware Software
Application Software
Programming Software

Computer Language

A language is the main medium of communication between computer systems, and the most common are programming languages.
Computers only understand binary numbers (0 and 1) to perform various operations, but languages are developed for different types of work on a computer.
The binary number system has two digits: 0 and 1.
A single digit (0 or 1) is called a bit, short for binary digit. A byte is made up of 8 bits.

Binary Language

Data and instructions (numbers, characters, strings, etc.) are encoded as binary numbers, a series of bits (one or more bytes made up of zeros and ones).
Encoding and decoding of data into binary is performed automatically by the system based on the encoding scheme.

Encoding Schemes:

Numeric Data: Encoded as binary numbers
Non-Numeric Data: Encoded as binary numbers using representative code
- ASCII – 1 byte per character
- Unicode – 2 bytes per character

Binary Number System

A binary number is made up of only 0s and 1s.
Example: 110100.
The digital world uses binary digits.
Binary numbers have many uses in mathematics and beyond.

Binary Number System vs Decimal

Decimal
- Base 10, ten digits (0-9)
- The position (place) values are integral powers of 10: $10^0$ (ones), $10^1$ (tens), $10^2$ (hundreds), $10^3$ (thousands)…
- n decimal digits - $10^n$ unique values
Binary
- Base 2, two digits (0-1)
- The position (place) values are integral powers of 2: $2^0$ (1), $2^1$ (2), $2^2$ (4), $2^3$ (8), $2^4$ (16), $2^5$ (32), $2^6$ (64)…
- n binary digits - $2^n$ unique values

ASCII Table

American Standard Code for Information Interchange
- Contains letters, numbers, control characters, and other symbols.
- Each character is assigned a unique 7-bit code.

Generations of Programming Languages

What is a Programming language
- English is a natural language. It has word symbols etc…
- A programming language also has words,symbol and rules.
- The rules are called as syntax.

Generations of Programming Languages

Generation	Programming Language
1st GL	Machine Language
2nd GL	Assembly Language
3rd GL	High-Level Language
4th GL	Very High-Level Language

The concept of language generations, sometimes called levels, is closely connected to the advances in technology that brought about computer generations.

Machine Language (1st Generation)

Most basic programming language, known as ‘Machine Code’.
Consists of 1s and 0s, so it’s directly understood by the computer.
All other languages must be translated into machine code before the instructions can be carried out; programs execute faster because no translation is needed.
- E.g., Games, simulation programs, real-time applications are written in machine code.
Disadvantages:
1. Machine specific – program written for one type of machine won’t run on another.
2. Time-consuming, laborious, and cumbersome to work with.
3. Error-prone.
4. Debugging is tedious.

Assembly Language (2nd Generation)

In the 50s, machine code gave way to assembly language.
Uses mnemonics (abbreviations that represent instructions in a more memorable way) and numbers (0-9) instead of 0s and 1s.
- E.g., ADD, ADX, SBX
Usage: apps where timing and storage space are critical, part of OS, device drivers that control devices such as printer, CD ROM.
Advantage:
1. Easier for programmer to use – debugging is easier.
Disadvantages:
1. Needs to be translated to machine code.
2. Machine-specific.

1st vs 2nd Generation Language

The major difference between machine language and assembly language is that machine language is referred to as a binary language.
Machine language can run on a computer directly.
Assembly language is a low-level programming language that must be converted into machine code using software called an assembler.

High-Level Language (3rd Generation)

3GLs are called procedural languages or high-level languages.
They are easier to understand because they resemble our own English language more than 1GLs and 2GLs.
Examples of 3GLs are BASIC, COBOL, Pascal, Fortran, C, C++, Perl, and Ada.
High-level languages were developed – developed with the programmer in mind.
- E.g: ALGOL (ALGOrithmic Language), FORTRAN (FORmula TRANslation) & COBOL (Common Business Oriented Language).
Others followed: C, C++, C#, R, …
A single instruction in a high-level language usually translates into many instructions in machine language.

High-Level Language (3rd Generation) Advantages

High-level languages are programmer-friendly.
Provide a higher level of abstraction from machine languages.
Machine-independent language.
Easy to learn.
Less error-prone, easy to find and debug errors.
High-level programming results in better programming productivity.

Very High-Level Language (4th Generation)

The 3GLs are procedural in nature (HOW of the problem get coded).
The procedures require knowledge of how the problem will be solved.
4GLs are non-procedural; only WHAT of the problem is coded, i.e., only ‘What is required’ is to be specified, and the rest gets done automatically.
A big program of 3GLs may get replaced by a single statement of 4GLs.
Main aim of 4GLs is to cut down on developed and maintenance time and make it easier for users.

Translation Programs

A program written in a high-level language is called source code.
To convert the source code into machine code, translators are needed.
A translator takes a program written in source language as input and converts it into a program in target language as output.
It also detects and reports errors during translation.

Types of Translator

Assemblers
Compilers
Interpreters

Source Code is the code that is input to a translator.
Executable code is the code that is output from the translator.

Assembler

An Assembler converts an assembly program into machine code.

Compiler

A compiler is a translator used to convert programs in high-level language to low-level language.
It translates the entire program and also reports the errors in the source program encountered during the translation.

Compiler Advantages

Fast in execution.
The object/executable code produced by a compiler can be distributed or executed without having to have the compiler present.
The object program can be used whenever required without the need for recompilation.

Compiler Disadvantages

Debugging a program is much harder; therefore, not so good at finding errors.
When an error is found, the whole program has to be re-compiled.

Interpreter

An Interpreter translates line by line and reports the error once it encounters it during the translation process.
It directly executes the operations specified in the source program when the input is given by the user.
It gives better error diagnostics than a compiler.

Interpreter Advantages

Good at locating errors in programs.
Debugging is easier since the interpreter stops when it encounters an error.
If an error is detected, there is no need to retranslate the whole program.

Interpreter Disadvantages

Rather slow
No object code is produced, so a translation has to be done every time the program is running.
For the program to run, the Interpreter must be present.