unit 1 text book
Computer Programming
Computer programming, or coding, involves designing, coding, testing, debugging, and maintaining the source code of computer programs. This code is written in programming languages such as Java, C++, C#, and Python. The purpose is to create instructions for computers to perform specific tasks or exhibit desired behaviors. It requires expertise in application domains, algorithms, and formal logic. Programming is a phase in software engineering. There is debate over whether programming is an art form, a craft, or an engineering discipline. Good programming applies all three to produce efficient and evolvable software. Programmers generally do not need licenses or standardized certifications.
Because the discipline covers many areas, even "software engineers", which may or may not include critical applications, it is debatable whether licensing is required for the profession as a whole. In most cases, the discipline is self-governed by the entities which require the programming, and sometimes very strict environments are defined (e.g. United States Air Force use of AdaCore and security clearance). However, representing oneself as a "Professional Software Engineer" without a license from an accredited institution is illegal in many parts of the world. Another ongoing debate is the extent to which the programming language used in writing computer programmes affects the form that the final programme takes. This debate is analogous to that surrounding the Sapir-Whorf hypothesis in linguistics and cognitive science, which postulates that a particular spoken language's nature influences the habitual thought of its speakers. Different language patterns yield different patterns of thought. This idea challenges the possibility of representing the world perfectly with language, because it acknowledges that the mechanisms of any language condition the thoughts of its speaker community.
History of Computing Devices
Ancient cultures had no conception of computing beyond simple arithmetic. The abacus, invented in Sumeria around 2500 BC, was the only mechanical device for numerical computation. Later, the Antikythera mechanism, invented around 100 AD in ancient Greece, was the first mechanical calculator using gears to perform calculations, tracking the metonic cycle for lunar-to-solar calendars and calculating Olympiad dates.
Al-Jazari, a medieval scientist, built programmable Automata in 1206 AD using pegs and cams in a wooden drum to trigger levers that operated percussion instruments, producing drum rhythms. The Jacquard Loom, developed in 1801, used pasteboard cards with holes to control cloth weaving patterns. Charles Babbage adopted punched cards around 1830 for his Analytical Engine. Ada Lovelace wrote the first computer program for the Analytical Engine to calculate Bernoulli Numbers.
The synthesis of numerical calculation, predetermined operation and output, along with a way to organize and input instructions in a manner relatively easy for humans to conceive and produce, led to the modern development of computer programming. Development of computer programming accelerated through the Industrial Revolution. In the late 1880s, Herman Hollerith invented the recording of data on a medium that could then be read by a machine. Prior uses of machine readable media, above, had been for control, not data. "After some initial trials with paper tape, he settled on punched cards … " To process these punched cards, first known as "Hollerith cards" he invented the tabulator, and the keypunch machines. These three inventions were the foundation of the modern information processing industry. In 1896 he founded the Tabulating Machine Company (which later became the core of IBM). The addition of a control panel (plugboard) to his 1906 Type I Tabulator allowed it to do different jobs without having to be physically rebuilt. By the late 1940s, there were a variety of control panel programmable machines, called unit record equipment, to perform data-processing tasks. The invention of the von Neumann architecture allowed computer programmes to be stored in computer memory. Early programmes had to be crafted using the instructions (elementary operations) of the particular machine, often in binary notation.
Evolution of Programming Languages
Every model of computer used likely different instructions (machine language) to do the same task. Later assembly languages were developed that let the programmer specify each instruction in a text format, entering abbreviations for each operation code instead of a number and specifying addresses in symbolic form (e.g., ADD X, TOTAL). Entering a program in assembly language is usually more convenient, faster, and less prone to human error than using machine language, but because an assembly language is little more than a different notation for a machine language, any two machines with different instruction sets also have different assembly languages.
In 1954, FORTRAN was invented; it was the first high-level programming language to have a functional implementation, as opposed to just a design on paper. (A high-level language is, in very general terms, any programming language that allows the programmer to write programmes in terms that are more abstract than assembly language instructions, i.e. at a level of abstraction "higher" than that of an assembly language.) It allowed programmers to specify calculations by entering a formula directly (e.g. Y = X2 + 5X + 9). The programme text, or source, is converted into machine instructions using a special programme called a compiler, which translates the FORTRAN programme into machine language. In fact, the name FORTRAN stands for "Formula Translation". Many other languages were developed, including some for commercial programming, such as COBOL. Programmes were mostly still entered using punched cards or paper tape.
By the late 1960s, data storage devices and computer terminals became inexpensive enough that programmes could be created by typing directly into the computers. Text editors were developed that allowed changes and corrections to be made much more easily than with punched cards. (Usually, an error in punching a card meant that the card had to be discarded and a new one punched to replace it.) As time has progressed, computers have made giant leaps in the area of processing power. This has brought about newer programming languages that are more abstracted from the underlying hardware. Popular programming languages of the modern era include C++, C#, Visual Basic, Pascal, HTML, Java/Javascript, Perl, PHP, SQL and dozens more. Although these high-level languages usually incur greater overhead, the increase in speed of modern computers has made the use of these languages much more practical than in the past. These increasingly abstracted languages typically are easier to learn and allow the programmer to develop applications much more efficiently and with less source code. However, high-level languages are still impractical for a few programmes, such as those where low-level hardware control is necessary or where maximum processing speed is vital. Computer programming has become a popular career in the developed world, particularly in the United States, Europe, Scandinavia, and Japan. Due to the high labour cost of programmers in these countries, some forms of programming have been increasingly subject to offshore outsourcing (importing software and services from other countries, usually at a lower wage), making programming career decisions in developed countries more complicated, while increasing economic opportunities for programmers in less developed areas, particularly China and India.
Modern Programming Quality Requirements
Whatever the approach to software development may be, the final programmer must satisfy some fundamental properties. The following properties are among the most relevant:
- Reliability: Results of a program are often reliable if correct. Depends on conceptual correctness of algorithms and minimization of programming mistakes.
- Robustness: How well a program anticipates problems not due to programmer error, including incorrect or corrupt data, resources unavailability, or user errors.
- Usability: Ergonomics of a program; the ease in using a program for its intended purpose, even unanticipated purposes. Includes textual, graphical, and hardware elements that improve clarity and intuitiveness.
- Portability: The range of computer hardware and operating system platforms on which the source code of a program can be compiled/interpreted and run.
- Maintainability: The ease with which a program can be modified to make improvements, fix bugs, or adapt to new environments. Good practices during initial development make the difference.
- Efficiency/Performance: The amount of system resources a program consumes: processor time, memory space, disk usage, and network bandwidth. Correct disposal of temporary files and lack of memory leaks.
Readability of Source Code
In computer programming, readability refers to the ease with which a human reader can comprehend the purpose, control flow, and operation of source code. It affects the aspects of quality above, including portability, usability and most importantly maintainability. Readability is important because programmers spend the majority of their time reading, trying to understand and modifying existing source code, rather than writing new source code. Unreadable code often leads to bugs, inefficiencies, and duplicated code. A study found that a few simple readability transformations made code shorter and drastically reduced the time to understand it. Following a consistent programming style often helps readability. However, readability is more than just programming style. Many factors, having little or nothing to do with the ability of the computer to efficiently compile and execute the code, contribute to readability. Some of these factors include:
- Different indentation styles (whitespace)
- Comments
- Decomposition
- Naming conventions for objects (such as variables, classes, procedures, etc.)
Software Development
Software development commonly involves requirement analysis, design, implementation, testing, and debugging. Various approaches exist for each of these tasks. Use Case analysis is popular for requirements analysis. Many programmers use forms of Agile software development, where various stages of software development are integrated into short cycles lasting a few weeks, rather than years. Popular modeling techniques include Object-Oriented Analysis and Design (OOAD) and Model-Driven Architecture (MDA). The Unified Modeling Language (UML) is a notation used for both OOAD and MDA. A similar technique used for database design is Entity-Relationship Modeling (ER Modeling). Implementation techniques include imperative languages (object-oriented or procedural), functional languages, and logic languages.
Algorithmic Complexity
Computer programming focuses on discovering and implementing the most efficient algorithms for a given class of problem. Algorithms are classified into orders using Big O notation, which expresses resource use, such as execution time or memory consumption, in terms of the size of an input. Expert programmers know established algorithms and their complexities to choose the best algorithms for the circumstances.
Measuring Language Usage
It is difficult to determine the most popular programming languages. Some languages are popular for particular kinds of applications (e.g., COBOL in corporate data centers, FORTRAN in engineering, scripting languages in Web development, and C in embedded applications), while some languages are used to write many different kinds of applications. Many applications use a mix of several languages. New languages are designed around the syntax of a previous language with new functionality added (e.g., C++ adds object-orientedness to C, and Java adds memory management and bytecode to C++). Methods of measuring programming language popularity include counting job advertisements, book sales, courses, and existing lines of code.
Debugging
Debugging is a very important task in the software development process, because an incorrect programme can have significant consequences for its users. Some languages are more prone to some kinds of faults because their specification does not require compilers to perform as much checking as other languages. Use of a static code analysis tool can help detect some possible problems. Debugging is often done with IDEs like Eclipse, Kdevelop, NetBeans, Code::Blocks, and Visual Studio. Standalone debuggers like gdb are also used, and these often provide less of a visual environment, usually using a command line.
Programming Languages
Different programming languages support different styles of programming (called programming paradigms). The choice of language depends on company policy, suitability to the task, availability of third-party packages, or individual preference. Trade-offs involve finding enough programmers who know the language, the availability of compilers, and the efficiency with which programmes written in a given language execute. Languages exist on a spectrum from "low-level" to "high-level"; low-level languages are more machine-oriented and faster, while high-level languages are more abstract and easier to use but execute less quickly. According to Allen Downey, basic instructions appear in every language:
- input: Gather data from the keyboard, a file, or some other device.
- output: Display data on the screen or send data to a file or other device.
- arithmetic: Perform basic arithmetical operations like addition and multiplication.
- conditional execution: Check for certain conditions and execute the appropriate sequence of statements.
- repetition: Perform some action repeatedly, usually with some variation.
Programmers
Computer programmers write computer software. Their jobs usually involve:
- Coding
- Compilation
- Debugging
- Documentation
- Integration
- Maintenance
- Requirements analysis
- Software architecture
- Software testing
- Specification
Computer Experiment
In the scientific context, a computer experiment refers to mathematical modeling using computer simulation. It has become common to call such experiments in silico. This area includes Computational physics, Computational chemistry, Computational biology and other similar disciplines.
Computer Simulation
In a computer simulation, a "computer" model typically replaces a traditional mathematical model. Whereas a mathematical model is traditionally solved analytically, a computer model can be solved numerically: this is what a computer simulation ofa system (typically a physical system) is about. In a computer experiment a computer model is used to make inferences about some underlying system. The idea is that the computer model takes the place of an experiment we cannot do: the phrase in silico experiment is also used. At the moment, for example, the debate on climate change is being informed largely from evaluations of climate simulators running on some of the largest computers in the world, which are being used to investigate the impact of a substantial increase in the atmospheric concentration of greenhouse gases like carbon dioxide. In this case, the accumulation of many simulations on different initial conditions form an experiment.
Statistics and Computer Experiments
Computer experiments can be seen as a branch of applied statistics because the user must account for three sources of uncertainty. First the models often contain parameter whose values are not certain: second, the models themselves are imperfect representations of the underlying system; and third, data collected from the system that might be used to calibrate the models are imperfectly measured. However, most practitioners of computer experiments do not see themselves as statisticians.
History of Computer Experiments
The first computer experiments were probably conducted at Los Alamos National Laboratory to study the behaviour of nuclear weapons. Since then, the use of computer models has branched out into large parts of the physical and environmental sciences (where they are sometimes referred to as process models), and in medicine. Because computer experiments have developed in such a wide range of applications there is little standardisation of the terminology.
As a general guide, in this article learning about the model parameters using data from the system is referred to as (model) calibration, while learning about the system behaviour itself as (system) prediction. Combining both of these, e.g., using the model and system data to make predictions about the system, is referred to as calibrated prediction. Other terminology is discussed below, in #The "traditional" approach.
Constructing a Simulator
The simulator is the computer code that we actually evaluate: the outputs of the simulator correspond, usually directly, to measurable aspects of the system. It is important to understand the process of creating a simulator, because this allows us to make judgements about how similar two or more simulators of the same system are. Without this information it is difficult to combine information from different simulators, because we do not know to what extent we can treat them as independent sources of information. See also Computer simulation. In most applications there are typically three parts to a simulator:
Simulator = Model + Treatment + Solver
The Model
A mathematical model for the system of interest. In the physical sciences a model typically describes the state variables, plus fundamental laws and equations of state that variables exist and evolve in space and time. For example, an ocean model might include:
- Velocity in each of three directions
- Pressure
- Temperature
- Salinity
- Density
Subject to:
- Conservation of momentum
- Continuity equation (conservation of mass)
- Conservation of temperature and salinity
And with Equations of state, e.g.
- Relationship of density to temperature, salinity and pressure, and perhaps also a model for the formation of sea-ice
The state variables for the ocean model are expressed as a continuum in space and time, and the fundamental laws as partial differential equations. Even at this stage, though, simplifications may be made. For example, it is common to treat seawater as incompressible.
The Treatment
The treatment makes the model applicable to a particular instance, such as the Earth during 1750-2100. This includes boundary conditions describing ocean margins and topography, initial conditions quantifying the state vector at the start of 1750, and forcing functions describing external influences on the oceans over the period. These forcings mainly describe events at the surface of the ocean, such as temperature, winds, and exchanges of freshwater through evaporation and precipitation. Can be inferred from data, while 'future' values will be specified according to a particular scenario. Large-scale climate modelling couples an ocean model with an atmosphere model, so that the forcing at the margin between the ocean and the atmosphere does not have to be prescribed, but can be inferred.
The Solver
The solver turns the model and the treatment into a calculation that approximates the evolution of the state vector. This usually requires discretizing the problem, replacing the continuum with a lattice of discrete points. For an ocean simulator, the Earth's surface might be divided into rectangles, and the ocean itself into a number of layers. This division is typically fixed for a given simulator, and the number of cells is referred to as the simulator's resolution. This process can necessitate further adjustment. There may be processes with characteristic scales that are smaller than a grid cell, or a time-step. These do not get picked up by the simulator, which behaves as though the state vector is constant over each cell and time-step. These so-called sub-grid-scale processes need to be put back in if they are thought to be a large component of the model.
The Simulator Function
The simulator can be viewed as a deterministic function that maps inputs (coefficients, initial conditions, forcing functions) into outputs. Some consider this random due to complexity.
Sources of Uncertainty
There are four different sources of uncertainty in a computer experiment, which are discussed in turn.
- Uncertainty About the Simulator Behaviour - uncertain about what would happen if we evaluated the simulator at a particular input value x;
- Uncertainty about the 'Correct' Simulator Input - due to imprecise of faulty measurements. This would typically inlcude initial conditions (starting values of the state vector).
- Uncertainty about Model Error - about model, but cannot precisely define.
- Statistical Uncertainty - the use of correction techniques and error correction
Computer Simulation
A computer simulation applies a computer model or program to simulate a system. Computer simulations have become a used part of the modelling of many natural systems in physics, mathematics (computational physics), astrophysics, chemistry, and biology (computational biology), and human systems in economics, psychology, social science, and engineering. Simulations can explore new technology and estimate the performance of complex systems. Computer simulations vary from computer programmes that run a few minutes, to network-based groups of computers running for hours, to ongoing simulations that run for days. The scale of events being simulated by computer simulations has far exceeded anything possible (or perhaps even imaginable) using traditional paper-and-pencil mathematical modeling.
Simulation vs, Model
A computer model refers to the algorithms and equations used to capture the behavior of the system being modeled. However, a computer simulation refers to the actual running of the programme which contains these equations or algorithms. Simulation, therefore, refers to an instance where you ran a model. Model and simulation are often used interchangeably.
History of Computer Simulations
Computer simulation developed with the rapid growth of the computer, including the Manhattan Project. Simulation is often used as an adjunct to modeling systems for which simple closed form analytic solutions are not possible.
Data Preparation
The external data requirements of simulations and models vary widely. Input sources also vary widely:
- Sensors and other physical devices connected to the model;
- Controls used to direct the progress of the simulator in some way;
- Historical data entered by hand;
- Current data extracted as by-product from other processes;
- Values output for the purpose of other simulations, models, or processes.
Because of this variety, and that many common elements exist between diverse simulation systems, there are a large number of specialized simulation languages. The best-known of these may be Simula (sometimes Simula-67, after the year 1967 when it was proposed). There are now many others. Systems that accept data from external sources must be very careful in knowing what they are receiving.
Types of Computer Models
Models can be classified by various attributes:
- Stochastic or deterministic
- Steady-state or dynamic
- Continuous or discrete
- Local or distributed
For time-stepped simulations, there are two main classes:
- Simulations that store data in regular grids and require only next-neighbor access (stencil codes)
- Models where the underlying graph is not a regular grid (meshfree method class)
Further types of simulations:
- Equations define the relationships between elements of the modeled system and attempt to find a state in which the system is in equilibrium. (static simulations)
- Dynamic simulations model changes in a system in response to changing input signals
- Stochastic models use random number generators to model chance or random events
- A discrete event simulation (DES) manages events in time; it maintains a queue of events sorted by the simulated time they should occur.
- A continuous dynamic simulation performs numerical solution of differential-algebraic equations or differential equations (either partial or ordinary)
- A special type of discrete simulation that does not rely on a model with an underlying equation is agent-based simulation
- Distributed models run on a network of interconnected computers
CGI Simulations
Output data can be presented in tables or matrices or, using computer-generated-imagery (CGI) animation, as graphs and moving images. Weather forecasting models balance moving rain/snow clouds against a map that uses numeric coordinates and numeric timestamps. Large amounts of data cna be graphically displayed in motion, as changes occur during a simulation run.
Computer Simulation in Science
Generic areas of use from the underlying mathematical description:
- Numerical simulation of differential equations that cannot be solved analytically
- Stochastic simulation, typically used for discrete systems where events occur probabilistically
Specific examples include:
- Statistical simulations based upon an agglomeration of a large number of input profiles
- Agent-based simulation in ecology. time stepped dynamic model
- Hydrology such as the SWMM and DSSAM Models
- Computer simulations have also been used to formally model theories of human cognition and performance
- Computer simulation using molecular modeling for drug discovery
- Computer simulation for studying the selective sensitivity of bonds by mechanochemistry during grinding of organic molecules
- Computational fluid dynamics simulations simulate flowing air, water, and other fluids
Notable simulations include Donella Meadows' World3, James Lovelock's Daisyworld, and Thomas Ray's Tierra.
Simulation Environments
Graphical environments to design simulations have been developed. Notable ones include:
- Open Source Physics (Java, with Easy Java Simulations)
Simulation Contexts
Computer simulations are used in a wide variety of practical contexts, such as:
- Analysis of air pollutant dispersion using atmospheric dispersion modeling
- Design of complex systems such as aircraft and also logistics systems.
- Design of Noise barriers
- Flight simulators to train pilots
- Weather forecasting
- Emulation
- Forecasting of prices on financial markets (for example Adaptive Modeler)
- Behavior of structures under stress and other conditions
- Design of industrial processes, such as chemical processing plants
- Strategic Management and Organizational Studies
- Reservoir simulation
- Process Engineering Simulation tools
- Robot simulators
- Urban Simulation Models