Programming Language Concepts - Functional Programming

Scheme: A Lisp Dialect

Developed at MIT in the 1970s by Guy L. Steele Jr. and Gerald Jay Sussman, Scheme emerged as a sophisticated effort to simplify and deeply understand the core principles of Lisp. It aimed to create a more minimalist and theoretically pure dialect, more closely aligning with Alonzo Church's lambda calculus.

It is widely used for teaching programming due to its minimalist syntax, which echoes Lisp's elegance. This simplicity makes it highly accessible for beginners, allowing them to grasp powerful functional programming concepts without being encumbered by complex syntactic rules. Scheme's clear and consistent semantic model significantly aids in developing a profound understanding of computation.

Key Features:

Static Scoping (Lexical Scoping): This fundamental characteristic ensures that variables are resolved based on their position within the code during compile-time/definition-time, rather than at runtime. This design choice contributes to more predictable and robust code, preventing common variable shadowing issues and enabling advanced constructs like closures (functions that capture their surrounding environment).
Functions as First-Class Entities: A cornerstone of functional programming, this means functions are treated with the same capabilities as any other data type (e.g., numbers, strings, booleans). Consequently, functions can be:
- Passed as arguments to other functions (enabling higher-order functions like map or filter).
- Returned as results from other functions (supporting dynamic function generation and currying).
- Assigned to variables or stored in data structures (allowing flexible manipulation and composition).
Distinct Boolean Evaluation from Lisp: Scheme refined certain behaviors inherited from earlier Lisp versions, introducing some incompatibilities for the sake of theoretical consistency. A notable distinction is in boolean evaluation:
- In traditional Lisp, an empty list (nil or ()) evaluates to false in a boolean context, similar to how many other languages treat empty collections as 'falsy'.
- In Scheme, any list (including an empty one, (), often denoted as '()) evaluates to true in a boolean context. Only the explicit boolean value #f (false) is considered false; all other values, including numbers, strings, and lists, are treated as true. This design choice underscores Scheme's strict adherence to a precise boolean logic.

Example of Boolean Difference:

(define a '())
(if a
    (display "true")
    (display "false")) ; In Scheme, this outputs "true"; in Lisp, it outputs "false"

In this Scheme illustration, a variable a is bound to the empty list (). The if expression evaluates a. Since () is not #f, Scheme interprets it as true, leading "true" to be displayed. Conversely, in classic Lisp, () is equivalent to nil, which is considered false, resulting in "false".

Functions in Functional Programming

Functions achieve first-class status, granting them significant operational flexibility that profoundly enhances a language's expressive power and promotes elegant programming paradigms:

Passed as arguments to subprograms: This enables the construction of higher-order functions that abstract over specific operations or control flows. Prime examples include map, filter, and reduce utilities, which accept a function to be applied to elements within a collection.
Returned as results from subprograms: This allows functions to dynamically generate or customize other functions, supporting sophisticated concepts such as currying, closures, and factory functions for creating specialized procedures.
Used as operands: Functions can be directly assigned to variables, stored within complex data structures (like lists, vectors, or hash maps), or incorporated into expressions just like any other data value. This intrinsic flexibility makes functions exceptionally composable and adaptable.

Example: Applying a Function to List Values

Consider the definition of a powerful higher-order function named apply-to-all in Scheme, which takes another function (fun) and a list of values, then processes each element by applying fun to it.

Function Definition:

(define (apply-to-all fun values)
    (if (null? values) '()
        (cons (fun (car values)) (apply-to-all fun (cdr values)))))

Let's break down this recursive definition:

(define (apply-to-all fun values)): This line declares a function named apply-to-all that expects two arguments: fun (intended to be a function) and values (expected to be a list).
(if (null? values) '()): This establishes the base case for the recursion. If values is an empty list (verified by null?), the function returns an empty list '(), preventing infinite recursion.
(cons (fun (car values)) (apply-to-all fun (cdr values))): This is the recursive step. (car values) extracts the first element of the list. (fun (car values)) applies the provided fun to this first element. (cdr values) retrieves the rest of the list (all elements except the first). (apply-to-all fun (cdr values)) recursively calls itself on the remainder of the list. Finally, cons constructs a new list by prepending the result of (fun (car values)) to the list produced by the recursive call.

This apply-to-all function demonstrates remarkable versatility, capable of evaluating any function passed to it, thereby showcasing the robust power of higher-order functions:

Example: Using with even? to filter even numbers from a list:

(apply-to-all even? '(3 8 12 5 22)) ; Returns (#f #t #t #f #t)

Here, the even? predicate is applied to each number in the list. For instance, 3 is not even (#f), while 8 is even (#t), and so forth. The outcome is a new list composed of boolean values, indicating the result of even? for each original element.

Example: Defining and applying a square function to a list:

(define (square x) (* x x))
(apply-to-all square '(2 5 8 100 12345678)) ; Returns (4 25 64 10000 152415765279684)

Initially, a square function is defined to compute the square of its input. Subsequently, apply-to-all utilizes this square function on every number within the provided list. The generated output is a new list containing the squares of the original numeric values.

Lisp Data Types and Structures

Lisp's fundamental design is anchored in two primary data object types: atoms and lists. This remarkably simple yet profoundly powerful foundational structure allows for the uniform representation of both data and executable code.

Atoms: These are the indivisible, elementary data elements in Lisp. They can take various forms:
- Numbers: Including integers (10, -3.5) and floating-point values.
- Symbols: Often used as identifiers (e.g., A, foo, +) or as symbolic data themselves, playing a crucial role in Lisp's symbolic computation capabilities.
- Strings: Sequences of characters enclosed in double quotes (e.g., "hello").
Lists: These are ordered collections that can contain zero or more Lisp objects. Crucially, a list's elements can themselves be either atoms or other lists, enabling deeply nested and complex data structures. Lists serve as the primary means of structuring data in Lisp.

The iconic S-expression (Symbolic Expression) is the parenthesized collection of sublists and/or atoms. This elegant syntax is central to Lisp's design, providing a universal representation for both data structures and program code.

Example: (A B (C D) E). This S-expression represents a list containing four elements: the atom A, the atom B, a nested sublist (C D) (which itself comprises two atoms), and the atom E. This recursive nature allows for the construction of arbitrarily complex data hierarchies.

Historically, Lisp was considered a typeless language in the sense that variables do not possess inherent, fixed types. Instead, the values themselves carry type tags, meaning type checking is predominantly dynamic, occurring at runtime rather than compile-time.

Internally, Lisp lists are constructed and stored as single-linked lists, often referred to as cons cells. Each cons cell is a pair with two pointers: the car (content of address register) points to the current element of the list, and the cdr (content of decrement register) points to the rest of the list (or nil to signify the end of the list). This elegant structure underpins the efficient execution of fundamental list manipulation operations like cons, car, and cdr.

Lisp Interpretation

Lambda notation is intrinsically used in Lisp for specifying and defining functions. Derived directly from Alonzo Church's lambda calculus, it offers a concise and powerful mechanism for creating anonymous functions, which are functions not bound to an identifier.

Example: (lambda (x y) (+ x y)) defines an anonymous function that accepts two arguments, x and y, and returns their sum. This function can then be immediately applied to specific arguments or bound to a variable.

One of Lisp's most distinguishing features is its homoiconicity, meaning that function applications and data share the exact same notation. This property allows for their interchangeable interpretation, treating code itself as a form of mutable data, which enables powerful metaprogramming capabilities.

Example: The S-expression (A B C) can be interpreted in two fundamental ways:

Data interpretation: When treated as data, (A B C) simply represents a list composed of three distinct atoms: A, B, and C. This is a literal piece of structured data.
Function application: If A is identified as a function or an operator, then (A B C) signifies the act of applying function A to the parameters B and C. In this context, the Lisp interpreter would first evaluate A to a procedure, and then evaluate B and C to their respective values before invoking A with these resulting arguments.

The Lisp interpreter dynamically determines the meaning of an S-expression based on the context, specifically whether the first element of a list evaluates to a recognized function or a special form.

Scheme Expressions

Basic Syntax

In Scheme, all computational constructs, whether representing raw data or executable program code, are uniformly built using S-expressions (Symbolic Expressions). This fundamental syntactic consistency is a defining trait and a cornerstone of Scheme's simplicity and power.

An expression in Scheme can fundamentally be either:

Atom: An indivisible unit of data, such as a number, a string, a boolean value, a character, or a symbol.
List: A finite sequence of zero or more S-expressions, always enclosed within parentheses.

Lists are formed by packaging expressions within parentheses. Conventionally, the initial element of a list typically designates the operation or function to be executed, with the subsequent elements serving as its arguments.

Types of Expressions

Scheme supports various intrinsic types of expressions for representing different kinds of values:

Numbers: Encompassing integers (e.g., 42), real/floating-point numbers (e.g., -3.14), and rational numbers (e.g., 1/2).
Strings: Textual data enclosed within double-quotes (e.g., "hello world").
Identifiers: Symbols employed as names for variables, functions, or special syntactic forms (e.g., my-variable, +, define).
Characters: Individual characters prefixed with #\ (e.g., #\a, #\space, #\newline).
Booleans: The primary boolean values are #t for true and #f for false. Notably, in Scheme's boolean context, all other values (including the empty list, numbers, and strings) are interpreted as true.

Example Value Representations:

42: An integer number literal.
"Scheme is fun!": A string literal.
#t: The boolean true value.
#\c: The character 'c'.
'symbol: A quoted symbol, which prevents its evaluation and treats symbol purely as data.
(list 1 2 3): An expression that, when evaluated, constructsa list containing the numeric values 1, 2, and 3.