1/92
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Data Type
defines a collection of data values and a set of predefined operations on those values
Descriptor
the collection of the attributes of a variable -- used for type checking and by allocation/deallocation operations
Descriptor -- static attributes
required only at compile time -- built by the compiler as part of the symbol table
Descriptor -- dynamic attributes
part or all of the descriptor must be maintained during execution
Primitive Data Types
those not defined in terms of other data types
4 numeric types
Integer -- Floating-point -- Complex -- Decimal
Integer
the most common primitive numeric data type
Java's 4 signed integer sizes
byte -- short -- int -- long
Floating-point
models real numbers but only as approximations -- stored in binary on most computers
2 properties defining floating-point values
Precision (accuracy of fractional part, measured in bits) -- Range (range of fractions and exponents)
Complex
each value consists of two floats: real part and imaginary part -- supported by Fortran and Python -- Python literal form: (7 + 3j)
Decimal
stores a fixed number of decimal digits with decimal point at a fixed position -- primary data type for business processing -- essential to COBOL
Advantage of Decimal type
accuracy of decimal values
2 disadvantages of Decimal type
limited range since no exponents are allowed -- representation wastes memory
Boolean
introduced by ALGOL 60 -- represents switches and flags -- two values: true and false -- enhances readability
Boolean in C89
no Boolean type -- uses int -- nonzero is true -- zero is false
Character types
stored as numeric codings -- ASCII (8-bit) or Unicode (16-bit)
ASCII
American Standard Code for Information Interchange -- 8-bit code -- traditionally most commonly used
Unicode (UCS-2)
16-bit character coding -- first widely used in Java
UCS-4 / UTF-32
4-byte character code developed by Unicode Consortium and ISO after 1991 -- described in ISO/IEC 10646 Standard published in 2000
Character String Type
values are sequences of characters
2 design issues for Character String Types
Is it a primitive type or a special kind of array? -- Is the length of objects static or dynamic?
5 typical string operations
Assignment -- Comparison -- Catenation -- Substring Reference -- Pattern Matching
C/C++ string library functions
strcpy (copy) -- strcat (catenate) -- strcmp (lexicographic compare) -- strlen (length excluding null)
C strings
terminated with null character represented by zero -- implemented as arrays of char
Java strings
String class (constant/immutable) -- StringBuffer class (changeable)
Python strings
immutable -- similar to Java's String class
3 string length options
Static Length -- Limited Dynamic Length -- Dynamic Length
Static Length String
length set when string is created -- immutable -- Java String class, C++ standard library, .NET (C#, F#)
Limited Dynamic Length String
varying length up to a declared fixed maximum -- example: C
Dynamic Length String
no maximum length -- requires dynamic storage allocation/deallocation overhead -- examples: Perl, JavaScript
Static Length String descriptor
compile-time -- 3 fields: name of type, type's length, address of first char
Limited Dynamic Length String descriptor
run-time -- stores fixed maximum length and current length
Dynamic Length String descriptor
run-time -- stores only current length -- allocation/deallocation is biggest implementation problem
3 approaches to dynamic string storage
linked list (extra storage, pointer chasing slows operations) -- arrays of pointers to heap characters (faster than linked list but extra memory) -- adjacent storage cells (slower alloc/dealloc but faster operations and less storage -- typically used)
Enumeration Type
all possible values, which are named constants, are enumerated in the definition -- implicitly assigned integer values 0, 1, ...
Enumeration constants
implicitly assigned 0, 1, ... but can be explicitly assigned any integer literal
Enumeration in Java
added in Java 5.0 -- all enum types are implicitly subclasses of predefined class Enum
Languages without enumeration types
Perl -- JavaScript -- PHP -- Python -- Ruby -- Lua
Array
a homogeneous aggregate of data elements -- individual element identified by its position relative to the first element
Indexing/Subscripting (array)
a mapping from indices to elements
Languages without array range checking
C -- C++ -- Perl -- Fortran
Languages with array range checking
Java -- ML -- C#
Ada array subscript notation
uses parentheses () -- reduces readability because () also used for subprogram parameters
C-based array subscript notation
uses square brackets []
2 distinct types in an array type
element type -- type of the subscripts
Associative Array
an unordered collection of data elements indexed by keys -- each element is a key-value pair
Associative Array in Perl
called hashes -- names begin with %
Associative Array in Python
called dictionaries
Lua Table
an associative array where both keys and values can be any type
Record
a possibly heterogeneous aggregate of data elements where individual elements are identified by names -- implemented as struct in C, C++, C#
Difference between record and array
record elements (fields) are not referenced by indices -- fields are named with identifiers
Tuple
a data type similar to a record except elements are not named
Python Tuple
immutable -- mixed types allowed -- indexing begins at 1 -- catenated with + -- deleted with del
ML/F# Tuples
used to allow functions to return multiple values
List (Scheme/Common Lisp)
delimited by parentheses -- elements not separated by punctuation -- data and code have the same form
Scheme list operation -- CAR
returns the first element of its list parameter
Scheme list operation -- CDR
returns the remainder of the list after the first element is removed
Scheme list operation -- CONS
puts its first parameter into its second parameter (a list) to make a new list
Scheme list operation -- LIST
returns a new list of its parameters
ML list operations
hd (head -- equivalent to CAR) -- tl (tail -- equivalent to CDR) -- :: for CONS -- elements in brackets separated by commas -- must be same type
Python List
mutable -- heterogeneous -- serves as Python's arrays -- indices begin at zero
Union
a type whose variables are allowed to store different type values at different times during execution
F# Union
declared with type statement using OR operators (|) -- accessed via pattern-matching structure
Pointer Type
variables have a range of values consisting of memory addresses and a special value nil
nil (pointer)
not a valid address -- indicates pointer cannot currently reference any memory cell
2 uses of pointers
provide indirect addressing -- manage dynamic memory (heap access)
2 fundamental pointer operations
assignment -- dereferencing
Dereferencing (C++)
explicitly specified with () as a prefix unary operator -- p -> age is equivalent to (p).age
C/C++ pointer operators
* for dereferencing -- & for address-of
Dangling Pointer
a pointer that points to a heap-dynamic variable that has been deallocated
4 reasons dangling pointers are dangerous
location may be reallocated to a new variable -- type checks become invalid -- new value has no relation to old dereferenced value -- storage manager may fail
C/C++ pointers
extremely flexible but dangerous -- can point anywhere in memory -- pointer arithmetic is possible
Type Checking
ensuring the operands of an operator are of compatible types
Compatible Type
legal for the operator OR allowed to be implicitly converted by compiler-generated code
Coercion
automatic/implicit type conversion -- example: int coerced to float when added to float in Java
Type Error
application of an operator to an operand of an inappropriate type
Static type bindings
nearly all type checking can be done at compile time
Dynamic type bindings
type checking must be done at run time
Languages with only dynamic type checking
JavaScript -- PHP
Strong Typing
a language is strongly typed if type errors are always detected -- types of all operands determinable at compile or run time
Strongly typed languages
Java -- C#
NOT strongly typed languages
C -- C++ (because union types are not type checked)
Advantage of strong typing
allows detection of misuses of variables that result in type errors
Coercion weakens strong typing
coercion results in loss of error detection -- C/C++ less reliable than ML/F# which have no coercions
Type Equivalence
two types are equivalent if an operand of one type can be substituted for the other without coercion
2 approaches to type equivalence
Name Type Equivalence -- Structure Type Equivalence
Name Type Equivalence
two variables have equivalent types if in the same declaration or using the same type name -- easy to implement but highly restrictive
Structure Type Equivalence
two variables have equivalent types if their types have identical structures -- more flexible but harder to implement
C type equivalence
uses both -- name equivalence for struct, enum, and union -- structure equivalence for other nonscalar types
Type Theory
a broad area of study in mathematics, logic, computer science, and philosophy
2 branches of Type Theory in CS
Practical (data types in commercial PLs) -- Abstract (typed lambda calculus)
Type System
a set of types and the rules that govern their use in programs