NumPy Arrays
NumPy Arrays
Introduction to NumPy Arrays
NumPy arrays are designed for data analysis, offering specific functionalities such as element-wise mathematics and matrix operations, unlike general-purpose Python data structures like lists, tuples, dictionaries, and sets.
NumPy arrays are similar to lists as they store sequences of objects but are optimized for numerical operations and handling multi-dimensional data.
Key Differences Between Lists and NumPy Arrays
Heterogeneous vs. Homogeneous Data: Lists can store items of different types (heterogeneous), whereas NumPy arrays store items of the same type (homogeneous).
Dimensionality: NumPy arrays have dimensions, allowing them to be one-dimensional (like lists), two-dimensional (tables or matrices), or multi-dimensional.
Accessing NumPy Arrays
To use NumPy arrays, import the NumPy package using
import numpy as np. The aliasnpis a common shorthand for NumPy.If NumPy is not installed, it can be installed via the command line using
pip install numpy.The Anaconda distribution of Python includes NumPy and other data science packages by default.
Creating NumPy Arrays
NumPy arrays can be created by passing a list into the
np.array()function. Example:my_list = [1, 2, 3, 4] my_array = np.array(my_list) type(my_array) # Output: numpy.ndarrayMulti-dimensional arrays can be created by passing a list of lists to
np.array():my_array_2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
Array Attributes
.shape: Returns the dimensions of the array as a tuple (rows, columns).my_array_2d.shape # Output: (2, 4).size: Returns the total number of elements in the array.my_array_2d.size # Output: 8.dtype: Returns the data type of the elements in the array.my_array_2d.dtype # Output: int64
Special Array Creation Functions
np.identity(n): Creates an identity matrix of dimension n \times n, with ones on the diagonal.identity_matrix = np.identity(5)np.ones(shape): Creates an array filled with ones.ones_array = np.ones((2, 4))np.zeros(shape): Creates an array filled with zeros.zeros_array = np.zeros((4, 6))
Indexing and Slicing
Indexing and slicing in NumPy arrays are similar to Python lists.
my_array = np.array([1, 2, 3, 4, 5]) my_array[3] # Returns 4 (index 3) my_array[3:] # Returns [4, 5] (from index 3 to the end)Reversing an array using slicing:
my_array[::-1] # Returns [5, 4, 3, 2, 1]
Indexing in Two-Dimensional Arrays
To index a 2D array, use comma-separated indices within square brackets:
array[row, column].two_d_array = np.array([[1, 2, 3, 4, 5, 6], [7, 8, 9, 10, 11, 12], [13, 14, 15, 16, 17, 18]]) two_d_array[1, 4] # Returns 11 (row 1, column 4)Slicing in two dimensions allows you to extract segments of the array.
Reversing a two-dimensional array:
two_d_array[::-1, ::-1] # Reverses both dimensions
Array Manipulation
.reshape(shape): Changes the dimensions of an array.reshaped_array = np.reshape(two_d_array, (6, 3))np.ravel(array, order='C' or 'F'): Unravels a multi-dimensional array into a one-dimensional array.order = 'C' unravels by rows.
order = 'F' unravels by columns (Fortran style).
raveled_array = np.ravel(two_d_array, order='C')
.flatten(): Flattens a multi-dimensional array into a one-dimensional array (returns a copy).flattened_array = two_d_array.flatten().T: Returns the transpose of a two-dimensional array (rows become columns and vice versa).transposed_array = two_d_array.T.flipud(): Flips an array vertically (up-down)..fliplr(): Flips an array horizontally (left-right)..rot90(k=n): Rotates an array by 90 degrees n times.np.roll(array, shift, axis): Shifts elements in an array along a specified axis.rolled_array = np.roll(two_d_array, 2, axis=1) # shift each row by 2 columnsIf the
axisargument is not specified, the array is flattened before rolling.
Array Concatenation
np.concatenate((array1, array2), axis): Joins two arrays along a specified axis.new_array = np.array([[19, 20, 21], [22, 23, 24], [25, 26, 27]]) concatenated_array = np.concatenate((two_d_array, new_array), axis=1)When concatenating, the dimensions along the specified axis must match for a valid matrix.
Element-Wise Math Operations
NumPy arrays allow for efficient element-wise math operations, which are faster than using loops with lists.
Scalar operations are applied to each element in the array.
array = np.array([1, 2, 3, 4]) array + 100 # Adds 100 to each element array * 2 # Multiplies each element by 2 array ** 2 # Squares each element array % 2 # Modulus 2 for each elementElement-wise operations between two NumPy arrays of the same dimension:
small_array = np.array([[1, 2], [3, 4]]) small_array + small_array # Adds corresponding elements small_array * small_array # Multiplies corresponding elements small_array ** small_array # Element-wise exponentiation
NumPy Math Functions
np.mean(array, axis): Calculates the mean of an array (optionally along a specified axis).np.std(array): Calculates the standard deviation of an array.np.sum(array, axis): Calculates the sum of elements in an array (optionally along a specified axis).np.log(array): Calculates the natural logarithm of each element.np.sqrt(array): Calculates the square root of each element.np.dot(array1, array2): Calculates the dot product of two arrays (for 1D arrays) or performs matrix multiplication (for 2D arrays).row1 = two_d_array[0, :] row2 = two_d_array[1, :] dot_product = np.dot(row1, row2)Matrix multiplication with
np.dot():matrix_product = np.dot(small_array, small_array)
Conclusion:
NumPy arrays are powerful and efficient for numerical calculations, especially with multi-dimensional data.
However, NumPy arrays are limited to homogeneous data types. For datasets with mixed data types, pandas data frames are used.