The NumPy Handbook

Get started with NumPy stress-free

Welcome to this comprehensive guide on NumPy! Whether you're a beginner or looking to refresh your knowledge, this blog has got you covered. We'll explore the fundamental aspects of NumPy, including array attributes, creation methods, and powerful array manipulations, all the way up to advanced techniques like Boolean indexing.

In this concise and quick-to-read blog, you'll gain a solid understanding of NumPy and its capabilities. By the end, you'll be equipped with the essential tools to start coding with confidence.

Let's dive in and unlock the power of NumPy together!


When is Numpy useful?

NumPy arrays are useful when :

  1. We are working with numbers and other mathematical values like sin, cosine etc and not strings.

  2. Elements are homogenous in size and type. To work around this constraint we can create an array containing arrays of different types.

  3. The structure is in static size and no need for it to be dynamic. Otherwise, a list is preferable.


Arrays and Attributes

  1. arr.ndim : number of axes (dimensions) of the array

  2. arr.shape : the size of the array in each dimension. For a matrice: (row,column) and for a 3-d array the shape will be (number of axes, number of arrays in each axis, number of elements in each array)

  3. arr.size : total elements in the array

     import numpy as np
     arr = np.array([[1, 2, 3], [4, 5, 6]])
     print("Array:", arr)
     >>Out: [[1,2,3],[4,5,6]]
     print("Shape:", arr.shape)
     >>Out: Shape: (2,3)
     print("Size:", arr.size)
     >>Out: Size: 6
     print("Number of Dimensions:", arr.ndim)
     >>Out: Number of Dimensions: 2
    

What do you think is the output for this code snippet?

import numpy as np
a = np.array([[1,2,3],[10,20,30,40]])
print(a.shape)
print(a.size)

Creating arrays

From List

import numpy as np
list_2d = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
array_2d = np.array(list_2d)
list_3d = [[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]], [[13, 14, 15], [16, 17, 18]]]
array_3d = np.array(list_3d)

print("2D Array:")
print(array_2d)
print("Shape of 2D Array:", array_2d.shape)
print()

print("3D Array:")
print(array_3d)
print("Shape of 3D Array:", array_3d.shape)

From Tuples

import numpy as np
tuple_data = ((1, 2, 3), (4, 5, 6))
array_2d = np.array(tuple_data)
print(array_2d)
>>Out: [[1 2 3]
 [4 5 6]]

Using arange() and linspace()

The key difference between np.arange() and np.linspace() is that np.arange() generates values based on a step size, while np.linspace() generates values based on the desired number of elements within a given interval.

import numpy as np
#using np.arange(start,stop,step)
array_arange = np.arange(0, 10, 2)
print("Array using np.arange():", array_arange)
>>Out: Array using np.arange(): [0 2 4 6 8]
# Using np.linspace(start,stop,num)
array_linspace = np.linspace(0, 1, 5)
print("Array using np.linspace():", array_linspace)
>>Out: Array using np.linspace(): [0. 0.25 0.5  0.75 1.  ]

Using Random

import numpy as np

# Generate a 1D array with random integers between 0 and 9
array_random_integers = np.random.randint(10, size=5)
print(array_random_integers)
>>Out: Random integers array: [2 4 1 6 8]

Accessing and Manipulations on Array

Accessing an element

Indexing starts from 0. Negative indexing is allowed.

import numpy as np
array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(array[1,2])
>>Out: 2
array_3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]], [[13, 14, 15], [16, 17, 18]]])
print(array_3d[1, 0, 2])
>>Out: 9

Sorting

import numpy as np
array = np.array([3, 1, 5, 2, 4])

# Sort the array in ascending order
sorted_arr = np.sort(arr)
print("Sorted array:", sorted_arr)

# Sort the array in descending order
sorted_arr_desc = np.sort(arr)[::-1]
print("Sorted array (descending):", sorted_arr_desc)

Insertion and Deletion

import numpy as np
array = np.array([1, 2, 3, 4, 5])
# Insertion using np.insert(array_name, index, value)
array = np.insert(array, 2, 10)
print("Array after insertion:", array)
>>Out: [ 1  2 10  3  4  5]
# Deletion using np.delete(array_name, index)
array = np.delete(array, 4)
print("Array after deletion:", array)
>>Out: [ 1  2 10  3  5]

Reshape

import numpy as np
original_array = np.array([[1, 2, 3], [4, 5, 6]])
reshaped_array = original_array.reshape((1, 6))
print(reshaped_array)
>>OUT: [[1 2 3 4 5 6]]

What will be the output of the code snippet below:

import numpy as np
original_array = np.arange(45)
reshaped_array = original_array.reshape((5, -1))
print(reshaped_array)

Answer: The -1 in the reshape function allows NumPy to automatically calculate the correct value for that dimension based on the total number of elements. In this case, since the original array has 45 elements, the -1 will be replaced with 9, resulting in a shape of (5, 9) for the reshaped array. So use this when you know the number of rows in your array and let Numpy figure out the rest.

Slicing

import numpy as np
array = np.array([[1, 2, 3, 4],
                  [5, 6, 7, 8],
                  [9, 10, 11, 12]])

# Slicing rows and columns
print("First row:", array[0, :])  # Output: [1 2 3 4]
print("Second column:", array[:, 1])  # Output: [2 6 10]

# Slicing a sub-array
print("Sub-array (2x3):") 
print(array[1:3, 0:3])# row 1 upto 3 elements and column 0 to 3 elements 
# Output:
# [[ 5  6  7]
#  [ 9 10 11]]

# Step slicing
print("Every alternate element in the first row:", array[0, ::2])  # Output: [1 3]

Boolean Indexing

Boolean indexing allows you to filter or pick entries from an array based on a boolean condition. It includes indexing an array with a boolean array or a boolean expression, resulting in a new array containing only the elements for which the matching boolean value is True.

import numpy as np
array = np.array([1, 2, 3, 4, 5, 6])
# Create a boolean array based on a condition
condition = array > 3
# [False, False, False, True, True, True]
# Use the boolean array to index the original array
filtered_array = array[condition]
# [4, 5, 6]

Boolean indexing provides a powerful and flexible way to select elements from an array based on specific conditions, allowing you to perform operations and computations on the selected elements efficiently.

Mathematical Operations

Operations on arrays are vectorized, which means that they are performed element-wise.

import numpy as np

# Create an array
arr = np.array([1, 2, 3, 4, 5])
# Add 2 to each element
arr_plus_2 = np.add(arr, 2)
>>Out: [3 4 5 6 7]
# Remove 2 from the array
arr_minus_2 = np.subtract(arr, 2)
>>Out: [-1  0  1  2  3]
# Get the square of the array
arr_squared = np.square(arr)
>>Out: [ 1  4  9 16 25]
# Find the sin of the array
arr_sin = np.sin(arr)
>>Out: [ 0.84147098  0.90929743  0.14112001 -0.7568025  -0.95892427]

Linear Algebra

The Dot Product

The dot product is an operation that takes two vectors and returns a scalar value. In Numpy, you can compute the dot product of two vectors using the np.dot() function or the @ operator.

The dot product is only valid when the two vectors have the same size, meaning they have the same number of elements. If the vectors have different sizes, Numpy will raise a ValueError. Similarly in a 2D array, ensure that the number of columns in the first array matches the number of rows in the second array.It is like matrix multiplication.

import numpy as np
array1 = np.array([[1, 2], [3, 4]])
array2 = np.array([[5, 6], [7, 8]])

# Compute the dot product using np.dot()
dot_product = np.dot(array1, array2)
>>Out: [[19 22]
 [43 50]]

# Alternatively, compute the dot product using @ operator
dot_product_alt = array1 @ array2
>>Out: [[19 22]
 [43 50]]

Eigenvalues and Eigenvectors

In linear algebra, given a square matrix, an eigenvector is a non-zero vector that only changes by a scalar factor when a linear transformation is applied to it. The corresponding scalar factor is called the eigenvalue. Eigenvectors and eigenvalues are used to analyze and understand linear transformations and systems of linear equations.

In NumPy, you can use the np.linalg.eig() function to compute the eigenvalues and eigenvectors of a matrix. Here's an example:

import numpy as np
matrix = np.array([[3, 1], [2, 2]])

# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(matrix)
print(eigenvalues)
>>Out: [4. 1.]
print(eigenvectors)
>>Out: [[ 0.70710678 -0.4472136 ]
 [ 0.70710678  0.89442719]]

Conclusion

That is it for this beginner friendly guide. Hope you found this guide useful. Please like share and follow for more!