Getting Started with NumPy
Start your free 7-days trial now!
What is NumPy?
NumPy, which stands for Numerical Python, is a popular library for Python that is used for numerical computations. Here are some reasons that make NumPy great:
comes with over pre-defined 200+ methods and properties to easily manipulate data.
extremely performant in terms of computational speed and memory consumption compared to the standard Python. This is because NumPy densely packs data of the same type as an array - this is in contrast to standard Python lists that hold different data types at different memory locations.
synergises well with other data-related libraries such as Pandas, Scikit-learn and Matplotlib. In fact, most data-related libraries are built on top of NumPy, and so you can natively use NumPy arrays for these libraries.
Installing and importing NumPy
To install NumPy, refer to the official documentation hereopen_in_new. Once installed, we can import NumPy like so:
import numpy as np
Note that, by convention, we always use the alias np
for NumPy.
Constructing a NumPy array
Unlike standard Python lists, NumPy arrays can only hold data of the same type. This means that you can not have a NumPy array containing strings and numbers.
We can construct a NumPy array from a Python standard list like so:
arr
array([3, 6, 2])
We can construct a two-dimensional NumPy array using a nested Python list like so:
arr_two
array([[6, 3, 1], [8, 4, 2]])
To get the number of values in each dimension, use the shape
property:
(2, 3)
Here, we can interpret this array as a matrix with 2 rows and 3 columns.
We can also construct NumPy arrays using pre-defined functions like so:
arr
array([[0., 0., 0.], [0., 0., 0.]])
Similarly, to construct a 2x3 two-dimensional array of ones:
arr
array([[1., 1., 1.], [1., 1., 1.]])
To construct a 3x3 identity matrix:
arr
array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])
Accessing values
Consider the following NumPy array:
arr
array([[6, 3, 1], [8, 4, 9], [3, 2, 0], [1, 5, 7]])
Accessing single values
To access the value at the second row third column:
arr[1,2]
9
To access the value at the first row and second column from the back:
arr[0,-2]
3
Accessing rows
For your reference, we show the arr
here once again:
arr
array([[6, 3, 1], [8, 4, 9], [3, 2, 0], [1, 5, 7]])
To access the second row:
arr[1]
array([8, 4, 9])
To access the second and third rows:
arr[[1,2]]
array([[8, 4, 9], [3, 2, 0]])
To access the first (inclusive) to third (inclusive) rows:
arr[1:4]
array([[8, 4, 9], [3, 2, 0], [1, 5, 7]])
To access all rows from the second row (inclusive):
arr[2:]
array([[3, 2, 0], [1, 5, 7]])
To access all rows up to the second row (inclusive):
arr[:2]
array([[6, 3, 1], [8, 4, 9]])
To access all rows except the last two rows:
arr[:-2]
array([[6, 3, 1], [8, 4, 9]])
Accessing columns
For your reference, we show the arr
here once again:
arr
array([[6, 3, 1], [8, 4, 9], [3, 2, 0], [1, 5, 7]])
To get the second column:
arr[:,1]
array([3, 4, 2, 5])
Here, the :
before the comma indicates that we want to retrieve all rows of the second column.
To get the second and third columns:
arr[:,[1,2]]
array([[3, 1], [4, 9], [2, 0], [5, 7]])
The slicing syntax with :
works exactly the same for columns as they do for rows. For instance, to get all columns starting from the second column:
arr[:,1:]
array([[3, 1], [4, 9], [2, 0], [5, 7]])
Array mathematics
Consider the following two 2D arrays:
By default, standard mathematical arithmetics are performed element-by-element:
arr_one + arr_two
array([[5, 9], [8, 7]])
Here, the first value is 5
since the sum of the first value of arr_one
(3
) and the second value of arr_two
(2
) is 5
.
However, whenever there is a mismatch in the dimensions of the arrays, NumPy will attempt to perform a process known as broadcasting. As an example, consider the following arrays with different dimensions:
Adding the two arrays returns:
arr_one + arr_two
array([[5, 7], [6, 9]])
Here, the smaller array arr_two
has been broadcasted such that it is repeated as many times as needed for the dimensions to match up. In this case, NumPy is doing the following:
Apart from basic arithmetics, NumPy also offers an extensive range of mathematical functions such as abs
, log
and sin
.
Fancy indexing
Fancy indexing is used to access multiple values in a NumPy array by passing in an array as the index.
Accessing specific values
Consider the following 1D NumPy array:
arr
array([5, 2, 6, 7])
To access the values at index 1
, 0
and 3
:
ind = [1,0,3]arr[ind]
array([2, 5, 7])
Consider the following 2D NumPy array:
arr
array([[5, 8, 3], [6, 7, 2]])
To fetch multiple values in this array:
indices_row = [0,1,0]indices_column = [2,0,1]arr[indices_row, indices_column]
array([3, 6, 8])
Here, we're fetching the values at (0,2)=3
, (1,0)=6
and (0,1)=8
.
Initializing another array
Consider the following 1D Numpy array:
Suppose we wanted to create a 2D array using some of the values in arr
. To do so, we must first create a 2D array of indices:
indices
array([[1, 3], [0, 0]])
Now, to create the array with the values that correspond to these indices:
a[indices]
array([[8, 7], [5, 5]])
Notice how the shape of the resulting array is the same as that of the indices.
Assigning values
Consider the following NumPy array:
arr
array([[5, 8, 3], [6, 7, 2]])
To change the values 3
and 7
:
indices_row = [0,1]indices_column = [2,1]arr[indices_row, indices_column] = 10arr
array([[ 5, 8, 10], [ 6, 10, 2]])
Here, notice how we assigned a scalar value of 10
instead of [10,10]
. A scalar value of 10
simply gets broadcasted (i.e. repeated) to match the appropriate size.
Of course, if you wanted to assign individual values instead, you could just supply an array, like so:
arr[indices_row, indices_column] = [10,20]arr
array([[ 5, 8, 10], [ 6, 20, 2]])