Skip to content

Numpy Crash course

This notes are notes for the Numpy crash course.

Numpy is a linear library for Python, it is an essential building block for other PyData ecosystem ( Pandas, scipy, scikit-lean, etc)

Using NumPy

Normally numpy is import as np

import numpy as np

NumPy Arrays

NumPy came in vector and matrices, Vectors are 1 dimension, matrix are 2D dimensions.

Create NumPy Arrays

we can create an array from a python list

1
2
3
my_list = [1,2,3]
np.array(my_list)
# array([1, 2, 3])

and can be a list of list as well

1
2
3
4
5
my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
np.array(my_matrix)
#array([[1, 2, 3],
#       [4, 5, 6],
#       [7, 8, 9]])

Build-in Methods

arange

Return evenly spaced value within a given interval

1
2
3
4
np.arange(0,10)
# array([0,1,2,3,4,5,6,7,8,9])
np.arange(0,11,2)
# array([0, 2, 4, 6, 8, 10])

zeros and ones

Generate vectors or matrix of zeros or ones

np.zero(3)
# array([0. , 0. , 0.])

np.zero((5,5))
# array([[0., 0., 0., 0., 0.],
#        [0., 0., 0., 0., 0.],
#        [0., 0., 0., 0., 0.],
#        [0., 0., 0., 0., 0.],
#        [0., 0., 0., 0., 0.]])

np.ones(3)
# array([1., 1., 1.])

np.ones((3,3))
# array([[1., 1., 1.],
#       [1., 1., 1.],
#       [1., 1., 1.]])

linspace

Return evenly spaced numbers over a specific interval

np.linspace(0,10,3)
# array([0., 5., 10.])

np.linspace(0,5,20)
# array([0.        , 0.26315789, 0.52631579, 0.78947368, 1.05263158,
#       1.31578947, 1.57894737, 1.84210526, 2.10526316, 2.36842105,
#       2.63157895, 2.89473684, 3.15789474, 3.42105263, 3.68421053,
#       3.94736842, 4.21052632, 4.47368421, 4.73684211, 5.        ])

#Note that .linspace() includes the stop value. To obtain an array of common fractions, increase the number of items:

np.linspace(0,5,21)
# array([0.  , 0.25, 0.5 , 0.75, 1.  , 1.25, 1.5 , 1.75, 2.  , 2.25, 2.5 ,
#       2.75, 3.  , 3.25, 3.5 , 3.75, 4.  , 4.25, 4.5 , 4.75, 5.  ])

eye identity matrix

Create a identity matrix

1
2
3
4
5
np.eye(4)
# array([[1., 0., 0., 0.],
#       [0., 1., 0., 0.],
#       [0., 0., 1., 0.],
#       [0., 0., 0., 1.]])

Random

Here some of the ways we can create random numbers

rand

Creates an array of the given shape with random uniform distribution over [0,1)

1
2
3
4
5
6
7
8
np.random.rand(2)
# array([0.37065108, 0.89813878])
np.random.rand(5,5)
# array([[0.03932992, 0.80719137, 0.50145497, 0.68816102, 0.1216304 ],
#       [0.44966851, 0.92572848, 0.70802042, 0.10461719, 0.53768331],
#       [0.12201904, 0.5940684 , 0.89979774, 0.3424078 , 0.77421593],
#       [0.53191409, 0.0112285 , 0.3989947 , 0.8946967 , 0.2497392 ],
#       [0.5814085 , 0.37563686, 0.15266028, 0.42948309, 0.26434141]])

randint

This generate a integer from low (inclusive) to high (exclusive)

1
2
3
4
np.random.randint(1,100)
#61
np.random.randint(1,100,10)
# array([39, 50, 72, 18,27, 59, 15, 97, 11, 14])

seed

It is use to create a random state that can be reproducible, i means, the result will be the same everything we use the same seed

1
2
3
4
5
6
7
np.random.seed(42)
np.random.rand(4)
# array([0.37454012, 0.95071431, 0.73199394, 0.59865848])

np.random.seed(42)
np.random.rand(4)
# array([0.37454012, 0.95071431, 0.73199394, 0.59865848])

Array Attributes and Methods

To explain the attributes and methods we need to create a vector and matrix

1
2
3
4
5
arr = np.arange(25)
# array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
#       17, 18, 19, 20, 21, 22, 23, 24])
ranarr = np.random.randint(0,50,10)
# array([38, 18, 22, 10, 10, 23, 35, 39, 23,  2])

Reshape - reshape

Return the same data of the vector or matrix but in a different shape

1
2
3
4
5
6
arr.reshape(5,5)
# array([[ 0,  1,  2,  3,  4],
#       [ 5,  6,  7,  8,  9],
#       [10, 11, 12, 13, 14],
#       [15, 16, 17, 18, 19],
#       [20, 21, 22, 23, 24]])

max, min, argmax, argmin - max,min,argmax,argmin

Let start with ranarr

ranarr = np.random.randint(0,50,10)
# array([38, 18, 22, 10, 10, 23, 35, 39, 23,  2])

the maximum number in the array

ranarr.max()
# 39
the index of this maximum number
ranarr.argmax()
# 7
now for the minimum
1
2
3
4
ranarr.min()
# 2
ranarr.argmin()
# 9

Shape - shape

shape is an attribute and not a method

# Vector
arr.shape
#(25,)

# Notice the two sets of brackets
arr.reshape(1,25)
# array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
#        16, 17, 18, 19, 20, 21, 22, 23, 24]])

arr.reshape(1,25).shape
# (1, 25)

arr.reshape(25,1)
#array([[ 0],
#       [ 1],
#       [ 2],
#       [ 3],
#       [ 4],
#       [ 5],
#       [ 6],
#       [ 7],
#       [ 8],
#       [ 9],
#       [10],
#       [11],
#       [12],
#       [13],
#       [14],
#       [15],
#       [16],
#       [17],
#       [18],
#       [19],
#       [20],
#       [21],
#       [22],
#       [23],
#       [24]])

arr.reshape(25,1).shape
# (25, 1)

dtype

In order to know the data type of the object

1
2
3
4
5
6
7
arr.dtype
# dtype('int32')

arr2 = np.array([1.2, 3.4, 5.6])
arr2.dtype

#dtype('float64')

Numpy Indexing and Selection

To select an item in the array we can use a syntax similar to the one use to pick up elements of a list, in the following example we will:

1. Create an array

import numpy as np#create an array
arr = np.arange(0,11)

2. Select a single element

arr[7]
#7

3. Select a range of elements

arr[0:5]
#array([0,1,2,3,4])\

Broadcasting

The differences between Python list and Numpy arrays can be simplify as; python list you can only reassign values to part of the list with the same size and shape, if you want to replace X number of elements you will need to pass in a new x element list, this is explain better with an example.

In the example: 1. Create an array. 2. Slice part of the array. 3. We will change the sliced array. 4. Display the original array.

Notice the elements of the array, that belong to the sliced array, were change. This is because the data is not copied in order to avoid memory problems.

numpy broadcasting

import numpy as np
#create an array
arr= np.arange(0,10)
#slice the array
sliced_arr = arr[0:6]
#change the values in the slice
sliced_arr[:] = 100
# print the original array to show the changes
print(arr)
#array([100, 100, 100, 100, 100, 100,   6,   7,   8,   9])

If you want to make a copy of the array you can use copy() like new_arr = arr.copy()

Indexing 2d arrays (matrices)

The syntax will be arr_2d[row][col] or arr_2d[row,col], the latter the most common used.

1. Create the matrix

import numpy as np
arr_2d = np.arange([5,10,15],[20,25,30],[35,40,45])

2. Select base in index

a row

arr_2d[1]
#array([20,25,30])
a value
arr_2d[1][0]
#20

3. select a matrix inside the matrix

1
2
3
arr_2d[:2,1:] # top right corner
# array([25,30],
#       [40,45])

Conditional selection

We can select elements of the arrays base in a condition, let say we want to know what elements are bigger than 4.

1
2
3
4
5
import numpy as np
arr = np.arange(0,10)
print(arr>4)
#array([False, False, False, False,  True,  True,  True,  True,  True,
#        True])
we can save this array of boolean values and use it to slice or select elements of the original array base in this condition

1
2
3
4
5
bool_arr = arr>4
arr[bool_arr]
# array([ 5,  6,  7,  8,  9, 10])
arr[arr>4]
# array([ 5,  6,  7,  8,  9, 10])

Operations

Arithmetic

Numpy allows operation including matrix with matrix and scalar with matrix.

1. Addition, multiplication , subtraction and division

import numpy as np

arr = np.arange(0,10)
arr
#array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

#addition
arr + arr
#array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

# Multiplication
arr * arr
#array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

#Subtraction
arr - arr
#array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

#Division
arr/arr
#array([nan,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

numpy will notify us when the division is not possible or the division by 0

1
2
3
1/arr
#array([       inf, 1.        , 0.5       , 0.33333333, 0.25      ,
#       0.2       , 0.16666667, 0.14285714, 0.125     , 0.11111111])

and we have the exponential as well

arr**3
#array([  0,   1,   8,  27,  64, 125, 216, 343, 512, 729])

Universal Array function

With Numpy we can perform different function to the matrices, square root, logarithmic and geometric functions.

 np.sqrt(arr)
 #array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
 #      2.23606798, 2.44948974, 2.64575131, 2.82842712, 3.        ])

# Exponential (e^)
np.exp(arr)
#array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
#       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,
#       2.98095799e+03, 8.10308393e+03])

#Trigonometric
np.sin(arr)
#array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ,
#       -0.95892427, -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849])

#Natural Logarithm
np.log(arr)
#array([      -inf, 0.        , 0.69314718, 1.09861229, 1.38629436,
#       1.60943791, 1.79175947, 1.94591015, 2.07944154, 2.19722458])

Statistics

as an example of the statistic function that can be perform in numpy we have sum, mean and max

arr = np.arange(0,10)

arr.sum()
#45

arr.mean()
#4.5

arr.max()
#9

and other examples of statistic functions

other example of Statistics operation

Axis Logic

Wen we work with 2D arrays (Matrix) the array term , axis 0 is the vertical axis ( rows ), and axis 1 is the horizontal ( columns )

so let do sum on the 0 axis, basically sum all the elements vertically, it make sense after the code.

arr_2d = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
arr_2d
#array([[ 1,  2,  3,  4],
#       [ 5,  6,  7,  8],
#       [ 9, 10, 11, 12]])

arr_2d.sum(axis=0)
#array([15, 18, 21, 24])
#[(1+5+9), (2+6+10), (3+7+11), (4+8+12)]

arr_2d.sum(axis=1)
#array([10, 26, 42])