Last time, we saw how to use matplotlib
to create plots. As an example, let's plot $y=\sin(x)$ and $y=\cos(x)$ for $0 \leq x \leq 2\pi$.
import matplotlib.pyplot as plt
First, let's generate a list of $x$-values that we will use for plotting. Let's use N=1000
data points.
from math import pi
x_list = []
a = 0 # Left-end of the x-interval
b = 2*pi # Right-end of the x-interval
N = 1000 # Number of sub-intervals to divide (a,b) into
dx = (b - a)/N # Width of each sub-interval
for i in range(N+1):
x = a + i*dx
x_list.append(x)
Now, we want to plug each one of these $x$-values into the $\sin$ and $\cos$ functions:
from math import sin, cos
sin_x_list = []
cos_x_list = []
for x in x_list:
sin_x_list.append(sin(x))
cos_x_list.append(cos(x))
Now plot:
plt.plot(x_list, sin_x_list, 'r-', label='$y=\sin(x)$')
plt.plot(x_list, cos_x_list, 'b-', label='$y=\cos(x)$')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.title('Sine and cosine')
plt.legend()
Exercise: Rewrite the cells above to use list comprehensions instead for
loops to generate x_list
, sin_x_list
, cos_x_list
.
from math import pi, sin, cos
a = 0 # Left-end of the x-interval
b = 2*pi # Right-end of the x-interval
N = 1000 # Number of sub-intervals to divide (a,b) into
dx = (b - a)/N # Width of each sub-interval
x_list = [a + i*dx for i in range(N+1)]
sin_x_list = [sin(x) for x in x_list]
cos_x_list = [cos(x) for x in x_list]
plt.plot(x_list, sin_x_list, 'r-', label='$y=\sin(x)$')
plt.plot(x_list, cos_x_list, 'b-', label='$y=\cos(x)$')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.title('Sine and cosine')
plt.legend()
Some thoughts:
x_list
, where we take evenly spaced points spanning an intervalsin(x_list)
).sin(x_list)
The NumPy module contains many useful tools for numerical calculations in Python.
import numpy as np
The basic building blocks from NumPy are arrays, which in many ways behave like lists. We can define an array using np.array
:
my_array = np.array([1,2,3,4,5,6])
my_list = [1,2,3,4,5,6]
type(my_list)
type(my_array)
We access elements and slices of arrays just like lists:
my_array[3]
my_array[-1]
my_array[1:5]
However, NumPy arrays behave very differently than lists with arithmetic operations:
my_list * 3
my_array * 3
We see that:
In general, arithmetic operations on arrays are done element-wise.
my_array + 4
my_array % 2
my_array ** 3
3 ** my_array
Can we take the sin
and cos
of arrays?
sin(my_array)
Note: We imported the sin
and cos
functions from the math
module, but they are not designed to work with arrays. Instead, we can either import sin
and cos
from the numpy
module, or we can use np.sin
and np.cos
.
np.sin(my_array)
What about generating arrays of evenly spaced points? We can use the np.linspace
:
help(np.linspace)
Repeating our first exercise now with arrays:
a = 0
b = 2*pi
N = 1000
x_array = np.linspace(a,b,N+1)
sin_x_array = np.sin(x_array)
cos_x_array = np.cos(x_array)
plt.plot(x_array, sin_x_array, 'r-', label='$y=\sin(x)$')
plt.plot(x_array, cos_x_array, 'b-', label='$y=\cos(x)$')
plt.xlabel('$x$')
plt.ylabel('$y$')
plt.title('Sine and cosine')
plt.legend()
Let's compare evaluation times between for
loops, list comprehensions, and numpy
arrays:
from time import time
a = 0 # Left-end of the x-interval
b = 2*pi # Right-end of the x-interval
N = 10**6
t0 = time()
x_list = []
dx = (b - a)/N # Width of each sub-interval
for i in range(N+1):
x = a + i*dx
x_list.append(x)
sin_x_list = []
cos_x_list = []
for x in x_list:
sin_x_list.append(sin(x))
cos_x_list.append(cos(x))
t1 = time()
print(t1 - t0)
t0 = time()
dx = (b - a)/N # Width of each sub-interval
x_list = [a + i*dx for i in range(N+1)]
sin_x_list = [sin(x) for x in x_list]
cos_x_list = [cos(x) for x in x_list]
t1 = time()
print(t1 - t0)
t0 = time()
x_array = np.linspace(a,b,N+1)
sin_x_array = np.sin(x_array)
cos_x_list = np.cos(x_array)
t1 = time()
print(t1 - t0)
We've seen that we can use np.array
to convert lists to arrays and np.linspace
to generate arrays of equally spaced points. Some other options:
np.zeros
can generate arrays full of zerosnp.ones
can generate arrays full of 1snp.arange
works just like the normal range
function, except that it returns an array insteadnp.zeros(5)
np.ones(7)
np.arange(2,10,3)
Notice: the np.zeros
function and np.ones
function produce arrays filled with floats, while the np.arange
function returns an array full of integers.
In general, unlike lists, NumPy arrays can only be filled with one datatype. You can check what datatype an array holds using the .dtype
attribute:
my_array = np.zeros(5)
print(my_array.dtype)
my_array = np.arange(5)
print(my_array.dtype)
You can change the datatype when defining arrays using np.zeros
or np.ones
by using the dtype
keyword argument:
np.zeros(5, dtype=int)
np.zeros(10, dtype=bool)
np.ones(10, dtype=bool)
Like with lists, we can take slices of NumPy arrays:
N = 50
my_list = [i for i in range(N)]
my_array = np.arange(N)
list_slice = my_list[1::2]
array_slice = my_array[1::2]
print(list_slice)
print(array_slice)
What happens if we modify these slices?
list_slice[0] = 99
print('Modified slice:')
print(list_slice)
print('List:')
print(my_list)
array_slice[0] = 99
print('Modified slice:')
print(array_slice)
print('Array:')
print(my_array)
Observations:
ptriples = [[3, 4, 5],
[4, 3, 5],
[5, 12, 13],
[6, 8, 10],
[8, 6, 10],
[8, 15, 17],
[9, 12, 15],
[12, 5, 13],
[12, 9, 15],
[12, 16, 20],
[15, 8, 17],
[15, 20, 25],
[16, 12, 20],
[20, 15, 25]]
How can we plot the $(a,b)$ pairs from our Pythagorean triples?
a_list = []
b_list = []
c_list = []
for ptriple in ptriples:
a_list.append(ptriple[0])
b_list.append(ptriple[1])
c_list.append(ptriple[2])
plt.plot(a_list, b_list, 'o')
plt.xlabel('$a$')
plt.ylabel('$b$')
plt.title('Pythagorean tuples $(a,b)$')
Recall: For project 1, we wrote a function get_primes
that generated a list of all primes less than or equal to the input.
# Import the square root function
from math import sqrt
def is_prime(n):
# Check for divisors between 2 and sqrt(n)
for d in range(2,int(sqrt(n))+1):
# Check if d divides n
if n % d == 0:
return False
return True
def get_primes(N):
# Initialize an empty list of twin primes
primes = []
# Check if n is prime for n = 2, 3, ..., N-1
for n in range(2,N):
if is_prime(n):
# If n is prime, add it to the list
primes.append(n)
return primes
This get_primes
function worked by sequentially testing whether each integer was prime.
Exercise: Write a new get_primes
function that utilizes the Sieve of Eratosthenes.
Description of the method:
Hint: It might be helpful to build a list of Boolean values, such that the i
th element of the list is True
if the number i
is prime, and False
otherwise.
We can then "eliminate" multiples of a prime by setting their values to False
.
It may also be helpful to use the continue
command, which will end an iteration early and proceed to the next iteration of a loop.
def get_primes_with_sieve(N):
# Construct a list that will ultimate contain
# True in the ith position if i is a prime number
# and False otherwise
# Initialize all values to True until we can show
# that a number is not prime
is_prime_bools = [True for n in range(N+1)]
# Set 0 and 1 to be not prime
is_prime_bools[0] = False
is_prime_bools[1] = False
for p in range(2,N+1):
# if p is known to not be prime, skip it
if not is_prime_bools[p]:
continue
# otherwise, eliminate all multiples of p
for n in range(2*p, N+1, p):
is_prime_bools[n] = False
# Construct the list of primes based on is_prime_bools
primes = [p for p in range(N+1) if is_prime_bools[p]]
return primes
get_primes_with_sieve(15)
Let's compare the runtime of these two algorithms:
from time import time
N = 1000000
t0 = time()
primes = get_primes(N)
t1 = time()
print(t1-t0)
t0 = time()
primes = get_primes_with_sieve(N)
t1 = time()
print(t1-t0)
Exercise: Rewrite the sieve function to make use of NumPy slicing.
def get_primes_with_numpy_sieve(N):
# Construct a list that will ultimate contain
# True in the ith position if i is a prime number
# and False otherwise
# Initialize all values to True until we can show
# that a number is not prime
is_prime_bools = np.ones(N+1,dtype=bool)
# Set 0 and 1 to be not prime
is_prime_bools[0] = False
is_prime_bools[1] = False
for p in range(2,N+1):
# if p is known to not be prime, skip it
if not is_prime_bools[p]:
continue
# otherwise, eliminate all multiples of p
is_prime_bools[2*p::p] = False
# Construct the list of primes based on is_prime_bools
primes = [p for p in range(N+1) if is_prime_bools[p]]
return primes
N = 1000000
t0 = time()
primes = get_primes(N)
t1 = time()
print(t1-t0)
t0 = time()
primes = get_primes_with_sieve(N)
t1 = time()
print(t1-t0)
t0 = time()
primes = get_primes_with_numpy_sieve(N)
t1 = time()
print(t1-t0)
It will be useful for the project to consider what are called, "primitive Pythagorean triples". We say that a Pythagorean triple $(a,b,c)$ is primitive if the greatest common divisor of $a$, $b$, and $c$ is $1$.
Math exercise: A Pythagorean triple $(a,b,c)$ is primitive if and only if the greatest common divisor of $a$ and $b$ is $1$.
We will call a Pythogrean tuple $(a,b)$ primitive if $a$ and $b$ have greatest common divisor $1$. It will be useful for the project to also plot only the primitive Pythagorean tuples.
To that end, can we calculated greatest common divisors? Let's write our own greatest_common_divisor
function:
def greatest_common_divisor(a,b):
gcd = 1
for d in range(1,min(a,b)+1):
if (a % d == 0) and (b % d == 0):
gcd = d
return gcd
greatest_common_divisor(15,6)
Another way of calculating greatest common divisors is called "the Euclidean algorithm." The algorithm works as follows:
Exercise: Write a function to implement the Euclidean algorithm
def Euclidean_algorithm(a,b):
while True:
# First, make sure that a <= b
if a > b:
a,b = b,a
# Replace b with b - a
b = b - a
# If one of the numbers is zero, stop
if a == 0 or b == 0:
break
# The non-zero number is the GCD
return max(a,b)
Euclidean_algorithm(18,12)
We can apply some thought the make this significantly faster:
def Euclidean_algorithm(a,b):
# First, make sure that a <= b
if a > b:
a,b = b,a
while True:
# Subtract as many multiples of a as we can
a, b = b % a, a
# If one of the numbers is zero, stop
if a == 0:
break
# The non-zero number is the GCD
return b
a = 100000000
b = a//2 + 2
t0 = time()
gcd = greatest_common_divisor(a,b)
t1 = time()
print(t1 - t0)
t0 = time()
gcd = Euclidean_algorithm(a,b)
t1 = time()
print(t1 - t0)
How can we construct primitive triples?
ptriples = [[3, 4, 5],
[4, 3, 5],
[5, 12, 13],
[6, 8, 10],
[8, 6, 10],
[8, 15, 17],
[9, 12, 15],
[12, 5, 13],
[12, 9, 15],
[12, 16, 20],
[15, 8, 17],
[15, 20, 25],
[16, 12, 20],
[20, 15, 25]]
def is_primitive_triple(ptriple):
if Euclidean_algorithm(ptriple[0], ptriple[1]) == 1:
return True
else:
return False
prim_ptriples = [ptriple for ptriple in ptriples if is_primitive_triple(ptriple)]
print(prim_ptriples)
It is often the case that we have some list (or other iterable) of elements which we want to sequentially assign to different variable names.
my_list = ['Justin', 'Emily', 'Steven', 'Jonathan']
first = my_list[0]
second = my_list[1]
third = my_list[2]
fourth = my_list[3]
print(first)
print(third)
There's a shorthand for performing this unpacking by using comma-separated variable names:
first, second, third, fourth = my_list
print(first)
print(third)
This can be very useful when combined with list comprehension, particular for our Pythagorean triples project:
for ptriple in ptriples:
print(ptriple[0])
for (a,b,c) in ptriples:
print(a)