The numpy module

What is the numpy module?

A collection of many functions is called a module. One of the most useful modules in Python is called numpy (numerical Python) – it contains many functions to deal with numerical programming. This is technically an extension to the Core Python functionality we’ve been focussing on so far but now comes as standard in most Python installations.

The numpy module builds on the core functionality but also adds additional features including:

  • It is performant which means it is well optimised
  • It offers additional numerical computing tools
  • It adds an additional object called an n-dimensional array

Numpy arrays vs lists

One thing we can use the numpy module for is to create a new object called a numpy array. This is another data structure, in addition to the in-built Python types we’ve been learning about, and is similiar to a list.

Numpy arrays Numpy module (and arrays) are a Python extension (but often come as standard)

Ordered

Mutable

Less flexible
- One data type per array

Allows implicit element-wise operations

Generally quicker (optimised)
More memory efficient
Lists Lists are part of Python in-built functionality

Ordered

Mutable

Very flexible
- All types in any list

Needs explicit element-wise operations

Generally slower performance
Less memory efficient

When using these objects, list objects are highly flexible, in both content and shape whereas numpy.array objects are much more strict and require every item to be the same type and often work best when they have a consistent shape (e.g. 2x3 grid).

Numpy arrays

numpy.array objects are mutable, ordered container objects but must contain a specific object type and have n-dimensional shape.

To use the numpy module we first need to import it.

The as part of this import statement gives us a shorthand to use in the code when we want to access numpy, in this case np. This is the convention most often used for the numpy module. import statements themselves are the way we access additional Python modules such as numpy or matplotlib.

One way to create a numpy.array is from a list:

where we need the np. at the start of the function to tell python to access the numpy module.

We can also index and slice numpy.arrays in a similar way to other iterable objects (i.e. objects with length like lists):

And a numpy.array has an additional properties (attributes) called dtype which tells us what is contained within the array and shape which tells us the dimensions of the array.

Element-wise operations

The numpy module itself also provides some additional tools and syntax to complete simple operations more succinctly. For instance, we’ve shown before one way to act on every item in a list using a for loop:

There is actually a short hand for creating a new list using a for loop for very simple operations called a list comprehension.

But this is still more complex than using a numpy.array, where the same operation can be performed using an operator directly on the whole array:

Operation speed

For large numbers of elements the time difference between operations using lists and numpy.arrays can start to be measurable. We can quickly check this my importing the time module:

Comparing the two operations we can see that performing this operation with the list takes longer than within a numpy.array (this is highly variable though):

You may recall, when we first introduced list and dict objects, we also mentioned other Python objects which were similar but with some differences in functionality (tuple and set objects). In Python, as in many languages, there are often many tools which can be used to complete a task and it’s up to you to choose the correct tool for the job. Overall, list objects may be more appropriate when you need to store a set of strings or if you don’t know the number of elements in advance (appending to a list is faster than appending to an numpy.array due to the way the data is stored in memory). Whereas numpy.array objects would be more appropriate when performance is a factor or for simpler numerical operations.

Working with numpy

To use the numpy module we always need to start by using an import statement. In this case we import the numpy module and use the shorthand np:

We’ve seen that we can apply operators directly to a numpy.array:

Similarly you can use additional functions provided by the numpy module to do something to each element in the array. For example you can apply a square root:

Or perform a reductive operation such as calculating the mean of all the elements:

We can also apply mathematical operations over the whole array. For instance we can look at the np.cos function which produces applies the cosine function element-wise:

The help states that this wants an array-like object and wants the input in radians. We can write this as:

If we look at arr1 we can see that this has not been updated by the application of these operations - when using this functionality a copy of the array is returned which you can choose to re-assign to the original variable name or create a new variable:

Element-wise operations on 1D arrays

Element-wise operations in numpy allow you to perform arithmetic or mathematical functions on each corresponding element of arrays. For example, if you have two arrays of the same length, arr1 and arr2, you can add them directly: arr1 + arr2. This will produce a new array where each element is the sum of the elements at the same position in the original arrays. Similarly, you can use other operators (-, *, /) or numpy functions (np.sqrt(arr1), np.cos(arr1)) to apply operations to each element individually. The arrays must have compatible shapes for these operations.

When 1D arrays have different lengths, you need to be careful about the operations you perform. Element-wise operations: Operations such as arr1 + arr3 or arr1 * arr3 require arrays to have the same length or compatible shapes. If the lengths differ, numpy will raise a ValueError due to shape mismatch.

Basic operations on 1D arrays

Summing all elements in a 1D numpy array can be done with np.sum(arr1).

For cumulative summing, use np.cumsum(arr1), which returns an array where each element is the sum of all previous elements.

Sorting is performed with np.sort(arr1), which returns a sorted copy of the array.

To concatenate two arrays, use np.concatenate([arr1, arr2]). This joins the arrays end-to-end, creating a new array containing all elements from both arrays in order. Concatenation is useful for combining datasets or extending arrays.

To find unique elements, use np.unique(arr1), which returns an array of the distinct values in arr1. These operations are efficient and commonly used for data analysis.