Numpy series

Martin McBride, 2017-05-10
Tags python numpy
Categories python numpy

Numpy is a Python package that allows you to efficiently store and process large arrays of numerical data. Obvious examples of this type of data are sound data and image data, but numpy can also be used anywhere you have large data sets to process.

Part of the attraction of numpy is that it uses simple and familiar Python syntax to perform complex operations on arrays, which simplifies your code. The other attraction is that numpy is highly efficient, both in terms of speed and memory usage. These two factors are not unrelated - numpy provides high level array operations, and these operations are efficient because, under the hood, the entire processing loop is written in C.

Some of the main features of using numpy are:

In this article, we will start with a quick tour of some of the things you can do with numpy arrays, and then look at some of the features in a bit more detail. We will only look at one dimensional arrays, multi-dimensional arrays will be a topic for another article.

Before you start, you will need to install numpy. The official numpy site will point you at the latest version, with instructions for installing the package.

Numpy arrays quick tour

First, of course, you must import the numpy package. It is common practice to imported numpy as np (so that you can use the short name np in your code). You don't have to, but most people who use numpy do, and will recognise the np prefix.

>>> import numpy as np


You can create an array of zeros using the zeros function, supplying the required array length. By default the array will contain floats (we will see how to change that later):

>>> a = np.zeros(5)
>>> print(a)
[ 0.  0.  0.  0.  0.]


You can read or write elements using square brackets, just like a normal list list:

>>> a = np.zeros(6)
>>> a[1] = 2
>>> a[3] = 5
>>> print(a)
[ 0.  2.  0.  5.  0.  0.]
>>> print(a[2])
0.0
>>> print(a[3])
5.0


Now here is the interesting part. Maths operators are

Creating a numpy array

There are various ways to create a numpy array.

Arrays filled with zeros

The easiest is probably the zeros function:

>>> import numpy as np
>>> a = np.zeros(5)
>>> print(a)
[ 0.  0.  0.  0.  0.]


This creates an array with 5 elements, each set to 0. The array elements are floating point values - this is the default, but you can create an array with different typed elements (eg integers), using the dtype parameter, see later.

Arrays filled with ones

The ones function creates an array with all elements set to 1, which is sometimes useful.

>>> a = np.ones(7)
>>> print(a)
[ 1.  1.  1.  1.  1.  1.  1.]


Empty arrays

The empty function creates an array where the elements haven't been initialised at all, they just contains whatever happened to be in the memory.

>>> a = np.empty(4)
>>> print(a)
[  1.18553402e-311   1.18554384e-311   0.00000000e+000   0.00000000e+000]


You can use this if you are creating an array where the initial values don't matter (if you intend to fill it with other data). empty is slightly faster than zeros because it doesn't need to fill the memory with values.

Accessing array elements

You can read or write elements using square brackets, as with a list:

>>> a = np.zeros(6)
>>> a[1] = 2
>>> a[3] = 5
>>> print(a)
[ 0.  2.  0.  5.  0.  0.]
>>> print(a[2])
0.0
>>> print(a[3])
5.0


You can also use slices

Data types

Numpy uses homogeneous arrays (all the elements must be the same type). This is different to a Python list, where different elements of the same list can have different types.

By default, when you create an array it will contain floating point values. You can choose a different type by using the dtype parameter. This creates an array of integer values.

>>> a = np.ones(4, dtype=np.int16)
>>> print(a)
[1 1 1 1]


dtype can also be used with zeros and empty.

Arrays filled with a value range

You can use the arange function to fill an array with a range of values:

>>> a = np.arange(5)
>>> print(a)
[0 1 2 3 4]