import numpy as np

Advanced Array Iteration#

For most of the numpy code that I write in data science applications I make use of slicing, indexing and standard operations. However, occationally there is a need to use a numpy iteration function called nditer. This might be useful in an instance where I need to iterate over each element of an 2, 3 or 4D array without including multiple for loops. There is extensive documentation for this on the numpy docs. Here we will consider some basic functionality that I have found useful in applied work.

A matrix example#

We will consider how to iterate over each element in a 2 dimensional array. You obviously easily do this in standard python. Here’s a simple example:

a = np.arange(6).reshape(2, 3)
print(a)
[[0 1 2]
 [3 4 5]]

A standard python implementation to iterate over all combinations is as follows. Note the requirement of an inner loop.

def standard_all_element_iteration(a, print_out=True):
    for row in a:
        for col in row:
            if print_out: print(col, end= ' ')
standard_all_element_iteration(a)
0 1 2 3 4 5 

When we need to iterate over all elements of an array then we can use nditer to eliminate the inner loop.

def nditer_all_element_iteration(a, print_out=True):
    for element in np.nditer(a):
        if print_out: print(element, end=' ')
nditer_all_element_iteration(a)
0 1 2 3 4 5 

The result is that we have considerably faster iteration because the inner loop executes in C.

%timeit standard_all_element_iteration(a, print_out=False)
%timeit nditer_all_element_iteration(a, print_out=False)
1.29 µs ± 5.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
640 ns ± 5.14 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Note that the iteration took place in across the rows our the array a. To iterate across the all elements column-wise you can use ‘Fortran’ ordering by passing the parameter order='F' to np.nditer

print(a)
[[0 1 2]
 [3 4 5]]
for element in np.nditer(a, order='F'):
    print(element, end=' ')
0 3 1 4 2 5