import numpy as np
Advanced Array Iteration#
For most of the numpy
code that I write in data science applications I make use of slicing, indexing and standard operations. However, occationally there is a need to use a numpy
iteration function called nditer
. This might be useful in an instance where I need to iterate over each element of an 2, 3 or 4D array without including multiple for loops. There is extensive documentation for this on the numpy
docs. Here we will consider some basic functionality that I have found useful in applied work.
A matrix example#
We will consider how to iterate over each element in a 2 dimensional array. You obviously easily do this in standard python. Here’s a simple example:
a = np.arange(6).reshape(2, 3)
print(a)
[[0 1 2]
[3 4 5]]
A standard python implementation to iterate over all combinations is as follows. Note the requirement of an inner loop.
def standard_all_element_iteration(a, print_out=True):
for row in a:
for col in row:
if print_out: print(col, end= ' ')
standard_all_element_iteration(a)
0 1 2 3 4 5
When we need to iterate over all elements of an array then we can use nditer to eliminate the inner loop.
def nditer_all_element_iteration(a, print_out=True):
for element in np.nditer(a):
if print_out: print(element, end=' ')
nditer_all_element_iteration(a)
0 1 2 3 4 5
The result is that we have considerably faster iteration because the inner loop executes in C.
%timeit standard_all_element_iteration(a, print_out=False)
%timeit nditer_all_element_iteration(a, print_out=False)
2.1 μs ± 21.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
877 ns ± 2.87 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
Note that the iteration took place in across the rows our the array a
. To iterate across the all elements column-wise you can use ‘Fortran’ ordering by passing the parameter order='F'
to np.nditer
print(a)
[[0 1 2]
[3 4 5]]
for element in np.nditer(a, order='F'):
print(element, end=' ')
0 3 1 4 2 5