March 2019, Reading time: 3 minutes
In this post I will explain generators and generator expressions. Please read first the post on iterables and iterators. Generators are iterator objects that we can call to get the values one at a time. When a function has a yield statement, it returns a generator object. Then we can iterate through this, using a for loop or the next method. Lets see an example.
>>> def func1(): ... print("hello") ... yield 1 ... print("world") ... yield 2 ... print("end") ... >>> func1() <generator object func1 at 0x0000011547640E58> >>> gen = func1() >>> next(gen) hello 1 >>> next(gen) world 2 >>> next(gen) end Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration
Note that when we call func1, func1 is not executed. Instead it returns a generator, so we can think of it as generator factory. Now I can iterate via the generator like any other iterator, in this case using next. Each time I call next, the execution stops at yield. When there is no more yield a StopIteration exception is raised. Lets see another example
>>> def sq(n): ... for i in range(n): ... yield(i) ... >>> g = sq(5) >>> next(g) 0 >>> next(g) 1 >>> next(g) 2 >>> next(g) 3 >>> next(g) 4 >>> next(g) Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration
the sq function does not precalculate all the squares, but it returns them as requested. Of course I can also do this
>>> g = sq(5) >>> list(g) [0, 1, 2, 3, 4] >>> list(g) 
but note since g is an iterator if I call it again I will get an empty list.
Since generators do not precalcuate all the results, but they only calculate one at a time, they consume much less memory. Second, If I don’t want all the results, the computation will be faster.
We have seen list comprehensions
>>> [x**2 for x in range(11)] [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
If I change the square bracket with parenthesis I get a generator expression
>>> (x**2 for x in range(11)) <generator object <genexpr> at 0x0000011547640F48>
But notice it doesn’t return a list, but a generator, I have to iterate to get all the elements one by one like this
>>> g = (x**2 for x in range(11)) >>> for i in g: ... print(i) ... 0 1 4 9 16 25 36 49 64 81 100
and once it is exhausted I cannot reuse it. It will print nothing
>>> for i in g: ... print(i) ... >>>
of course I can also use the list to convert it to a list, but I cannot call it twice, and this also defeats the purpose of generators. Because the purpose is to retrieve only the elements I want and not every element.
>>> g = (x**2 for x in range(11)) >>> list(g) [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100] >>> list(g)  >>>