List comprehensions in Python - a never ending story. Some consider it as a cure for everything, some consider them to be faster. The latter is something I couldn’t prove experimentally.
So, what are list comprehensions? List comprehensions allow to create lists from iterable data structures. Suitable data structures in Pythons are:
- strings
- tuples (immutable)
- lists (mutable).
A little note upfront: we could use lambda functions with map
(map(lambda x: f(x), range(999.....))
) as well.
The simplest thing we can do with list comprehensions is to simple access an element in an iterable:
inputList = (1,4,'p','@')
print([x for x in inputList])
[1, 4, 'p', '@']
We can apply functions on each element:
print([str(x) for x in inputList])
['1', '4', 'p', '@']
If we want to do more complex things, then we can write and apply custom functions
def customFunction(x):
# do something
return x
y = [customFunction(x) for x in otherList]
We can add conditionals (if/else statements):
print([True for x in inputList if type(x) == int] )
print([True if (type(x) == int) else False for x in inputList])
[True, True]
[True, True, False, False]
NB!: the order has changed. If we would try print([True for x in inputList if (type(x) == int) else False])
or print([True if (type(x) == int) for x in inputList ])
, we would use invalid syntax. To avoid this, we could write the if/else statement into a custom function and apply as above.
As with for loops, we can use nested list comprehensions as well:
list1 = [0,3,1,3,5,8,5]
list2 = [1,3,2,6,9,7,5]
print([x*y for x in list1 for y in list2])
[0, 0, 0, 0, 0, 0, 0, 3, 9, 6, 18, 27, 21, 15, 1, 3, 2, 6, 9, 7, 5, 3, 9, 6, 18, 27, 21, 15, 5, 15, 10, 30, 45, 35, 25, 8, 24, 16, 48, 72, 56, 40, 5, 15, 10, 30, 45, 35, 25]
Differences in computation speed? Well, not really. I ran the following script in background while working on other things. Hence, it is not a perfect benchmark since CPU scheduling could have been influenced by other things I did. Loops were slightly faster with 0.2454 +/- 0.0141 s than list comprehensions that took 0.2512 +/- 0.0251 s. Basically, there are no differences but savings of a few seconds of writing a bit less code.
import time
import random
import numpy as np
inputList = [random.randint(-10,10) for i in range(999999)]
def timeIt(inputList,fs):
tick = time.time()
fs(inputList)
return time.time()-tick
def listComprehensions(inputList):
return [str(i) for i in inputList]
def forLoop(inputList):
outputList = []
for i in inputList:
outputList.append(str(i))
return outputList
measurements_loops = []
for i in range(1000):
measurements_loops.append(timeIt(inputList,forLoop))
measurements_listcomprehensions = [timeIt(inputList,forLoop) for i in range(1000)]
print(np.mean(measurements_loops))
print(np.mean(measurements_listcomprehensions))
print(np.std(measurements_loops))
print(np.std(measurements_listcomprehensions))