In Python 3.9, nested functions are surprisingly slower than normal functions, around 10% for my example.
from timeit import timeit
def f():
return 0
def factory():
def g():
return 0
return g
g = factory()
print(timeit("f()", globals=globals()))
#> 0.074835498
print(timeit("g()", globals=globals()))
#> 0.08470309999999998
dis.dis
show the same bytecode, and the only difference that I've found was in function internal flags. Indeed, dis.show_code
reveals that g
has a flags NESTED
while f
has not.
However, the flags can be removed, and it makes g
as fast as f
.
import inspect
g.__code__ = g.__code__.replace(co_flags=g.__code__.co_flags ^ inspect.CO_NESTED)
print(timeit("f()", globals=globals()))
#> 0.07321161100000001
print(timeit("g()", globals=globals()))
#> 0.07439838800000001
I've tried to look at CPython code to understand how CO_NESTED
flag could impact function execution, but I've found nothing. Is there any explanation to this performance difference relative to the CO_NESTED
flag?
EDIT: Removing CO_NESTED
flag seems also to have no impact on function execution, except the overhead, even when it has captured variable.
import inspect
global_var = 40
def factory():
captured_var = 2
def g():
return global_var + captured_var
return g
g = factory()
assert g() == 42
g.__code__ = g.__code__.replace(co_flags=g.__code__.co_flags ^ inspect.CO_NESTED)
assert g() == 42 # function still works as expected