The timeIt module is used to test how fast our code is . We can use it to compare different functions or classes or just some line of code ,and see how fast they are .Also if we want we can use it to compare two pieces of code together . So if a certain problem has different implementations , and we want to check which one is the best , we can try this module .
Another usage if we have imported a library , and we have no access to its source code hence we are unable to measure its performance using algorithm analysis such as measuring it using bigO , we can just benchmark this library , and maybe compare it against our own implementation or other libraries .
The timeIt module has two main functions that we will use and explain :
- timeit.timeit
- timeit.repeat
Table of Contents
timeit.timeit
syntax
timeit.timeit( setup='pass', stmt='pass', number=1000000, globals=None, timer=<default timer> )
- stmt : is the statement that we want to measure , we must pass it as a string . The string can span multiple lines .
- setup : the setup is what you deem to be necessary for the setup of your statement .When you don’t specify anything , it is as if your code doesn’t require any.
- timer : a function that will be used to time your code , it defaults to time.perf_counter() which doesn’t measure the time elapsed during the processor sleep.
- number : is the number of times that we want to repeat this function .
- globals : The global namespace that will be associated with our function .If we want to access some variables in certain modules we can use this parameter . We can give it a value of globals() , locals() , or just the dictionary of the module that we want to access its variables by using module.__dict__ , e.g random.__dict__
The timeit.timeit function will return a number which is the amount of time it took to execute our code .
An example to illustrate it all
Let us say we want to compare if the sum function is faster than the multiplication function .
import timeit import time # define the sum , and multiplication functions def sum(x, y): return x + y def mul(x, y): return x * y # test how fast the sum function is sumTimingResult = timeit.timeit( globals=globals(), # access the global variables in this scope setup='xData= range(0,100); yData = range(220 , 320)', stmt='[ sum(x,yData[index]) for index , x in enumerate(xData)]', number=10000, timer=time.perf_counter ) >>> sumTimingResult 0.4553970939999772 # test the mul function mulTimingResult = timeit.timeit( globals=globals(), # access the global variables in this scope setup='xData = range(0,100); yData =range(220 , 320)', stmt='[ mul(x,yData[index]) for index , x in enumerate(xData)]', number=10000, timer=time.perf_counter ) >>> mulTimingResult 0.5149135840000554 # compare the result if sumTimingResult < mulTimingResult: print( f'The sum function {sumTimingResult} is faster than the mul function {mulTimingResult}') else: print( f'The sum function {sumTimingResult} is slower than the mul function {mulTimingResult}') # output The sum function 0.4553970939999772 is faster than the mul function 0.5149135840000554
So after creating the sum , and the multiplication function , we just create the code to test them . So we used time.timeit , and we passed the following parameters :
- globals : the globals that we wanted to access in our functions .They are located in the global module , which is our file , since we are running the test from our file . So we passed the globals() functions which return a dictionary containing our global defined variable , hence we can access them inside our timeit function.
- setup : we used this to create our test data. We created two range one to specify what values x can take, and the other to specify what values y can take. Note that in this case it is allowed to use a semicolon to separate the x and the y value .
- stmt : we are looping the x and y range , and just calling the function sum and mul to get the values.
- number : is the number of times we want to repeat this test , it is a good way in order to take into consideration some unforeseen variables or just to get an average for the number of repetition.
- timer : we just specified the timer that we want this function to use.
The result of the execution of the previous code is :
The sum function 0.3349066 is faster than the mul function 0.35314319999999993
timeit.repeat
syntax
timeit.timeit( setup='pass', stmt='pass', number=1000000, repeat = 5, globals=None, timer=<default timer> )
timeit.repeat has the same syntax as timeit.timeit , the only difference is that it takes the repeat argument in addition to the argument passed to timeit . The repeat parameter is simply as if we are calling the timeit.timeit function for the passed values a number of time . so Here the default is five , hence it is just as repeating the execution of time.timeit five times .
the function will return a list containing the amount of time it took the function to execute for each repeat .
example to illustrate it all
In this example we have two files , the first one is codeToTest.py , and the second one is called codeToTest_timeit.py .
The first file will contain our functions , and the second one is where we are going to test their performance .
#codeToTest.py import re import timeit def splitUsingRegex(messageToSplit: str, separator: str = '\s'): ''' Split a message using the provided separator messageToSplit : string separator : default '\s' which is ' \t\n\r\x0b\x0c' hence white space ''' patternOfSeparator = f'([^{separator}]+)' # pattern will match what is not a separator messageWordList = re.findall(patternOfSeparator, messageToSplit) return messageWordList def splitUsingLoop(messageToSplit: str, separator: str = [' ', '\t', '\n', '\r', '\x0b', '\x0c']): ''' Split a message using only loops messageToSplit : string separator : default [' ', '\t', '\n', '\r', '\x0b', '\x0c'] which is '\s' which is whitespace ''' messageWordList = [] word = [] for char in messageToSplit: word.append(char) for sep in separator: if char == sep: word.pop() if len(word) != 0: messageWordList.append(''.join(word)) word = [] break if word != []: messageWordList.append(''.join(word)) return messageWordList
the codeToTest.py contains two functions . The first one is about splitting a message using regex . The second one is about splitting an expression using just for loops .
We want to check which one performs better . Hence we created the codeToTest_timeit.py .
# import the modules import timeit import time testData = [ 'this is a test data', 'the aim is to verify', 'which of the splitting functions', 'is going to perform better', 'the number of iteration here is 235712', 'and this is why the repeat can be useful', 'since sometimes just running a test once', 'we can get a result which is not true' ] splitUsingRegexTimingResult = timeit.repeat( setup='from codeToTest import splitUsingRegex , splitUsingLoop', stmt='[splitUsingRegex(data) for data in testData]', number=235712, repeat=5, globals=globals(), timer=time.perf_counter ) splitUsingLoopTimingResult = timeit.repeat( setup='from codeToTest import splitUsingRegex , splitUsingLoop', stmt='[splitUsingLoop(data) for data in testData]', number=235712, repeat=5, globals=globals(), timer=time.perf_counter ) for _splitUsingLoopTimingResult, _splitUsingRegexTimingResult in zip(splitUsingLoopTimingResult, splitUsingRegexTimingResult): if _splitUsingLoopTimingResult < _splitUsingRegexTimingResult: print( f'splitUsingLoop : {_splitUsingLoopTimingResult} is faster than splitUsingRegex {_splitUsingRegexTimingResult} ') else: print( f'splitUsingLoop : {_splitUsingLoopTimingResult} is slower than splitUsingRegex {_splitUsingRegexTimingResult} ')
So using a separate file has its advantages . It becomes easier to setup our testing.
In the codeToTest_timeit file we have created first of all our tesData List which contains the values that we want to pass to the functions that we want to test .
- we used globals to get access to any variable that we want to define in our testing module .
- the setup was used to import the module we wanted to test .
- The stmt is our actual test , which is just looping through our test data , and calling each function in each test .
- We have specified a repeat of 5, and the timer is the time.perf_counter.
This is the test results , and as you can see that regex are faster than using nested for loops.
splitUsingLoop : 30.1613112 is slower than splitUsingRegex 9.099230499999999 splitUsingLoop : 31.806555499999988 is slower than splitUsingRegex 8.4409013 splitUsingLoop : 30.62968939999999 is slower than splitUsingRegex 9.3228735 splitUsingLoop : 30.689064400000007 is slower than splitUsingRegex 8.864524599999996 splitUsingLoop : 30.370010100000002 is slower than splitUsingRegex 7.826640099999999
Warning
When using timeit.timeit or timeit.repeat , both of these functions turn off garbage cleaning , so if your functions are memory dependent , you might get results that are not accurate , so just pay attention to this .
You can turn on garbage collections by
gc.enable()
as the first statement in you setup argument