Our Blogs

Speeding up the python code

admin | 9 January 2021

Python is one of the most popular programming languages. Most of the companies use python in production for rapid delivery. Although it is good when it comes to production but what about its performance? In that case Python fails miserably as it is slower than most of the programming languages such as c++, java, etc. So how can we make it faster??

There are several ways of doing it. We will discuss it one by one.

Image for post

For demonstration purpose, I have created a script for finding out the Nearest neighbors for each data points in increasing order of euclidean distances.
Before moving to the optimization method we will first do profiling to see where it is taking more time. For that we will use line profiler which is a great tool to provide the execution of each line inside the function line profiler is applied to.

First, let’s import NumPy and the

line_profilerhttps://medium.com/media/7f8c6e0e44a613c01790a2a40cb9fba8

Using this %%writefile cell magic we write our code in a Python script.https://medium.com/media/6554711ada715154e2a5a1c0d90701fbtool to provide the execution of each line inside the function line profiler is

Next, we will import our script so that we can execute and profile our code using %lprun magic command and will save the report in a file named sim_result and display it into the cell.https://medium.com/media/366702620ae6b0ac2d52446d49c22f84

Image for post

We can see the function comp_inner_raw( ) is consuming a lot of time for computations. So we need to optimize this function.

Raw Python Computation time:https://medium.com/media/32ceb755c7d1834844533dfd32a2b5e4

CPU times: user 10.6 s, sys: 1.77 s, total: 12.3 s
Wall time: 12.3 s

Optimizing the code using Numba:

The function comp_inner_raw() is using loop and we know loops in python are slow so we can use Numba which is a just-in-time compiler for Python that works best on the code which uses NumPy arrays, functions, and loops. Using Numba we can generate optimized machine code from our pure Python code by using LLVM compiler infrastructure. This will speed up the code.

This is how a Numba works:

Image for post
Image Courtesy: ContinuumIO

Simply apply jit on the computer_inner_raw() function to enable numba to work it.https://medium.com/media/3d885198b9fa6577695d63b9db6ea69f

Numba optimized Python code Computation Time:https://medium.com/media/9a594e1afe36143d9ddf468757302226

CPU times: user 3.46 s, sys: 1.75 s, total: 5.21 s
Wall time: 5.21 s

Using Numba we made our code 2.5x faster than the raw Python code.

Optimizing the code using Swig:

The Simplified Wrapper and Interface Generator (SWIG) provides capability to wrap c/c++ libraries with other languages such as Python, Ruby, Java etc.

Image for post
Image Courtesy: swig.org

In order to create wrapper we will first create our cpp file.https://medium.com/media/fe14b429668c0937b7413de2941ad57d

We will create interface file which is an input to SWIG that provides wrapper files.https://medium.com/media/3f61f9d936da89aeb9ba38adee1c7f99

Next we will create our header file.https://medium.com/media/4764ed43d91178786f110c695a4edbf5

Generating the wrapper
We will write a bash file to generate the wrapper for our code. In the build.sh code compiler optimization is also done.https://medium.com/media/cd7a950ddc4b043558a7653c015bdf1e

Run bash build.sh command to generate wrapper. Then after import swig generated wrapper named myknn.py in python.

Swig optimized Python code Computation Time:https://medium.com/media/fd21dbfad99dbafcd1dce3ce56ec7f1c

CPU times: user 992 ms, sys: 3.04 ms, total: 995 ms
Wall time: 991 ms

Using Swig code become 12x faster than the raw Python code.

Conclusion:

The results for the execution time to run the algorithm are summarized in the table below:

Image for post
Image for post

It can be observed that using swig can speed up our python code upto 12 times which is much faster than the Numba optimized code.

References:
1. https://rushter.com/blog/numba-cython-python-optimization/
2. https://ipython-books.github.io/43-profiling-your-code-line-by-line-with-line_profiler/
3. https://jakevdp.github.io/blog/2015/02/24/optimizing-python-with-numpy-and-numba/
4. https://towardsdatascience.com/speed-up-your-algorithms-part-2-numba-293e554c5cc1

Get in Touch

Get in touch to explore the possibilities of enhancing your business process through digitization and automation to increase your ROI.