Introducing an efficient Big O analyzer, a premium state-of-the-art AKDSFramework feature to analyze Big O for any function without any human intervention.
As you already know calculating big O is a big part of what we do to approximate the running time of an algorithm in worst cases. But most of it is done by hand tracing step by step manually. I'm going to propose an analytical approach to compute big O with respect to one expanding variable. Let's see what I mean with some examples
- Sorting a sequence of numbers (Big O is \(O(n \log n)\) with respect to the sequence's length (n))
- Finding the maximum element of a given array is \(O(n)\) with respect the length of the array.
- Finding the last element of a singly linked list is \(O(n)\) with respect to the length of the list.
So in the above cases I'm calling the sequence of numbers the "expanding variables" because we are calculating how the algorithm would perform when these "expanding variables" grows towards big sizes.
Now let's create a Big O analyzer.
Designing the Big O analyser system
Our big O analyser system has these following parts
- A function that would make a dictionary and record how much time the function is taking for different size of inputs.
- Another function that would generate different size of inputs that can be fed into the function. Let's call it a generator.
- Another function that would interpret the execution times and fit the times into a definitive time complexity with respect to the expanding variable.
Let's see an example to clarify this thing:
Let's say we are tasked to find the complexity of the python function sorted()
. Now we identify what's our expanding variable?
For the function sorted it sorts a sequence of data, so the expanding variable would be the sequence of numbers. So now let's import an generator that is built into AKDSFramework
from AKDSFramework.applications.complexity_analysis.data_generators import integer_sequence
Now armed with integer_sequence that can generate sequence of integers with length of any number we create an instance of the integer_sequence
like this:
int_generator = lambda n: integer_sequence(lb=1, ub=200000, size=n)
Now this int_generator can create a random sequence of length n
and individual elements are ranging from 1 to 200000.
Now our job is to make a dictionary and record how much time the function is taking for different size of inputs. From the previous piece of code we already know that we can generate different size of inputs. Now it's time to create the dictionary. For that we need to do the followings
from AKDSFramework.applications.complexity_analysis.analyser import runtimedict
Now we feed the function and any other keyword arguments for the function into runtimedict
method.
The run-time dictionary
runtimedict
method takes in a few arguments these are:
func
: The function for which you want to make the execution time dictionary.pumping_lower_bound
: Lowest size of the expanding variable. For example if you have to find the execution time for some size of arrays from where you would start.pumping_upper_bound
: Largest size of the expanding variable. For example if you have to find the execution time for some size of arrays where you would stop.total_measurements
: From lowest size of array to largest size of array how many measurements you want to do?pumping
: Among all the keyword arguments what variable needs to be pumped meaning what arguments is the expanding variable. Put the name of the variable in strings.- **kwargs: All the arguments of the functions.
Let's take the example of sorted
. Sorted function takes in iterable
as the keyword argument for the sequence of numbers so to make the run time dictionary we write this:
- We'll record 200 measurements.
- Our array size will start from 1000
- Our array size will end to 5000
So the code would be
# The integer generator from before
int_generator = lambda n: int_generator(1, 200000, n)
# And the Run time dictionary
rtdc = runtimedict(sorted, 1000, 5000, 200, pumping='iterable', iterable=int_generator)
Now to fit the complexity we need the following lines of code:
from AKDSFramework.applications.complexity_analysis.analyser import run_inference_on_complexity
run_inference_on_complexity(rtdc)
Output
Calculating complexity: 100%|██████████| 7/7 [00:00<00:00, 2618.63it/s]
O(N log N)
As you can see that our analysis of sorted function is \(O(n \log n)\) which is actually true.
Another example
Now let's take another example of bubble sort and insertion sort. Both are order \(O(n^2)\)
from AKDSFramework.applications.sorting import bubblesort, insertionsort
rtdc = runtimedict(insertionsort, 10, 2000, 200, pumping='array', array=int_generator, visualize=False, maintain_iter_dict=False)
run_inference_on_complexity(rtdc)
Output
Processing: 100%|██████████| 200/200 [00:12<00:00, 16.17it/s]
Calculating complexity: 100%|██████████| 7/7 [00:00<00:00, 2285.37it/s]
O(N^2)
Bubble sort
rtdc = runtimedict(bubblesort, 1000, 5000, 200, pumping='array', array=int_generator, visualize=False, maintain_iter_dict=False)
run_inference_on_complexity(rtdc)
Output
Processing: 100%|██████████| 200/200 [03:23<00:00, 1.02s/it]
Calculating complexity: 100%|██████████| 7/7 [00:00<00:00, 3031.82it/s]
O(N^2)
Inner workings
So we can say the big O complexity analysis is working. Let's see the inner workings of this module:
- First we calculate the runtime for different size of array.
- Next we fit the size and time to return the least-squares solution to a linear matrix equation. Now by fitting we mean in separate instances we transform the size to order \(O(n)\) or \(O(n^2)\) or \(O(n^3)\) or \(O(n \log n)\) or \(O(\log n)\) or \(O(c^n)\) then we look at which one is most fitted to a straight line with the time. The most fitted one will be our big O because the time would be in the same order.
- We return the most fitted complexity.
To see which one fits better we use numpy.linalg.lstsq
to return the least-squares solution to a linear matrix equation. Returned residual is minimum means that the equation fits better to a linear equation. More about this method here
Installation
- First download/clone this repo like git clone
https://github.com/theroyakash/AKDSFramework.git
- Now uninstall if any previous version installed
pip3 uninstall AKDSFramework
- Now install fresh on your machine
pip3 install -e AKDSFramework
Alternate installation
This is easier to install but a bit slower in the installation time.
pip3 install https://github.com/theroyakash/AKDPRFramework/tarball/main
First code, Check the version
Now to check whether your installation is completed without error import AKDSFramework
import AKDSFramework
print('AKDSFramework Version is --> ' + AKDSFramework.__version__)
What you contribute is the only resource behind these material. Please support me on gumroad