Understanding Algorithm Efficiency with Big O Notation

Knowing how to write efficient code is crucial for every programmer. Big O notation helps us measure how well algorithms perform as we give them more data. In this blog post, we'll explore the basics of algorithm efficiency, focusing on concepts like time complexity and Big O notation. By the end, you'll have a clear idea of how to analyze and improve your code's performance, making it faster and better able to handle large amounts of information. Let's dive into the world of algorithm efficiency and make your code run smarter!

What is a good code?

A good code is analyzed based on two things, Readable and Scalable.

Readable is about how others can easily understand your code. Scalable is divided into two things, Time and Space.

Time complexity

The time taken for the program to complete its execution. In analyzing a program efficiency, time is not about how many seconds or minutes or hours that specific program is taking. Because, execution time will not be same for all machines.

For example, you can use your VS Code to run a specific program using code runner. You will be getting a time in seconds for running the program. But it will not be same when you run the program again. It also applicable for various machines. It will not be same for all machines.

So, how can we find the time then? Let's discuss the standard for time complexity.

Time complexity is analyzed using notations like Big O, Omega and Theta.

Big O is analyzing the program's worst case time.
Omega is the best case.
Theta is the average case.

For time complexity we always focus on the Big O, because taking the worst case scenario will help us to optimize the program for its worst case.

Big O Notation

let's understand it using a example program.

demo.py

def main(n):
    """
    Print numbers from 1 to n
    """
	for i in range(1,n+1):
		print(i)
main(5)
 
'''
Output:
1
2
3
4
5
'''

The for loop will run for n times. If the input is 5, it will run for 5 times. So, the number of operations depends on input n.
The print function will run for n times, as it is inside the for loop.

Time = n + n -> 2n

The time complexity for the program is 2n. But it is not correct.

But why?

There are some set of rules to make it simple to analyze the time complexity.

What cause time in a function?

Operations (+, -, *, /)
Comparisons (<, >, ==)
Looping (for, while)
Outside Function call (function())

Rules to find time complexity

Rule 1: Always worst Case
Rule 2: Remove Constants
Rule 3: Different inputs should have different variables. O(a+b). A and B arrays nested would be O(a*b)
- * for steps in order
- - for nested steps
Rule 4: Drop Non-dominant terms

Let's discuss each rule.

Rule 1: Always worst Case

When analyzing an algorithm, consider the worst-case scenario. This means you look at the maximum time or space the algorithm will take.

// Finding a specific number in a array. For example, take the number as 7.
def findNum(array):
	for num in array:
		if(num == 7):
			print("Number found")
			break

Best Case
- Scenario: The number 7 is the first element in the array.
- Explanation: The loop will find the number on the first iteration and then break out of the loop.
Average Case
- Scenario: The number 7 is somewhere in the middle of the array.
- Explanation: On average, the loop will have to iterate through half of the elements in the array to find the number.
Worst Case
- Scenario: The number 7 is the last element in the array or is not present in the array at all.
- Explanation: The loop will iterate through all elements of the array.

So, thinking about the worst case. The time complexity will be O(n).

Rule 2: Remove Constants

In Big O notation, constant factors are ignored because they do not significantly affect the growth rate of the algorithm. For example, 𝑂(2𝑛) is simplified to 𝑂(𝑛).

demo.py

def loopTwice(array):
	for num in array:
		print("Loop")
 
	for num in array:
		print("Loop")

As you see, we are looping through the array twice using two separate for loops. Let's take size of array as n.

Time = O(n+n) -> O(2n)

Now we have to remove the constants.

Time = O(2n) -> O(n)

Rule 3: Different input should have different variables

When an algorithm takes multiple inputs, each input should be represented by a different variable. For instance:

Sequential operations: If you perform operations on two separate arrays of sizes a and b, the time complexity is O(a+b).
Nested operations: If you have a nested loop where the outer loop runs a times and the inner loop runs b times, the time complexity is O(a*b).

demo.py

def loop(arr1, arr2):
	for num in arr1: # O(n)
		print(num)
 
	for num in arr2: # O(m)
		print(num)

In this example, there are two different inputs arr1 and arr2. Let's assume the size of arr1 is n and size of arr2 is m.

Time = O(n + m) -> Sequential operations

def loop(arr1, arr2):
	for i in arr1: # O(n)
		for j in arr2: # O(m)
			print(i, j)
	# Nested loops give us O(n * m)

Time = O(n * m) -> Nested operations

If there is only one input but you are using a nested loop, then time will be O(n*n) -> O(n^2)

Rule 4: Drop Non-dominant Terms

When multiple terms are added together, only the term with the largest growth rate is kept. For example, O(n^2 + n) is simplified to O(n^2) because as n grows, n^2 will dominate n.

demo.py

def example_function(n):
	# O(n^2) complexity
	for i in range(n):		# Outer loop: runs 'n' times
		for j in range(n):	# Inner loop: runs 'n' times
			print(i, j)
 
	# O(n) complexity
	for k in range(n):		# Another loop: runs 'n' times
		print(k)

The nested loops for i in range(n) and for j in range(n) have a complexity of O(n^2) because they iterate through n elements twice (once nested within each other).
The second loop for k in range(n) has a complexity of O(n) because it iterates through n elements once.

The total complexity of the example_function is O(n^2 +n). When n is large, the n^2 term dominates the n term, so the overall complexity is effectively O(n^2 ).

Various Big O's in analysis

O(1) -> Constant- no loops
O(log N) -> Logarithmic- usually searching algorithms have log n if they are sorted (Binary Search)
O(n) -> Linear- for loops, while loops through n items
O(n log(n)) -> Log Liniear- usually sorting operations
O(n^2) -> Quadratic- every element in a collection needs to be compared to ever other element. Two nested loops
O(2^n) -> Exponential- recursive algorithms that solves a problem of size N
O(n!) -> Factorial- you are adding a loop for every element

Iterating through half a collection is still O(n)

Two separate collections: O(a * b)

To know more about various Big O's with example, checkout Various Big O's in Analysis of Algorithm