"If a worker wants to do his job well, he must first sharpen his tools." - Confucius, "The Analects of Confucius. Lu Linggong"
Front page > Programming > “Be wary of time complexity pitfalls\"

“Be wary of time complexity pitfalls\"

Published on 2024-08-16
Browse:843

“Be wary of time complexity pitfalls\

Be wary of time complexity pitfalls

Write Here

A bilibili vedio also shows this : [Bilibili Video][https://www.bilibili.com/video/BV16u4m1c7cU/?spm_id_from=333.999.0.0] and I think it's a good vedio, but its language is Chinese.

time complexity

  • What does time complexity mean?
  • Time complexity is a measure of how much time an algorithm takes to run as a function of the size of its input. It is a way to describe the efficiency of an algorithm, and it is used to compare different algorithms and determine which one is the most efficient.

  • How to calculate time complexity?

  • To calculate time complexity, we need to consider the number of operations performed by the algorithm as a function of the size of the input. The most common way to measure the number of operations is by counting the number of times a particular operation is performed.

  • What are some common pitfalls when calculating time complexity?

    1. Not considering the input size: If we only consider the number of operations performed by the algorithm, we may underestimate the time complexity. For example, if we count the number of times a loop runs, but we don't take into account the size of the input, then the time complexity may be too high.
    1. Not considering the algorithm's efficiency: Some algorithms may perform many operations even when the input size is small, which can lead to high time complexity. For example, sorting algorithms like bubble sort and selection sort have quadratic time complexity, even though they may only need to swap two elements in a small array.
    1. Not considering the algorithm's worst-case scenario: Some algorithms have a best-case scenario where they perform fewer operations than the worst-case scenario. For example, searching algorithms like binary search have a best-case scenario where they find the target element in logarithmic time, but they have a worst-case scenario where they need to examine all elements in the array.

Let's see some examples of time complexity

Here is a question:
Find the max 10 integers in a list.

import random
ls = [random.randint(1, 100) for _ in range(n)]

Here is the test function:

import time
def benchmark(func, *args) -> float:
    start = time.perf_counter()
    func(*args)
    end = time.perf_counter()
    return end - start

Solution1: Use heap

Here is the solution which uses the heapq module:

def find_max_n(ls, n):
    import heapq
    return heapq.nlargest(n, ls)

Or we write it in the python way:

def find_largest_n(nums, n):
    if n  max_heap[0]:
            max_heap[0] = num
            _sift_down(max_heap, 0)

    return max_heap

def _sift_down(heap, index):
    left = 2 * index   1
    right = 2 * index   2
    largest = index

    if left  heap[largest]:
        largest = left

    if right  heap[largest]:
        largest = right

    if largest != index:
        heap[index], heap[largest] = heap[largest], heap[index]
        _sift_down(heap, largest)

Solution2: Use sort

Here is the solution which uses the sort function:

def find_max_n(ls, n):
    return sorted(ls, reverse=True)[:n]

Alomost everyone know that The time complexity of the heap, is O(n log k), and the time complexity of the sort function is O(n log n).

It seems that the First solution is better than the second one. But it is not true in python.

Look at the final code:

import time
def benchmark(func, *args) -> float:
    start = time.perf_counter()
    func(*args)
    end = time.perf_counter()
    return end - start

def find_max_n_heapq(ls, n):
    import heapq
    return heapq.nlargest(n, ls)

def find_max_n_python_heap(nums, n):
    if n  max_heap[0]:
            max_heap[0] = num
            _sift_down(max_heap, 0)

    return max_heap

def _sift_down(heap, index):
    left = 2 * index   1
    right = 2 * index   2
    largest = index

    if left  heap[largest]:
        largest = left

    if right  heap[largest]:
        largest = right

    if largest != index:
        heap[index], heap[largest] = heap[largest], heap[index]
        _sift_down(heap, largest)


def find_max_n_sorted(ls, n):
    return sorted(ls, reverse=True)[:n]

# test
import random
for n in range(10, 10000, 100):
    ls = [random.randint(1, 100) for _ in range(n)]
    print(f"n = {n}")
    print(f"Use    Heapq: {benchmark(find_max_n_heapq, ls, n)}")
    print(f"Python Heapq: {benchmark(find_max_n_python_heap, ls, n)}")
    print(f"Sorted      : {benchmark(find_max_n_sorted, ls, n)}")

I run it in Python3.11 VScode, Here is the result:

n is not big

Use Heapq: 0.002430099993944168
Python Heapq: 0.06343129999004304
Sorted : 9.280000813305378e-05
n = 910
Use Heapq: 9.220000356435776e-05
Python Heapq: 0.07715500006452203
Sorted : 9.360001422464848e-05
n = 920
Use Heapq: 9.440002031624317e-05
Python Heapq: 0.06573969998862594
Sorted : 0.00012450001668184996
n = 930
Use Heapq: 9.689992293715477e-05
Python Heapq: 0.06760239996947348
Sorted : 9.66999214142561e-05
n = 940
Use Heapq: 9.600003249943256e-05
Python Heapq: 0.07372559991199523
Sorted : 9.680003859102726e-05
n = 950
Use Heapq: 9.770004544407129e-05
Python Heapq: 0.07306570000946522
Sorted : 0.00011979998089373112
n = 960
Use Heapq: 9.980006143450737e-05
Python Heapq: 0.0771204000338912
Sorted : 0.00022829999215900898
n = 970
Use Heapq: 0.0001601999392732978
Python Heapq: 0.07493270002305508
Sorted : 0.00010840001050382853
n = 980
Use Heapq: 9.949994273483753e-05
Python Heapq: 0.07698320003692061
Sorted : 0.00010300008580088615
n = 990
Use Heapq: 9.979994501918554e-05
Python Heapq: 0.0848745999392122
Sorted : 0.00012620002962648869

if n is big ?

n = 10000
Use Heapq: 0.003642000025138259
Python Heapq: 9.698883199947886
Sorted : 0.00107999995816499
n = 11000
Use Heapq: 0.0014836000045761466
Python Heapq: 10.537632800056599
Sorted : 0.0012236000038683414
n = 12000
Use Heapq: 0.001384599949233234
Python Heapq: 12.328411899972707
Sorted : 0.0013226999435573816
n = 13000
Use Heapq: 0.0020017001079395413
Python Heapq: 15.637207800056785
Sorted : 0.0015075999544933438
n = 14000
Use Heapq: 0.0017026999266818166
Python Heapq: 17.298848500009626
Sorted : 0.0016967999981716275
n = 15000
Use Heapq: 0.0017773000290617347
Python Heapq: 20.780625900020823
Sorted : 0.0017105999868363142

What I find & how to improve it

When n is big, Sorted costs a little time (sometimes even better than use heapq), but Python Heapq costs a lot of time.

  • Why Sorted costs a little time, but Python Heapq costs a lot of time?
  • Because sorted() is a bulit-in function in Python, you can find python offical document about it.

Bulit-in function is faster than heapq because it is written in C, which is a compiled language.

  • How to improve it?
  • You can use the built-in function sorted() instead of heapq.sort() to improve the performance of your code. The sorted() function is a built-in function in Python, which is implemented in C and is therefore much faster than heapq.sort()

Conculsion

When we are dealing with large data, we should use built-in function instead of heapq.sort() to improve the performance of our code. We must be wary of time complexity pitfalls when dealing with large data. Sometimes time complexity pitfalls are unavoidable, but we should try to avoid them as much as possible.

About Me

Hello, I'm mengqinyuan. I'm a student. I love to learn new things.
You can see my github: [MengQinYuan's Github][https://github.com/mengqinyuan]

Release Statement This article is reproduced at: https://dev.to/mengqinyuan/be-wary-of-time-complexity-pitfalls-13jf?1 If there is any infringement, please contact [email protected] to delete it
Latest tutorial More>

Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.

Copyright© 2022 湘ICP备2022001581号-3