A bilibili vedio also shows this : [Bilibili Video][https://www.bilibili.com/video/BV16u4m1c7cU/?spm_id_from=333.999.0.0] and I think it's a good vedio, but its language is Chinese.
Time complexity is a measure of how much time an algorithm takes to run as a function of the size of its input. It is a way to describe the efficiency of an algorithm, and it is used to compare different algorithms and determine which one is the most efficient.
How to calculate time complexity?
To calculate time complexity, we need to consider the number of operations performed by the algorithm as a function of the size of the input. The most common way to measure the number of operations is by counting the number of times a particular operation is performed.
What are some common pitfalls when calculating time complexity?
Here is a question:
Find the max 10 integers in a list.
import random ls = [random.randint(1, 100) for _ in range(n)]
Here is the test function:
import time def benchmark(func, *args) -> float: start = time.perf_counter() func(*args) end = time.perf_counter() return end - start
Here is the solution which uses the heapq module:
def find_max_n(ls, n): import heapq return heapq.nlargest(n, ls)
Or we write it in the python way:
def find_largest_n(nums, n): if n max_heap[0]: max_heap[0] = num _sift_down(max_heap, 0) return max_heap def _sift_down(heap, index): left = 2 * index 1 right = 2 * index 2 largest = index if left heap[largest]: largest = left if right heap[largest]: largest = right if largest != index: heap[index], heap[largest] = heap[largest], heap[index] _sift_down(heap, largest)
Here is the solution which uses the sort function:
def find_max_n(ls, n): return sorted(ls, reverse=True)[:n]
Alomost everyone know that The time complexity of the heap, is O(n log k), and the time complexity of the sort function is O(n log n).
It seems that the First solution is better than the second one. But it is not true in python.
Look at the final code:
import time def benchmark(func, *args) -> float: start = time.perf_counter() func(*args) end = time.perf_counter() return end - start def find_max_n_heapq(ls, n): import heapq return heapq.nlargest(n, ls) def find_max_n_python_heap(nums, n): if n max_heap[0]: max_heap[0] = num _sift_down(max_heap, 0) return max_heap def _sift_down(heap, index): left = 2 * index 1 right = 2 * index 2 largest = index if left heap[largest]: largest = left if right heap[largest]: largest = right if largest != index: heap[index], heap[largest] = heap[largest], heap[index] _sift_down(heap, largest) def find_max_n_sorted(ls, n): return sorted(ls, reverse=True)[:n] # test import random for n in range(10, 10000, 100): ls = [random.randint(1, 100) for _ in range(n)] print(f"n = {n}") print(f"Use Heapq: {benchmark(find_max_n_heapq, ls, n)}") print(f"Python Heapq: {benchmark(find_max_n_python_heap, ls, n)}") print(f"Sorted : {benchmark(find_max_n_sorted, ls, n)}")
I run it in Python3.11 VScode, Here is the result:
Use Heapq: 0.002430099993944168
Python Heapq: 0.06343129999004304
Sorted : 9.280000813305378e-05
n = 910
Use Heapq: 9.220000356435776e-05
Python Heapq: 0.07715500006452203
Sorted : 9.360001422464848e-05
n = 920
Use Heapq: 9.440002031624317e-05
Python Heapq: 0.06573969998862594
Sorted : 0.00012450001668184996
n = 930
Use Heapq: 9.689992293715477e-05
Python Heapq: 0.06760239996947348
Sorted : 9.66999214142561e-05
n = 940
Use Heapq: 9.600003249943256e-05
Python Heapq: 0.07372559991199523
Sorted : 9.680003859102726e-05
n = 950
Use Heapq: 9.770004544407129e-05
Python Heapq: 0.07306570000946522
Sorted : 0.00011979998089373112
n = 960
Use Heapq: 9.980006143450737e-05
Python Heapq: 0.0771204000338912
Sorted : 0.00022829999215900898
n = 970
Use Heapq: 0.0001601999392732978
Python Heapq: 0.07493270002305508
Sorted : 0.00010840001050382853
n = 980
Use Heapq: 9.949994273483753e-05
Python Heapq: 0.07698320003692061
Sorted : 0.00010300008580088615
n = 990
Use Heapq: 9.979994501918554e-05
Python Heapq: 0.0848745999392122
Sorted : 0.00012620002962648869
n = 10000
Use Heapq: 0.003642000025138259
Python Heapq: 9.698883199947886
Sorted : 0.00107999995816499
n = 11000
Use Heapq: 0.0014836000045761466
Python Heapq: 10.537632800056599
Sorted : 0.0012236000038683414
n = 12000
Use Heapq: 0.001384599949233234
Python Heapq: 12.328411899972707
Sorted : 0.0013226999435573816
n = 13000
Use Heapq: 0.0020017001079395413
Python Heapq: 15.637207800056785
Sorted : 0.0015075999544933438
n = 14000
Use Heapq: 0.0017026999266818166
Python Heapq: 17.298848500009626
Sorted : 0.0016967999981716275
n = 15000
Use Heapq: 0.0017773000290617347
Python Heapq: 20.780625900020823
Sorted : 0.0017105999868363142
When n is big, Sorted costs a little time (sometimes even better than use heapq), but Python Heapq costs a lot of time.
Bulit-in function is faster than heapq because it is written in C, which is a compiled language.
When we are dealing with large data, we should use built-in function instead of heapq.sort() to improve the performance of our code. We must be wary of time complexity pitfalls when dealing with large data. Sometimes time complexity pitfalls are unavoidable, but we should try to avoid them as much as possible.
Hello, I'm mengqinyuan. I'm a student. I love to learn new things.
You can see my github: [MengQinYuan's Github][https://github.com/mengqinyuan]
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3