How to Hash Large Files in Python without Memory Overconsumption? - Programming - luping.net

"If a worker wants to do his job well, he must first sharpen his tools." - Confucius, "The Analects of Confucius. Lu Linggong"

Online tools

Software tutorial

Site navigation

Programming

Front page > Programming > How to Hash Large Files in Python without Memory Overconsumption?

How to Hash Large Files in Python without Memory Overconsumption?

Published on 2024-11-06

Browse:996

How to Hash Large Files in Python without Memory Overconsumption?

Computing MD5 Hashes for Large Files in Python

Python's hashlib module provides a convenient interface for calculating cryptographic hashes. However, for exceptionally large files whose size exceeds system memory, using hashlib directly can be problematic.

Solution: Progressive Hashing

To address this issue, we employ progressive hashing by reading the file in manageable chunks. This approach ensures that the entire file content is hashed without consuming excessive memory. Here's a sample Python function that implements this technique:

import hashlib

def md5_for_file(f):
    block_size = 2**20
    md5 = hashlib.md5()
    while True:
        data = f.read(block_size)
        if not data:
            break
        md5.update(data)
    return md5.digest()

To calculate the MD5 hash of a large file, you can invoke the function as follows:

with open("filename", "rb") as f:
    md5 = md5_for_file(f)

Note on File Mode

Ensure that you open the file in binary mode with "rb" for accurate results. Using "r" can lead to incorrect calculations.

Additional Considerations

For convenience, an improved version of the function is presented below:

import hashlib
import os

def generate_file_md5(rootdir, filename):
    m = hashlib.md5()
    with open(os.path.join(rootdir, filename), "rb") as f:
        buf = f.read()
        while buf:
            m.update(buf)
            buf = f.read()
    return m.hexdigest()

Cross-checking the calculated hashes with external tools like jacksum is recommended to verify accuracy.

Release Statement This article is reprinted at: 1729388119 If there is any infringement, please contact [email protected] to delete it

Latest tutorial More>

How to Customize Default Values for Input in Python?
Customizing Default Values for Input in PythonIn Python, the default input command (input()) allows you to capture user input. However, you may want t...

Programming Published on 2024-11-06
Scrape the web with puppeteer!
Puppeteer full guide pt.1 Puppeteer: The Power Tool for Web Automation In today's fast-paced web development landscape, automatio...

Programming Published on 2024-11-06
Debug Mode: Django Docker Pycharm
Getting your local setup to debug the code you're writing properly takes more time than any dev would like to admit. And let's not forget this...

Programming Published on 2024-11-06
# Replace Generic Validation with Reusable Functions
JavaScript and TypeScript developers often find themselves writing the same conditions repeatedly. If you're a web developer, you've probably ...

Programming Published on 2024-11-06
Filtering Options in Effect-TS: A Practical Guide
Effect-TS provides various methods to filter values inside an Option, allowing you to apply transformations, predicates, or checks on the optional val...

Programming Published on 2024-11-06
How to Replace Elements in a Python List Based on Conditional Boolean Logic?
Python List Replacement with Conditional Boolean LogicGiven a list of values, you may desire to selectively replace specific elements with None based ...

Programming Published on 2024-11-06
How to Create a Column Based on If-Else-Else Conditions in Pandas?
Creating a Column with If-Else-Else Conditions in PandasTo create a new column based on an if-elif-else condition, there are two main approaches:Non-V...

Programming Published on 2024-11-06
How to Replace Values in a List Based on a Condition in Python?
Replacing Values in a List Based on a Condition in PythonIn Python, you may encounter scenarios where you need to manipulate elements within a list, s...

Programming Published on 2024-11-06
How to Create Static Binaries in Golang with Docker Scratch: CGO_ENABLED=0 and -ldflags?
Flags to Create Static Binaries in GolangWhen building a static binary in Golang using the Docker scratch base, it's essential to include both CGO...

Programming Published on 2024-11-06
Can I Append Rows to a CSV File Without Overwriting It?
Appending New Rows to Existing CSV Files in Python: A More Efficient ApproachWhen you need to update a CSV file with additional rows, you might consid...

Programming Published on 2024-11-06
Nestjs, Firebase, GCloud. How to Quickly Set Up an API Backend in TypeScript.
It's great that you decided to open this article. My name is Fedor, and I've been a full-stack developer on a permanent basis since the end of 2021. J...

Programming Published on 2024-11-06
How to Avoid jQuery Promises in Chained Functions While Maintaining Async Operations?
Dodging jQuery Promises in Chained FunctionsDespite recommendations to avoid jQuery promises, developers may face challenges when chaining async jQuer...

Programming Published on 2024-11-06
Why is the `repr` Method Crucial in Python?
Exploring the Significance of repr MethodWithin the context of Python programming, the repr method plays a pivotal role in representing an object as a...

Programming Published on 2024-11-06
Top React Design Patterns Every Developer Should Know for Scalable and Efficient Apps
As React continues to dominate the front-end ecosystem, mastering its design patterns can significantly enhance the efficiency and scalability of your...

Programming Published on 2024-11-06
Building an Infinite Scroll Component in React
Introduction We see infinite scrolling in applications and web pages especially social media that want us just to scroll. While mindlessly sc...

Programming Published on 2024-11-06

Classification More>

Learn japanese Learn Korean Learn Chinese Learn foreign language Game Common problem Technology peripherals AI Software tutorial Programming Article

Study Chinese

1 How do you say "walk" in Chinese? 走路 Chinese pronunciation, 走路 Chinese learning
2 How do you say "take a plane" in Chinese? 坐飞机 Chinese pronunciation, 坐飞机 Chinese learning
3 How do you say "take a train" in Chinese? 坐火车 Chinese pronunciation, 坐火车 Chinese learning
4 How do you say "take a bus" in Chinese? 坐车 Chinese pronunciation, 坐车 Chinese learning
5 How to say drive in Chinese? 开车 Chinese pronunciation, 开车 Chinese learning
6 How do you say swimming in Chinese? 游泳 Chinese pronunciation, 游泳 Chinese learning
7 How do you say ride a bicycle in Chinese? 骑自行车 Chinese pronunciation, 骑自行车 Chinese learning
8 How do you say hello in Chinese? 你好Chinese pronunciation, 你好Chinese learning
9 How do you say thank you in Chinese? 谢谢Chinese pronunciation, 谢谢Chinese learning
10 How to say goodbye in Chinese? 再见Chinese pronunciation, 再见Chinese learning

Tool More>

Image base64 decoding

Unicode encoding

JS obfuscation encryption compression

URL hexadecimal encryption tool

UTF-8 encoding conversion tool

Online Ascii encoding and decoding tools

MD5 encryption tool

Hash/Hash text online encryption and decryption tool

Online SHA encryption

Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.

Copyright© 2022 湘ICP备2022001581号-3