"If a worker wants to do his job well, he must first sharpen his tools." - Confucius, "The Analects of Confucius. Lu Linggong"
Front page > Programming > Process a large csv file with parallel processing #eg39

Process a large csv file with parallel processing #eg39

Published on 2024-11-02
Browse:787

A csv file stores a large amount orders data.

Process a large csv file with parallel processing #eg39

Use Java to process this file: Find orders whose amounts are between 3,000 and 5,000, group them by customers, and sum order amounts and count orders.

Process a large csv file with parallel processing #eg39
Write the following SPL statement:

=file("d:/OrdersBig.csv").cursor@mtc(;8).select(Amount>=3000 && Amount

cursor() function parses a large file that cannot fit into the memory; by default, it performs the serial computation. @m option enables multithreaded data retrieval; 8 is the number of parallel threads; @t option enables importing the first line as column titles; and @c option enables using comma as the separator.

Read How to Call a SPL Script in Java to find how to integrate SPL into a Java application.

This is one of the problems on StackOverflow. You can click on it to see that the conventional solution is quite complicated, but the SPL approach is really simple and efficient.

SPL open source address

Release Statement This article is reproduced at: https://dev.to/esproc_spl/process-a-large-csv-file-with-parallel-processing-eg38-40mo?1 If there is any infringement, please contact [email protected] to delete it
Latest tutorial More>

Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.

Copyright© 2022 湘ICP备2022001581号-3