Efficient CSV Read and Write in Go
The task of reading and writing a CSV file efficiently in Go involves optimizing the I/O operations. Consider the following code snippet that reads a CSV file, performs calculations on the data, and writes the results to a new CSV file:
package main
import (
"encoding/csv"
"fmt"
"log"
"os"
"strconv"
)
func ReadRow(r *csv.Reader) (map[string]string, error) {
record, err := r.Read()
if err == io.EOF {
return nil, io.EOF
}
if err != nil {
return nil, err
}
m := make(map[string]string)
for i, v := range record {
m[strconv.Itoa(i)] = v
}
return m, nil
}
func main() {
// load data csv
csvFile, err := os.Open("./path/to/datafile.csv")
if err != nil {
log.Fatal(err)
}
defer csvFile.Close()
// create channel to process rows concurrently
recCh := make(chan map[string]string, 10)
go func() {
defer close(recCh)
r := csv.NewReader(csvFile)
if _, err := r.Read(); err != nil { //read header
log.Fatal(err)
}
for {
rec, err := ReadRow(r)
if err == io.EOF {
return // no more rows to read
}
if err != nil {
log.Fatal(err)
}
recCh <- rec
}
}()
// write results to a new csv
outfile, err := os.Create("./where/to/write/resultsfile.csv"))
if err != nil {
log.Fatal("Unable to open output")
}
defer outfile.Close()
writer := csv.NewWriter(outfile)
for record := range recCh {
time := record["0"]
value := record["1"]
// get float values
floatValue, err := strconv.ParseFloat(value, 64)
if err != nil {
log.Fatal("Record: %v, Error: %v", floatValue, err)
}
// calculate scores; THIS EXTERNAL METHOD CANNOT BE CHANGED
score := calculateStuff(floatValue)
valueString := strconv.FormatFloat(floatValue, 'f', 8, 64)
scoreString := strconv.FormatFloat(prob, 'f', 8, 64)
//fmt.Printf("Result: %v\n", []string{time, valueString, scoreString})
writer.Write([]string{time, valueString, scoreString})
}
writer.Flush()
}
The key improvement in this code is the use of concurrency to process CSV rows one at a time. By using a channel, we can read rows from the input CSV file in a goroutine and write the results to the output CSV file in the main routine concurrently. This approach avoids loading the entire file into memory, which can significantly reduce the memory consumption and improve the performance.
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3