Benchmarks comparison table:

Real test memory consumption comparison table:

The fast method (fetch data using slices) is not the fastest in benchmarks, but it’s fastest and the least gluttonous in real tests.

Analyzing the results, we can make two important conclusions, in my opinion :

Happy coding ?

","image":"http://www.luping.net/uploads/20241014/1728883446670caaf612a5c.jpg","datePublished":"2024-11-04T09:55:50+08:00","dateModified":"2024-11-04T09:55:50+08:00","author":{"@type":"Person","name":"luping.net","url":"https://www.luping.net/articlelist/0_1.html"}}
”工欲善其事,必先利其器。“—孔子《论语.录灵公》
首页 > 编程 > Go 中获取 SQL 数据的三种方式

Go 中获取 SQL 数据的三种方式

发布于2024-11-04
浏览:419

Three ways to fetch SQL data in Go

Introduction

In this article, we explore three different approaches to fetching data from SQL databases in Golang: the standard method using JSON marshaling, a dynamic approach using maps, and an optimized method that avoids unnecessary overhead.

In my practice, there were very frequent cases where I had to retrieve data from a database with an unknown structure beforehand. This often happens in the e-commerce industry.

We will consider 3 methods of retrieving data, along with their advantages and disadvantages.

You can get all the code of the examples from the repository
https://github.com/oleg578/dbf

Data preparing

We will use the database MariaDB, because I just like this database. It has never failed me in 10 years of practice with a load of up to 500 million transactions per day with a data volume of up to 4 terabytes in e-commerce. And it’s really faster than MySQL.

Code for create MariaDB docker instance and seed test data are in https://github.com/oleg578/dbf/tree/main/db

All examples are tested on Ubuntu 24.04.1 LTS (Noble Numbat) with 11th Gen Intel Core i5–1135G7 and 16GiB RAM.

Standard way

The standard way is trivial — we fetch rows from database into array and then Marshal it into JSON struct

rs, errRs := con.QueryContext(ctx, q, numbRows)

...

 for rs.Next() {
  var dmy = Dummy{}
  if err := rs.Scan(
   &dmy.ID,
   &dmy.Product,
   &dmy.Description,
   &dmy.Price,
   &dmy.Qty,
   &dmy.Date); err != nil {
   panic(err)
  }
  result = append(result, dmy)
 }
 if err := rs.Err(); err != nil {
  panic(err)
 }
 msg, errRTJ := json.Marshal(result)
 if errRTJ != nil {
  panic(errRTJ)
 }

...

_, errOut := os.Stdout.Write(msg)
...

What about speed?
Test result:

% ./db2json_struct 10000000 1>/dev/null
Elapsed time: 12631 ms, HeapAlloc = 5400.725 MB, Sys = 7099.447 MB
10000000 records read

Let’s just remember this as a starting point.

Using map way

Next we consider fetching unknown list of columns — like “SELECT * FROM …”.
The sequence of actions is simple.
Each record will be represent as map[string]interface{}, then

// create result slice
// numbRows is number of rows in result
outRows := make([]map[string]interface{}, 0, numbRows) 

We will not serialize each record to save program execution time, and our actions are not complex.
See https://github.com/oleg578/dbf/tree/mapping/example

After fetch rows from database, we will request an slice of columns

columns, err := rs.Columns()

Create slice of values and slice of pointers for data

values := make([]interface{}, len(columns))
valuePointers := make([]interface{}, len(values))
  for i := range values {
    valuePointers[i] = &values[i]
  }

Then for each row we get the map which represent model — https://github.com/oleg578/dbf/blob/mapping/sql2json.go

func Row2Map(columns []string, values []interface{}) (map[string]interface{}, error) {
 rowMap := make(map[string]interface{})
 if len(columns) != len(values) {
  return nil, errors.New("columns and values length not equal")
 }
 for i, col := range columns {
  rowMap[col] = assignCellValue(values[i]) // we will help to typify the value
 }
 return rowMap, nil
}

func assignCellValue(val interface{}) interface{} {
 if b, ok := val.([]byte); ok {
  if floatValue, err := strconv.ParseFloat(string(b), 64); err == nil {
   return floatValue
  }
  return string(b)
 }
 return val
}

Note:
You may want to pay attention to the function assignCellValue — its purpose is to pre-assign a type to column values. Simple trick — this function tells the JSON encoder which values ​​to accept as non-numeric.
Benchmark:
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkRow2Map
BenchmarkRow2Map-8 7376358 159.0 ns/op 336 B/op 2 allocs/op

Finally, execution time of our example — https://github.com/oleg578/dbf/blob/mapping/example/main.go
% ./db2json_map 10000000 1>/dev/null
Elapsed time: 12152 ms, HeapAlloc = 8966.899 MB, Sys = 12248.287 MB
10000000 records read

What can we say about these results?

Benefits — We can obtain structures that are not predefined.

Disadvantages — there are several.
But we pay for this with memory consumption.
It’s easy to see that such an implementation consumes 1.5 times more memory.

The second disadvantage which can give us a headache are data types. We must define what we will access as numeric data, and what we will define as string. In current we use AssignValue auxiliary function:

func assignCellValue(val interface{}) interface{} {
 if b, ok := val.([]byte); ok {
  if floatValue, err := strconv.ParseFloat(string(b), 64); err == nil {
   return floatValue
  }
  return string(b)
 }
 return val
}

If value can be represented as float — then we define it as interface value (json library will define it as numeric or null), else as string.

The third disadvantage is map structure property — we can’t guarantee order of fields in json. But this disadvantage may be unimportant.

We cannot say that the result we got is satisfactory.He devours memory. On small datasets this may be acceptable, but this can negatively affect processing with large amounts of data.
https://github.com/oleg578/dbf/tree/mapping

What can we improve?

Let’s look at the weak points in our algorithm of actions when we use maps.

This a creating map[string]interface{} for each row — this very expensive operation in terms of resources and processor time;
And another too expensive operation — JSON marshaling the final slice which can be very big.

It’s time to think about improvement

Let’s play with data structures

When we get data from a table, the order of columns is always defined by the database. Then we can use 2 coordinated slices — columns and values. Then we can refuse a map type and use slice.

The next trick — we will request data as byte slices and list of columns will fetch as list of ColumnTypes — it will help us in future.

columns, err := rs.ColumnTypes()
 if err != nil {
  log.Fatalf("fault get column types: %v", err)
 }
 values := make([]sql.RawBytes, len(columns))
 valuePointers := make([]interface{}, len(values))
 for i := range values {
  valuePointers[i] = &values[i]
 }

So, we are ready to fetch data in a new way and it’s time to serialize this data.
The JSON library is heavy, but we can make serialization easier.

Columns are simple token-level data usually — we can turn them into JSON strings, then we will just escape special symbols by our escape function -

func escape(in []byte) []byte {
 var out bytes.Buffer
 for _, b := range in {
  switch b {
  case '\n', '\r', '\t', '\b', '\f', '\\', '"':
   out.WriteByte('\\')
   out.WriteByte(b)
  case '/':
   out.WriteByte('\\')
   out.WriteByte(b)
  default:
   if b 


Now let’s think about data types. We can determine the type of column (using sql ColumnType structure sql.ColumnType)

func isDigit(c *sql.ColumnType) bool {
 switch c.DatabaseTypeName() {
 case "TINYINT":
  return true
 case "SMALLINT":
  return true
 case "MEDIUMINT":
  return true
 case "BIGINT":
  return true
 case "INT":
  return true
 case "INT1":
  return true
 case "INT2":
  return true
 case "INT3":
  return true
 case "INT4":
  return true
 case "INT8":
  return true
 case "BOOL":
  return true
 case "BOOLEAN":
  return true
 case "DECIMAL":
  return true
 case "DEC":
  return true
 case "NUMERIC":
  return true
 case "FIXED":
  return true
 case "NUMBER":
  return true
 case "FLOAT":
  return true
 case "DOUBLE":
  return true
 default:
  return false
 }
}

And finally, let’s apply primitive serialization:

func Row2Json(columns []*sql.ColumnType, values []sql.RawBytes) (string, error) {
 if len(values) == 0 {
  return "", errors.New("no data in values")
 }
 if len(columns) != len(values) {
  return "", errors.New("columns and values length not equal")
 }
 var buff strings.Builder
 buff.WriteByte('{')
 for i, val := range values {
  buff.WriteByte('"')
  buff.WriteString(columns[i].Name())
  buff.WriteByte('"')
  buff.WriteByte(':')
  if len(val) > 0 {
   if !isDigit(columns[i]) {
    buff.WriteByte('"')
   }
   buff.Write(escape(val))
   if !isDigit(columns[i]) {
    buff.WriteByte('"')
   }
  } else {
   buff.WriteString("null")
  }
  if i != len(values)-1 {
   buff.WriteByte(',')
  }
 }
 buff.WriteByte('}')

 return buff.String(), nil
}

Benchmark:
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkRow2Json
BenchmarkRow2Json-8 2881545 385.3 ns/op 440 B/op 9 allocs/op

Use UNIX way to output

We will use UNIX way of output of our program — i.e. we will not create an output slice — we will out data into standard output stream instead — then we can use the output as a standard pipe in UNIX:

// create new buffer
 writer := bufio.NewWriter(os.Stdout)
 defer writer.Flush()

 writer.WriteByte('[') //start print array of data
 ...
 msg, errMsg := dbf.Row2Json(columns, values)
 ...
 if _, err := writer.WriteString(msg); err != nil {
   log.Fatalf("fault write row: %v", err)
  } // write serialized row
 ...
 writer.WriteByte(']') // finish serialized slice

In success execution we will get something like:

%  ./db2json_ds 3 2>/dev/null | jq
[
  {
    "id": 1,
    "product": "product_1",
    "description": null,
    "price": 1.23,
    "qty": 10,
    "date": "2021-01-01 00:00:00"
  },
  {
    "id": 2,
    "product": "product_2",
    "description": null,
    "price": 2.23,
    "qty": 20,
    "date": "2021-01-01 00:00:00"
  },
  {
    "id": 3,
    "product": "product_3",
    "description": null,
    "price": 3.23,
    "qty": 30,
    "date": "2021-01-01 00:00:00"
  }
]

It’s a time of the moment of truth — fetch 10 million records:

% ./db2json_ds 10000000 1>/dev/null
Elapsed time: 11894 ms, HeapAlloc = 2.436 MB, Sys = 11.710 MB
10000000 records read
Let’s compare with the starting point:

  • Execution time — 11.647 seconds instead 12.631 seconds;
  • Memory consumption — 11.710 MB instead 7099.447 MB. So, up to 10 percent faster and 600 times less memory consumption.

Conclusion

Let’s examine the broader scope of the tests.
Test comparison table (the length of result json file is 1.2Gb)

Benchmarks comparison table:

Real test memory consumption comparison table:

The fast method (fetch data using slices) is not the fastest in benchmarks, but it’s fastest and the least gluttonous in real tests.

Analyzing the results, we can make two important conclusions, in my opinion :

  • simple solutions always work more effectively;
  • real tests are always more important than synthetic ones.

Happy coding ?

版本声明 本文转载于:https://dev.to/oleg578/three-ways-to-fetch-sql-data-in-go-44pe?1如有侵犯,请联系[email protected]删除
最新教程 更多>
  • 为什么在 Java 中使用相同的种子时会得到相同的随机数?
    为什么在 Java 中使用相同的种子时会得到相同的随机数?
    具有固定种子的Java随机数:为什么输出相同?在您的代码中,您定义了一种使用以下命令生成随机数的方法指定的种子。但是,您会注意到,当您提供相同的种子时,所有 100 个生成的数字都是相同的。此行为是预期的,因为在 Random 构造函数中使用相同的种子会产生可预测的数字序列。种子是初始化随机数生成器...
    编程 发布于2024-11-09
  • jQuery Chaining 如何简化开发并提高代码效率?
    jQuery Chaining 如何简化开发并提高代码效率?
    理解 jQuery 中的对象和方法链接在 jQuery 中,链接允许在单个语句中串联多个 jQuery 方法。这使开发人员能够简化代码并轻松执行复杂的操作。链接的基本原理涉及每个 jQuery 方法的返回值。当调用 jQuery 方法时,它通常返回一个表示所选元素的 jQuery 对象。这允许在同一...
    编程 发布于2024-11-09
  • 除了“if”语句之外:还有哪些地方可以在不进行强制转换的情况下使用具有显式“bool”转换的类型?
    除了“if”语句之外:还有哪些地方可以在不进行强制转换的情况下使用具有显式“bool”转换的类型?
    无需强制转换即可上下文转换为 bool您的类定义了对 bool 的显式转换,使您能够在条件语句中直接使用其实例“t”。然而,这种显式转换提出了一个问题:“t”在哪里可以在不进行强制转换的情况下用作 bool?上下文转换场景C 标准指定了四种值可以根据上下文转换为 bool 的主要场景:语句:if、w...
    编程 发布于2024-11-09
  • Bootstrap 4 Beta 中的列偏移发生了什么?
    Bootstrap 4 Beta 中的列偏移发生了什么?
    Bootstrap 4 Beta:列偏移的删除和恢复Bootstrap 4 在其 Beta 1 版本中引入了重大更改柱子偏移了。然而,随着 Beta 2 的后续发布,这些变化已经逆转。从 offset-md-* 到 ml-auto在 Bootstrap 4 Beta 1 中, offset-md-*...
    编程 发布于2024-11-09
  • Hono.js 基准测试:Node.js、Deno 和 Bun — 哪个最快?
    Hono.js 基准测试:Node.js、Deno 和 Bun — 哪个最快?
    Deno 2.0 刚刚发布,并声称比 Bun 和 Node.js 更快,同样,Bun 也声称更快。这引起了我的兴趣,所以我决定测试它们的性能,看看它们在现实场景中的比较。 为了公平比较,我需要选择一个与所有三种 JavaScript 运行时环境(Node.js、Deno 和 Bun)兼容的框架。这就...
    编程 发布于2024-11-09
  • 大批
    大批
    方法是可以在对象上调用的 fns 数组是对象,因此它们在 JS 中也有方法。 slice(begin):将数组的一部分提取到新数组中,而不改变原始数组。 let arr = ['a','b','c','d','e']; // Usecase: Extract till index p...
    编程 发布于2024-11-09
  • 何时应该使用 Tkinter 的 Entry Get 函数来有效检索输入?
    何时应该使用 Tkinter 的 Entry Get 函数来有效检索输入?
    Tkinter Entry 的 Get 函数:深入探讨其功能和用法在 Tkinter 中,Entry 小部件通常用于收集用户输入以进一步收集用户输入然而,与 Entry 关联的 get() 函数通常无法产生所需的结果,这可能会让开发人员感到困惑。本文深入探讨 get() 的概念,全面了解其执行和应用...
    编程 发布于2024-11-09
  • 在 Go 中使用 WebSocket 进行实时通信
    在 Go 中使用 WebSocket 进行实时通信
    构建需要实时更新的应用程序(例如聊天应用程序、实时通知或协作工具)需要一种比传统 HTTP 更快、更具交互性的通信方法。这就是 WebSockets 发挥作用的地方!今天,我们将探讨如何在 Go 中使用 WebSocket,以便您可以向应用程序添加实时功能。 在这篇文章中,我们将介绍: WebSoc...
    编程 发布于2024-11-09
  • 如何防止 Pandas 在保存 CSV 时添加索引列?
    如何防止 Pandas 在保存 CSV 时添加索引列?
    避免使用 Pandas 保存的 CSV 中的索引列使用 Pandas 进行修改后保存 csv 文件时,默认行为是包含索引列。为了避免这种情况,可以在使用 to_csv() 方法时将索引参数设置为 False。为了详细说明,请考虑以下命令序列:pd.read_csv('C:/Path/to/file....
    编程 发布于2024-11-09
  • 如何克服 PHP 中日期表示的 2038 限制?
    如何克服 PHP 中日期表示的 2038 限制?
    PHP 中的日期表示:克服 2038 年限制虽然 PHP 的原生日期函数在 2038 年有一个截止日期,但还有其他方法处理超出此限制的日期。其中一种方法是仅存储年、月和日,而忽略小时、分钟、秒和毫秒部分。通过丢弃这些附加时间部分,可以显着扩展可表示日期的范围。这是因为这些组件中的每一个都占用了 PH...
    编程 发布于2024-11-09
  • 如何在 Go (Gorilla) 中向特定客户端发送有针对性的 Websocket 更新?
    如何在 Go (Gorilla) 中向特定客户端发送有针对性的 Websocket 更新?
    在 Go (Gorilla) 中向特定客户端发送 Websocket 更新尽管是 Go 新手,但您寻求有关实现 Websocket 通信的指导您的预输入项目。您已尝试利用 Gorilla 的 GitHub 存储库中的示例,但在理解如何识别特定客户端并针对 websocket 更新进行定位方面遇到了挑...
    编程 发布于2024-11-09
  • 使用swoole作为基于ESP6的脚本可编程控制器的云端物联网网关框架
    使用swoole作为基于ESP6的脚本可编程控制器的云端物联网网关框架
    脚本可编程控制器的本地功能已经基本完成,开始实现远程相关功能。 远程系统整体架构如下: 使用ESP8266的SDK实现tcp服务器和tcp客户端。 在tcp服务器的基础上编写http协议解析代码,设计简单的http服务器,处理与浏览器的数据交互,包括内置网页的下载,并使用ajax技术获取状态并保存数...
    编程 发布于2024-11-09
  • 如何修复 macOS 上 Django 中的“配置不正确:加载 MySQLdb 模块时出错”?
    如何修复 macOS 上 Django 中的“配置不正确:加载 MySQLdb 模块时出错”?
    MySQL配置不正确:相对路径的问题在Django中运行python manage.py runserver时,可能会遇到以下错误:ImproperlyConfigured: Error loading MySQLdb module: dlopen(/Library/Python/2.7/site-...
    编程 发布于2024-11-09
  • 为什么在 Java 的 Random 类中设置种子会返回相同的数字?
    为什么在 Java 的 Random 类中设置种子会返回相同的数字?
    Java随机数生成:为什么设置种子返回相同的数字?尽管将Random类的种子设置为特定值,但随机数生成器始终返回相同的数字。让我们探讨一下可能导致此问题的原因。了解 Random 类和种子初始化Java Random 类旨在生成伪随机数。默认情况下,它使用其内部时钟作为种子值,使其生成相对可预测的数...
    编程 发布于2024-11-09
  • 如何克服使用反射设置结构体字段值时 SetCan() 总是返回 False 的问题?
    如何克服使用反射设置结构体字段值时 SetCan() 总是返回 False 的问题?
    使用结构体的 SetString 探索反射反射提供了动态操作 Go 结构的强大工具。在此示例中,我们在尝试使用反射设置结构体字段的值时遇到一个常见问题:CanSet() 始终返回 false。这种障碍阻止了字段修改,使我们陷入困境。识别陷阱提供的代码片段突出显示了两个基本错误:传递值而不是指针: 按...
    编程 发布于2024-11-09

免责声明: 提供的所有资源部分来自互联网,如果有侵犯您的版权或其他权益,请说明详细缘由并提供版权或权益证明然后发到邮箱:[email protected] 我们会第一时间内为您处理。

Copyright© 2022 湘ICP备2022001581号-3