Benchmarks comparison table:

Real test memory consumption comparison table:

The fast method (fetch data using slices) is not the fastest in benchmarks, but it’s fastest and the least gluttonous in real tests.

Analyzing the results, we can make two important conclusions, in my opinion :

Happy coding ?

","image":"http://www.luping.net/uploads/20241014/1728883446670caaf612a5c.jpg","datePublished":"2024-11-04T09:55:50+08:00","dateModified":"2024-11-04T09:55:50+08:00","author":{"@type":"Person","name":"luping.net","url":"https://www.luping.net/articlelist/0_1.html"}}
」工欲善其事,必先利其器。「—孔子《論語.錄靈公》
首頁 > 程式設計 > Go 中取得 SQL 資料的三種方式

Go 中取得 SQL 資料的三種方式

發佈於2024-11-04
瀏覽:468

Three ways to fetch SQL data in Go

Introduction

In this article, we explore three different approaches to fetching data from SQL databases in Golang: the standard method using JSON marshaling, a dynamic approach using maps, and an optimized method that avoids unnecessary overhead.

In my practice, there were very frequent cases where I had to retrieve data from a database with an unknown structure beforehand. This often happens in the e-commerce industry.

We will consider 3 methods of retrieving data, along with their advantages and disadvantages.

You can get all the code of the examples from the repository
https://github.com/oleg578/dbf

Data preparing

We will use the database MariaDB, because I just like this database. It has never failed me in 10 years of practice with a load of up to 500 million transactions per day with a data volume of up to 4 terabytes in e-commerce. And it’s really faster than MySQL.

Code for create MariaDB docker instance and seed test data are in https://github.com/oleg578/dbf/tree/main/db

All examples are tested on Ubuntu 24.04.1 LTS (Noble Numbat) with 11th Gen Intel Core i5–1135G7 and 16GiB RAM.

Standard way

The standard way is trivial — we fetch rows from database into array and then Marshal it into JSON struct

rs, errRs := con.QueryContext(ctx, q, numbRows)

...

 for rs.Next() {
  var dmy = Dummy{}
  if err := rs.Scan(
   &dmy.ID,
   &dmy.Product,
   &dmy.Description,
   &dmy.Price,
   &dmy.Qty,
   &dmy.Date); err != nil {
   panic(err)
  }
  result = append(result, dmy)
 }
 if err := rs.Err(); err != nil {
  panic(err)
 }
 msg, errRTJ := json.Marshal(result)
 if errRTJ != nil {
  panic(errRTJ)
 }

...

_, errOut := os.Stdout.Write(msg)
...

What about speed?
Test result:

% ./db2json_struct 10000000 1>/dev/null
Elapsed time: 12631 ms, HeapAlloc = 5400.725 MB, Sys = 7099.447 MB
10000000 records read

Let’s just remember this as a starting point.

Using map way

Next we consider fetching unknown list of columns — like “SELECT * FROM …”.
The sequence of actions is simple.
Each record will be represent as map[string]interface{}, then

// create result slice
// numbRows is number of rows in result
outRows := make([]map[string]interface{}, 0, numbRows) 

We will not serialize each record to save program execution time, and our actions are not complex.
See https://github.com/oleg578/dbf/tree/mapping/example

After fetch rows from database, we will request an slice of columns

columns, err := rs.Columns()

Create slice of values and slice of pointers for data

values := make([]interface{}, len(columns))
valuePointers := make([]interface{}, len(values))
  for i := range values {
    valuePointers[i] = &values[i]
  }

Then for each row we get the map which represent model — https://github.com/oleg578/dbf/blob/mapping/sql2json.go

func Row2Map(columns []string, values []interface{}) (map[string]interface{}, error) {
 rowMap := make(map[string]interface{})
 if len(columns) != len(values) {
  return nil, errors.New("columns and values length not equal")
 }
 for i, col := range columns {
  rowMap[col] = assignCellValue(values[i]) // we will help to typify the value
 }
 return rowMap, nil
}

func assignCellValue(val interface{}) interface{} {
 if b, ok := val.([]byte); ok {
  if floatValue, err := strconv.ParseFloat(string(b), 64); err == nil {
   return floatValue
  }
  return string(b)
 }
 return val
}

Note:
You may want to pay attention to the function assignCellValue — its purpose is to pre-assign a type to column values. Simple trick — this function tells the JSON encoder which values ​​to accept as non-numeric.
Benchmark:
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkRow2Map
BenchmarkRow2Map-8 7376358 159.0 ns/op 336 B/op 2 allocs/op

Finally, execution time of our example — https://github.com/oleg578/dbf/blob/mapping/example/main.go
% ./db2json_map 10000000 1>/dev/null
Elapsed time: 12152 ms, HeapAlloc = 8966.899 MB, Sys = 12248.287 MB
10000000 records read

What can we say about these results?

Benefits — We can obtain structures that are not predefined.

Disadvantages — there are several.
But we pay for this with memory consumption.
It’s easy to see that such an implementation consumes 1.5 times more memory.

The second disadvantage which can give us a headache are data types. We must define what we will access as numeric data, and what we will define as string. In current we use AssignValue auxiliary function:

func assignCellValue(val interface{}) interface{} {
 if b, ok := val.([]byte); ok {
  if floatValue, err := strconv.ParseFloat(string(b), 64); err == nil {
   return floatValue
  }
  return string(b)
 }
 return val
}

If value can be represented as float — then we define it as interface value (json library will define it as numeric or null), else as string.

The third disadvantage is map structure property — we can’t guarantee order of fields in json. But this disadvantage may be unimportant.

We cannot say that the result we got is satisfactory.He devours memory. On small datasets this may be acceptable, but this can negatively affect processing with large amounts of data.
https://github.com/oleg578/dbf/tree/mapping

What can we improve?

Let’s look at the weak points in our algorithm of actions when we use maps.

This a creating map[string]interface{} for each row — this very expensive operation in terms of resources and processor time;
And another too expensive operation — JSON marshaling the final slice which can be very big.

It’s time to think about improvement

Let’s play with data structures

When we get data from a table, the order of columns is always defined by the database. Then we can use 2 coordinated slices — columns and values. Then we can refuse a map type and use slice.

The next trick — we will request data as byte slices and list of columns will fetch as list of ColumnTypes — it will help us in future.

columns, err := rs.ColumnTypes()
 if err != nil {
  log.Fatalf("fault get column types: %v", err)
 }
 values := make([]sql.RawBytes, len(columns))
 valuePointers := make([]interface{}, len(values))
 for i := range values {
  valuePointers[i] = &values[i]
 }

So, we are ready to fetch data in a new way and it’s time to serialize this data.
The JSON library is heavy, but we can make serialization easier.

Columns are simple token-level data usually — we can turn them into JSON strings, then we will just escape special symbols by our escape function -

func escape(in []byte) []byte {
 var out bytes.Buffer
 for _, b := range in {
  switch b {
  case '\n', '\r', '\t', '\b', '\f', '\\', '"':
   out.WriteByte('\\')
   out.WriteByte(b)
  case '/':
   out.WriteByte('\\')
   out.WriteByte(b)
  default:
   if b 


Now let’s think about data types. We can determine the type of column (using sql ColumnType structure sql.ColumnType)

func isDigit(c *sql.ColumnType) bool {
 switch c.DatabaseTypeName() {
 case "TINYINT":
  return true
 case "SMALLINT":
  return true
 case "MEDIUMINT":
  return true
 case "BIGINT":
  return true
 case "INT":
  return true
 case "INT1":
  return true
 case "INT2":
  return true
 case "INT3":
  return true
 case "INT4":
  return true
 case "INT8":
  return true
 case "BOOL":
  return true
 case "BOOLEAN":
  return true
 case "DECIMAL":
  return true
 case "DEC":
  return true
 case "NUMERIC":
  return true
 case "FIXED":
  return true
 case "NUMBER":
  return true
 case "FLOAT":
  return true
 case "DOUBLE":
  return true
 default:
  return false
 }
}

And finally, let’s apply primitive serialization:

func Row2Json(columns []*sql.ColumnType, values []sql.RawBytes) (string, error) {
 if len(values) == 0 {
  return "", errors.New("no data in values")
 }
 if len(columns) != len(values) {
  return "", errors.New("columns and values length not equal")
 }
 var buff strings.Builder
 buff.WriteByte('{')
 for i, val := range values {
  buff.WriteByte('"')
  buff.WriteString(columns[i].Name())
  buff.WriteByte('"')
  buff.WriteByte(':')
  if len(val) > 0 {
   if !isDigit(columns[i]) {
    buff.WriteByte('"')
   }
   buff.Write(escape(val))
   if !isDigit(columns[i]) {
    buff.WriteByte('"')
   }
  } else {
   buff.WriteString("null")
  }
  if i != len(values)-1 {
   buff.WriteByte(',')
  }
 }
 buff.WriteByte('}')

 return buff.String(), nil
}

Benchmark:
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkRow2Json
BenchmarkRow2Json-8 2881545 385.3 ns/op 440 B/op 9 allocs/op

Use UNIX way to output

We will use UNIX way of output of our program — i.e. we will not create an output slice — we will out data into standard output stream instead — then we can use the output as a standard pipe in UNIX:

// create new buffer
 writer := bufio.NewWriter(os.Stdout)
 defer writer.Flush()

 writer.WriteByte('[') //start print array of data
 ...
 msg, errMsg := dbf.Row2Json(columns, values)
 ...
 if _, err := writer.WriteString(msg); err != nil {
   log.Fatalf("fault write row: %v", err)
  } // write serialized row
 ...
 writer.WriteByte(']') // finish serialized slice

In success execution we will get something like:

%  ./db2json_ds 3 2>/dev/null | jq
[
  {
    "id": 1,
    "product": "product_1",
    "description": null,
    "price": 1.23,
    "qty": 10,
    "date": "2021-01-01 00:00:00"
  },
  {
    "id": 2,
    "product": "product_2",
    "description": null,
    "price": 2.23,
    "qty": 20,
    "date": "2021-01-01 00:00:00"
  },
  {
    "id": 3,
    "product": "product_3",
    "description": null,
    "price": 3.23,
    "qty": 30,
    "date": "2021-01-01 00:00:00"
  }
]

It’s a time of the moment of truth — fetch 10 million records:

% ./db2json_ds 10000000 1>/dev/null
Elapsed time: 11894 ms, HeapAlloc = 2.436 MB, Sys = 11.710 MB
10000000 records read
Let’s compare with the starting point:

  • Execution time — 11.647 seconds instead 12.631 seconds;
  • Memory consumption — 11.710 MB instead 7099.447 MB. So, up to 10 percent faster and 600 times less memory consumption.

Conclusion

Let’s examine the broader scope of the tests.
Test comparison table (the length of result json file is 1.2Gb)

Benchmarks comparison table:

Real test memory consumption comparison table:

The fast method (fetch data using slices) is not the fastest in benchmarks, but it’s fastest and the least gluttonous in real tests.

Analyzing the results, we can make two important conclusions, in my opinion :

  • simple solutions always work more effectively;
  • real tests are always more important than synthetic ones.

Happy coding ?

版本聲明 本文轉載於:https://dev.to/oleg578/three-ways-to-fetch-sql-data-in-go-44pe?1如有侵犯,請聯絡[email protected]刪除
最新教學 更多>
  • 如何在 Go (Gorilla) 中向特定客戶端發送有針對性的 Websocket 更新?
    如何在 Go (Gorilla) 中向特定客戶端發送有針對性的 Websocket 更新?
    在Go (Gorilla) 中向特定客戶端發送Websocket 更新儘管是Go 新手,但您尋求有關實現Websocket 通信的指導您的預輸入項目。您已嘗試利用 Gorilla 的 GitHub 儲存庫中的範例,但在理解如何識別特定客戶端並針對 websocket 更新進行定位方面遇到了挑戰。 要...
    程式設計 發佈於2024-11-09
  • 大批
    大批
    方法是可以在物件上呼叫的 fns 數組是對象,因此它們在 JS 中也有方法。 slice(begin):將陣列的一部分提取到新數組中,而不改變原始數組。 let arr = ['a','b','c','d','e']; // Usecase: Extract till index ...
    程式設計 發佈於2024-11-09
  • 使用swoole作為基於ESP6的腳本可程式控制器的雲端物聯網閘道框架
    使用swoole作為基於ESP6的腳本可程式控制器的雲端物聯網閘道框架
    腳本可程式控制器的本機功能基本上已完成,開始實現遠端相關功能。 遠端系統整體架構如下: 使用ESP8266的SDK實作tcp伺服器和tcp客戶端。 在tcp伺服器的基礎上編寫http協議解析程式碼,設計簡單的http伺服器,處理與瀏覽器的資料交互,包括內建網頁的下載,並使用ajax技術獲取狀態並...
    程式設計 發佈於2024-11-09
  • Bootstrap 4 Beta 中的列偏移發生了什麼事?
    Bootstrap 4 Beta 中的列偏移發生了什麼事?
    Bootstrap 4 Beta:列偏移的刪除和恢復Bootstrap 4 在其Beta 1 版本中引入了重大更改柱子偏移了。然而,隨著 Beta 2 的後續發布,這些變化已經逆轉。 從 offset-md-* 到 ml-auto在 Bootstrap 4 Beta 1 中, offset-md-*...
    程式設計 發佈於2024-11-09
  • 為什麼在 Java 的 Random 類別中設定種子會傳回相同的數字?
    為什麼在 Java 的 Random 類別中設定種子會傳回相同的數字?
    Java隨機數產生:為什麼設定種子會回傳相同的數字? 儘管將Random類別的種子設定為特定值,但隨機數產生器始終會傳回相同的數字。讓我們探討一下可能導致此問題的原因。 了解 Random 類別和種子初始化Java Random 類別旨在產生偽隨機數。預設情況下,它使用其內部時鐘作為種子值,使其產生...
    程式設計 發佈於2024-11-09
  • 在 Go 中使用 WebSocket 進行即時通信
    在 Go 中使用 WebSocket 進行即時通信
    构建需要实时更新的应用程序(例如聊天应用程序、实时通知或协作工具)需要一种比传统 HTTP 更快、更具交互性的通信方法。这就是 WebSockets 发挥作用的地方!今天,我们将探讨如何在 Go 中使用 WebSocket,以便您可以向应用程序添加实时功能。 在这篇文章中,我们将介绍: WebSoc...
    程式設計 發佈於2024-11-09
  • 如何克服使用反射設定結構體欄位值時 SetCan() 總是傳回 False 的問題?
    如何克服使用反射設定結構體欄位值時 SetCan() 總是傳回 False 的問題?
    使用結構體的 SetString 探索反射反射提供了動態操作 Go 結構的強大工具。在此範例中,我們在嘗試使用反射來設定結構體欄位的值時遇到一個常見問題:CanSet() 始終傳回 false。這種障礙阻止了字段修改,使我們陷入困境。 識別陷阱提供的程式碼片段突顯了兩個基本錯誤:傳遞值而非指標: ...
    程式設計 發佈於2024-11-09
  • 為什麼 MySQL 中帶有子查詢的「IN」查詢很慢,如何提升效能?
    為什麼 MySQL 中帶有子查詢的「IN」查詢很慢,如何提升效能?
    MySQL 中帶有子查詢的緩慢「IN」查詢當使用子查詢時,使用「IN」運算子的MySQL查詢可能會表現出顯著的效能下降檢索「IN」子句的值很複雜。在這種情況下,用明確值取代子查詢結果會顯著縮短執行時間。 要了解此行為的原因,需要注意的是,每次評估「IN」查詢時,MySQL 都會執行子查詢。在提供的範...
    程式設計 發佈於2024-11-09
  • 如何使用WinAPI取得螢幕解析度?
    如何使用WinAPI取得螢幕解析度?
    使用 WinAPI 取得螢幕解析度在 WinAPI 中,存在多個函數來決定目前螢幕解析度。適當的選擇取決於具體要求。 檢索顯示尺寸檢索顯示尺寸檢索顯示尺寸 主監視器:使用GetSystemMetrics(SM_CXSCREEN) 和GetSystemMetrics( SM_CYCYSEN) 取得主顯...
    程式設計 發佈於2024-11-09
  • 如何修復透過 Gmail REST API 發送電子郵件時出現的「400 錯誤請求 + 失敗前提條件」錯誤?
    如何修復透過 Gmail REST API 發送電子郵件時出現的「400 錯誤請求 + 失敗前提條件」錯誤?
    Gmail REST API:解決“400 Bad Request Failed Precondition”錯誤嘗試使用Gmail REST API 與伺服器發送電子郵件時-到伺服器授權時,您可能會遇到一條錯誤訊息,指出「400 Bad Request Failed Precondition」。此錯...
    程式設計 發佈於2024-11-09
  • 如何使用 LOAD XML 和 XML_LOAD() 將缺少 ID 列的 XML 檔案匯入 MySQL?
    如何使用 LOAD XML 和 XML_LOAD() 將缺少 ID 列的 XML 檔案匯入 MySQL?
    使用XML_LOAD() 函數將XML 檔案匯入MySQL在這種情況下,您在嘗試使用下列命令將XML 檔案匯入MySQL 資料庫表時遇到錯誤載入XML 指令。出現此問題的原因是表中的欄位數與 XML 檔案中的值不匹配,且表中多了一個自動遞增 id 欄位。 要解決此錯誤,您可以指定要使用 LOAD X...
    程式設計 發佈於2024-11-09
  • C++ 物件的記憶體是如何組織的?
    C++ 物件的記憶體是如何組織的?
    C 物件的記憶體佈局動態轉換和重新解釋操作通常涉及操作物件記憶體指標。讓我們深入研究 C 如何在記憶體中組織對象,以便更好地理解這些操作。 根據 C 標準,類別或結構中非靜態資料成員的記憶體佈局主要由它們的宣告順序決定。具有相同存取說明符的成員按其聲明的順序排序。 除了成員變數之外,物件還為以下物件...
    程式設計 發佈於2024-11-09
  • 時間數據系列:故事的其餘部分
    時間數據系列:故事的其餘部分
    Time Data Series: The Rest of the Story - AdatoSystems It’s been a while since I wrote about PHP Zmanim – the work I’ve done with it and the ...
    程式設計 發佈於2024-11-09
  • 如何在 PyCharm 中輕鬆管理 Django 專案的環境變數?
    如何在 PyCharm 中輕鬆管理 Django 專案的環境變數?
    在PyCharm 中設定環境變數在處理依賴環境變數的專案時,擁有方便的方法直接在開發環境中管理這些設定至關重要。在本指南中,我們將示範如何在 PyCharm 中輕鬆設定環境變量,而無需求助於手動配置或 bash 檔案。 具體來說,我們將重點放在為Django 專案設定以下環境變數:DATABASE_...
    程式設計 發佈於2024-11-09
  • 如何修復 Windows 上的「pip install」存取被拒絕錯誤?
    如何修復 Windows 上的「pip install」存取被拒絕錯誤?
    克服Windows 上的「pip install」存取被拒絕錯誤使用pip 安裝可能是一項簡單的任務,但有時您可能會遇到存取錯誤在Windows 上出現拒絕錯誤,即使以管理員身分執行命令提示字元或PowerShell 也是如此。 此錯誤通常表現為:WindowsError: [Error 5] Ac...
    程式設計 發佈於2024-11-09

免責聲明: 提供的所有資源部分來自互聯網,如果有侵犯您的版權或其他權益,請說明詳細緣由並提供版權或權益證明然後發到郵箱:[email protected] 我們會在第一時間內為您處理。

Copyright© 2022 湘ICP备2022001581号-3