Go で SQL データをフェッチする 3 つの方法

The fast method (fetch data using slices) is not the fastest in benchmarks, but it’s fastest and the least gluttonous in real tests.

表紙 > プログラミング > Go で SQL データをフェッチする 3 つの方法

Go で SQL データをフェッチする 3 つの方法

2024 年 11 月 4 日に公開

ブラウズ：897

Three ways to fetch SQL data in Go

Introduction

In this article, we explore three different approaches to fetching data from SQL databases in Golang: the standard method using JSON marshaling, a dynamic approach using maps, and an optimized method that avoids unnecessary overhead.

In my practice, there were very frequent cases where I had to retrieve data from a database with an unknown structure beforehand. This often happens in the e-commerce industry.

We will consider 3 methods of retrieving data, along with their advantages and disadvantages.

You can get all the code of the examples from the repository
https://github.com/oleg578/dbf

Data preparing

We will use the database MariaDB, because I just like this database. It has never failed me in 10 years of practice with a load of up to 500 million transactions per day with a data volume of up to 4 terabytes in e-commerce. And it’s really faster than MySQL.

Code for create MariaDB docker instance and seed test data are in https://github.com/oleg578/dbf/tree/main/db

All examples are tested on Ubuntu 24.04.1 LTS (Noble Numbat) with 11th Gen Intel Core i5–1135G7 and 16GiB RAM.

Standard way

The standard way is trivial — we fetch rows from database into array and then Marshal it into JSON struct

rs, errRs := con.QueryContext(ctx, q, numbRows)

...

 for rs.Next() {
  var dmy = Dummy{}
  if err := rs.Scan(
   &dmy.ID,
   &dmy.Product,
   &dmy.Description,
   &dmy.Price,
   &dmy.Qty,
   &dmy.Date); err != nil {
   panic(err)
  }
  result = append(result, dmy)
 }
 if err := rs.Err(); err != nil {
  panic(err)
 }
 msg, errRTJ := json.Marshal(result)
 if errRTJ != nil {
  panic(errRTJ)
 }

...

_, errOut := os.Stdout.Write(msg)
...

What about speed?
Test result:

% ./db2json_struct 10000000 1>/dev/null
Elapsed time: 12631 ms, HeapAlloc = 5400.725 MB, Sys = 7099.447 MB
10000000 records read

Let’s just remember this as a starting point.

Using map way

Next we consider fetching unknown list of columns — like “SELECT * FROM …”.
The sequence of actions is simple.
Each record will be represent as map[string]interface{}, then

// create result slice
// numbRows is number of rows in result
outRows := make([]map[string]interface{}, 0, numbRows)

We will not serialize each record to save program execution time, and our actions are not complex.
See https://github.com/oleg578/dbf/tree/mapping/example

After fetch rows from database, we will request an slice of columns

columns, err := rs.Columns()

Create slice of values and slice of pointers for data

values := make([]interface{}, len(columns))
valuePointers := make([]interface{}, len(values))
  for i := range values {
    valuePointers[i] = &values[i]
  }

Then for each row we get the map which represent model — https://github.com/oleg578/dbf/blob/mapping/sql2json.go

func Row2Map(columns []string, values []interface{}) (map[string]interface{}, error) {
 rowMap := make(map[string]interface{})
 if len(columns) != len(values) {
  return nil, errors.New("columns and values length not equal")
 }
 for i, col := range columns {
  rowMap[col] = assignCellValue(values[i]) // we will help to typify the value
 }
 return rowMap, nil
}

func assignCellValue(val interface{}) interface{} {
 if b, ok := val.([]byte); ok {
  if floatValue, err := strconv.ParseFloat(string(b), 64); err == nil {
   return floatValue
  }
  return string(b)
 }
 return val
}

Note:
You may want to pay attention to the function assignCellValue — its purpose is to pre-assign a type to column values. Simple trick — this function tells the JSON encoder which values to accept as non-numeric.
Benchmark:
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
BenchmarkRow2Map
BenchmarkRow2Map-8 7376358 159.0 ns/op 336 B/op 2 allocs/op

Finally, execution time of our example — https://github.com/oleg578/dbf/blob/mapping/example/main.go
% ./db2json_map 10000000 1>/dev/null
Elapsed time: 12152 ms, HeapAlloc = 8966.899 MB, Sys = 12248.287 MB
10000000 records read

What can we say about these results?

Benefits — We can obtain structures that are not predefined.

Disadvantages — there are several.
But we pay for this with memory consumption.
It’s easy to see that such an implementation consumes 1.5 times more memory.

The second disadvantage which can give us a headache are data types. We must define what we will access as numeric data, and what we will define as string. In current we use AssignValue auxiliary function:

func assignCellValue(val interface{}) interface{} {
 if b, ok := val.([]byte); ok {
  if floatValue, err := strconv.ParseFloat(string(b), 64); err == nil {
   return floatValue
  }
  return string(b)
 }
 return val
}

If value can be represented as float — then we define it as interface value (json library will define it as numeric or null), else as string.

The third disadvantage is map structure property — we can’t guarantee order of fields in json. But this disadvantage may be unimportant.

We cannot say that the result we got is satisfactory.He devours memory. On small datasets this may be acceptable, but this can negatively affect processing with large amounts of data.
https://github.com/oleg578/dbf/tree/mapping

What can we improve?

Let’s look at the weak points in our algorithm of actions when we use maps.

This a creating map[string]interface{} for each row — this very expensive operation in terms of resources and processor time;
And another too expensive operation — JSON marshaling the final slice which can be very big.

It’s time to think about improvement

Let’s play with data structures

When we get data from a table, the order of columns is always defined by the database. Then we can use 2 coordinated slices — columns and values. Then we can refuse a map type and use slice.

The next trick — we will request data as byte slices and list of columns will fetch as list of ColumnTypes — it will help us in future.

columns, err := rs.ColumnTypes()
 if err != nil {
  log.Fatalf("fault get column types: %v", err)
 }
 values := make([]sql.RawBytes, len(columns))
 valuePointers := make([]interface{}, len(values))
 for i := range values {
  valuePointers[i] = &values[i]
 }

So, we are ready to fetch data in a new way and it’s time to serialize this data.
The JSON library is heavy, but we can make serialization easier.

Columns are simple token-level data usually — we can turn them into JSON strings, then we will just escape special symbols by our escape function -

func escape(in []byte) []byte {
 var out bytes.Buffer
 for _, b := range in {
  switch b {
  case '\n', '\r', '\t', '\b', '\f', '\\', '"':
   out.WriteByte('\\')
   out.WriteByte(b)
  case '/':
   out.WriteByte('\\')
   out.WriteByte(b)
  default:
   if b 


Now let’s think about data types. We can determine the type of column (using sql ColumnType structure sql.ColumnType)

func isDigit(c *sql.ColumnType) bool {
 switch c.DatabaseTypeName() {
 case "TINYINT":
  return true
 case "SMALLINT":
  return true
 case "MEDIUMINT":
  return true
 case "BIGINT":
  return true
 case "INT":
  return true
 case "INT1":
  return true
 case "INT2":
  return true
 case "INT3":
  return true
 case "INT4":
  return true
 case "INT8":
  return true
 case "BOOL":
  return true
 case "BOOLEAN":
  return true
 case "DECIMAL":
  return true
 case "DEC":
  return true
 case "NUMERIC":
  return true
 case "FIXED":
  return true
 case "NUMBER":
  return true
 case "FLOAT":
  return true
 case "DOUBLE":
  return true
 default:
  return false
 }
}



And finally, let’s apply primitive serialization:

func Row2Json(columns []*sql.ColumnType, values []sql.RawBytes) (string, error) {
 if len(values) == 0 {
  return "", errors.New("no data in values")
 }
 if len(columns) != len(values) {
  return "", errors.New("columns and values length not equal")
 }
 var buff strings.Builder
 buff.WriteByte('{')
 for i, val := range values {
  buff.WriteByte('"')
  buff.WriteString(columns[i].Name())
  buff.WriteByte('"')
  buff.WriteByte(':')
  if len(val) > 0 {
   if !isDigit(columns[i]) {
    buff.WriteByte('"')
   }
   buff.Write(escape(val))
   if !isDigit(columns[i]) {
    buff.WriteByte('"')
   }
  } else {
   buff.WriteString("null")
  }
  if i != len(values)-1 {
   buff.WriteByte(',')
  }
 }
 buff.WriteByte('}')

 return buff.String(), nil
}



Benchmark:

cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz

BenchmarkRow2Json

BenchmarkRow2Json-8     2881545        385.3 ns/op      440 B/op        9 allocs/op

  
  
  Use UNIX way to output


We will use UNIX way of output of our program — i.e. we will not create an output slice — we will out data into standard output stream instead — then we can use the output as a standard pipe in UNIX:

// create new buffer
 writer := bufio.NewWriter(os.Stdout)
 defer writer.Flush()

 writer.WriteByte('[') //start print array of data
 ...
 msg, errMsg := dbf.Row2Json(columns, values)
 ...
 if _, err := writer.WriteString(msg); err != nil {
   log.Fatalf("fault write row: %v", err)
  } // write serialized row
 ...
 writer.WriteByte(']') // finish serialized slice



In success execution we will get something like:

%  ./db2json_ds 3 2>/dev/null | jq
[
  {
    "id": 1,
    "product": "product_1",
    "description": null,
    "price": 1.23,
    "qty": 10,
    "date": "2021-01-01 00:00:00"
  },
  {
    "id": 2,
    "product": "product_2",
    "description": null,
    "price": 2.23,
    "qty": 20,
    "date": "2021-01-01 00:00:00"
  },
  {
    "id": 3,
    "product": "product_3",
    "description": null,
    "price": 3.23,
    "qty": 30,
    "date": "2021-01-01 00:00:00"
  }
]



It’s a time of the moment of truth — fetch 10 million records:

% ./db2json_ds 10000000 1>/dev/null                                                                                                                              

Elapsed time: 11894 ms, HeapAlloc = 2.436 MB, Sys = 11.710 MB

10000000 records read

Let’s compare with the starting point:

Execution time — 11.647 seconds instead 12.631 seconds;
Memory consumption — 11.710 MB instead 7099.447 MB. So, up to 10 percent faster and 600 times less memory consumption.

Conclusion

Let’s examine the broader scope of the tests.
Test comparison table (the length of result json file is 1.2Gb)

Benchmarks comparison table:

Real test memory consumption comparison table:

The fast method (fetch data using slices) is not the fastest in benchmarks, but it’s fastest and the least gluttonous in real tests.

Analyzing the results, we can make two important conclusions, in my opinion :

simple solutions always work more effectively;
real tests are always more important than synthetic ones.

Happy coding ?

リリースステートメントこの記事は次の場所に転載されています: https://dev.to/oleg578/three-ways-to-fetch-sql-data-in-go-44pe?1 侵害がある場合は、[email protected] に連絡して削除してください。それ

最新のチュートリアルもっと>

CSS でテーブルのセル幅を設定する方法: Min-Width と Max-Width が機能しないのはなぜですか?
表のセルに幅プロパティを使用する予想に反して、min-width プロパティと max-width プロパティは表のセルには適用できません。 CSS 仕様によれば、表のセルに対する影響は定義されていません。代替解決策表のセルの幅を定義するには、代わりに width プロパティを使用します。これは、表...

プログラミング 2024 年 11 月 9 日に公開
Node.js ストリームによる効率的なデータ処理
この記事では、Node.js ストリームについて詳しく説明し、大量のデータを効率的に処理するのにどのように役立つかを理解します。ストリームは、大きなファイルの読み取り、ネットワーク経由のデータ転送、リアルタイム情報の処理など、大規模なデータセットを処理するためのエレガントな方法を提供します。デー...

プログラミング 2024 年 11 月 9 日に公開
チャンクを使用して大規模な MySQL 選択を効率的に取得するにはどうすればよいですか?
チャンキングを使用して大規模な MySQL 選択を効率的に取得するMySQL で大規模なデータセットを処理すると、データ取得中にメモリの問題が発生することがよくあります。これを解決するには、チャンキングが効果的な解決策を提供します。チャンキング手法チャンキングには、大規模な選択クエリを小さなサブセッ...

プログラミング 2024 年 11 月 9 日に公開
C++ で複数のオブジェクトポインターを 1 行で宣言するとコンパイラエラーが発生するのはなぜですか?
複数のオブジェクトポインターを 1 行で宣言: コンパイラーエラーを解明する複数のオブジェクトポインターを同じ行で宣言すると、開発者は多くの場合、次のような一般的な問題に遭遇します。コンパイラエラー。コードを正しく実行するには、この問題の根本原因を理解することが重要です。次のクラス宣言を考えて...

プログラミング 2024 年 11 月 9 日に公開
CSS クリップパスと JavaScript を使用して、反転したテキストの色のホバー効果を実現するにはどうすればよいですか?
CSS と JavaScript を使用してマウスホバー時のテキストの色を反転する目的のホバー効果を実現するには、黒のテキストが白に反転しますが、黒いカーソルが表示されるように、CSS クリップパスの機能と JavaScript イベント処理を組み合わせることができます。このアプローチは、プライマ...

プログラミング 2024 年 11 月 9 日に公開
量子コンピューティング: テクノロジーをどのように再定義するか
量子コンピューティングは、21 世紀で最も大きな技術進歩の 1 つです。ビットを使用して情報を 0 または 1 として処理する古典的なコンピューターとは異なり、量子コンピューターは、同時に複数の状態に存在できる量子ビット、つまり量子ビットを利用します。コンピューティングにおけるこの根本的な変化は、テ...

プログラミング 2024 年 11 月 9 日に公開
PHP の POST 変数の最大制限を増やす方法は?
PHP 最大 POST 変数制限多数の入力フィールドを含む POST リクエストを処理する場合、変数の数がデフォルトを超えると一般的な問題が発生しますPHP での制限。たとえば、1000 を超えるフィールドを持つフォームでは、$_POST 配列の最初の 1001 個の変数のみが公開される場合がありま...

プログラミング 2024 年 11 月 9 日に公開
の内側を垂直方向に整列させるには?
を内で垂直に配置する次の状況を考えてみましょう。< 内にがネストされています。 div>、次のコードに見られるように:<div id="theMainDiv" style=" border:solid 1px gray; ...

プログラミング 2024 年 11 月 9 日に公開
効率的なオフラインアクセスのために PHP で配列を保存および復元する方法
ローカルアクセスのために PHP で配列を保存および復元するリモート API から配列を取得し、オフライン用にローカルに保存したいと考えています。操作。これを実現するには、パフォーマンスやファイルサイズを犠牲にすることなく JSON シリアル化を活用できます。JSON シリアル化: エンコーディ...

プログラミング 2024 年 11 月 9 日に公開
Docker を使用して Go アプリケーションをデプロイする方法
Docker is a containerization platform that simplifies applications’ packaging, distribution, and deployment. You can harness the benefits of Go and Do...

プログラミング 2024 年 11 月 9 日に公開
JavaScript Web コンポーネントと LIT を使用した再利用可能なコンポーネントの構築
今日のペースの速い Web 開発環境では、再利用可能で保守可能なコンポーネントを構築することが重要です。 JavaScript Web コンポーネントは、フレームワーク間で機能する自己完結型のモジュール要素を作成するネイティブな方法を提供します。ただし、これらのコンポーネントを手動で作成するのは...

プログラミング 2024 年 11 月 9 日に公開
C++ で非メイン関数に渡される配列で範囲ベースの for ループを使用するにはどうすればよいですか?
非メイン関数に渡される配列の範囲ベースの for ループC では、範囲ベースの for ループは次のようになります。配列を反復処理するために使用されます。ただし、配列がメイン関数以外に渡されると、配列はポインターに変化し、サイズ情報が失われます。この問題を解決し、範囲ベースの for ループの使用を...

プログラミング 2024 年 11 月 9 日に公開
array_column は PHP のオブジェクトの配列と互換性がありますか?
オブジェクトの配列で array_column を利用することは可能ですか?PHP の array_column 関数は、多次元データから特定の列を抽出するための強力なツールです配列。ただし、オブジェクトの配列での使用には課題が生じます。PHP の以前のバージョンでは、array_column ...

プログラミング 2024 年 11 月 9 日に公開
MySQL クエリをエスケープするために mysql_real_escape_string を介して PDO を使用するのはどのような場合ですか?
MySQL クエリのエスケープ: PDO と mysql_real_escape_stringmysql_real_escape_string は MySQL クエリをエスケープして SQL インジェクションを防ぐ方法を提供しますが、PHP データオブジェクト (PDO) を使用することをお勧めし...

プログラミング 2024 年 11 月 9 日に公開
`std::string` を `LPCSTR` および `LPWSTR` に変換するにはどうすればよいですか?
std::string を LPCSTR および LPWSTR に変換するstd::string を LPCSTR または LPWSTR に変換するには、これらの性質を理解する必要があります。ポインタ。定義を明確にしましょう:LPCSTR と LPSTR:LPCSTR: 定数文字列への長いポインタ。...

プログラミング 2024 年 11 月 9 日に公開