グラナイトを試してみました。

2024 年 11 月 8 日に公開

ブラウズ：258

I tried out Granite .

花崗岩 3.0

Granite 3.0 は、エンタープライズレベルのさまざまなタスク向けに設計された、オープンソースの軽量の生成言語モデルファミリです。多言語機能、コーディング、推論、ツールの使用をネイティブにサポートしているため、エンタープライズ環境に適しています。

どのようなタスクを処理できるかを確認するために、このモデルを実行してテストしました。

環境設定

Google Colab で Granite 3.0 環境をセットアップし、次のコマンドを使用して必要なライブラリをインストールしました。

!pip install torch torchvision torchaudio
!pip install accelerate
!pip install -U transformers

実行

Granite 3.0 の 2B モデルと 8B モデルの両方のパフォーマンスをテストしました。

2Bモデル

2Bモデルを走らせました。 2B モデルのコードサンプルは次のとおりです:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "auto"
model_path = "ibm-granite/granite-3.0-2b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()

chat = [
    { "role": "user", "content": "Please list one IBM Research laboratory located in the United States. You should only output its name and location." },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
input_tokens = tokenizer(chat, return_tensors="pt").to("cuda")
output = model.generate(**input_tokens, max_new_tokens=100)
output = tokenizer.batch_decode(output)
print(output[0])

出力

userPlease list one IBM Research laboratory located in the United States. You should only output its name and location.
assistant1. IBM Research - Austin, Texas

8Bモデル

8Bモデルは2bを8bに置き換えて使用できます。以下は、8B モデルのロールとユーザー入力フィールドのないコードサンプルです:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

device = "auto"
model_path = "ibm-granite/granite-3.0-8b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()

chat = [
    { "content": "Please list one IBM Research laboratory located in the United States. You should only output its name and location." },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

input_tokens = tokenizer(chat, add_special_tokens=False, return_tensors="pt").to("cuda")
output = model.generate(**input_tokens, max_new_tokens=100)
generated_text = tokenizer.decode(output[0][input_tokens["input_ids"].shape[1]:], skip_special_tokens=True)
print(generated_text)

出力

1. IBM Almaden Research Center - San Jose, California

関数呼び出し

関数呼び出し機能を調査し、ダミー関数を使用してテストしました。ここでは、get_current_weather が模擬天気データを返すように定義されています。

ダミー関数

import json

def get_current_weather(location: str) -> dict:
    """
    Retrieves current weather information for the specified location (default: San Francisco).
    Args:
        location (str): Name of the city to retrieve weather data for.
    Returns:
        dict: Dictionary containing weather information (temperature, description, humidity).
    """
    print(f"Getting current weather for {location}")

    try:
        weather_description = "sample"
        temperature = "20.0"
        humidity = "80.0"

        return {
            "description": weather_description,
            "temperature": temperature,
            "humidity": humidity
        }
    except Exception as e:
        print(f"Error fetching weather data: {e}")
        return {"weather": "NA"}

プロンプト作成

関数を呼び出すためのプロンプトを作成しました:

functions = [
    {
        "name": "get_current_weather",
        "description": "Get the current weather",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and country code, e.g. San Francisco, US",
                }
            },
            "required": ["location"],
        },
    },
]
query = "What's the weather like in Boston?"
payload = {
    "functions_str": [json.dumps(x) for x in functions]
}
chat = [
    {"role":"system","content": f"You are a helpful assistant with access to the following function calls. Your task is to produce a sequence of function calls necessary to generate response to the user utterance. Use the following function calls as required.{payload}"},
    {"role": "user", "content": query }
]

応答の生成

次のコードを使用して、応答を生成しました:

instruction_1 = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
input_tokens = tokenizer(instruction_1, return_tensors="pt").to("cuda")
output = model.generate(**input_tokens, max_new_tokens=1024)
generated_text = tokenizer.decode(output[0][input_tokens["input_ids"].shape[1]:], skip_special_tokens=True)
print(generated_text)

出力

{'name': 'get_current_weather', 'arguments': {'location': 'Boston'}}

これにより、モデルが指定された都市に基づいて正しい関数呼び出しを生成できることが確認されました。

強化されたインタラクションフローのためのフォーマット仕様

Granite 3.0 では、構造化フォーマットでの応答を容易にするフォーマット指定が可能です。ここでは、応答には[UTTERANCE]、内なる思考には[THINK]を使用して説明します。

一方、関数呼び出しはプレーンテキストとして出力されるため、関数呼び出しと通常のテキスト応答を区別するための別のメカニズムを実装する必要がある場合があります。

出力形式の指定

AI の出力をガイドするためのサンプルプロンプトは次のとおりです:

prompt = """You are a conversational AI assistant that deepens interactions by alternating between responses and inner thoughts.

* Record spoken responses after the [UTTERANCE] tag and inner thoughts after the [THINK] tag.
* Use [UTTERANCE] as a start marker to begin outputting an utterance.
* After [THINK], describe your internal reasoning or strategy for the next response. This may include insights on the user's reaction, adjustments to improve interaction, or further goals to deepen the conversation.
* Important: **Use [UTTERANCE] and [THINK] as a start signal without needing a closing tag.**


Follow these instructions, alternating between [UTTERANCE] and [THINK] formats for responses.

example1:
  [UTTERANCE]Hello! How can I assist you today?[THINK]I’ll start with a neutral tone to understand their needs. Preparing to offer specific suggestions based on their response.[UTTERANCE]Thank you! In that case, I have a few methods I can suggest![THINK]Since I now know what they’re looking for, I'll move on to specific suggestions, maintaining a friendly and approachable tone.
...
example>

Please respond to the following user_input.

Hello! What can you do?

"""

実行コード例

応答を生成するコード:

chat = [
    { "role": "user", "content": prompt },
]
chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

input_tokens = tokenizer(chat, return_tensors="pt").to("cuda")
output = model.generate(**input_tokens, max_new_tokens=1024)
generated_text = tokenizer.decode(output[0][input_tokens["input_ids"].shape[1]:], skip_special_tokens=True)
print(generated_text)

出力例

出力は次のとおりです:

[UTTERANCE]Hello! I'm here to provide information, answer questions, and assist with various tasks. I can help with a wide range of topics, from general knowledge to specific queries. How can I assist you today?
[THINK]I've introduced my capabilities and offered assistance, setting the stage for the user to share their needs or ask questions.

[UTTERANCE] タグと [THINK] タグが正常に使用され、効果的な応答フォーマットが可能になりました。

プロンプトによっては、終了タグ ([/UTTERANCE] や [/THINK] など) が出力に表示される場合がありますが、全体的には、出力形式は通常正常に指定できます。

ストリーミングコードの例

ストリーミング応答を出力する方法も見てみましょう。

次のコードは、asyncio ライブラリとスレッドライブラリを使用して、Granite 3.0 からの応答を非同期にストリーミングします。

import asyncio
from threading import Thread
from typing import AsyncIterator
from transformers import (
    AutoTokenizer,
    AutoModelForCausalLM,
    TextIteratorStreamer,
)

device = "auto"
model_path = "ibm-granite/granite-3.0-8b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, device_map=device)
model.eval()

async def generate(chat) -> AsyncIterator[str]:
    # Apply chat template and tokenize input
    chat = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
    input_tokens = tokenizer(chat, add_special_tokens=False, return_tensors="pt").to("cuda")

    # Set up the streamer
    streamer = TextIteratorStreamer(
        tokenizer,
        skip_prompt=True,
        skip_special_tokens=True,
    )
    generation_kwargs = dict(
        **input_tokens,
        streamer=streamer,
        max_new_tokens=1024,
    )
    # Generate response in a separate thread
    thread = Thread(target=model.generate, kwargs=generation_kwargs)
    thread.start()

    for output in streamer:
        if not output:
            continue
        await asyncio.sleep(0)
        yield output

# Execute asynchronous generation in the main function
async def main():
    chat = [
        { "role": "user", "content": "Please list one IBM Research laboratory located in the United States. You should only output its name and location." },
    ]
    generator = generate(chat)
    async for output in generator:  # Use async for to retrieve responses sequentially
        print(output, end="|")

await main()

出力例

上記のコードを実行すると、次の形式で非同期応答が生成されます:

1. |IBM |Almaden |Research |Center |- |San |Jose, |California|

この例では、ストリーミングが成功する例を示します。各トークンは非同期で生成され、順番に表示されるため、ユーザーは生成プロセスをリアルタイムで確認できます。

まとめ

Granite 3.0は8Bモデルでも適度に強いレスポンスを提供します。関数呼び出し機能とフォーマット仕様機能も非常にうまく動作し、幅広いアプリケーションに対する可能性を示しています。

リリースステートメントこの記事は次の場所に転載されています: https://dev.to/m_sea_bass/i-tried-out-granite-30-53lm?1 侵害がある場合は、[email protected] に連絡して削除してください。

最新のチュートリアルもっと>

Microsoft Visual C ++が2フェーズテンプレートのインスタンス化を正しく実装できないのはなぜですか？
Microsoft Visual Cの「壊れた」2フェーズテンプレートインスタンス化の謎問題ステートメント：標準に準拠したコンパイラは、最初のフェーズでfoo（0）コールを解決し、foo（void*）にバインドします。ただし、MSVCはこのプロセスを第2フェーズに延期し、foo（0...

プログラミング 2025-02-19に投稿されました
mysqlテーブルで列の存在を確実に確認するにはどうすればよいですか？
mysqlテーブルの列の存在を決定することは、mysqlの列の存在を決定するため、テーブル内の列の存在が少し困惑する可能性があることを確認することができます。他のデータベースシステム。一般的に試みられた方法：が存在する場合（select * from information_schema...

プログラミング 2025-02-19に投稿されました
パスワードプロンプトなしでUbuntuにMySQLをインストールするにはどうすればよいですか？
Non-Interactive Installation of MySQL on UbuntuThe standard method of installing MySQL server on Ubuntu using sudo apt-get install mysql prompts forコン...

プログラミング 2025-02-19に投稿されました
$\ "while（1）vs。for（;;）：コンパイラの最適化はパフォーマンスの違いを排除しますか？\"$
\ "while（1）vs。for（;;）：コンパイラの最適化はパフォーマンスの違いを排除しますか？\"
while（1）vs。for（;;）：速度の違いはありますか？は（;;）の代わりに（1）を使用します。 loops？回答：ほとんどの最新のコンパイラでは、（1）と（;;）。説明：これらのループがどのように実装されているかのテクニカル分析は次のとおりです。コンパイラー： ...

プログラミング 2025-02-19に投稿されました
矢印関数がIE11に構文エラーを引き起こすのはなぜですか、そしてそれらを修正するにはどうすればよいですか？
なぜ矢印関数はIE 11 で構文エラーを引き起こす理由。 IE 11は矢印関数をサポートせず、構文エラーにつながります。問題のあるコードは、 G.Selectall（ "。MainBars"）として書き直す必要があります。 .Append（ "Te...

プログラミング 2025-02-19に投稿されました
`exec（）` Python 3のローカル変数を更新しますか？そうでない場合は、どのように作成できますか？
execのローカル変数への影響：のダイビング、エグゼクティブ関数、動的コード実行のためのPythonステープルは興味深いクエリを提起します：関数内のローカル変数を更新できますか？ DiLemma Python 3では、次のコードスニペットは、予想されるようにローカル変数を更新...

プログラミング 2025-02-19に投稿されました
Pythonの文字列から絵文字を削除する方法：一般的なエラーを修正するための初心者のガイド？
emojisをpython emojisの除去する絵文字を削除するための提供されたPythonコードは、構文誤差が含まれているため失敗します。 Unicode文字列は、python 2のu ''プレフィックスを使用して指定する必要があります。さらに、re.unicod...

プログラミング 2025-02-19に投稿されました
オブジェクトがPythonに特定の属性を持っているかどうかを確認する方法は？
メソッドオブジェクト属性の存在を決定するメソッドこの問い合わせは、オブジェクト内の特定の属性の存在を検証する方法を求めています。未定義のプロパティにアクセスしようとする試みがエラーを提起する次の例を考えてみましょう： >>> a = SomeClass() >&g...

プログラミング 2025-02-19に投稿されました
JavaScriptオブジェクトにキーを動的に設定する方法は？
JavaScriptオブジェクト変数の動的キーを作成する方法このSyntax jsObj['key' i] = 'example' 1;は機能しません。正しいアプローチには四角いブラケットが採用されています： var key = 'DYNAMIC_KEY', obj = {...

プログラミング 2025-02-19に投稿されました
バージョン5.6.5の前にMySQLのタイムスタンプ列を使用してcurrent_timestampを使用することの制限は何でしたか？
current_timestampがデフォルトまたはmysqlバージョンの更新条項を持つタイムスタンプ列の制限は、5.6.5より前に、5.6.5より前のmysqlバージョンで、そこにあります。デフォルトのcurrent_timestampまたは更新時の1つのタイムスタンプ列のみを持つようにテ...

プログラミング 2025-02-19に投稿されました
PHPを使用してXMLファイルから属性値を効率的に取得するにはどうすればよいですか？
XMLファイルから属性値をPHP の取得します。提供されている例のような属性を含むXMLファイルを使用する場合： $xml = simplexml_load_file($file); foreach ($xml->Var[0]->attributes() as $att...

プログラミング 2025-02-19に投稿されました
$ポイントインポリゴン検出により効率的な方法：Ray TracingまたはMatplotlib \ 's path.contains_points？$
ポイントインポリゴン検出により効率的な方法：Ray TracingまたはMatplotlib \ 's path.contains_points？
Pythonの効率的なポイントインポリゴン検出ポリゴン内にあるかどうかを決定することは、計算ジオメトリの頻繁なタスクです。このタスクの効率的な方法を見つけることは、多数のポイントを評価する場合に有利です。ここでは、一般的に使用される2つの方法を調査して比較します：Ray TracingとM...

プログラミング 2025-02-19に投稿されました
char_length（）を使用してmysqlの文字列長でデータを並べ替える方法は？
string_length（列）を使用する代わりに、mysqlの文字列長に基づいてデータをソートするために、mysql で文字列長でデータを選択する」を選択してください。組み込みchar_length（）function length（ ] length（）：文字列で占有されている...

プログラミング 2025-02-19に投稿されました
なぜ有効なコードにもかかわらず、PHPで入力をキャプチャするリクエストを要求するのはなぜですか？
アドレス指定PHP の郵便要求の誤動作： action='' 意図は、テキストボックスから入力をキャプチャし、[送信]ボタンがクリックされたときに表示することです。ただし、出力は空白のままです。 method = "get"がシームレスに動作しますが、meth...

プログラミング 2025-02-19に投稿されました
HibernateをMySQL列挙列にマッピングするときに列挙値を保存する方法は？
hibernateの列挙値を保存：間違った列タイプのトラブルシューティングデータ持続性の領域で、データモデル間の互換性を確保する、データベーススキーマ、そしてそれぞれのマッピングが不可欠です。 Javaで列挙されたタイプを操作する場合、冬眠がこれらの酵素を基礎となるデータベースにマッ...

プログラミング 2025-02-19に投稿されました