探索非同步 Deepgram API：使用 Python 進行語音轉文本

首頁 > 程式設計 > 探索非同步 Deepgram API：使用 Python 進行語音轉文本

探索非同步 Deepgram API：使用 Python 進行語音轉文本

發佈於2024-11-07

今天將探索用於將語音轉換為文字的 Deepgram API [轉錄]。無論是建立語音助理、轉錄會議還是創建語音控制應用程序，Deepgram 都讓入門變得比以往更容易。

Exploring Async Deepgram API: Speech-to-Text using Python

什麼是 Deepgram？

Deepgram 是一個強大的語音辨識平台，它使用先進的機器學習模型來即時轉錄音訊。它提供了一個易於使用的 API，開發人員可以將其整合到他們的應用程式中，以執行諸如轉錄電話呼叫、將會議轉換為文本，甚至分析客戶互動等任務。

為什麼使用 Deepgram？

準確性：Deepgram 憑藉在海量資料集上訓練的深度學習演算法而擁有很高的準確率。
即時轉錄：說話時立即獲得結果，非常適合即時應用。
多種語言：支援多種語言和口音，使其適合全球應用。

Deepgram API 入門

安裝 - pip install httpx

導入所需的庫

import httpx
import asyncio
import logging
import traceback

定義非同步函數

#recording_url: The URL of the audio file to be transcribed.
#callback_url: The URL to which Deepgram will send the #transcription results (optional).
#api_key: Your Deepgram API key.

async def transcribe_audio(recording_url: str, callback_url: str, api_key: str):
    url = "https://api.deepgram.com/v1/listen"

    # Define headers
    headers = {
        "Authorization": f"Token {api_key}"
    }

    # Define query parameters
    query_params = {
        "callback_method": "post",
        "callback": callback_url
    }

    # Define body parameters
    body_params = {
        "url": recording_url
    }

4. 發送非同步請求

    logger.info(f"Sending request to {url} with headers: {headers}, query: {query_params}, body: {body_params}")

    async with httpx.AsyncClient(timeout=60.0) as client:
        try:
            # Make a POST request with query parameters and body
            response = await client.post(url, headers=headers, params=query_params, json=body_params)
            response.raise_for_status()  # Raise an error for HTTP error responses
            result = response.json()
            logger.info(f"Response received: {result}")

            return result

我們建立一個逾時為 60 秒的 httpx.AsyncClient 實例。使用 async with 可確保客戶端在區塊執行後正確關閉。
如果請求成功，我們解析 JSON 回應並記錄它，然後傳回結果。