使用 Gradio 構建演示

現在我們已經微調了一個用於迪維希語語音識別的 Whisper 模型，接下來讓我們構建一個 Gradio 演示，向社群展示它！

首先要做的是使用 pipeline() 類載入微調後的檢查點——這在預訓練模型部分已經非常熟悉了。您可以將 model_id 更改為 Hugging Face Hub 上微調模型的名稱空間，或者更改為預訓練的 Whisper 模型之一，以執行零樣本語音識別

from transformers import pipeline

model_id = "sanchit-gandhi/whisper-small-dv"  # update with your model id
pipe = pipeline("automatic-speech-recognition", model=model_id)

其次，我們將定義一個函式，該函式接收音訊輸入的 filepath 並將其傳遞給管道。在這裡，管道會自動處理音訊檔案的載入、將其重新取樣到正確的取樣率以及使用模型執行推理。然後我們可以簡單地將轉錄文字作為函式的輸出返回。為了確保我們的模型可以處理任意長度的音訊輸入，我們將啟用預訓練模型部分中描述的分塊功能

def transcribe_speech(filepath):
    output = pipe(
        filepath,
        max_new_tokens=256,
        generate_kwargs={
            "task": "transcribe",
            "language": "sinhalese",
        },  # update with the language you've fine-tuned on
        chunk_length_s=30,
        batch_size=8,
    )
    return output["text"]

我們將使用 Gradio 的 blocks 功能在我們的演示中啟動兩個選項卡：一個用於麥克風轉錄，另一個用於檔案上傳。

import gradio as gr

demo = gr.Blocks()

mic_transcribe = gr.Interface(
    fn=transcribe_speech,
    inputs=gr.Audio(sources="microphone", type="filepath"),
    outputs=gr.components.Textbox(),
)

file_transcribe = gr.Interface(
    fn=transcribe_speech,
    inputs=gr.Audio(sources="upload", type="filepath"),
    outputs=gr.components.Textbox(),
)

最後，我們使用剛剛定義的兩個塊啟動 Gradio 演示

with demo:
    gr.TabbedInterface(
        [mic_transcribe, file_transcribe],
        ["Transcribe Microphone", "Transcribe Audio File"],
    )

demo.launch(debug=True)

這將啟動一個 Gradio 演示，類似於 Hugging Face Space 上執行的那個。

如果您希望將您的演示託管在 Hugging Face Hub 上，您可以使用此 Space 作為您的微調模型的模板。

點選連結將模板演示覆制到您的帳戶：https://huggingface.co/spaces/course-demos/whisper-small?duplicate=true

我們建議您的 Space 名稱與您的微調模型相似（例如 whisper-small-dv-demo），並將可見性設定為“公開”。

將 Space 複製到您的帳戶後，點選“Files and versions”->“app.py”->“edit”。然後將模型識別符號更改為您的微調模型（第 6 行）。滾動到頁面底部，點選“Commit changes to main”。演示將重新啟動，這次使用您的微調模型。您可以與您的朋友和家人分享此演示，以便他們可以使用您訓練的模型！

請檢視我們的影片教程以更好地瞭解如何複製 Space 👉️ YouTube 影片

我們期待在 Hub 上看到您的演示！

< > 在 GitHub 上更新

音訊課程

使用 Gradio 構建演示