MCP 課程文件

使用AMD NPU和iGPU加速的本地微型代理

Hugging Face's logo
加入 Hugging Face 社群

並獲得增強的文件體驗

開始使用

使用AMD NPU和iGPU加速的本地微型代理

在本節中,我們將向您展示如何使用AMD神經網路處理單元(NPU)和整合GPU(iGPU)加速我們的端到端微型代理應用程式。然後,我們將透過提供本地檔案訪問和建立用於本地處理敏感資訊的助手來增強我們的端到端應用程式,以確保最大程度的隱私。

為此,我們將使用Lemonade Server,這是一個利用NPU和iGPU加速在本地執行模型的工具。

設定

設定Lemonade伺服器

您可以在Windows和Linux上安裝Lemonade Server。更多文件請參閱lemonade-server.ai

Windows
Linux

要在Windows上安裝Lemonade Server,只需下載並執行此處的最新安裝程式。

Lemonade Server支援所有平臺上的CPU推理,以及Windows x86/x64上的所有引擎。GPU加速透過使用Vulkan的llamacpp引擎實現,主要針對AMD Ryzen™ AI 7000/8000/300系列和AMD Radeon™ 7000/9000系列。對於NPU加速,ONNX Runtime GenAI (OGA) 引擎支援AMD Ryzen™ AI 300系列裝置。

安裝Lemonade Server後,您可以單擊新增到桌面的Lemonade圖示來啟動它。

微型代理和NPX設定

本課程的這一部分假設您已安裝npx微型代理。如果尚未安裝,請參閱課程的微型代理部分。請務必使用huggingface_hub[mcp]==0.33.2

使用AMD NPU和iGPU執行您的微型代理應用程式

要使用AMD NPU和iGPU執行您的微型代理應用程式,只需將我們在上一節中建立的MCP伺服器指向Lemonade Server,如下所示

Windows
Linux
{
  "model": "Qwen3-8B-GGUF",
  "endpointUrl": "https://:8000/api/",
  "servers": [
    {
      "type": "stdio",
      "command": "C:\\Program Files\\nodejs\\npx.cmd",
      "args": [
        "mcp-remote",
        "https://:7860/gradio_api/mcp/sse"
      ]
    }
  ]
}

然後,您可以選擇各種模型在本地機器上執行。例如,我們使用了Qwen3-8B-GGUF模型,該模型透過Vulkan加速在AMD GPU上高效執行。您可以透過訪問https://:8000/#model-management查詢支援的模型列表,甚至匯入自己的模型。

建立一個助手來本地處理敏感資訊

Lemonade Server Interface

現在,讓我們透過啟用對本地檔案的訪問並引入一個完全在裝置上處理敏感資訊的助手來增強我們的端到端應用程式。具體來說,這個助手將幫助我們評估候選簡歷並支援招聘過程中的決策制定——所有這些都可以在保證資料隱私和安全的前提下進行。

為此,我們將使用Desktop Commander MCP伺服器,它允許您在本地機器上執行命令,並提供全面的檔案系統訪問、終端控制和程式碼編輯功能。

讓我們用一個基本的微型代理來設定一個專案。

mkdir file-assistant
cd file-assistant

然後,我們將在`file-assistant`資料夾中建立一個新的`agent.json`檔案。

Windows
Linux
{
  "model": "user.jan-nano",
  "endpointUrl": "https://:8000/api/",
  "servers": [
    {
      "type": "stdio",
      "command": "C:\\Program Files\\nodejs\\npx.cmd",
      "args": [
        "-y",
        "@wonderwhy-er/desktop-commander"
      ]
    }
  ]
}

最後,我們需要下載 `Jan Nano` 模型。您可以透過訪問 https://:8000/#model-management,點選 `Add a Model` 並提供以下資訊來完成此操作

Model Name: user.jan-nano
Checkpoint: Menlo/Jan-nano-gguf:jan-nano-4b-Q4_0.gguf
Recipe: llamacpp

Custom Model

全部完成!現在讓我們試一試。

試用

recording

我們的目標是建立一個能夠幫助我們本地處理敏感資訊的助手。為此,我們首先將為助手建立一個職位描述檔案。

在`file-assistant`資料夾中建立一個名為`job_description.md`的檔案。

# Senior Food Technology Engineer

## About the Role
We're seeking a culinary innovator to transform cooking processes into precise algorithms and AI systems.

## What You'll Do
- Convert cooking instructions into measurable algorithms
- Develop AI-powered kitchen tools
- Create food quality assessment systems
- Build recipe-following AI models

## Requirements
- MS in Computer Science (food-related thesis preferred)
- Python and PyTorch expertise
- Proven experience combining food science with ML
- Strong communication skills using culinary metaphors

## Perks
- Access to experimental kitchen
- Continuous taste-testing opportunities
- Collaborative tech-foodie team environment

*Note: Must attend conferences and publish on algorithmic cooking optimization.*

現在,讓我們在`file-assistant`資料夾中建立一個`candidates`資料夾,併為我們的助手新增一個示例簡歷檔案以供使用。

mkdir candidates
touch candidates/john_resume.md

新增以下示例簡歷或包含您自己的簡歷。

# John Doe

**Contact Information**
- Email: email@example.com
- Phone: (+1) 123-456-7890
- Location: 1234 Abc Street, Example, EX 01234
- GitHub: github.com/example
- LinkedIn: linkedin.com/in/example
- Website: example.com

## Experience

**Machine Learning Engineer Intern** | Slow Feet Technology | Jul 2021 - Present
- Developed food-agnostic formulation for cross-ingredient meal cooking
- Created competitive cream of mushroom soup recipe, published in NeurIPS 2099
- Built specialized pan for meal cooking research

**Research Intern** | Paddling University | Aug 2020 - Present
- Designed efficient mapo tofu quality estimation method using thermometer
- Proposed fast stir frying algorithm for tofu cooking, published in CVPR 2077
- Outperformed SOTA methods with improved efficiency

**Research Assistant** | Huangdu Institute of Technology | Mar 2020 - Jun 2020
- Developed novel framework using spoon and chopsticks for eating mapo tofu
- Designed tofu filtering strategy inspired by beans grinding method
- Created evaluation criteria for eating plan novelty and diversity

**Research Intern** | Paddling University | Jul 2018 - Aug 2018
- Designed dual sandwiches using traditional burger ingredients
- Utilized structure duality to boost cooking speed for shared ingredients
- Outperformed baselines on QWE'15 and ASDF'14 datasets

## Education

**M.S. in Computer Science** | University of Charles River | Sep 2021 - Jan 2023
- Location: Boston, MA

**B.Eng. in Software Engineering** | Huangdu Institute of Technology | Sep 2016 - Jul 2020
- Location: Shanghai, China

## Skills

**Programming Languages:** Python, JavaScript/TypeScript, HTML/CSS, Java
**Tools and Frameworks:** Git, PyTorch, Keras, scikit-learn, Linux, Vue, React, Django, LaTeX
**Languages:** English (proficient), Indonesia (native)

## Awards and Honors

- **Gold**, International Collegiate Catching Fish Contest (ICCFC) | 2018
- **First Prize**, China National Scholarship for Outstanding Culinary Skills | 2017, 2018

## Publications

**Eating is All You Need** | NeurIPS 2099
- Authors: Haha Ha, San Zhang

**You Only Cook Once: Unified, Real-Time Mapo Tofu Recipe** | CVPR 2077 (Best Paper Honorable Mention)
- Authors: Haha Ha, San Zhang, Si Li, Wu Wang

然後我們可以用以下命令執行代理

tiny-agents run agent.json

您應該會看到以下輸出

Agent loaded with 18 tools:
 • get_config
 • set_config_value
 • read_file
 • read_multiple_files
 • write_file
 • create_directory
 • list_directory
 • move_file
 • search_files
 • search_code
 • get_file_info
 • edit_block
 • execute_command
 • read_output
 • force_terminate
 • list_sessions
 • list_processes
 • kill_process
 »

現在讓我們為助手提供一些資訊以開始。

» Read the contents of C:\Users\your_username\file-assistant\job_description.md

您應該會看到類似以下的輸出

<Tool iNtxGmOuXHqZVBWmKnfxsc61xsJbsoAM>read_file {"path":"C:\\Users\\your_username\\file-assistant\\job_description.md","length":23}

Tool iNtxGmOuXHqZVBWmKnfxsc61xsJbsoAM
[Reading 23 lines from start]

(...)

The job description for the Senior Food Technology Engineer position emphasizes the need for a candidate who can bridge the gap between food science and artificial intelligence (...). Candidates are also expected to attend conferences and publish research on algorithmic cooking optimization.

我們使用的是預設系統提示,這可能會導致助手多次呼叫某些工具。要建立一個更自信的助手,您可以在與`agent.json`相同的目錄中提供一個自定義的`PROMPT.md`檔案。

太棒了!現在讓我們閱讀候選人的簡歷。

» Inside the same folder you can find a candidates folder. Check for john_resume.md and let me know if he is a good fit for the job.

您應該會看到類似以下的輸出

<Tool ll2oWo73YeGIft5VbOIpF9GNf0kevjEy>read_file {"path":"C:\\Users\\your_username\\file-assistant\\candidates\\john_resume.md"}

Tool ll2oWo73YeGIft5VbOIpF9GNf0kevjEy
[Reading 58 lines from start]

(...)
John Wayne is a **strong fit** for the Senior Food Technology Engineer role. His technical expertise in AI and machine learning, combined with his experience in food-related research and publications, makes him an excellent candidate. He also has the soft skills and cultural fit needed to thrive in a collaborative, innovative environment.

太棒了!現在我們可以繼續邀請候選人參加面試。

» Create a file called "invitation.md" in the "file-assistant" folder and write a short invitation to John to come in for an interview.

您應該會看到類似以下內容被寫入`invitation.md`檔案

# Interview Invitation

Dear John,

We would like to invite you for an interview for the Senior Food Technology Engineer position. The interview will be held on [insert date and time] at [insert location or virtual meeting details].

Please confirm your availability and let us know if you need any additional information.

Best regards,
[Your Name]
[Your Contact Information]

太棒了!我們成功建立了一個可以幫助我們本地處理敏感資訊的助手。

探索其他模型和加速選項

在上面的示例中,Jan-Nano模型利用Vulkan加速,在AMD GPU上高效進行本地LLM推理。您還可以透過訪問https://:8000/#model-management或檢視模型文件來嘗試其他模型和加速選項。

對於需要簡潔上下文並能受益於NPU + iGPU加速的Windows應用程式,您可以嘗試Lemonade Server提供的混合模型——針對AMD Ryzen AI 300系列PC進行了最佳化。諸如`Llama-xLAM-2-8b-fc-r-Hybrid`等模型經過專門微調,以實現工具呼叫,並提供快速、響應靈敏的效能!

結論

在本單元中,我們展示瞭如何使用AMD NPU和iGPU加速我們的端到端微型代理應用程式。我們還展示瞭如何建立一個助手來本地處理敏感資訊。

既然您已經瞭解瞭如何利用Lemonade Server進行本地模型加速和隱私保護應用程式,您可以在Lemonade GitHub儲存庫中探索更多示例和功能。該儲存庫包含額外的文件、示例實現,並由社群積極維護。

< > 在 GitHub 上更新

© . This site is unofficial and not affiliated with Hugging Face, Inc.