【书生大模型实战营】Lagent 自定义你的 Agent 智能体——调用通义万向API

Lagent 自定义你的 Agent 智能体

lzl2040

1257人浏览 · 2024-08-14 17:48:01

lzl2040 · 2024-08-14 17:48:01 发布

【书生大模型实战营】Lagent 自定义你的 Agent 智能体

【书生大模型实战营】Lagent 自定义你的 Agent 智能体

【书生大模型实战营】Lagent 自定义你的 Agent 智能体

任务

使用 Lagent 自定义一个智能体，并使用 Lagent Web Demo 成功部署与调用。

环境创建

cuda 12.2，使用如下命令创建环境：

# 创建环境
conda create -n agent_camp3 python=3.10 -y
# 激活环境
conda activate agent_camp3
# 安装 torch
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia -y
# 安装其他依赖包
pip install termcolor==2.4.0
pip install lmdeploy==0.5.2

创建agent_camp3文件夹，将lagent仓库克隆在这里，即：

git clone https://github.com/InternLM/lagent.git

然后以源码的方式安装lagent：

cd lagent && git checkout 81e7ace && pip install -e . && cd ..

Lagent

推荐在VScode命令行运行，它会自动进行端口转发。使用如下命令部署InternLM2.5-7B-Chat，并启动一个 API Server：

lmdeploy serve api_server /share/new_models/Shanghai_AI_Laboratory/internlm2_5-7b-chat --model-name internlm2_5-7b-chat

然后在另一个窗口中启动 Lagent 的 Web Demo：

streamlit run examples/internlm2_agent_web_demo.py

浏览器输入http://localhost:8501/有如下界面：
在这里插入图片描述
并修改模型名称一栏为 internlm2_5-7b-chat，修改模型 ip一栏为127.0.0.1:23333（因为23333对应的是之前运行的internlm2_5-7b-chat这个模型），并选择插件为：ArxivSearch，具体如下：

然后输入指令：帮我搜索一下 MindSearch 论文，结果如下：
在这里插入图片描述

基于 Lagent 自定义智能体

使用 Lagent 自定义工具主要分为以下几步：

继承 BaseAction 类
实现简单工具的 run 方法；或者实现工具包内每个子工具的功能
简单工具的 run 方法可选被 tool_api 装饰；工具包内每个子工具的功能都需要被 tool_api 装饰

下面以一个文生图的agent为例。首先在lagent/lagent/actions创建一个magicmaker.py文件，然后复制如下内容：

import json
import requests

from lagent.actions.base_action import BaseAction, tool_api
from lagent.actions.parser import BaseParser, JsonParser
from lagent.schema import ActionReturn, ActionStatusCode


class MagicMaker(BaseAction):
    styles_option = [
        'dongman',  # 动漫
        'guofeng',  # 国风
        'xieshi',   # 写实
        'youhua',   # 油画
        'manghe',   # 盲盒
    ]
    aspect_ratio_options = [
        '16:9', '4:3', '3:2', '1:1',
        '2:3', '3:4', '9:16'
    ]

    def __init__(self,
                 style='guofeng',
                 aspect_ratio='4:3'):
        super().__init__()
        if style in self.styles_option:
            self.style = style
        else:
            raise ValueError(f'The style must be one of {self.styles_option}')
        
        if aspect_ratio in self.aspect_ratio_options:
            self.aspect_ratio = aspect_ratio
        else:
            raise ValueError(f'The aspect ratio must be one of {aspect_ratio}')
    
    @tool_api
    def generate_image(self, keywords: str) -> dict:
        """Run magicmaker and get the generated image according to the keywords.

        Args:
            keywords (:class:`str`): the keywords to generate image

        Returns:
            :class:`dict`: the generated image
                * image (str): path to the generated image
        """
        try:
            response = requests.post(
                url='https://magicmaker.openxlab.org.cn/gw/edit-anything/api/v1/bff/sd/generate',
                data=json.dumps({
                    "official": True,
                    "prompt": keywords,
                    "style": self.style,
                    "poseT": False,
                    "aspectRatio": self.aspect_ratio
                }),
                headers={'content-type': 'application/json'}
            )
        except Exception as exc:
            return ActionReturn(
                errmsg=f'MagicMaker exception: {exc}',
                state=ActionStatusCode.HTTP_ERROR)
        image_url = response.json()['data']['imgUrl']
        return {'image': image_url}

从generate_image函数中可以看出它是通过调用API来实现的。

然后修改修改 lagent/examples/internlm2_agent_web_demo.py来适配我们的自定义工具。

首先导入创建的MagicMaker，然后将七加入到action_list变量，如下：

from lagent.actions.magicmaker import MagicMaker
action_list = [
            ArxivSearch(),
			MagicMaker(),
 ]

然后重新启动上面两个demo，在agent界面可以选择两个插件，功能也没有问题。首先输入请生成一只鸡在打篮球，生成的结果如下：

在这里插入图片描述
感觉这个agent对动作不敏感。

然后让它找半监督视频目标分割的论文STCN，也是可以的：
在这里插入图片描述
不过它应该只是根据摘要的标题来找的。

自定义Agent

使用通义万相作为自己的agent，详细使用见阿里云官网：

export DASHSCOPE_API_KEY="你的API Key"

然后创建在latent/actions文件夹下创建qwen_wanxiang.py，内容为：

# internlm2_5-7b-chat, 127.0.0.1:23333
import json

from lagent.actions.base_action import BaseAction, tool_api
from lagent.actions.parser import BaseParser, JsonParser
from lagent.schema import ActionReturn, ActionStatusCode
from dashscope import ImageSynthesis
import dashscope
from http import HTTPStatus
from urllib.parse import urlparse, unquote
from pathlib import PurePosixPath
import requests


class QWen_Wanxiang(BaseAction):

    def __init__(self):
        super().__init__()
    
    @tool_api
    def generate_image(self, keywords: str) -> dict:
        """Run magicmaker and get the generated image according to the keywords.

        Args:
            keywords (:class:`str`): the keywords to generate image

        Returns:
            :class:`dict`: the generated image
                * image (str): path to the generated image
        """
        rsp = dashscope.ImageSynthesis.call(model=dashscope.ImageSynthesis.Models.wanx_v1,
                              prompt=keywords,
                              n=4,
                              size='1024*1024')
        img_url = ""
        if rsp.status_code == HTTPStatus.OK:
            for result in rsp.output.results:
                img_url = result.url
                break
        else:
            print('Failed, status_code: %s, code: %s, message: %s' %
                (rsp.status_code, rsp.code, rsp.message))
        return {'image': img_url}

然后更改internlm2_agent_web_demo.py中的action_list，最后执行命令streamlit run examples/internlm2_agent_web_demo.py，其他跟之前的步骤一样。

然后让模型生成狗在打篮球的图片，结果如下：
在这里插入图片描述
对动作的理解也不行。

天启AI社区

GitCode 天启AI是一款由 GitCode 团队打造的智能助手，基于先进的LLM（大语言模型）与多智能体 Agent 技术构建，致力于为用户提供高效、智能、多模态的创作与开发支持。它不仅支持自然语言对话，还具备处理文件、生成 PPT、撰写分析报告、开发 Web 应用等多项能力，真正做到“一句话，让 Al帮你完成复杂任务”。

更多推荐