MinerU 2.0部署
MinerU 2.0使用sglang加速的部署指南 本文介绍了MinerU 2.0基于sglang的两种部署方式。提供了优化的Dockerfile构建方案,通过缓存机制提升构建速度并精简模型下载内容。详细说明了Docker运行命令和docker-compose配置选项,包括GPU设置、端口映射等参数。最后给出了Python测试代码示例,展示如何调用sglang服务进行PDF文档分析处理。部署方案支
·
简介
MinerU 2.0使用sglang加速,与之前差别较大,建议按照官方的Docker镜像的方式启动。
Docker镜像
Dockerfile
这是官方的Dockerfile
# Use the official sglang image
FROM lmsysorg/sglang:v0.4.7-cu124
# install mineru latest
RUN python3 -m pip install -U 'mineru[core]' -i https://mirrors.aliyun.com/pypi/simple --break-system-packages
# Download models and update the configuration file
RUN /bin/bash -c "mineru-models-download -s modelscope -m all"
# Set the entry point to activate the virtual environment and run the command line tool
ENTRYPOINT ["/bin/bash", "-c", "export MINERU_MODEL_SOURCE=local && exec \"$@\"", "--"]
建议使用下面这个Dockerfile,相较于官方的,它增加了缓存(提升下次构建的速度),值下载vlm的模型(官方还会下载pipeline的)。
# Use the official sglang image
FROM lmsysorg/sglang:v0.4.7-cu124
# install mineru latest
RUN --mount=type=cache,id=mineru_cache,target=/root/.cache,sharing=locked \
python3 -m pip install -U 'mineru[core]' -i https://mirrors.aliyun.com/pypi/simple --break-system-packages
# Download models and update the configuration file
RUN --mount=type=cache,id=mineru_cache,target=/root/.cache,sharing=locked \
mineru-models-download -s modelscope -m vlm && \
cp -r /root/.cache/modelscope /tmp/modelscope
RUN mkdir -p /root/.cache && \
mv /tmp/modelscope /root/.cache/modelscope
# Set the entry point to activate the virtual environment and run the command line tool
ENTRYPOINT ["/bin/bash", "-c", "export MINERU_MODEL_SOURCE=local && exec \"$@\"", "--"]
构建Docker镜像
docker build -t mineru-sglang:latest -f Dockerfile .
启动
Docker
# --gpus all
docker run -e MINERU_MODEL_SOURCE=local --gpus '"device=0,1"' \
--shm-size 100g \
-p 80:80 \
--ipc=host \
mineru-sglang:latest \
mineru-sglang-server --host 0.0.0.0 --port 80 --enable-torch-compile --tp 2
Docker compose
services:
mineru-sglang:
image: mineru-sglang:latest
container_name: mineru-sglang
restart: always
ports:
- 30000:30000
environment:
MINERU_MODEL_SOURCE: local
entrypoint: mineru-sglang-server
command:
--host 0.0.0.0
--port 80
# --enable-torch-compile # You can also enable torch.compile to accelerate inference speed by approximately 15%
# --dp 2 # If you have more than two GPUs with 24GB VRAM or above, you can use sglang's multi-GPU parallel mode to increase throughput
# --tp 2 # If you have two GPUs with 12GB or 16GB VRAM, you can use the Tensor Parallel (TP) mode
# --mem-fraction-static 0.7 # If you have two GPUs with 11GB VRAM, in addition to Tensor Parallel mode, you need to reduce the KV cache size
ulimits:
memlock: -1
stack: 67108864
ipc: host
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:30000/health || exit 1"]
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ["0"]
capabilities: [gpu]
测试
"""
pip install -U mineru -i https://mirrors.aliyun.com/pypi/simple
"""
import json
import os
import time
from mineru.backend.vlm.vlm_analyze import doc_analyze as vlm_doc_analyze
from mineru.backend.vlm.vlm_middle_json_mkcontent import union_make as vlm_union_make
from mineru.cli.common import convert_pdf_bytes_to_bytes_by_pypdfium2, prepare_env
from mineru.data.data_reader_writer import FileBasedDataWriter
from mineru.utils.enum_class import MakeMode
def process_pdf(
file_path:str,
):
output_dir = 'output'
server_url = 'http://<mineru_sglang_ip>:<port>'
f_make_md_mode = MakeMode.MM_MD
f_dump_md = True
f_dump_content_list = True
f_dump_middle_json = True
f_dump_model_output = True
start = time.time()
parts = os.path.splitext(os.path.basename(file_path))
pdf_file_name = parts[0]
with open(file_path, 'rb') as f:
pdf_bytes = f.read()
pdf_bytes = convert_pdf_bytes_to_bytes_by_pypdfium2(pdf_bytes, 0, None)
local_image_dir, local_md_dir = prepare_env(output_dir, pdf_file_name, 'auto')
image_writer, md_writer = FileBasedDataWriter(local_image_dir), FileBasedDataWriter(local_md_dir)
end1 = time.time()
print(f'start to call sglang, cost, {end1 - start}')
middle_json, infer_result = vlm_doc_analyze(pdf_bytes, image_writer=image_writer, backend='sglang-client',
server_url=server_url)
end2 = time.time()
print(f'end to call sglang, cost, {end2 - end1}')
pdf_info = middle_json["pdf_info"]
# draw_layout_bbox(pdf_info, pdf_bytes, local_md_dir, f"{pdf_file_name}_layout.pdf")
# draw_span_bbox(pdf_info, pdf_bytes, local_md_dir, f"{pdf_file_name}_span.pdf")
if f_dump_md:
image_dir = str(os.path.basename(local_image_dir))
md_content_str = vlm_union_make(pdf_info, f_make_md_mode, image_dir)
md_writer.write_string(
f"{pdf_file_name}.md",
md_content_str,
)
end3 = time.time()
print(f'end to gen md, cost, {end3 - end2}')
if f_dump_content_list:
image_dir = str(os.path.basename(local_image_dir))
content_list = vlm_union_make(pdf_info, MakeMode.CONTENT_LIST, image_dir)
md_writer.write_string(
f"{pdf_file_name}_content_list.json",
json.dumps(content_list, ensure_ascii=False, indent=4),
)
if f_dump_middle_json:
md_writer.write_string(
f"{pdf_file_name}_middle.json",
json.dumps(middle_json, ensure_ascii=False, indent=4),
)
if f_dump_model_output:
model_output = ("\n" + "-" * 50 + "\n").join(infer_result)
md_writer.write_string(
f"{pdf_file_name}_model_output.txt",
model_output,
)
print(f"local output dir is {local_md_dir}")
if __name__ == '__main__':
file = 'demo.pdf'
process_pdf(file)

GitCode 天启AI是一款由 GitCode 团队打造的智能助手,基于先进的LLM(大语言模型)与多智能体 Agent 技术构建,致力于为用户提供高效、智能、多模态的创作与开发支持。它不仅支持自然语言对话,还具备处理文件、生成 PPT、撰写分析报告、开发 Web 应用等多项能力,真正做到“一句话,让 Al帮你完成复杂任务”。
更多推荐
所有评论(0)