海光K100_AI单卡全离线部署PPT生成系统
一、引言随着人工智能技术迅猛发展大语言模型与多模态生成技术的深度融合正在重塑各行各业的创作范式。其中智能演示文稿PPT生成作为AI办公自动化的重要方向正经历从“模板填充”到“智能体自主创作”的根本性变革。传统PPT制作依赖人工进行资料搜集、内容组织、版式设计、图文排版等繁琐流程一份高质量的演示文稿往往需要数天的精心打磨。而随着大模型技术的成熟AI系统已经能够实现从需求理解、内容研究到视觉设计的全链路自动化。本文用K100_AI单卡在全离线环境里完成PPT生成系统的部署用到的幻灯片智能体模型是DeepPresenter-9B。用到的文生图模型是ERNIE-Image-Turbo。二、方案设计本方案采用“设计智能体DeepPresenter-9B 文生图模型ERNIE-Image-Turbo”的两模型协同架构。两个模型各司其职、通过标准化API协同工作1、DeepPresenter-9B作为Design Agent同时也是Research Agent、Long Context Model负责深度内容检索与信息组织和幻灯片HTML代码生成与设计约束管理。2、ERNIE-Image-Turbo作为T2I Model负责按需生成高质量配图。三、实施方法及代码3.1硬件环境本方案的硬件平台服务器配置如下组件规格GPU海光DCU 64GB K100_AI只需占用一张K100_AI显卡3.2软件栈本方案的软件栈基于Docker容器化技术构建使用经过海光DCU适配的vLLM推理镜像镜像一DeepPresenter-9B42.228.13.241:5000/jenkins/model_test_env/vllm:0.21.0-ubuntu22.04-dtk2604-py3.10-20260702-0235该镜像包含vLLM 0.21推理框架、DTK 26.04、Python 3.10环境单卡部署DeepPresenter-9B平均每秒生成31token。镜像二ERNIE-Image-Turboharbor.sourcefind.cn:5443/dcu/admin/base/custom:vllm0.15.1-ubuntu22.04-dtk26.04-0130-py3.10-20260220该镜像基于vLLM 0.15.1推理框架、DTK 26.04Python 3.10环境单卡部署ERNIE-Image-Turbo和DeepPresenter-9B部署在同一张K100_AI平均12秒生成一张高清图片。软件项目1、ERNIE-Image-Turbo环境部署参照https://developer.sourcefind.cn/codes/modelzoo/ernie-image_pytorch里面操作步骤项目下载链接为https://developer.sourcefind.cn/codes/modelzoo/ernie-image_pytorch/-/archive/main/ernie-image_pytorch-main.zip但是需要自行为该项目编写openai兼容API接口自编代码如下import os os.environ[HIP_VISIBLE_DEVICES] 7 import torch import base64 import io from flask import Flask, request, jsonify from flask_cors import CORS from modelscope import ErnieImagePipeline # 全局只加载一次模型 torch.set_float32_matmul_precision(high) torch.backends.cudnn.benchmark True torch.backends.cuda.enable_mem_efficient_sdp(True) print(正在加载 ERNIE-Image-Turbo 模型...) pipe ErnieImagePipeline.from_pretrained( /home/models/ERNIE-Image-Turbo, torch_dtypetorch.bfloat16, ).to(cuda) print(模型加载完成。) # 预热 with torch.no_grad(): dummy_q torch.randn(1, 8, 40, 64, devicecuda, dtypetorch.bfloat16) dummy_k torch.randn(1, 8, 40, 64, devicecuda, dtypetorch.bfloat16) dummy_v torch.randn(1, 8, 40, 64, devicecuda, dtypetorch.bfloat16) torch.nn.functional.scaled_dot_product_attention(dummy_q, dummy_k, dummy_v) print(模型预热完成。) # 生成函数 def generate_image(prompt, height, width, steps, guidance_scale, use_pe): with torch.no_grad(): image pipe( promptprompt, heightheight, widthwidth, num_inference_stepssteps, guidance_scaleguidance_scale, use_peuse_pe ).images[0] return image # Flask API 服务 app Flask(__name__) CORS(app) app.route(/v1/images/generations, methods[POST]) def images_generations(): data request.get_json() print(f[DEBUG] Received data: {data}) if not data: return jsonify({error: Missing request body}), 400 prompt data.get(prompt) if not prompt: return jsonify({error: Missing prompt field}), 400 # 优先使用 size 字段格式如 1920x1088否则使用 width 和 height size_str data.get(size) if size_str and isinstance(size_str, str) and x in size_str: try: w_str, h_str size_str.split(x) width int(w_str) height int(h_str) print(f[DEBUG] Parsed from size: width{width}, height{height}) except ValueError: return jsonify({error: fInvalid size format: {size_str}}), 400 else: width data.get(width, 848) height data.get(height, 1264) print(f[DEBUG] Using width/height: width{width}, height{height}) # 确保整数 try: width int(width) height int(height) except (TypeError, ValueError): return jsonify({error: fInvalid width/height: {width}, {height}}), 400 # 性能优化限制最大分辨率 max_dim 1024 # 可调整为 768 或 1280 平衡速度与画质 if width max_dim or height max_dim: scale max_dim / max(width, height) width int(width * scale) height int(height * scale) print(f[DEBUG] Scaled down to max dim {max_dim}: {width}x{height}) # 对齐到 16 的倍数模型要求 orig_w, orig_h width, height width (width // 16) * 16 height (height // 16) * 16 if width 16: width 16 if height 16: height 16 if (orig_w, orig_h) ! (width, height): print(f[DEBUG] Aligned dimensions: ({orig_w},{orig_h}) - ({width},{height})) else: print(f[DEBUG] Dimensions already multiple of 16: {width}x{height}) # 性能优化降低步数、禁用提示词增强 steps data.get(steps, 4) # 默认步数从 8 改为 4 if num_inference_steps in data: steps data[num_inference_steps] guidance_scale data.get(guidance_scale, 1.0) use_pe data.get(use_pe, False) # 默认禁用提示词增强加速推理 try: pil_image generate_image(prompt, height, width, steps, guidance_scale, use_pe) buffered io.BytesIO() pil_image.save(buffered, formatPNG) img_base64 base64.b64encode(buffered.getvalue()).decode(utf-8) response { data: [ { b64_json: img_base64, url: fdata:image/png;base64,{img_base64} } ] } return jsonify(response) except Exception as e: print(f[ERROR] {e}) return jsonify({error: str(e)}), 500 app.route(/v1/models, methods[GET]) def list_models(): return jsonify({ data: [ {id: ERNIE-Image-Turbo, object: model} ] }) app.route(/health, methods[GET]) def health(): return jsonify({status: ok}) if __name__ __main__: app.run(host0.0.0.0, port5000, threadedFalse)2、PPTAgent下载链接为https://github.com/icip-cas/PPTAgent/archive/refs/heads/main.zip安装好该项目的依赖库之后修改其配置文件如下/opt/models/PPTAgent-main# cat deeppresenter/config.yaml context_folding: true design_agent: id001 api_key: 123456 max_tool_calls: 20 base_url: http://192.168.222.65:8083/v1 model: DeepPresenter-9B design_constraints: max_bullets_per_slide: 6 min_bottom_margin: 0.6 system_prompt_extra: | # ⛔ FATAL RULES – VIOLATION SLIDE REJECTION ⛔ You are generating HTML slides that will be converted to PowerPoint via a strict validator. **Any rule violation makes the entire slide fail.** Follow every rule exactly. ## ABSOLUTELY FORBIDDEN – WILL CAUSE IMMEDIATE REJECTION: 1. **Headers (h1 to h6)** MUST NEVER have any of these CSS properties: - border (including border-*, outline) - background (including background-color, background-image) - box-shadow - margin (use padding on the parent container instead) - padding (use padding on the parent container instead) → **If you need visual decoration (border, background, shadow) around a heading, wrap the heading in a div and apply those styles to that div only.** → **Example (WRONG):** h2 styleborder-bottom: 2px solid red;Title/h2 ❌ → **Example (CORRECT):** div styleborder-bottom: 2px solid red;h2Title/h2/div ✅ 2. **Naked text** inside div, section, or any block container is NOT allowed. All text must be enclosed in p, h1–h6, ul, or ol. 3. **Inline elements** (b, strong, span, a) must NOT have margin, padding, or border. Spacing must be applied to the outer block-level parent. 4. **Manual bullet symbols** (e.g., •, -, *) are forbidden. Always use ulli.../li/ul. 5. **Overflow is prohibited** – both horizontally and vertically. The slide container has a fixed height (100% of 7.5 inches). All content must fit within with at least **0.5 inch (≈36pt) of bottom padding**. 6. **Image generation** – width and height must be multiples of 16 (e.g., 1360×768, 1376×768). Never use odd numbers like 1366. ## MANDATORY LAYOUT RULES (to prevent overflow): - Root container: div classslide with styledisplay:flex; flex-direction:column; height:100%; width:100%;. - Main content area: use flex:1; and set padding-bottom: 36pt; (or larger) to ensure bottom margin. - Limit bullet points to **6 per slide** and keep text concise. If content is dense, reduce font size (e.g., use font-size: 14pt instead of 18pt) or line-height. - Use relative units (%, vw, em) for widths and margins to avoid horizontal overflow. Avoid fixed pixel widths that may exceed container. - For multi‑column or grid layouts, use display:flex; flex-wrap:wrap; with appropriate gap and ensure total width ≤ 100%. ## SELF‑CHECK (run mentally before submitting each slide): 1. **Scan every h1–h6** – confirm they have **zero** border, background, box-shadow, margin, padding. If any exist, remove them immediately and move styling to a wrapping div. 2. Check every div – ensure no raw text is left unwrapped. 3. Verify that the bottom‑most element is at least 36pt above the slide bottom (use padding-bottom on the flex child). 4. Temporarily set overflow: visible on the slide to visually check for any content spilling outside; if it spills, reduce content or font sizes. 5. Ensure no list is manually bulleted – all lists use ul. ## REMINDER – Existing guidelines still apply: - All styles must be inline or inside style. - Use flex/grid for layout. - The validator is unforgiving – **double‑check every rule** before finalizing a slide. long_context_model: *id001 research_agent: *id001 t2i_model: base_url: http://192.168.222.65:5000/v1 model: ERNIE-Image-Turbo api_key: not-needed offline_mode: true vision_model: *id0013.3模型下载与准备1、DeepPresenter-9B幻灯片设计模型下载链接https://modelscope.cn/models/forceless/DeepPresenter-9B/files该模型针对幻灯片生成场景进行了监督微调2、ERNIE-Image-Turbo文生图模型下载链接https://modelscope.cn/models/PaddlePaddle/ERNIE-Image-Turbo/files8B参数支持8步推理快速生成3.4 模型启动参数/opt/models# cat dmx_DeepPresenter-9B.sh #!/bin/bash export VLLM_SPEC_DECODE_EAGER1 export VLLM_MLA_DISABLE0 export VLLM_USE_FLASH_MLA1 export VLLM_RPC_TIMEOUT1800000 export HIP_VISIBLE_DEVICES7 export ALLREDUCE_STREAM_WITH_COMPUTE1 # 海光CPU绑定核通过hy-smi --showtopo参考numa节点 export VLLM_NUMA_BIND1 export VLLM_RANK0_NUMA0 export VLLM_RANK1_NUMA0 export VLLM_RANK2_NUMA0 export VLLM_RANK3_NUMA0 export VLLM_RANK4_NUMA0 export VLLM_RANK5_NUMA0 export VLLM_RANK6_NUMA0 export VLLM_RANK7_NUMA0 export NCCL_MAX_NCHANNELS16 export NCCL_MIN_NCHANNELS16 #export TRITON_HIP_CLANG_PATH/opt/dtk-26.04-DCC2602-0317/llvm/bin/clang #export ROCM_PATH/opt/dtk-26.04-DCC2602-0317 export PATH$ROCM_PATH/llvm/bin:$PATH vllm serve /home/models/DeepPresenter-9B \ --gpu-memory-utilization 0.4 \ --port 8083 \ --max-model-len 32768 \ --max-num-seqs 32 \ --served-model-name DeepPresenter-9B \ --tensor-parallel-size 1 \ --enable-auto-tool-choice \ --tool-call-parser qwen3_xml \ --trust-remote-code \ --enable-prefix-caching \ --enable-chunked-prefill启动参数一定要注意设置--tool-call-parser qwen3_xml四、运行测试1、启动ERNIE-Image-Turbo模型用前面自编的openai兼容API接口程序加载ERNIE-Image-Turbo模型如下root:/opt/models# docker exec -it erinie-image-new bash root:/workspace# cd /home/models/ernie-image_pytorch-main root:/home/models/ernie-image_pytorch-main# nohup python run-web-turbo-OpenAI.py [1] 57 root:/home/models/ernie-image_pytorch-main# nohup: ignoring input and appending output to nohup.out2、启动DeepPresenter-9B如下root:/opt/models# docker exec -it vllm0.21-new bash root:/# cd /home/models/ root:/home/models# nohup ./dmx_DeepPresenter-9B.sh [1] 232 root:/home/models# nohup: ignoring input and appending output to nohup.out3、启动PPTagent项目如下root:/opt/models# docker exec -it erinie-image-new bash root:/workspace# cd /home/models/PPTAgent-main root:/home/models/PPTAgent-main# source .venv/bin/activate (pptagent) root:/home/models/PPTAgent-main# python webui.py /opt/models/PPTAgent-main/.venv/lib/python3.11/site-packages/requests/__init__.py:113: RequestsDependencyWarning: urllib3 (2.6.3) or chardet (7.0.1)/charset_normalizer (3.4.4) doesnt match a supported version! warnings.warn( /home/models/PPTAgent-main/webui.py:107: DeprecationWarning: The theme parameter in the Blocks constructor will be removed in Gradio 6.0. You will need to pass theme to Blocks.launch() instead. with gr.Blocks( /home/models/PPTAgent-main/webui.py:107: DeprecationWarning: The css parameter in the Blocks constructor will be removed in Gradio 6.0. You will need to pass css to Blocks.launch() instead. with gr.Blocks( /home/models/PPTAgent-main/webui.py:119: DeprecationWarning: The default value of allow_tags in gr.Chatbot will be changed from False to True in Gradio 6.0. You will need to explicitly set allow_tagsFalse if you want to disable tags in your chatbot. chatbot gr.Chatbot( Please visit http://localhost:7861 * Running on local URL: http://0.0.0.0:7861 * To create a public link, set shareTrue in launch().4、访问PPT生成系统如下5、生成PPT的效果如下