CANN窗口化PID残差诊断基准
PidWindowedResidualDiagnostics Benchmark Note【免费下载链接】mat-chem-sim-pred面向工业领域聚焦计算仿真、预测两大核心场景构建面向流程工业机理数据双轮驱动的领域计算层推动AI for Science在材料化学领域的深度应用。项目地址: https://gitcode.com/cann/mat-chem-sim-pred本机原型验证环境OSWindowsPython3.11数据类型float32测试命令python -m pytest prediction/ProcessControl/PIDModelFit/pid_windowed_residual_diagnostics/tests/test_pid_windowed_residual_diagnostics.py -q python prediction/ProcessControl/PIDModelFit/pid_windowed_residual_diagnostics/tests/benchmark_pid_windowed_residual_diagnostics.py结果7 passedCPU Reference 性能BNwindowswindowstridelagwork itemsloop msvectorized msvectorized speedup64204815256128163.93M176.54618.1419.73x1284096155122563231.46M614.194117.3085.24x2564096155122563262.91M1219.490242.3925.03x512819215102451264503.32M4660.7831804.2712.58x初步判断该方向值得继续做 Ascend C 原型但不应直接复用 Python 的 sliding-window materialization 思路。NPU 版本应按(batch, window)切分任务在 kernel 内按窗口读取actual/predicted并融合统计量、lag 自相关和 Ljung-Box 归约避免显式展开[B, W, window_size]中间结果。相比PidTuningRuleBatch和标量 rollout该方向有三个优势工作量随B * W * window_size * max_lag增长具备足够吞吐空间。输出仅为metrics[B,W,8]和autocorr[B,W,L]D2H 压力可控。已有PidResidualDiagnostics在 node202 上证明同类归约/相关性扫描能获得 e2e 加速。Ascend C 原型验证环境NPU 环境node202SOCAscend910B3CANN/usr/local/Ascend/ascend-toolkitdevice3CPU 对比线程数64构建命令source /usr/local/Ascend/ascend-toolkit/set_env.sh cd prediction/ProcessControl/PIDModelFit/pid_windowed_residual_diagnostics cmake -S . -B build -DCMAKE_BUILD_TYPERelease -DSOC_VERSIONAscend910B3 cmake --build build -j 2构建结果libpid_windowed_residual_diagnostics_kernel_lib.solibpid_windowed_residual_diagnostics_host.sotest_aclnn_pid_windowed_residual_diagnosticsbenchmark_pid_windowed_residual_diagnosticsSmokePidWindowedResidualDiagnostics smoke windows2 w0_mean0 w0_mae0.5 w0_rmse0.707107 w0_dw1.5 w0_autocorr[0, -0.5] PASSEDBenchmark 命令./build/benchmark_pid_windowed_residual_diagnostics 3 64 2048 256 128 16 5 64 ./build/benchmark_pid_windowed_residual_diagnostics 3 128 4096 512 256 32 3 64结果B64 N2048 windows15 window256 stride128 lag16 cpu_64T_ms4.69239 npu_kernel_ms0.0470184 npu_resident_e2e_ms0.0727348 kernel_speedup99.7991 resident_e2e_speedup64.5137 metric_max_abs7.62939e-06 metric_max_rel1.19104e-07 autocorr_max_abs0 autocorr_max_rel0 B128 N4096 windows15 window512 stride256 lag32 cpu_64T_ms6.07111 npu_kernel_ms0.0818743 npu_resident_e2e_ms0.195025 kernel_speedup74.1515 resident_e2e_speedup31.1298 metric_max_abs7.62939e-06 metric_max_rel1.19131e-07 autocorr_max_abs0 autocorr_max_rel0说明当前resident_e2e口径假设actual/predicted已在 Device仅统计 kernel 和metrics/autocorrD2H。冷启动口径尚未统计若输入来自 Host完整actual/predictedH2D 会影响端到端收益。极小窗口不是当前性能判断口径后续若要支持小窗口巡检应单独补充数值稳定性和调度开销测试。Ascend C 原型复现命令当前目录已补齐 host、kernel、ACL smoke 和 ACL benchmark 入口。后续在 node202 上可按以下命令验证source /usr/local/Ascend/ascend-toolkit/set_env.sh cd prediction/ProcessControl/PIDModelFit/pid_windowed_residual_diagnostics cmake -S . -B build -DCMAKE_BUILD_TYPERelease -DSOC_VERSIONAscend910B3 cmake --build build -j 2 export LD_LIBRARY_PATH$PWD/build:$PWD/build/lib:/usr/local/Ascend/ascend-toolkit/latest/lib64:${LD_LIBRARY_PATH:-} ./build/test_aclnn_pid_windowed_residual_diagnostics 3 ./build/benchmark_pid_windowed_residual_diagnostics 3 128 4096 512 256 32 5 64 ./build/benchmark_pid_windowed_residual_diagnostics 3 256 4096 512 256 32 3 64 ./build/benchmark_pid_windowed_residual_diagnostics 3 512 8192 1024 512 64 2 64benchmark 输出字段cpu_64T_msCPU 多线程 reference。npu_kernel_ms输入已常驻 Device仅统计 launch stream sync。npu_resident_e2e_ms输入已常驻 Device统计 compute 输出 D2H。【免费下载链接】mat-chem-sim-pred面向工业领域聚焦计算仿真、预测两大核心场景构建面向流程工业机理数据双轮驱动的领域计算层推动AI for Science在材料化学领域的深度应用。项目地址: https://gitcode.com/cann/mat-chem-sim-pred创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考