如何在服务器上免费使用claude code进行YOLO的改进
在服务器 Linux 环境下利用 Claude Code 将C3k2_RCSOSA等模块无缝集成到 Ultralytics YOLO 代码库中并修改配置文件完成注册。第一阶段初始化与环境检查首先在你的服务器终端中启动claude假设你已按上一篇教程配置好 GitHub Models 免费额度或本地 Ollama 模型。1. 启动 Claude Codecd /path/to/your/ultralytics # 进入你的 YOLO 项目根目录 claude2. 告诉 Claude Code 背景User (你):我正在做一个 YOLO 改进实验。我有一个包含C3k2_RCSOSA、RCSOSA、C2PSA_RCSOSA、RepVGG、SEBlock等模块的 Python 代码。任务请阅读当前项目的ultralytics/nn/modules/block.py文件了解模块的存放格式。将我提供的代码片段整理并追加到block.py文件的末尾。请确保必要的依赖如 Conv, Bottleneck能够被正确引用如果 block.py 中已存在则删除我代码中的重复定义只保留新增类。确保代码中的__all__列表被正确更新或合并。Claude Code 的执行逻辑它会先cat ultralytics/nn/modules/block.py。然后分析你提供的代码发现Conv和Bottleneck在block.py中通常已经存在。它会自动剔除你代码中的Conv和Bottleneck定义避免冲突然后将RepVGG、SEBlock、SR、RCSOSA、PSABlock_RCSOSA、C3k2_RCSOSA、C2PSA_RCSOSA等类追加写入文件。它会处理__all__的合并。第二阶段注册模块 (修改 tasks.py)模块写好了必须让 YOLO 的解析器认识它。User (你):接下来请修改ultralytics/nn/tasks.py文件。操作找到parse_model函数中用于映射字典字符串到类的部分通常是一个大的 if-elif 块或者字典。将以下模块名注册进去C3k2_RCSOSA,RCSOSA,C2PSA_RCSOSA,RepVGG。确保它们能够从当前文件或block.py中正确导入。Claude Code 的执行逻辑它会打开ultralytics/nn/tasks.py。定位到类似m self.module_map[n]或者具体的if n C2f: ...逻辑区域。插入注册代码例如elif m in {C3k2_RCSOSA, RCSOSA, C2PSA_RCSOSA, RepVGG}: m getattr(nn, m) if hasattr(nn, m) else globals()[m] # 或者更常见的 ultralytics 写法 # from ultralytics.nn.modules.block import *Claude Code 会自动处理import语句的补充。第三阶段修改配置文件 (yaml)这是最关键的一步我们需要替换 YOLO backbone 中的 C3k2 模块。User (你):现在我们需要创建一个新的配置文件。读取ultralytics/cfg/models/11/yolo11.yaml(或者是你正在用的 v8/v10 yaml)。将该文件复制为yolo11_rcsosa.yaml。在新文件中将所有的C3k2替换为C3k2_RCSOSA。检查C3k2_RCSOSA的参数列表。根据我之前提供的代码它支持attnTrue参数。请修改 yaml 中对应的 args例如将[-1, 1, C3k2, [True, ...]]修改为[-1, 1, C3k2_RCSOSA, [True, ..., True]]确保最后一个参数开启attn。保存文件。Claude Code 的执行逻辑使用cp命令复制文件。使用正则表达式或文本替换功能精准替换模块名称。分析C3k2_RCSOSA的__init__定义推断参数位置并修改 yaml 中的列表参数。第四阶段验证与调试 (Run Fix)做完修改后必须跑通代码。User (你):配置已完成。现在请执行验证步骤运行命令python yolo predict modelyolo11_rcsosa.yaml sourcehttps://ultralytics.com/images/bus.jpg。如果遇到KeyError或ModuleNotFoundError请分析报错信息自动修改tasks.py或__init__.py中的导入路径直到错误解决。如果遇到shape mismatch或维度错误请分析block.py中C3k2_RCSOSA的前向传播逻辑并修正 yaml 中的通道数配置。Claude Code 的执行逻辑它会运行命令。假设报错KeyError: C3k2_RCSOSA。Claude 会意识到注册失败回到tasks.py检查发现可能是字典更新的位置不对然后修正代码。假设报错AttributeError: module ultralytics.nn.modules has no attribute C3k2_RCSOSA。Claude 会检查ultralytics/nn/modules/__init__.py确保from .block import *包含了你的新模块或者显式添加它们。循环修正Claude Code 的优势在于它可以一直重复“修改-运行”的过程直到成功。第五阶段完整代码检查 (Fusion)因为你的代码中包含RepVGG这通常涉及“重参数化”。User (你):代码运行起来了但我注意到 RepVGG 有fuse_repvgg_block或类似的重参数化方法。请检查ultralytics/nn/modules/block.py中我刚才添加的RepVGG类。确认get_equivalent_kernel_bias方法是否存在。在ultralytics/utils/ops.py或模型导出相关的工具函数中添加对RepVGG层的 fuse 支持如果尚未支持。运行一次yolo export modelyolo11_rcsosa.yaml formatonnx测试导出是否成功。总结 Claude Code 的交互剧本为了方便你直接在终端操作这里有一段浓缩的Prompt 剧本你可以直接发给 Claude Code我需要修改本地的 Ultralytics YOLO 代码库以集成一个新的注意力模块。第一步读取 ul-tralytics/nn/modules/block.py。第二步我会提供一段包含 C3k2_RCSOSA, RCSOSA, C2PSA_RCSOSA, RepVGG, SEBlock 等类的代码。请将这些类添加到 block.py 中。注意如果 block.py 中已有 Conv 或 Bottleneck 定义请删除我提供代码中的重复定义只保留新增的类。第三步修改 ul-tralytics/nn/tasks.py确保 C3k2_RCSOSA, RCSOSA, C2PSA_RCSOSA 能被正确解析实例化。第四步复制当前目录下的 yaml 配置文件为 yolo11_custom.yaml并将其中的 C3k2 替换为 C3k2_RCSOSA记得根据代码定义添加 attnTrue 参数。第五步运行 yolo predict modelyolo11_custom.yaml。如果有报错请自动修复代码直到运行成功。-------------------------------------------------------------------------------------------------------------------------------每一步效果如下第二步骤代码如下def conv_bn(in_channels, out_channels, kernel_size, stride, padding, groups1): result nn.Sequential() result.add_module(conv, nn.Conv2d(in_channelsin_channels, out_channelsout_channels, kernel_sizekernel_size, stridestride, paddingpadding, groupsgroups, biasFalse)) result.add_module(bn, nn.BatchNorm2d(num_featuresout_channels)) return result class SEBlock(nn.Module): def __init__(self, input_channels): super(SEBlock, self).__init__() internal_neurons input_channels // 8 self.down nn.Conv2d(in_channelsinput_channels, out_channelsinternal_neurons, kernel_size1, stride1, biasTrue) self.up nn.Conv2d(in_channelsinternal_neurons, out_channelsinput_channels, kernel_size1, stride1, biasTrue) self.input_channels input_channels def forward(self, inputs): x F.avg_pool2d(inputs, kernel_sizeinputs.size(3)) x self.down(x) x F.relu(x) x self.up(x) x torch.sigmoid(x) x x.view(-1, self.input_channels, 1, 1) return inputs * x class RepVGG(nn.Module): def __init__(self, in_channels, out_channels, kernel_size3, stride1, padding1, dilation1, groups1, padding_modezeros, deployFalse, use_seFalse): super(RepVGG, self).__init__() self.deploy deploy self.groups groups self.in_channels in_channels padding_11 padding - kernel_size // 2 self.nonlinearity nn.SiLU() # self.nonlinearity nn.ReLU() if use_se: self.se SEBlock(out_channels) else: self.se nn.Identity() if deploy: self.rbr_reparam nn.Conv2d(in_channelsin_channels, out_channelsout_channels, kernel_sizekernel_size, stridestride, paddingpadding, dilationdilation, groupsgroups, biasTrue, padding_modepadding_mode) else: self.rbr_identity nn.BatchNorm2d( num_featuresin_channels) if out_channels in_channels and stride 1 else None self.rbr_dense conv_bn(in_channelsin_channels, out_channelsout_channels, kernel_sizekernel_size, stridestride, paddingpadding, groupsgroups) self.rbr_1x1 conv_bn(in_channelsin_channels, out_channelsout_channels, kernel_size1, stridestride, paddingpadding_11, groupsgroups) def get_equivalent_kernel_bias(self): kernel3x3, bias3x3 self._fuse_bn_tensor(self.rbr_dense) kernel1x1, bias1x1 self._fuse_bn_tensor(self.rbr_1x1) kernelid, biasid self._fuse_bn_tensor(self.rbr_identity) return kernel3x3 self._pad_1x1_to_3x3_tensor(kernel1x1) kernelid, bias3x3 bias1x1 biasid def _pad_1x1_to_3x3_tensor(self, kernel1x1): if kernel1x1 is None: return 0 else: return torch.nn.functional.pad(kernel1x1, [1, 1, 1, 1]) def _fuse_bn_tensor(self, branch): if branch is None: return 0, 0 if isinstance(branch, nn.Sequential): kernel branch.conv.weight running_mean branch.bn.running_mean running_var branch.bn.running_var gamma branch.bn.weight beta branch.bn.bias eps branch.bn.eps else: assert isinstance(branch, nn.BatchNorm2d) if not hasattr(self, id_tensor): input_dim self.in_channels // self.groups kernel_value np.zeros((self.in_channels, input_dim, 3, 3), dtypenp.float32) for i in range(self.in_channels): kernel_value[i, i % input_dim, 1, 1] 1 self.id_tensor torch.from_numpy(kernel_value).to(branch.weight.device) kernel self.id_tensor running_mean branch.running_mean running_var branch.running_var gamma branch.weight beta branch.bias eps branch.eps std (running_var eps).sqrt() t (gamma / std).reshape(-1, 1, 1, 1) return kernel * t, beta - running_mean * gamma / std def forward(self, inputs): if hasattr(self, rbr_reparam): return self.nonlinearity(self.se(self.rbr_reparam(inputs))) if self.rbr_identity is None: id_out 0 else: id_out self.rbr_identity(inputs) return self.nonlinearity(self.se(self.rbr_dense(inputs) self.rbr_1x1(inputs) id_out)) def fusevggforward(self, x): return self.nonlinearity(self.rbr_dense(x)) class SR(nn.Module): def __init__(self, c1, c2): super().__init__() c1_ int(c1 // 2) c2_ int(c2 // 2) self.repconv RepVGG(c1_, c2_) def forward(self, x): x1, x2 x.chunk(2, dim1) out torch.cat((x1, self.repconv(x2)), dim1) out self.channel_shuffle(out, 2) return out def channel_shuffle(self, x, groups): batchsize, num_channels, height, width x.data.size() channels_per_group num_channels // groups x x.view(batchsize, groups, channels_per_group, height, width) x torch.transpose(x, 1, 2).contiguous() x x.view(batchsize, -1, height, width) return x def make_divisible(x, divisor): if isinstance(divisor, torch.Tensor): divisor int(divisor.max()) # to int return math.ceil(x / divisor) * divisor class RCSOSA(nn.Module): def __init__(self, c1, c2, n1, seFalse, e0.5, head8): super().__init__() n_ n // 2 c_ make_divisible(int(c1 * e), head) # self.conv1 Conv(c1, c_) self.conv1 RepVGG(c1, c_) self.conv3 RepVGG(int(c_ * 3), c2) self.sr1 nn.Sequential(*[SR(c_, c_) for _ in range(n_)]) self.sr2 nn.Sequential(*[SR(c_, c_) for _ in range(n_)]) self.se None if se: self.se SEBlock(c2) def forward(self, x): x1 self.conv1(x) x2 self.sr1(x1) x3 self.sr2(x2) x torch.cat((x1, x2, x3), 1) return self.conv3(x) if self.se is None else self.se(self.conv3(x)) class Bottleneck(nn.Module): def __init__( self, c1: int, c2: int, shortcut: bool True, g: int 1, k: tuple[int, int] (3, 3), e: float 0.5 ): super().__init__() c_ int(c2 * e) # hidden channels self.cv1 Conv(c1, c_, k[0], 1) self.cv2 Conv(c_, c2, k[1], 1, gg) self.add shortcut and c1 c2 def forward(self, x: torch.Tensor) - torch.Tensor: return x self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x)) def autopad(k, pNone, d1): # kernel, padding, dilation if d 1: k d * (k - 1) 1 if isinstance(k, int) else [d * (x - 1) 1 for x in k] # actual kernel-size if p is None: p k // 2 if isinstance(k, int) else [x // 2 for x in k] # auto-pad return p class Conv(nn.Module): default_act nn.SiLU() # default activation def __init__(self, c1, c2, k1, s1, pNone, g1, d1, actTrue): super().__init__() self.conv nn.Conv2d(c1, c2, k, s, autopad(k, p, d), groupsg, dilationd, biasFalse) self.bn nn.BatchNorm2d(c2) self.act self.default_act if act is True else act if isinstance(act, nn.Module) else nn.Identity() def forward(self, x): return self.act(self.bn(self.conv(x))) def forward_fuse(self, x): return self.act(self.conv(x)) class C2f(nn.Module): def __init__(self, c1, c2, n1, shortcutFalse, g1, e0.5): super().__init__() self.c int(c2 * e) # hidden channels self.cv1 Conv(c1, 2 * self.c, 1, 1) self.cv2 Conv((2 n) * self.c, c2, 1) # optional actFReLU(c2) self.m nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k((3, 3), (3, 3)), e1.0) for _ in range(n)) def forward(self, x): y list(self.cv1(x).chunk(2, 1)) y.extend(m(y[-1]) for m in self.m) return self.cv2(torch.cat(y, 1)) def forward_split(self, x): y list(self.cv1(x).split((self.c, self.c), 1)) y.extend(m(y[-1]) for m in self.m) return self.cv2(torch.cat(y, 1)) class C3(nn.Module): def __init__(self, c1, c2, n1, shortcutTrue, g1, e0.5): super().__init__() c_ int(c2 * e) # hidden channels self.cv1 Conv(c1, c_, 1, 1) self.cv2 Conv(c1, c_, 1, 1) self.cv3 Conv(2 * c_, c2, 1) # optional actFReLU(c2) self.m nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, k((1, 1), (3, 3)), e1.0) for _ in range(n))) def forward(self, x): return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1)) class PSABlock_RCSOSA(nn.Module): def __init__(self, c: int, attn_ratio: float 0.5, num_heads: int 4, shortcut: bool True) - None: super().__init__() self.attn RCSOSA(c, c, seFalse, eattn_ratio, headnum_heads) self.ffn nn.Sequential(Conv(c, c * 2, 1), Conv(c * 2, c, 1, actFalse)) self.add shortcut def forward(self, x: torch.Tensor) - torch.Tensor: x x self.attn(x) if self.add else self.attn(x) x x self.ffn(x) if self.add else self.ffn(x) return x class C3k(C3): def __init__(self, c1: int, c2: int, n: int 1, shortcut: bool True, g: int 1, e: float 0.5, k: int 3): super().__init__(c1, c2, n, shortcut, g, e) c_ int(c2 * e) # hidden channels self.m nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, k(k, k), e1.0) for _ in range(n))) class C3k2_RCSOSA(C2f): def __init__( self, c1: int, c2: int, n: int 1, c3k: bool False, e: float 0.5, attn: bool False, g: int 1, shortcut: bool True, ): super().__init__(c1, c2, n, shortcut, g, e) self.m nn.ModuleList( nn.Sequential( Bottleneck(self.c, self.c, shortcut, g), PSABlock_RCSOSA(self.c, attn_ratio0.5, num_headsmax(self.c // 64, 1)), ) if attn else C3k(self.c, self.c, 2, shortcut, g) if c3k else Bottleneck(self.c, self.c, shortcut, g) for _ in range(n) ) class C2PSA_RCSOSA(nn.Module): def __init__(self, c1: int, c2: int, n: int 1, e: float 0.5): super().__init__() assert c1 c2 self.c int(c1 * e) self.cv1 Conv(c1, 2 * self.c, 1, 1) self.cv2 Conv(2 * self.c, c1, 1) self.m nn.Sequential(*(PSABlock_RCSOSA(self.c, attn_ratio0.5, num_headsself.c // 64) for _ in range(n))) def forward(self, x: torch.Tensor) - torch.Tensor: a, b self.cv1(x).split((self.c, self.c), dim1) b self.m(b) return self.cv2(torch.cat((a, b), 1))