迁移学习实战预训练模型微调技巧1. 迁移学习原理迁移学习策略 ├── 特征提取冻结预训练层只训练新分类头 ├── 微调解冻部分预训练层低学习率训练 └── 全量微调解冻所有层极低学习率训练2. 图像分类微调importtorchimporttorch.nnasnnfromtorchvisionimportmodels# 加载预训练模型modelmodels.resnet50(pretrainedTrue)# 冻结所有层forparaminmodel.parameters():param.requires_gradFalse# 替换分类头num_classes10model.fcnn.Linear(model.fc.in_features,num_classes)# 只训练分类头optimizertorch.optim.Adam(model.fc.parameters(),lr0.001)# 阶段 2微调forparaminmodel.layer4.parameters():param.requires_gradTrueoptimizertorch.optim.Adam([{params:model.layer4.parameters(),lr:0.0001},{params:model.fc.parameters(),lr:0.001},])3. NLP 微调BERTfromtransformersimportBertTokenizer,BertForSequenceClassificationfromtransformersimportTrainer,TrainingArguments tokenizerBertTokenizer.from_pretrained(bert-base-chinese)modelBertForSequenceClassification.from_pretrained(bert-base-chinese,num_labels2)# 数据处理deftokenize(examples):returntokenizer(examples[text],paddingmax_length,truncationTrue,max_length128)# 训练training_argsTrainingArguments(output_dir./results,num_train_epochs3,per_device_train_batch_size16,learning_rate2e-5,warmup_steps500,evaluation_strategyepoch,)trainerTrainer(modelmodel,argstraining_args,train_datasettrain_dataset)trainer.train()总结策略数据量学习率适用场景特征提取少大数据极少微调中中最常用全量微调多小数据充足