模型列表

注意：分析迁移工具的模型列表仅供参考，备注中提及行数仅为参考，请以实际所在行数为准。

表1 PyTorch模型列表
序号	模型	原始训练工程代码链接参考	备注
1	3D-Transformer-tr_spe	https://github.com/smiles724/Molformer/tree/f5cad25e037b0a63c7370c068a9c477f4004c5ea	-
2	3D-Transformer-tr_cpe
3	3D-Transformer-tr_full
4	AFM	https://github.com/shenweichen/DeepCTR-Torch/tree/b4d8181e86c2165722fa9331bc16185832596232	由于除DIN外，其它模型没有对应训练脚本，迁移前需要拷贝./examples/run_din.py文件，将其命名为run_<模型名称>.py，并做如下修改：导入模型结构，如from deepctr_torch.models.ccpm import CCPM。根据模型结构传入不同的入参初始化网络，如model = CCPM(feature_columns, feature_columns, device=device)。根据网络是否支持dense_feature，修改网络的输入。
5	AutoInt
6	CCPM
7	DCN
8	DeepFM
9	DIN
10	FiBiNET
11	MLR
12	NFM
13	ONN
14	PNN
15	WDL
16	xDeepFM
17	BERT	https://github.com/codertimo/BERT-pytorch/tree/d10dc4f9d5a6f2ca74380f62039526eb7277c671	迁移完成后，该工程在需要安装才能使用，安装步骤如下：去除requirements.txt文件中的torch项。执行python3 setup.py install。具体使用方式详见仓库README。
18	BEiT	https://github.com/microsoft/unilm/tree/9cbfb3e40eedad33a8d2f1f15c4a1e26fa50a5b1	迁移前进行以下操作。把模型源码下载后只保留beit文件夹。下载开源代码库pytorch-image-models0.3.3版本的代码，将其中的timm文件夹移至beit文件夹下迁移后，由于不能将PyTorch模型权重迁移为MindSpore模型权重，需要注释utils.py的第550和560行代码。
19	BiT-M-R101x1	https://github.com/google-research/big_transfer/tree/140de6e704fd8d61f3e5ea20ffde130b7d5fd065	数据集如果使用cifar-10-bin，可从https://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz获取。数据集如果使用cifar-100-bin，可从https://www.cs.toronto.edu/~kriz/cifar-100-binary.tar.gz获取。
20	BiT-M-R101x3
21	BiT-M-R152x2
22	BiT-M-R152x4
23	BiT-M-R50x1
24	BiT-M-R50x3
25	BiT-S-R101x1
26	BiT-S-R101x3
27	BiT-S-R152x2
28	BiT-S-R152x4
29	BiT-S-R50x1
30	BiT-S-R50x3
31	CenterNet-ResNet50	https://github.com/bubbliiiing/centernet-pytorch/tree/91b63b9d0fef2e249fbddee8266c79377f0c7946	迁移后根据仓库readme处理数据集。由于没有训练好的mindspore模型权重，因此需要将train.py中的model_path置为空。
32	CenterNet-HourglassNet
33	Conformer-tiny	https://github.com/pengzhiliang/Conformer/tree/815aaad3ef5dbdfcf1e11368891416c2d7478cb1	迁移前需要将timm库（推荐0.3.2版本）放到原始代码根目录下。由于框架限制，当前不支持--repeated-aug，所以训练时需要使用--no-repeated-aug参数。
34	Conformer-small
35	Conformer-base
36	DeiT-tiny
37	DeiT-small
38	DeiT-base
39	CvT-13	https://github.com/microsoft/CvT/tree/f851e681966390779b71380d2600b52360ff4fe1	迁移前需要将timm库（推荐0.3.2版本）和einops库放到原始代码根目录下。迁移前修改./run.sh中内容: 将train()中训练启动方式（4~10行）改为python3 tools/train.py ${EXTRA_ARGS}。将test()中测试启动方式（15~21行）改为python3 tools/test.py ${EXTRA_ARGS}。
40	CvT-21
41	CvT-W24
42	albert-base-v1	https://github.com/huggingface/transformers/tree/49cd736a288a315d741e5c337790effa4c9fa689	迁移前，需要把原仓库的模板文件移走，这些文件本质不是python文件却以.py后缀命名。 mv templates ../ 迁移后，请进行以下修改：为避免出现list out of range错误，对src/transformers/configuration_utils.py的d["torch_dtype"] = x2ms_adapter.tensor_api.split(str(d["torch_dtype"]), ".")[1]语句取消索引的使用：修改后： d["torch_dtype"] = x2ms_adapter.tensor_api.split(str(d["torch_dtype"]), ".") 对src/transformers/modeling_utils.py的model_to_save.config.torch_dtype = x2ms_adapter.tensor_api.split(str(dtype), ".")[1]语句取消索引的使用：修改后： model_to_save.config.torch_dtype = x2ms_adapter.tensor_api.split(str(dtype), ".") 将./src/transformers/utils/import_utils.py中的is_torch_available()定义返回值改为“True”来走原来的PyTorch流程：修改前： def is_torch_available(): return _torch_available 修改后： def is_torch_available(): return True
43	albert-large-v1
44	albert-xlarge-v1
45	albert-xxlarge-v1
46	albert-Text classification
47	albert-TokenClassification
48	albert-QA
49	albert-MultipleChoice
50	bert-base-uncased
51	bert-large-uncased
52	bert-base-QA
53	bert-base-Text classification
54	bert-base-Multiple Choice
55	bert-base-token-classification
56	distilbert-base-uncased
57	distilbert-base-QA
58	distilbert-base-Text classification
59	roberta-base
60	roberta-large
61	roberta-base-Multiple Choice
62	roberta-base-Text classification
63	roberta-base-token-classification
64	roberta-base-QA
65	xlm-mlm-en-2048
66	xlm-mlm-ende-1024
67	xlm-mlm-enro-1024
68	xlm-clm-enfr-1024
69	xlm-Text classification
70	xlm-Roberta-base
71	xlm-roberta-large
72	xlm-roberta-Text classification
73	Xlm-reberta-token-classification
74	xlm-roberta-QA
75	xlnet-base-cased
76	xlnet-large-cased
77	XLNet-base-Text classification
78	XLNet-base-token-classification
79	XLNet-base-Multiple Choice
80	XLNet-base-QA
81	DistilRoBERTa		迁移后，请修改./src/transformers/utils/import_utils.py中的is_torch_available()定义：修改前： def is_torch_available(): return _torch_available 修改后： def is_torch_available(): return True
82	Transform-XL		迁移后，请进行以下修改：修改./src/transformers/utils/import_utils.py中的is_torch_available()定义：修改前： def is_torch_available(): return _torch_available 修改后： def is_torch_available(): return True MindSpore的dtype转换为字符串类型后的结构和torch的有所不同，因此需要对./src/transformers/modeling_utils.py进行以下修改：修改前： model_to_save.config.torch_dtype = x2ms_adapter.tensor_api.split(str(dtype), ".")[1] 修改后： model_to_save.config.torch_dtype = str(dtype)
83	EfficientNet-B0	https://github.com/lukemelas/EfficientNet-PyTorch/tree/7e8b0d312162f335785fb5dcfa1df29a75a1783a	-
84	EfficientNet-B1
85	EfficientNet-B2
86	EfficientNet-B3
87	EfficientNet-B4
88	EfficientNet-B5
89	EfficientNet-B6
90	EfficientNet-B7
91	EfficientNet-B8
92	egfr-att	https://github.com/lehgtrung/egfr-att/tree/0666ee90532b1b1a7a2a179f8fbf10af1fdf862f	-
93	FasterRCNN	https://github.com/AlphaJia/pytorch-faster-rcnn/tree/943ef668facaacf77a4822fe79331343a6ebca2d	支持以下backbone网络: mobilenet resnet-fpn vgg16 HRNet 迁移前，进行如下修改。因为用到torchvision 0.9.0的MultiScaleRoIAlign算子，因此要将该算子所在文件torchvision/ops/poolers.py拷贝到根目录下，且将./utils/train_utils.py和./utils/faster_rcnn_utils.py中用到该算子的地方修改为如下内容。 from poolers import MultiScaleRoIAlign 由于MindSpore没有torch.utils.data.Subset对应的API，需将./utils/coco_utils.py中涉及该API的代码注释掉，示例如下。 # if isinstance(dataset, torch.utils.data.Subset): # dataset = dataset.dataset 迁移后，修改如下内容。由于MindSpore中的BitwiseOr算子不支持UINT8的输入，需对./utils/roi_header_util.py的如下表达式进行修改。修改前： pos_inds_img \| neg_inds_img 修改后： pos_inds_img.astype(mindspore.int32) \| neg_inds_img.astype(mindspore.int32)
94	FCOS-ResNet50	https://github.com/zhenghao977/FCOS-PyTorch-37.2AP/tree/2bfa4b6ca57358f52f7bc7b44f506608e99894e6	迁移后需要进行以下修改。数据集使用VOC数据集，需要修改./train_voc.py代码中第39行数据集路径为实际路径。由于mindspore中没有对应的scatter算子，因此需要对./model/loss.py文件进行以下修改。将125和126行替换为以下代码： min_indices = mindspore.ops.ArgMinWithValue(-1)(areas.reshape(-1, areas.shape[-1])) tmp = np.arange(0, batch_size * h_mul_w).astype(np.int32) indices = mindspore.ops.Concat(-1)((mindspore.ops.ExpandDims()(mindspore.Tensor(tmp), -1), mindspore.ops.ExpandDims()(min_indices[0], -1))) reg_targets = mindspore.ops.GatherNd()(ltrb_off.reshape(-1, m, 4), indices) 将130行替换为以下代码： cls_targets = mindspore.ops.GatherNd()(classes.reshape(-1, m, 1), indices) 在文件的第7行导入相应的包： import numpy as np 由于没有mindspore的预训练模型，因此需要将./model/config.py中的pretrained，freeze_stage_1 ，freeze_bn修改为False。
95	FCOS-ResNet101
96	MGN-strong	https://git.openi.org.cn/Smart_City_Model_Zoo/mgn-strong	迁移前需要进行以下修改。该模型依赖于torchvision，因此需要将torchvision/目录下的models/文件夹拷贝至./mgn-strong/model/目录下；将./mgn-strong/model/models/__init__.py的内容改为： from .resnet import * 修改./mgn-strong/model/mgn.py第7行的import语句：修改前： from torchvision.models.resnet import resnet50, Bottleneck, resnet101 修改后： from .models.resnet import resnet50, Bottleneck, resnet101 修改./mgn-strong/loss/triplet.py第83行addmm_调用语句：修改前： dist.addmm_(1, -2, inputs, inputs.t()) 修改后： dist.addmm_(inputs, inputs.t(), beta=1, alpha=-2) 需要在Mindspore 1.7.0上运行。
97	MobileNetV1 SSD	https://github.com/qfgaohao/pytorch-ssd/tree/f61ab424d09bf3d4bb3925693579ac0a92541b0d	MindSpore暂不支持数据集加载中使用Tensor和在模型中对ModuleList使用切片。因此迁移前需要对原始工程文件夹下./vision/ssd/ssd.py进行如下修改。第57行for循环修改为如下循环。 for idx in range(start_layer_index, end_layer_index): layer = self.base_net[idx] 第143行“self.center_form_priors = center_form_priors”语句前插入“center_form_priors = center_form_priors.asnumpy()”。
98	MobileNetV1 SSD-Lite
99	MobileNetV2 SSD-Lite
100	MobileNetV3-Large SSD-Lite
101	MobileNetV3-Small SSD-Lite
102	SqueezeNet SSD-Lite
103	VGG16 SSD
104	SqueezeNet	https://github.com/weiaicunzai/pytorch-cifar100/tree/2149cb57f517c6e5fa7262f958652227225d125b	数据集使用cifar-100-bin，可从https://www.cs.toronto.edu/~kriz/cifar-100-binary.tar.gz获取。根据实际修改./utils.py文件中的数据集路径。
105	InceptionV3
106	InceptionV4
107	InceptionResNetV2
108	Xception
109	Attention56
110	StochasticDepth18
111	StochasticDepth34
112	StochasticDepth50
113	StochasticDepth101
114	VGG11
115	VGG13
116	VGG16
117	DenseNet161
118	DenseNet169
119	DenseNet201
120	PreActResNet34
121	PreActResNet50
122	PreActResNet101
123	PreActResNet152
124	ResNeXt152
125	SEResNet34
126	SEResNet50
127	SEResNet101
128	VGG19	https://github.com/kuangliu/pytorch-cifar/tree/49b7aa97b0c12fe0d4054e670403a16b6b834ddd	数据集需要使用cifar-10-bin，可从https://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz获取。切换不同的网络需要在./main.py中修改使用的网络名称。
129	PreActResNet18
130	DenseNet121
131	ResNeXt29_2x64d
132	MobileNet
133	MobileNetV2
134	SENet18
135	ShuffleNetG2
136	GoogleNet
137	DPN92
138	RetineNet-ResNet34	https://github.com/yhenon/pytorch-retinanet/tree/0348a9d57b279e3b5b235461b472d37da5feec3d	由于原始仓库代码中有关于torch版本和加载torch模型的代码，迁移前需要修改原始工程脚本./train.py：第77~88行的backbone模型选择，pretrained参数都设置为False。删除第18行assert torch.__version__.split('.')[0] == '1' 迁移完成后，由于mindspore反向传递和数据集加载的限制，需要进行以下修改：在./retinanet/losses.py中25和26行替换成以下代码。 def print_grad_fn(cell_id, grad_input, grad_output): pass class FocalLoss(mindspore.nn.Cell): def __init__(self): super(FocalLoss, self).__init__() self.register_backward_hook(print_grad_fn)
139	RetineNet-ResNet50
140	Res2Net	https://github.com/Res2Net/Res2Net-ImageNet-Training/tree/d77c16ff111522c64e918900f100699acc62f706	暂不支持torchvision.models相关接口的迁移，需做以下操作。修改原始工程：创建目录./res2net_pami/models。在./res2net_pami/main.py中，将import torchvision.models as models改为import models。
141	ResNet18	https://github.com/pytorch/examples/tree/41b035f2f8faede544174cfd82960b7b407723eb/imagenet	暂不支持torchvision.models相关接口的迁移，需做以下操作。修改原始工程：创建目录./imagenet/models。从torchvision库（0.6.0版本）中拷贝torchvision/models/resnet.py至./imagenet/models下，删除from .utils import load_state_dict_from_url语句。创建./imagenet/models/__init__.py文件，内容为： from .resnet import * ./main.py中，将import torchvision.models as models改为import models。
142	ResNet34
143	ResNet50
144	ResNet101
145	ResNet152
146	ResNeXt-50（32x4d）
147	ResNeXt-101（32x8d）
148	Wide ResNet-50-2
149	Wide ResNet-101-2
150	sparse_rcnnv1-resnet50	https://github.com/liangheming/sparse_rcnnv1/tree/65f54808f43c34639085b01f7ebc839a3335a386	迁移后，修改如下内容。 ./x2ms_adapter/nn.py中手动修改MultiheadAttention类中初始化函数的batch_size、src_seq_length和tgt_seq_length的值。对./nets/common.py的if x.requires_grad:语句改为if True:。对./losses/sparse_rcnn_loss.py中linear_sum_assignment函数的item[i]变量转换成numpy。 indices = linear_sum_assignment(item[i].asnumpy()) 对./datasets/coco.py做2点修改： __getitem__定义模块的返回语句return box_info修改为return box_info.img,box_info.labels,box_info.boxes。修改collect_fn定义模块的for循环。 image,labels,boxes = item #增加 img = (image[:, :, ::-1] / 255.0 - np.array(rgb_mean)) / np.array(rgb_std)#item.img改为image target = x2ms_np.concatenate([labels[:, None], boxes], axis=-1)#item.labels改为labels，item.boxes改为boxes
151	sparse_rcnnv1-resnet101
152	ShuffleNetV2	https://github.com/megvii-model/ShuffleNet-Series/tree/aa91feb71b01f28d0b8da3533d20a3edb11b1810	-
153	ShuffleNetV2+		-
154	SMSD	https://git.openi.org.cn/PCLNLP/Sarcasm_Detection/src/commit/54bae1f2306a4d730551b4508ef502cfdbe79918	迁移前需要进行以下操作：在./SMSD/目录中新建state_dict文件夹。在./SMSD/models/__init__.py中添加如下语句导入SMSD_bi模型。 from models.SMSD_bi import SMSD_bi 运行迁移后代码可通过--repeat参数控制训练重复次数（以SMSD_bi模型为例）： python3 train.py --model_name SMSD_bi --repeat 1
155	SMSD_bi
156	Swin-Transformer	https://github.com/microsoft/Swin-Transformer/tree/5d2aede42b4b12cb0e7a2448b58820aeda604426	迁移前需要将timm库代码放到原始代码根目录下。 timm库版本推荐0.4.12。当前--cfg参数只支持以下四个配置文件： swin_tiny_patch4_window7_224.yaml swin_tiny_c24_patch4_window8_256.yaml swin_small_patch4_window7_224.yaml swin_base_patch4_window7_224.yaml
157	Transformer	https://github.com/SamLynnEvans/Transformer/tree/e06ae2810f119c75aa34585442872026875e6462	需要对于该代码仓中脚本依赖的torchtext库进行迁移并有如下注意事项：拷贝迁移后的torchtext_x2ms到脚本文件夹。将torchtext_x2ms重命名为torchtext，以保证用户调用的是迁移后的torchtext。 torchtext版本建议使用0.6.0。
158	UNet	https://github.com/milesial/Pytorch-UNet/tree/e1a69e7c6ce18edd47271b01e4aabc03b436753d	-
159	RCNN-Unet	https://github.com/bigmb/Unet-Segmentation-Pytorch-Nest-of-Unets/tree/c050f5eab6778cba6dcd8f8a68b74c9e62a698c8	迁移前需要进行以下操作：由于MindSpore求导存在语法限制，./pytorch_run.py中249和252行的注释需要修改为4空格倍数对齐。模型要求输入图片大小为16的倍数，因此当数据集图片大小不满足16倍数时，需取消./pytorch_run.py中121、122行和505、506行的注释，将图片缩放裁剪为16倍数。当数据集label图片通道为1时，需要在./pytorch_run.py的293行尾加入.convert('RGB')将图片转换为3通道。由于MindSpore中使用ModuleList会导致子层的权重名称改变，需要将./pytorch_run.py第350行的torch.nn.ModuleList改为list，避免checkpoint文件保存后无法重新加载。
160	Attention Unet
161	RCNN-Attention Unet
162	Nested Unet
163	ViT-B_16	https://github.com/jeonsworld/ViT-pytorch/tree/460a162767de1722a014ed2261463dbbc01196b6	数据集需要使用cifar-10-bin，可从https://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz获取。
164	ViT-B_32
165	ViT-L_16
166	ViT-L_32
167	ViT-H_14
168	R50-ViT-B_16
169	YOLOv3	https://github.com/ultralytics/yolov3/tree/ae37b2daa74c599d640a7b9698eeafd64265f999	迁移完成后进行如下修改。对./models/yolo.py： class Detect(mindspore.nn.Cell): stride = None #删除 onnx_dynamic = False def __init__(self, …): … self.stride = None #新增对./utils/loss.py的build_targets函数：修改前 gij = x2ms_adapter.tensor_api.long((…)) gi, gj = gij.T … tbox.append(…) 修改后 gij = x2ms_adapter.tensor_api.long((…)).T gi, gj = gij … gij = gij.T tbox.append(…) 对./val.py中run函数：删除“path, shape = Path(paths[si]), shapes[si][0]”。删除scale_coords函数的调用处。删除“callbacks.run('on_val_image_end'，…）”。将./models/目录下对应模型配置文件{model_name}.yaml中所有nn.修改为x2ms_adapter.nn.。运行多卡场景下，将train.py中“val_loader = create_dataloader(…）”的rect参数修改为False。
170	YOLOv3-Tiny
171	YOLOv3-SSP
172	YOLOv4	https://github.com/WongKinYiu/PyTorch_YOLOv4/tree/eb5f1663ed0743660b8aa749a43f35f505baa325	迁移完成后进行如下修改。修改./model/models.py的create_module函数。修改前： module_list[j][0].bias = mindspore.Parameter(bias_, …) 修改后： module_list[j][0].bias = mindspore.Parameter(bias.reshape(bias_.shape), …) 修改./utils/datasets.py 修改前： if os.path.isfile(cache_path): 修改后： if False: 对./utils/loss.py的build_targets函数：修改前 gij = x2ms_adapter.tensor_api.long((…)) gi, gj = gij.T … tbox.append(…) 修改后 gij = x2ms_adapter.tensor_api.long((…)).T gi, gj = gij … gij = gij.T tbox.append(…) 对./train.py，把if '.bias' in k: 修改为if '.bias' in k or '.beta' in k: 字符串'Conv2d.weight'修改为'.weight' 运行多卡场景下，将./train.py中“testloader = create_dataloader(…）”的rect参数修改为False。
173	YOLOv4-tiny
174	YOLOv4-pacsp
175	YOLOv4-paspp
176	YOLOv4-csp-leaky
177	YOLOv5l	https://github.com/ultralytics/yolov5/tree/8c420c4c1fb3b83ef0e60749d46bcc2ec9967fc5	迁移完成后进行如下修改。对./models/yolo.py： class Detect(mindspore.nn.Cell): stride = None #删除 … def __init__(self, …): … self.stride = None #新增对./utils/loss.py的build_targets函数：修改前 gij = x2ms_adapter.tensor_api.long((…)) gi, gj = gij.T … tbox.append(…) 修改后 gij = x2ms_adapter.tensor_api.long((…)).T gi, gj = gij … gij = gij.T tbox.append(…) 对./val.py中run函数：删除“path, shape = Path(paths[si]), shapes[si][0]”。删除scale_coords函数的调用处。删除“callbacks.run('on_val_image_end'，…）”。将./models/目录下对应模型配置文件{model_name}.yaml中所有nn.修改为x2ms_adapter.nn.。运行多卡场景下，将./train.py中“val_loader = create_dataloader(…）”的rect参数修改为False。
178	YOLOv5m
179	YOLOv5n
180	YOLOv5s
181	YOLOv5x
182	YOLOX	https://github.com/bubbliiiing/yolox-pytorch/tree/1448e849ac6cdd7d1cec395e30410f49a83d44ec	迁移后，修改如下内容。注释./train.py第341行代码。 #'adam' : optim_register.adam(pg0, Init_lr_fit, betas = (momentum, 0.999)) 进行训练前，防止hccl超时，执行如下命令。 export HCCL_CONNECT_TIMEOUT=3000
183	AAGCN-ABSA	https://git.openi.org.cn/PCLNLP/SentimentAnalysisNLP/src/commit/7cf38449dad742363053c4cc380ebfe33292184d	-
184	CAER-ABSA		依赖三方库pytorch-pretrained-bert，下载并将其子目录pytorch_pretrained_bert拷贝至SentimentAnalysisNLP/目录下。将./SentimentAnalysisNLP/pytorch_pretrained_bert/modeling.py中158行的BertLayerNorm类定义移至try-except语块外。
185	GIN-ABSA		依赖三方库pytorch-pretrained-bert，下载并将其子目录pytorch_pretrained_bert拷贝至SentimentAnalysisNLP/目录下。由于MindSpore中不支持在数据处理过程中创建Tensor，需要在./GIN-ABSA/data_utils.py中去除数据集初始化中的tensor创建，包括：去除147行的torch.tensor()操作和234行的torch.tensor()操作。
186	Scon-ABSA		依赖huggingface中bert-base-uncased的预训练权重，需要下载pytorch_model.bin，转换成MindSpore格式的pytorch_model.ckpt后，在脚本中加载该转换后的模型权重。依赖三方库pytorch-pretrained-bert，下载并将其子目录pytorch_pretrained_bert拷贝至SentimentAnalysisNLP/目录下。将./pytorch_pretrained_bert/modeling.py中158行的 BertLayerNorm类定义移至try-except语块外。
187	Trans-ECE		依赖huggingface中bert-base-chinese的预训练权重，需要下载到当前目录的bert-base-chinese下，并将模型权重pytorch_model.bin转换到MindSpore格式的pytorch_model.ckpt。由于原始代码中存在缺陷，需要将./Trans-ECE/Run.py中48、49行使用list包裹filter，并删除54行多余的trans_optimizer参数。由于当前不支持自定义优化器，需要将./Trans-ECE/Run.py中55行的BertAdam改为optim.Adam优化器。
188	PyramidNet 101	https://github.com/dyhan0920/PyramidNet-PyTorch/tree/5a0b32f43d79024a0d9cd2d1851f07e6355daea2	迁移前，进行如下修改。由于原始仓库代码对于python版本和pytorch版本有限制，需要根据https://github.com/dyhan0920/PyramidNet-PyTorch/issues/5进行适配性修改。由于迁移后代码不需要torchvision模块，需要注释train.py的23~25行。 #model_names = sorted(name for name in models.__dict__ # if name.islower() and not name.startswith("__") # and callable(models.__dict__[name]))
189	PyramidNet 164 bottleneck
190	PyramidNet 200 bottleneck

表2 TensorFlow 2模型列表
序号	模型	原始训练工程代码链接参考	备注
1	ALBERT_base_v2	https://github.com/huggingface/transformers/tree/49cd736a28	迁移前，需要把原仓库的模板文件移走，这些文件本质不是python文件却以.py后缀命名。 mv templates ../ 迁移后，请进行以下修改：对./examples/tensorflow/language-modeling/run_mlm.py：补充以下导包语句： from x2ms_adapter.keras.losses import SparseCategoricalCrossentropy 将DataCollatorForLanguageModeling参数return_tensors的值"tf"改为"np"：修改前： data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm_probability=data_args.mlm_probability, return_tensors="tf") 修改后： data_collator = DataCollatorForLanguageModeling(tokenizer=tokenizer, mlm_probability=data_args.mlm_probability, return_tensors="np") 修改model.compile调用的参数：修改前： model.compile(optimizer=optimizer) 修改后： model.compile(optimizer=optimizer, loss=SparseCategoricalCrossentropy(True)) 修改./src/transformers/modeling_tf_utils.py的dtype_byte_size方法：修改前： bit_search = re.search("[^\d](\d+)$", dtype.name) 修改后： bit_search = re.search("[^\d](\d+)$", str(dtype))
2	ALBERT_large_v2
3	ALBERT_xlarge_v2
4	ALBERT_xxlarge_v2
5	ALBERT_base_v1
6	ALBERT_large_v1
7	ALBERT_xlarge_v1
8	ALBERT_xxlarge_v1
9	roberta-base
10	roberta-large
11	RBT6
12	RBT4
13	RBTL3
14	RBT3
15	DenseNet_121	https://github.com/calmisential/Basic_CNNs_TensorFlow2/tree/f063c84451f12e904f9c91c51278be52afccb0c2	请根据需要，在./configuration.py中进行epoch、batch_size、数据集路径等配置。迁移前，将./models/__init__.py文件的regnet.RegNet代码行注释。 #regnet.RegNet()
16	DenseNet_169
17	EfficientNet_B0
18	EfficientNet_B1
19	Inception_V4
20	MobileNet_V1
21	MobileNet_V2
22	MobileNet_V3_Large
23	MobileNet_V3_Small
24	ResNet_101
25	ResNet_152
26	ResNet_18
27	ResNet_34
28	ResNet_50
29	ResNext_101
30	ResNext_50
31	Shufflenet_V2_x0_5
32	Shufflenet_V2_x1_0
33	AFM	https://github.com/ZiyaoGeng/Recommender-System-with-TF2.0/tree/1d2aa5bf551873d5626539c196705db46d55c7b6	各个网络文件夹均依赖./data_process/目录，请直接迁移Recommender-System-with-TF2.0/目录或将./data_process/复制至网络文件夹下后再进行迁移。
34	Caser
35	DCN
36	Deep_Crossing
37	DeepFM
38	DNN
39	FFM
40	FM
41	MF
42	NFM
43	PNN
44	WDL
45	BiLSTM-CRF	https://github.com/kangyishuai/BiLSTM-CRF-NER/tree/84bde29105b13cd8128bb0ae5d043c4712a756cb	训练需在MindSpore1.7 版本中执行。根据原训练工程README.md下载完整数据集，解压数据集并将里面的文件拷贝至./data中。迁移后，视训练情况，将./main.py中的batch_size，hidden_num和embedding_size参数值适当调小，如以下示例： params = { "maxlen": 128, "batch_size": 140, "hidden_num": 64, "embedding_size": 64, "lr": 1e-3, "epochs": 10 }
46	FCN	https://github.com/YunYang1994/TensorFlow2.0-Examples/tree/299fd6689f242d0f647a96b8844e86325e9fcb46/5-Image_Segmentation/FCN	./parser_voc.py中使用的scipy.misc.imread方法为scipy 1.2.0以前的旧版本API，mindspore最低兼容scipy 1.5.2，因此请使用scipy的官方弃用警告中推荐的imageio.imread。
47	GoogleNet	https://github.com/marload/ConvNets-TensorFlow2/tree/29411e941c4aa72309bdb53c67a6a2fb8db57589	迁移后的load_data()接口需要通过data_dir参数指定数据集路径或将数据集放置在默认路径~/x2ms_datasets/cifar10/cifar-10-batches-py下。
48	SqueezeNet
49	Unet	https://github.com/YunYang1994/TensorFlow2.0-Examples/tree/299fd6689f242d0f647a96b8844e86325e9fcb46/5-Image_Segmentation/Unet	数据集请使用Membrane，可从该训练工程的README.md中获取。
50	Vit	https://github.com/tuvovan/Vision_Transformer_Keras/tree/6a1b0959a2f5923b1741335aca5bc2f8dcc7c1f9	迁移后的load()接口需要通过data_dir参数指定数据集路径或将数据集放置在默认路径~/x2ms_datasets/cifar10/cifar-10-batches-bin下。需要删除train.py中“early_stop = tf.keras.callbacks.EarlyStopping(patience=10),”中的逗号以保证callback对象为单一的实例而非元组。

表3 TensorFlow 1模型列表
序号	模型	原始训练工程代码链接参考	备注
1	ALBERT-base-v2	https://github.com/google-research/ALBERT/tree/a36e095d3066934a30c7e2a816b2eeb3480e9b87	迁移前，需要进行以下修改：在./classifier_utils.py，将以下语句 if t.dtype == tf.int64: 修改为 if t.dtype == "int64": 在./optimization.py，进行以下修改： optimizer = AdamWeightDecayOptimizer( 修改为 optimizer = tf.keras.optimizers.Adam( train_op = tf.group(train_op, [global_step.assign(new_global_step)]) 修改为 train_op = tf.group(train_op, global_step) 数据集使用Glue-MNLI时，record数据集需参考README自行生成。
2	ALBERT-large-v2
3	ALBERT-xlarge-v2
4	ALBERT-xxlarge-v2
5	Attention-Based Bidirectional RNN	https://github.com/dongjun-Lee/text-classification-models-tf/tree/768ea13547104f56786c52f0c6eb99912c816a09	由于dropout算子在mindspore中已经对training参数做了处理，所以需要将模型定义文件中self.keep_prob属性直接修改为0.5，无需通过where判断。
6	Character-level CNN
7	RCNN
8	Very Deep CNN
9	Word-level Bidirectional RNN
10	Word-level CNN
11	BERT-Tiny	https://github.com/google-research/bert/tree/eedf5716ce1268e56f0a50264a88cafad334ac61	迁移前，请进行如下修改： ./run_classifier.py文件中：将hidden_size = output_layer.shape[-1].value的“.value”删除（源代码592行）。 hidden_size = output_layer.shape[-1] 注释以下代码（源代码869、870行）： #file_based_convert_examples_to_features( # train_examples, label_list, FLAGS.max_seq_length, tokenizer, train_file) 在_decode_record函数中，将源代码529行 if t.dtype == tf.int64: 改为 if t.dtype == 'int64’ 对./optimization.py：将AdamWeightDecayOptimizer的实例化代码替换成optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate) #optimizer = AdamWeightDecayOptimizer( #… #) optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate) 修改前： train_op = tf.group(train_op, [global_step.assign(new_global_step)]) 修改后： train_op = tf.group(train_op, global_step)
12	BERT-Mini
13	BERT-Small
14	BERT-Medium
15	BERT-Base
16	RBT6	https://github.com/bojone/bert4keras/tree/9c1c916def4d515a046c414	迁移前，需进行以下修改：在./examples/task_language_model.py中：修改checkpoint_path，config_path，dict_path，输入的训练数据，batch_size的值。 txt = open(txt, encoding='gbk').read() 改为 txt = open(txt, encoding='utf8').read() 对./bert4keras/layers.py，在from keras.layers import 语句后增加以下导入语句： from keras.layers import Input, Dropout, Lambda, Add, Dense, Activation 对./bert4keras/models. py，在from bert4keras.layers import 语句后增加以下导入语句： from bert4keras.layers import Input, Dropout, Lambda, Add, K, Dense, Activation
17	RBT4
18	RBTL3
19	RBT3
20	RoBERTa-wwm-ext-large
21	RoBERTa-wwm-ext
22	Bi-LSTM-CRF	https://github.com/fzschornack/bi-lstm-crf-tensorflow/tree/5181106	迁移前需要新建./bi-lstm-crf-tensorflow.py文件，并将bi-lstm-crf-tensorflow.ipynb中的代码拷贝至该新建的Python文件中。迁移后，视训练情况修改./bi-lstm-crf-tensorflow.py中num_units变量的赋值语句值。以修改为64为例： #num_units = 128 num_units = 64
23	CNN-LSTM-CTC	https://github.com/watsonyanghx/CNN_LSTM_CTC_Tensorflow/tree/6999cd19285e7896cfe77d50097b0d96fb4e53e8	迁移前，将utils.py中的43行注释 #tf.app.flags.DEFINE_string('log_dir', './log', 'the logging dir') 迁移后，适当调小./utils.py中validation_steps的值，以便训练时快速观察模型收敛效果，以修改为50为例： x2ms_FLAGS.define_integer('validation_steps', 50, 'the step to validation') 在工程根目录下，创建./imgs/train, ./imgs/val文件夹，放入指定训练、测试数据。
24	LeNet	https://github.com/Jackpopc/aiLearnNotes/tree/7069a705bbcbea1ac24	迁移后，自行下载MNIST数据集，放至执行命令的目录下。解压MNIST里的所有.gz文件并删除原gz文件。若网络收敛不佳，可尝试调小学习率LR、增大训练周期EPOCHS。
25	AlexNet
26	ResNet-18	https://github.com/taki0112/ResNet-Tensorflow/tree/f395de3a53d	迁移前需要安装jedi依赖。迁移后需要做以下适配。迁移后的load()接口需要通过data_dir参数指定数据集路径或将需要的数据集放置在默认路径： ~/x2ms_datasets/cifar100/cifar-100-python ~/x2ms_datasets/cifar10/cifar-10-batches-py ~/x2ms_datasets/mnist.npz ~/x2ms_datasets/fashion-mnist 由于MindSpore不支持一次性做大批量数据的测试。需要自行修改代码将测试集分批次做测试。测试集的placeholder的shape第一维改为1。
27	ResNet-34
28	ResNet-50
29	ResNet-101
30	ResNet-152

父主题： 附录