书签

Indices should be either on cpu or on the same device as the indexed tensor

回答 4 浏览 9146 2022-11-09

我下载了一个为YoloV7准备的数据集。我还克隆了yoloV7 Repo。

我想用这个下载的数据集来训练一个模型，为此我使用了这个命令。

python train.py --workers 8 --device 0 --batch-size 16 --data data.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights yolov7x.pt --name yolov7 --hyp data/hyp.scratch.p5.yaml

我得到了这样的RuntimeError

autoanchor: Analyzing anchors... anchors/target = 5.50, Best Possible Recall (BPR) = 1.0000
Image sizes 640 train, 640 test
Using 8 dataloader workers
Logging results to runs\train\yolov74
Starting training for 300 epochs...

     Epoch   gpu_mem       box       obj       cls     total    labels  img_size
  0%|                                                                                                                                                                                                               | 0/372 [00:03<?, ?it/s]
Traceback (most recent call last):
  File "D:\projects\yolov7\train.py", line 618, in <module>
    train(hyp, opt, device, tb_writer)
  File "D:\projects\yolov7\train.py", line 363, in train
    loss, loss_items = compute_loss_ota(pred, targets.to(device), imgs)  # loss scaled by batch_size
  File "D:\projects\yolov7\utils\loss.py", line 585, in __call__
    bs, as_, gjs, gis, targets, anchors = self.build_targets(p, targets, imgs)
  File "D:\projects\yolov7\utils\loss.py", line 759, in build_targets
    from_which_layer = from_which_layer[fg_mask_inboxes]
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

我的系统包含1xCpu，1xCuda GPU（它是一个默认的游戏电脑）。

Tristate 提问于2022-11-09

也请看这个公开问题。github.com/WongKinYiu/yolov7/issues/1045 - Valentin_Ștefan 2022-11-19

4 个回答

#1楼已采纳

得票数 25

我相信这是当前实现中的一个错误。你可以通过修改utils/loss.py第685行来解决这个问题。

from_which_layer.append((torch.ones(size=(len(b),)) * i).to('cuda'))

并在756后面加一行，把fg_mask_inboxes放在你的cuda设备上。

fg_mask_inboxes = fg_mask_inboxes.to(torch.device('cuda'))

anactualtoaster 提问于2022-11-10

成功了！！在网上搜索了两天后，.... - Oscar Rangel 2022-11-11

看起来当你要用p6模型训练时，你需要使用train_aux.py，但它给你带来了类似的错误，你知道如何执行它吗？------ yolov7/utils/loss.py", 1559行, in build_targets2 from_which_layer = from_which_layer[fg_mask_inboxes] RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu) - Oscar Rangel 2022-11-11

@OscarRangel 尝试：from_which_layer.append((torch.ones(size=(len(b),))* i).to('cuda')) 在第658行 - Perry45 2022-12-14

谢谢。它起作用了，我不明白的是，以前不做这个修改也能起作用。 - Walid Ahmed 2022-12-16

我还恨不得把第742行的"cpu"改成'cuda'，在utils/loss.py中改。 - Cubimon 2022-12-31

#2楼

得票数 2

如果你要用yolov7使用p6模型，你需要使用train_aux.py，而不是train.py，而且你还需要修改几行。

1336 -- from_which_layer.append((torch.ones(size=(len(b),)) * i).to('cuda'))

1407 -- fg_mask_inboxes = fg_mask_inboxes.to(torch.device('cuda'))

Oscar Rangel 提问于2022-11-11

Oscar Rangel 修改于2022-11-13

#3楼

得票数 1

https://github.com/WongKinYiu/yolov7/blob/main/utils/loss.py

试着将第742行改为

matching_matrix = torch.zeros_like(cost, device="cpu")

这对我来说是有效的。

Manpreet Kour 提问于2022-12-08

您的答案可以通过其他支持信息得到改进。请编辑以添加更多详细信息，例如引用或文档，以便其他人可以确认您的答案是正确的。您可以在帮助中心找到更多关于如何写出好的答案的信息。 - Community 2022-12-17

#4楼

得票数 0

你可以通过修改utils/loss.py来解决这个问题。替换utils/loss.py的第759行，即

from_which_layer = from_which_layer[fg_mask_inboxes]

改为

from_which_layer = from_which_layer.to(fg_mask_inboxes.device)[fg_mask_inboxes]

这种修改的主要思路是将两个变量from_which_layer和fg_mask_inboxes放在同一个设备中。

npn 提问于2022-12-21

标签

yolo