3

I've cloned my GitHub repo into google colab and trying to load data using PyTorch's DataLoader.

global gpu, device
if torch.cuda.is_available():
    gpu = True
    device = 'cuda:0'
    torch.set_default_tensor_type('torch.cuda.FloatTensor')
    print("Using GPU")
else:
    gpu = False
    device = 'cpu'
    print("Using CPU")

data_transforms = transforms.Compose([
    #transforms.Resize(224),
    transforms.ToTensor(),
    transforms.Normalize([0.3112, 0.2636, 0.2047], [0.2419, 0.1972, 0.1554])
    ])
train_path = '/content/convLSTM/code/data/train/'
val_path = '/content/convLSTM/code/data/val/'
test_path = '/content/convLSTM/code/data/test/'

train_data = datasets.ImageFolder(root=train_path, transform=data_transforms)
val_data = datasets.ImageFolder(root=val_path, transform=data_transforms)
test_data = datasets.ImageFolder(root=test_path, transform=data_transforms)

train_loader = torch.utils.data.DataLoader(
    train_data,
    batch_size=18,
    num_workers=4,
    shuffle=False,
    pin_memory=True
    )

val_loader = torch.utils.data.DataLoader(
    val_data,
    batch_size=18,
    shuffle=False,
    num_workers=4,
    pin_memory=True
    )

test_loader = torch.utils.data.DataLoader(
    test_data,
    batch_size=18,
    shuffle=False,
    num_workers=4,
    pin_memory=True
    )
for batch_idx, (data, target) in enumerate(train_loader):
  print(batch_idx)
  if batch_idx==3:
    break

I'm getting the following error when I run the last for loop:

RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

I tried num_workers = 1 instead of 4 but the error persists. I'm not using any multiprocessing.

I also tried without setting torch.set_default_tensor_type('torch.cuda.FloatTensor') but the error persists.

Python : 3.6.8 | PyTorch : 1.3.1

What seems to be the problem?

1
  • Instead of the whole dataloader i just tried doing test = next(iter(train_loader)) and I'm getting the same exact error. Commented Nov 28, 2019 at 4:11

2 Answers 2

6

Not sure if you fixed it already but just in case someone else reads this, using n number of works activates pytorch multi processing. To disable it you need to have the default number of workers which is 0, not 1.

Try setting num_workers to 0 or using the Torch Multiprocessing submodule.

Sign up to request clarification or add additional context in comments.

Comments

1

Just try putting num_workers=0

dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True, num_workers=0)

This solved the problem over Kaggle notebook

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.