Newest 'pytorch-dataloader' Questions

0 votes

1 answer

138 views

Using DataLoader for efficient model prediction

I'm trying to understand the role/utility of batch_size in torch beyond model training. I already have a trained model, where the batch_size was optimized as a hyperparameter. I want to use the model ...

pgaluzio

190

asked Jul 18 at 8:59

0 votes

0 answers

69 views

Batching temporal graphs with Pytorch geometric data loader

I'm conducting research with temporal graph data using Pytorch-geometric. I'm facing some issues of memory usage when making PyG data in dense format (with to_dense_batch() and to_dense_adj()). I have ...

Vincent Tsai

11

asked Jun 11 at 9:57

0 votes

0 answers

32 views

Significant overhead when calling DataLoader for a dataset within FastAPI endpoint using multiple processing

I am calling a machine learning model for a dataset that I have loaded using torch DataLoader: class FilesDataset(): def __init__(self, path): file_paths = glob.glob(os.path.join(path, "*....

Iva

367

asked May 23 at 13:42

0 votes

0 answers

32 views

PyTorch DataLoader gradually slowing down as training progresses

I noticed my dataset iteration gradually slows down as training progresses. I'm using an A100 Google Colab instance. I removed the model and all the training stuff to try to debug the dataset. With ...

Joe C.

501

asked May 18 at 14:22

1 vote

0 answers

42 views

Image Tensors Return As Zero When num_workers > 0

I am facing an issue with multiprocessing. I am trying to load my .pt data as dataloaders. Everything works fine when I set the num_workers = 0. But when I set it to a value greater than 0, the tensor ...

jobayer

11

asked May 5 at 5:15

1 vote

0 answers

71 views

Error When Using Batch Size Greater Than 1 in PyTorch

I'm building a neural network to predict how an image will be partitioned during compression using VVC (Versatile Video Coding). The model takes a single Y-frame from a YUV420 image as input and uses ...

조동건

11

asked Mar 19 at 7:28

1 vote

1 answer

356 views

RuntimeError: Given groups=1, weight of size [64, 3, 3, 7, 7], expected input[1, 8, 3, 112, 112] to have 3 channels, but got 8 channels instead

import os import shutil import random import torch import torchvision.transforms as transforms import cv2 import numpy as np from torch.utils.data import Dataset, DataLoader import torch.nn as nn ...

Can Gürcüoğlu

11

asked Mar 4 at 8:51

0 votes

0 answers

66 views

Pytorch DataLoader loops are slower than expected

I created a training loop with pytorch's TensorDataset and DataLoader classes, but encounter an interesting behavior. The progress intermittently halts every 10-15 batches with seemingly no reason. I ...

Lev_Descartski

126

asked Feb 19 at 7:26

1 vote

0 answers

52 views

How to investigate memory consumption of pytorch_geometric data

I am working on a framework that uses pytorch_geometric graph data stored in the usual way in data.x and data.edge_index Additionally, the data loading process appends multiple other keys to that data ...

Knowledge seeker

11

asked Feb 13 at 13:17

0 votes

1 answer

40 views

How to apply min-max scaling on a IterableDataset?

I'm using an iterableDataset because I have massive amounts of data. And since IterableDataset does not store all data in memory, we cannot directly compute min/max on the entire dataset before ...

Saffy

13

asked Feb 5 at 18:19

0 votes

0 answers

177 views

Training stuck with num_workers > 0, but CPU is used instead of GPU with num_workers=0

I'm facing an issue with num_workers while training my model using PyTorch. If I set num_workers = 0, the training starts, but the model is utilizing the CPU instead of the GPU. Although CUDA is ...

Kamal Basha

1

asked Jan 25 at 10:39

0 votes

0 answers

27 views

How to see what file Dask is working with at any time for stateful dataloader

Problem: I am training an LLM for which my dataloader makes use of Dask to read in data. During LLM training, sometimes something breaks and you need to start again from the last checkpoint. Ideally ...

d-gg

864

asked Jan 9 at 17:38

2 votes

0 answers

332 views

Why does my PyTorch DataLoader only use one CPU core despite setting num_workers>1?

I am trying to fine-tune BERT for a multi-label classification task (Jigsaw toxic comments). I created a custom dataset and DataLoader as follows: class CustomDataSet(Dataset): def __init__(...

Hyppolite

67

asked Dec 26, 2024 at 13:21

0 votes

0 answers

32 views

How to modify batch data to reflect changes in original data in dataloader/pytorch?

I recently created a dataset class and am having trouble modifying the data in the batch so that it is reflected in future batches and original data I have the following dataset class class ...

rajan subramanian

1

asked Nov 30, 2024 at 21:02

-3 votes

1 answer

66 views

in testing dataset using dataloader , should we set shuffle=true or it doesn't matter?

I have a custom dataset (images of pizza,sushi and steak). I'm using torch DataLoader for it , now when writing the test dataloader custom should we set shuffle=true or it just doesn't matter?? I ...

YoussefYoussef2121

1

asked Nov 21, 2024 at 19:33

0 votes

0 answers

92 views

NVIDIA Jetson Orin FastAI2 model optimization with TensorRT and Torch2TRT incorrect Batch size

I have a Jetson Orin with the latest version of Jetpack 6.0 with CUDA 12 running on Ubuntu 22.04. I have installed PyTorch and it has CUDA support installed: Python 3.10.12 (main, Sep 11 2024, 15:47:...

PhilBot

368

asked Oct 19, 2024 at 20:33

0 votes

1 answer

156 views

Speeding up Dataset.getitems

I have a model with a forward function that receives optional parameters, like this: class MyModel(nn.Module): ... def forward(self, interactions: torch.Tensor, user_features: Optional[torch....

David Davó

812

asked Oct 10, 2024 at 7:23

1 vote

0 answers

1k views

Error loading image [SSL] record layer failure (_ssl.c:2578)

I am trying to use dataloaders in my code. I am implementing my code in aws sagemaker but for some reason when I use more than 0 num_workers for my dataloaders I get the error loading image [SSL] ...

Kasra Sadatsharifi

11

asked Oct 6, 2024 at 9:51

-2 votes

1 answer

155 views

error in PyTorch dataloader with num_workers>0 in VSC under WSL

I want to utilize my GPU by adjusting the workers number, but I have a problem with the number of workers > 0. test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False, num_workers=0) - ...

Kalin Stoyanov

581

asked Sep 18, 2024 at 11:11

0 votes

0 answers

57 views

The DataLoader in PyTorch modifies my data

I am trying to train my deep network on the MNIST dataset. When I try to upload the dataset to the dataloader and get the batched data through the iterator, I get modified data that differs from the ...

Kilka

1

asked Sep 1, 2024 at 10:38

2 votes

1 answer

209 views

Resume training in Pytorch using previous persistent_workers state

How to save and restore persistent_workers state of DataLoader in order to resume training from a saved checkpoint. First, I followed the steps in this discussion, so that the results are reproducible ...

HATEM EL-AZAB

371

asked Aug 25, 2024 at 14:10

0 votes

0 answers

49 views

TimeSeriesDataSet/TemporalFusionTransformer - found class 'NoneType' error, I think it is in the 'target_scale' array

I have tried with and without a normalizer. I am doing this with intraday financial data on one financial product, so only one group for now. The target data was already "normalized", in ...

PolarVortex8

59

asked Aug 15, 2024 at 16:58

0 votes

1 answer

31 views

What parameter do I need to change for it to match requirements?

I am trying to train a model based on a modified MNIST dataset so it classifies random images with label 10. I am constantly getting a TypeError. transform = transforms.Compose([ transforms....

Jacky02

3

asked Jul 9, 2024 at 13:43

1 vote

1 answer

111 views

How do I implement training a neural network when generating the data on spot?

I would like to construct a surrogate model of a physics simulation. Thus I am able to generate the data by myself. The data itself is very big, so it makes sense to generate a few data samples (e.g. ...

9hihowareyou9

33

asked Jun 27, 2024 at 13:31

0 votes

1 answer

213 views

PyTorch model training with DataLoader is too slow

I'm training a very small NN using the HAM10000 dataset. For loading the data I'm using the DataLoader that ships with PyTorch: class CocoDetectionWithFilenames(CocoDetection): def __init__(self, ...

Marek M.

3,940

asked Jun 25, 2024 at 6:12

Collectives™ on Stack Overflow

Using DataLoader for efficient model prediction

Batching temporal graphs with Pytorch geometric data loader

Significant overhead when calling DataLoader for a dataset within FastAPI endpoint using multiple processing

PyTorch DataLoader gradually slowing down as training progresses

Image Tensors Return As Zero When num_workers > 0

Error When Using Batch Size Greater Than 1 in PyTorch

RuntimeError: Given groups=1, weight of size [64, 3, 3, 7, 7], expected input[1, 8, 3, 112, 112] to have 3 channels, but got 8 channels instead

Pytorch DataLoader loops are slower than expected

How to investigate memory consumption of pytorch_geometric data

How to apply min-max scaling on a IterableDataset?

Training stuck with num_workers > 0, but CPU is used instead of GPU with num_workers=0

How to see what file Dask is working with at any time for stateful dataloader

Why does my PyTorch DataLoader only use one CPU core despite setting num_workers>1?

How to modify batch data to reflect changes in original data in dataloader/pytorch?

in testing dataset using dataloader , should we set shuffle=true or it doesn't matter?

NVIDIA Jetson Orin FastAI2 model optimization with TensorRT and Torch2TRT incorrect Batch size

Speeding up Dataset.getitems

Error loading image [SSL] record layer failure (_ssl.c:2578)

error in PyTorch dataloader with num_workers>0 in VSC under WSL

The DataLoader in PyTorch modifies my data

Resume training in Pytorch using previous persistent_workers state

TimeSeriesDataSet/TemporalFusionTransformer - found class 'NoneType' error, I think it is in the 'target_scale' array

What parameter do I need to change for it to match requirements?

How do I implement training a neural network when generating the data on spot?

PyTorch model training with DataLoader is too slow

Hot Network Questions