23,953 questions
0
votes
2
answers
52
views
pytorch Module B=A, A.to('cpu'), but the tensor in B is still in GPU, why?
After converting module A to CPU, the origin parameter tensor still stays on the GPU? When it is released? Is it wrong if I reuse the parameter?
My code:
import torch.nn as nn
class A(nn.Module):
...
1
vote
1
answer
77
views
PyTorch fails on Windows Server 2019: “Error loading c10.dll” (works fine on Windows 10)
I'm trying to deploy a Python project on Windows Server 2019, but PyTorch fails to import with a DLL loading error.
On my local machine (Windows 10, same Python version), everything works perfectly.
...
2
votes
0
answers
1k
views
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group
I am using CoDi which is multimodal Latent diffusion model. I am trying to remove the modules on images and video from CoDi and fine tune it with text-music pair data.
The training script (train.py) ...
1
vote
3
answers
11k
views
RuntimeError: unexpected EOF, expected 3302200 more bytes. The file might be corrupted
I am trying to implement pretrained model of following repository. I need your assistance to rectify the error.
RuntimeError: unexpected EOF, expected 3302200 more bytes. The file might be corrupted.
...
0
votes
1
answer
61
views
Is passing ray resources as options when calling the function equivalent to setting them in the function's decorator?
Is
@ray.remote
def run_experiment(...):
(...)
if __name__ == '__main__':
ray.init()
exp_config = sys.argv[1]
params_tuples, num_cpus, num_gpus = load_exp_config(exp_config)
ray.get(...
0
votes
0
answers
48
views
Given groups=1, weight of size [64, 1024, 1, 1], expected input[1, 256, 1, 1] to have 1024 channels, but got 256 channels instead
I have encountered this issue and I searched on the forums but I couldnt solve it. How can I solve this problem ?
I tried to add CBAM module in yolov12 for my custom dataset to improve accuracy. I ...
3
votes
1
answer
10k
views
Unable to create a tensor using torch.Tensor
i was trying to create a tensor as below.
import torch
t = torch.tensor(2,3)
i got the following error.
TypeError Traceback (most recent call
last) in ()
----> ...
3
votes
1
answer
1k
views
PyTorch Lightning DDP Error with SLURM: GPU MisconfigurationException and Devices Mismatch
I'm running a PyTorch Lightning training script on a SLURM-managed cluster using Distributed Data Parallel (DDP). My setup involves 1 node with 4 GPUs. However, I'm encountering issues with the ...
0
votes
1
answer
71
views
Unable to step into torch.nn.functional.linear using VS Code debugging
I want to step into the linear function using VS Code's step-in , but it skips automatically when I click "step into". Could anyone help me with this?
I used DEBUG=1 when compiling PyTorch.
...
0
votes
0
answers
29
views
Where is EXECUTORCH_LIBRARY defined in ExecuTorch v1.0?
I’m trying to register a custom operator for ExecuTorch (v1.0, built from the PyTorch 2.5 source tree).
My goal is to create a shared library that defines a few quantum operators and runs them from a ....
0
votes
0
answers
42
views
Unclear formulation in Temporal Fusion Transformer paper
I am currently trying to implement the Temporal Fusion Transformer using PyTorch.
This paper (https://arxiv.org/pdf/1912.09363) is my reference.
Currently I am stuck with the variable selection ...
12
votes
3
answers
13k
views
ImportError: Failed to load PyTorch C extensions
I have a problem with importing torch into my project, when I try to import it I get the following error:
ImportError: Failed to load PyTorch C extensions:
It appears that PyTorch has loaded the `...
0
votes
0
answers
266
views
Installation error while installing GroundingDino
I am trying to install the GroundingDino as instructed in the README file of their official GitHub repo, but I am facing the error below:
Obtaining file:///home/kgupta/workspace/Synthetic_Data_gen/...
3
votes
1
answer
15k
views
RuntimeError: For unbatched 2-D input, hx and cx should also be 2-D but got (3-D, 3-D) tensors
hey I have some problems with my LSTM. I have 6 features and I´m sending all my data (29002 rows) in the LSTM at once (is this a good idea?).
My Input is of size:
Training Shape torch.Size([290002, 1, ...
0
votes
1
answer
26
views
How can I get torch.set_grad_enabled(True) to work in ComfyUI?
I just spent hours figuring out that the following code fails when included in a ComfyUI custom node, but works perfectly fine outside (using the same Python venv). I finally found out that someone ...
1
vote
0
answers
66
views
Should I use torch.inference_mode() in a prediction method even when using model.eval()? [duplicate]
I'm following the book "Deep Learning with PyTorch Step By Step" and I have a question about the predict method in the StepByStep class (from this repository: GitHub).
The current ...
0
votes
5
answers
10k
views
ModuleNotFoundError: No module named 'torchvision.models.feature_extraction'
I want to extract features in ResNet101, however, I have trouble importing torchvision.models.feature_extraction.
Here is my code:
from torchvision import models
from torchvision.models....
0
votes
1
answer
73
views
DQN model either doesn't work or it is extremely slow in training
I'm trying to build a DQN model for my PhD progress, and before I implement it with the actual real data, I want to utilize dummy data.
Using the same process with simple Q Learning the approach was ...
12
votes
3
answers
6k
views
Resume Optuna study from most recent checkpoints
Is there a way to be able to pause/kill the optuna study, then resume it either by running the incomplete trials from the beginning, or resuming the incomplete trials from the latest checkpoint?
study ...
32
votes
12
answers
97k
views
Cannot import Pytorch [WinError 126] The specified module could not be found
I'm trying to do a basic install and import of Pytorch/Torchvision on Windows 10. I installed a Anaconda and created a new virtual environment named photo. I opened Anaconda prompt, activated the ...
3
votes
0
answers
93
views
I get the error " ImportError: libcudnn.so.9: cannot open shared object file: No such file or directory " when i try to use torch in virtual env
I have installed Cuda 13 on fedora 42 .
When i use pytorch localy, torch works fine, but when i creat a virtualenv my pytorch cant find the ibcudnn files.
I get the error
ImportError: libcudnn.so.9: ...
1
vote
2
answers
567
views
groupby aggregate product in PyTorch
I have the same problem as groupby aggregate mean in pytorch. However, I want to create the product of my tensors inside each group (or labels). Unfortunately, I couldn't find a native PyTorch ...
3
votes
4
answers
30k
views
F.cross_entropy raised "RuntimeError: CUDA error: device-side assert triggered. Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions."
I met this error in loss function. An example below:
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
a = torch.Tensor([[-10.3353, -28.4371, 2.0768, -4....
2
votes
0
answers
54
views
Having problems computing PDE Residuals
I'm computing PDE residuals for The_Well datasets (e.g. turbulent_radiative_layer_2D and shear_flow) using finite differences, but the residuals are much larger than I expect. The data are generated ...
1
vote
1
answer
98
views
Can uv integrate with e.g. pytorch prebuilt docker env?
So, pytorch requires a rather large bundle of packages. The prebuilt docker pytorch gpu images (https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/running.html) are quite helpful in ...