Skip to main content
Filter by
Sorted by
Tagged with
1 vote
1 answer
131 views

I am trying to install JAX with GPU support on a powerful, dedicated Linux server, but I am stuck in what feels like a Catch-22 where every official installation method fails in a different way, ...
PowerPoint Trenton's user avatar
1 vote
1 answer
57 views

I installed NVIDIA Nsight Visual Studio Edition 2025.01 in Visual Studio 2022. I want to debug code, but I can't debug with step over(F10), The debugger always stops at a location without a breakpoint....
Imagination Youth's user avatar
1 vote
0 answers
132 views

I’m using the PyTorch profiler to analyze sglang, and I noticed that in the CUDA timeline, some kernels show “Command Buffer Full”. This causes the cudaLaunchKernel time to become very long, as shown ...
plznobug's user avatar
  • 143
2 votes
0 answers
215 views

I am using WSL2 on windows 10. I have NVIDIA graphics card. I recently installed GPU jax using the command pip install -U "jax[cuda12]". This completed successfully, but when I run any jax ...
DrMittal's user avatar
0 votes
1 answer
99 views

I am trying to implement producer consumer problem in GPU-CPU. Required for some other project. GPU requests some data via Unified memory to CPU. CPU copies that data to a specific location in global ...
Chinmaya Bhat K K's user avatar
0 votes
0 answers
150 views

I'm converting a PWC-Net optical flow model to run on Jetson NX DLA using the iSLAM framework, but the TensorRT engine build fails during DLA optimization. Environment Hardware: NVIDIA Jetson NX ...
Unknown's user avatar
  • 705
0 votes
1 answer
134 views

I want to quantitatively measure the memory bandwidth utilization and SM utilization of a CUDA program for performance analysis and regression testing. My approach so far: Compute the theoretical ...
plznobug's user avatar
  • 143
1 vote
1 answer
283 views

I am writing PTX assembly code on CUDA C++ for research. This is my setup: I have just downloaded the latest CUDA C++ toolkit (13.0) yesterday on WSL linux. The local compilation environment does not ...
Junhao Liu's user avatar
2 votes
1 answer
67 views

I am trying to pass a float4 as argument to my cuda kernel (by value) using PyCUDA’s make_float4(). But there seems to be some misalignment when the data is transferred to the kernel. If I read the ...
Dodilei's user avatar
  • 308
2 votes
0 answers
40 views

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#warp-shuffle-functions https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-shfl-...
Tom Huntington's user avatar
1 vote
1 answer
190 views

My code: from transformers import AutoTokenizer, AutoModel model_name = "NVIDIA/nv-embed-v2" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModel.from_pretrained(...
6zL's user avatar
  • 21
0 votes
0 answers
50 views

Recently,I upgraded my Ubuntu from 22.04 to 24.04 and found the performence of my trained deep network written by torch degrade. After debug, I found the problem is copying data from one gpu to ...
wangwei's user avatar
  • 49
-4 votes
1 answer
42 views

When I was analyzing a large project, there were many kernel files. I wanted to find a specific kernel in the file obtained from nsys analysis. How should I operate
rongtao zhou's user avatar
2 votes
0 answers
81 views

Is there an officially sanctioned way to reuse shared data between global functions? Consider the following code https://cuda.godbolt.org/z/KMj9EKKbf: #include <cuda.h> #include <stdio.h> ...
Johan's user avatar
  • 77.4k
-4 votes
1 answer
275 views

In a talk on The C++ Execution Model, from the cppunderthesea 2024 conference, at around 44:50, NVIDIA's Bryce Adelstein Lelbach claims, that non-NVIDIA GPUs give no guarantee of threads progressing (&...
einpoklum's user avatar
  • 137k
0 votes
0 answers
260 views

I am trying to run a docker compose and failing. I have here a minimal reproducible example. First with this docker-compose.yml services: hello-app: image: python:3.10-slim command: python ...
KansaiRobot's user avatar
  • 10.6k
0 votes
0 answers
87 views

I'm encountering an issue while developing a DPDK-based program using a dual-port ConnectX-6 NIC on Ubuntu 24.04. Despite following the setup instructions, my program fails to detect the NIC ports. ...
Mohammad P's user avatar
0 votes
0 answers
58 views

I am facing an issue with a process that holds GPU memory even after I have terminated it. Here's a detailed breakdown of the situation: The process (a CUDA application) is running and occupies GPU ...
Elspeth Gilbert's user avatar
2 votes
2 answers
97 views

I am writing some code in C in which I want to add the optional ability to have certain sections of the code accelerated using OpenMP, and with an additional optional ability to have them accelerated ...
Matthew G.'s user avatar
0 votes
1 answer
74 views

So im trying to use tensorflow with my yolov8 project but for some reason it is not recognizing my gpu. I had originally installed it using pip but i was told i should use conda instead, so i switched ...
James Pelham-Burn's user avatar
0 votes
0 answers
120 views

Environment: OS: Windows Operating System TensorRT Version: TensorRT-10.3.0.26 NVIDIA CUDA Version: 12.6 cuDNN Version: 9.8 GPU: RTX 3050ti laptop GPU Issue Description: I am encountering an "...
B.Uluer's user avatar
  • 11
1 vote
0 answers
50 views

I have a laptop with an integrated Intel graphics card and an NVIDIA T1000 graphics card. I set the NVIDIA card as the preferred graphic processor in the Managed 3D in NVIDIA Control Panel. However, ...
Martin121233's user avatar
2 votes
0 answers
252 views

I am using Ubuntu 22.04. I have nvidia-570 driver installed along with cuda 12.4 on my host machine. However, I am not able to access gpu in my container. This is my docker-compose-file version: '3.8' ...
prarthana sigedar's user avatar
1 vote
0 answers
134 views

I'm trying to build a docker for realtime-whiper. The build process finishes successflly but at the end it gives this error: Error response from daemon: could not select device driver "nvidia&...
Ali Zekai Deveci's user avatar
0 votes
0 answers
152 views

I am trying to visualize something on Isaac Gym, but when I run gym.draw_viewer(viewer, sim, True) I get a segmentation fault. I think it has something to do with Vulkan because when I run vulkaninfo ...
Kai McClennen's user avatar

1
2 3 4 5
76