PyTorch GPU memory management

Question

In my code, I want to replace values in the tensor given values of some indices are zero, for example

target_mac_out[avail_actions[:, 1:] == 0] = -9999999

But, it returns OOM

RuntimeError: CUDA out of memory. Tried to allocate 166.00 MiB (GPU 0; 10.76 GiB total capacity; 9.45 GiB already allocated; 4.75 MiB free; 9.71 GiB reserved in total by PyTorch)

I think there is no memory allocation because it just visits the tensor of target_mac_out and check the value and replace a new value for some indices.

Am I understanding right?

@Ivan target_mac_out requires grad while avail_actions not. — GoingMyWay
– GoingMyWay, Commented Jan 17, 2021 at 11:46
can you display a code sample that shows how these tensors were created and initialized ? — trialNerror
– trialNerror, Commented Jan 17, 2021 at 20:35
@trialNerror the code is complex, avail_actions is a tensor in the GPU and target_mac_out is a tensor returned from the network. — GoingMyWay
– GoingMyWay, Commented Jan 18, 2021 at 4:56

trialNerror · Accepted Answer · 2021-01-18 10:07:52Z

1

It's hard to guess since we do not even know the sizes if the involved tensors, but your indexing avail_actions[:, 1:] == 0 creates a temporary tensor that does require memory allocation.

answered Jan 18, 2021 at 10:07

trialNerror

3,59311 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Natthaphon Hongcharoen · Accepted Answer · 2021-01-18 10:52:37Z

1

The avail_actions[:, 1:] == 0 create a new tensor, and possibly the whole line itself create another tensor before delete the old one after finish the operation.

If speed is not a problem then you can just use for loop. Like

for i in range(target_mac_out.size(0)):
    for j in range(target_mac_out.size(1)-1):
        if target_mac_out[i, j+1] == 0:
            target_mac_out[i, j+1] = -9999999

answered Jan 18, 2021 at 10:52

Natthaphon Hongcharoen

2,4391 gold badge13 silver badges26 bronze badges

Collectives™ on Stack Overflow

PyTorch GPU memory management

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related