Comments (3)
With an example to demonstrate the issue:
import torch
import eagerpy
a = torch.tensor([0.], requires_grad=True)
torch.norm(a, p=2).backward()
print(a.grad)
eagerpy.astensor(a).norms.l2().raw.backward()
print(a.grad)
tensor([0.])
tensor([nan])
from eagerpy.
Hi @mglisse, thanks for request and the example code.
That makes a lot of sense and I think this might be doable.
May I ask how you use EagerPy? Do you just use it as an alternative API for PyTorch, without needing the the ability to run the same code using different frameworks, or why is this only a problem with PyTorch?
from eagerpy.
Hi, thanks for the reply. I use eagerpy so I can write the code only once and let it work with several frameworks. It is true that currently I mostly experiment with pytorch though.
The problem isn't limited to pytorch. The first time I hit this NaN issue with pytorch, jax was giving good numbers, so I assumed they were doing something different. I didn't keep the exact code, and now that I try to reproduce it, I seem to get NaN from jax and pytorch in the same cases. So I don't know if my experiment at the time was bogus, or hit a very special case...
A good thing is that all frameworks seem to provide a norm function (at least for p not 0?). A bad thing is that the one in jax (I did not check tensorflow) does not seem to have a special (sub)gradient implementation, it also gives a NaN gradient for jax.numpy.linalg.norm(x,2) in 0. But I could go ask them about that. Another bad thing is that they don't have the same definition. On a matrix [[1,2],[3,4]] with p=1, numpy/jax return 6 while torch/tensorflow return 10, that complicates things a bit...
Of course there are workarounds, I could compute norms manually and add tiny
(trying to work through the various dtype/finfo to get it) before doing the square root. Or I can let eagerpy compute the norm and if result is 0, result=result.from_numpy(0.) to replace it with a constant (or actually some better formulation to get the right dtype, plus with pytorch this one does not have requires_grad so if I call .raw.backward() directly on it without combining it with other numbers, it fails).
from eagerpy.
Related Issues (20)
- Will it support for SparseTensor (Tensorflow or Pytorch)? HOT 1
- How to Transform a torch tensor to tensorflow tensor HOT 1
- eagerpy not working together with Neural Tangents HOT 3
- Equivalent of `np.diag`? HOT 5
- topk
- Inclusion of probability distributions (scope question) HOT 1
- implementation of `slogdet` in eagerpy HOT 5
- ep.totensor method? HOT 5
- add type conversions [feature request] HOT 4
- Python Scalars Support HOT 3
- Missing support for ep.nonzero() and ep.flatnonzero() HOT 1
- Have a decorator to wrap universal functions ? HOT 6
- Support for @ operator ? HOT 6
- Does a universal function can be compiled in tensorflow? HOT 5
- where method do not works with pytorch
- `index_update` seems very slow for tensorflow backend
- Why restrict cross entropy to 2D inputs only? HOT 1
- TensorFlowTensor.index_update fails for int64/float64 tensors and int/float values
- ValueError: Unknown type: <class 'tuple'>
- sigmoid support HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from eagerpy.