Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-images.githubu

The culprit is defintely here, this is what I pointed on <a class="issue-link js-issue

Loss function about inverse_design HOT 25 CLOSED

flaport commented on August 26, 2024

Loss function

from inverse_design.

Comments (25)

ianwilliamson commented on August 26, 2024 3

The scattering parameter arrays that enter into this loss function (S) correspond to the complex-valued scattering parameters that are returned by the ceviche-challenges model instance. This means that |S|^2 are the power scattering parameters, i.e. a value of 1.0 corresponding to full transmission and 0.5 corresponding to half power transmission (-3.0 in dB / log scale). We used the linear scale power quantities in the loss function for the optimizations in the paper, not dB / log scale quantities.

from inverse_design.

jan-david-fischbach commented on August 26, 2024 1

👍🏻 Perfect. Then It should be exactly as implemented. We only needed the dB conversion to convert the target values to linear scale.

from inverse_design.

ianwilliamson commented on August 26, 2024 1

If it helps, the designs for the mode converter problem (as CSV files) are available here, under the designs/ folder. A design from step 134 and 159 of the optimizations in the original paper are available.

from inverse_design.

lucasgrjn commented on August 26, 2024

The culprit is defintely here, this is what I pointed on #10. I am also investigating by testing on notebooks to find where is the issue to solve this.

If you take a look at your implementation, it is not the exact one. It should be:

L = jnp.linalg.norm(jax.nn.softplus(g*(s-target)/w_min)**2)**2

from inverse_design.

jan-david-fischbach commented on August 26, 2024

norm without axis or ord returns the two-norm, correct? so the inner **2 should already be contained in the norm?
In my case I have the inner **2 because I calculate the norm manually.
Therefore I think the equation and my code do the same thing, am I wrong?

from inverse_design.

lucasgrjn commented on August 26, 2024

Your equation is not wrong! (as you pointed you directly cancelled the square-root contained in the L2-norm)
However, I have no idea if this additional functions can add some small errors. And if this small errors can push us closer to the "paper solution".

Disclaimer: I don't think it will... Hence, I think your solution is better!

from inverse_design.

jan-david-fischbach commented on August 26, 2024

I tried the following:
Running the optimization loop including the loss from above for the modeconverter, but ignoring the generator.

When comparing that to the paper:

I have a hard time believing that the loss is reduced more strongly including the fabrication constraints than without.
What do you think @flaport, @Dj1312?

from inverse_design.

flaport commented on August 26, 2024

Hmm interesting... I indeed expect the loss with fabrication constraints to be a lot higher. Maybe it's a dumb normalization factor or so?

from inverse_design.

jan-david-fischbach commented on August 26, 2024

I thought about a **2

from inverse_design.

jan-david-fischbach commented on August 26, 2024

I have been quite desperately trying to recreate the graph with the generator. But it is always much worse in terms of normalized loss :/

from inverse_design.

jan-david-fischbach commented on August 26, 2024

Ah, it might also be related to the wavelength bands. I had disabled those in favor of a single wavelength to speed up the simulation. I'll try the unconstrained optimization with bands to check...

from inverse_design.

lucasgrjn commented on August 26, 2024

@Jan-David-Black
I think there is a subtitly! You use a definition with the conversion of the dB using a factor 10.
But in our case the s_params are defined as a ratio of power, a factor 10 will be more relevant.

I plotted the two graphs on my notebook without the generator (as you pointed) using t_sij = 10**(-x/20) and t_sij = 10**(-x/10) on the losses function. And so, the loss seems to be more reduced than with the fabrication constraints! :)

PS: I use the following parameters to generate the initial latent bias=0.95, r=1, r_scale=1e-3

from inverse_design.

jan-david-fischbach commented on August 26, 2024

@Jan-David-Black I think there is a subtitly! You use a definition with the conversion of the dB using a factor 10 (I think you meant 20 here?). But in our case the s_params are defined as a ratio of power, a factor 10 will be more relevant.

I plotted the two graphs on my notebook without the generator (as you pointed) using t_sij = 10**(-x/20) and t_sij = 10**(-x/10) on the losses function. And so, the loss seems to be more reduced than with the fabrication constraints! :)

PS: I use the following parameters to generate the initial latent bias=0.95, r=1, r_scale=1e-3

Hm, but we do square the target s-params to get their "power-representation" in this (somewhat ugly) line:

target = jnp.stack((jnp.ones_like(s11)*(t_s11**2),jnp.ones_like(s21)*(t_s21**2)))

So dividing by 20 should be correct, no?

Or am I completely missing the point and ceviche-challenges returns power s_params? That would be quite unconventional, no?

from inverse_design.

lucasgrjn commented on August 26, 2024

Or am I completely missing the point and ceviche-challenges returns power s_params? That would be quite unconventional, no?

We want to maximize the power going trough the output compared to the reflected power.
If you go take a look at the ceviche-challenges source code, the definition of the s_params is made using the overlap between an E and a H field, so a form of power. (Disclaimer: I am not really familiar to S-parameters in electronic circuits.)

But for my understanding, if we work only with a simple field, we would use the 20 factor. In our case, I would tend for a 10 factor.

But as you, TBH, if I take a look at the equation and the square, I am more doubtful.

The only thing making me tend for the 10 (power factor) is the following sentence:
For example, a minimum transmission amplitude cutoff of 0.5 (-3 dB in power transmission) would have a valid of 0.5, plus the fact Tab. I of the article gives the power scattering parameters.

from inverse_design.

jan-david-fischbach commented on August 26, 2024

For example, a minimum transmission amplitude cutoff of 0.5 (-3 dB in power transmission) would have a valid of 0.5, plus the fact Tab. I of the article gives the power scattering parameters.

Well maybe I just made the wrong assumptions.
I am going to have a deeper look

from inverse_design.

jan-david-fischbach commented on August 26, 2024

Here at least they use 20:
https://github.com/google/ceviche-challenges/blob/6352656f902dabacea88e123c89dde13dd8a3160/ceviche_challenges/scattering_test.py#L43-L44

from inverse_design.

lucasgrjn commented on August 26, 2024

Yes, I agree it is confusing..

from inverse_design.

lucasgrjn commented on August 26, 2024

Thanks @ianwilliamson !
Now, all make sense.

from inverse_design.

jan-david-fischbach commented on August 26, 2024

But that means that this still holds, right?

I tried the following: Running the optimization loop including the loss from above for the modeconverter, but ignoring the generator. When comparing that to the paper: I have a hard time believing that the loss is reduced more strongly including the fabrication constraints than without. What do you think @flaport, @Dj1312?

from inverse_design.

lucasgrjn commented on August 26, 2024

But that means that this still holds, right?

I have a hard time believing that the loss is reduced more strongly including the fabrication constraints than without.

Yep.. This issue remains unresolved

from inverse_design.

lucasgrjn commented on August 26, 2024

@Jan-David-Black some thoughts on the loss issue. I use the Fig.5 of the paper to extract the following values:

at step 1, $|S_{11}|^2 \approx -18dB$ and $|S_{21}|^2 \approx -27dB$ which leads to $L \approx 250$
at step 122, $|S_{11}|^2 \approx -50dB$ and $|S_{21}|^2 \approx -0.1dB$ which leads to $L \approx 0.4$
If we normalize the two values, we obtained a min value of $L \approx 1.6e^{-3}$

The obtained results is above the red curve. So it seems, there is a problem around the loss function.

If I use the results I obtain for a simple optimization (without the generator):

at step 1, I got $L \approx 254.7$
at step 150, I got $L \approx 0.39$
If we normalize the two values, we obtained a min value of $L \approx 1.53e^{-3}$

As you pointed, the loss value of the binarized design is so close of the non binarized one..

from inverse_design.

jan-david-fischbach commented on August 26, 2024

Could it be something with the softplus? I just used the one available in JAX. The one in Pytorch seems to have an additional $\beta$ parameter: https://pytorch.org/docs/stable/generated/torch.nn.Softplus.html
Other than that I see little room for error in the equation...

from inverse_design.

lucasgrjn commented on August 26, 2024

It may be the softplus as you pointed but I think the paper implemented the original. Moreover, they would have mentioned it on the description. (Maybe the step 0 is one with a random binarized design and step 1 is one with a full solid design ? Despite the fact I dont give a lot of value to this possibility...)

If the binarized design loss follows the trend of the non-generated one, at least, I think we can assume we are safe.

from inverse_design.

lucasgrjn commented on August 26, 2024

Thanks for the tip. I am going to take a look at it!

from inverse_design.

jan-david-fischbach commented on August 26, 2024

I think we can close this one as the loss function seems to be on track

from inverse_design.

Loss function about inverse_design HOT 25 CLOSED

Comments (25)

Related Issues (14)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent