flaport / inverse_design Goto Github PK

View Code? Open in Web Editor NEW

16.0 16.0 5.0 50.5 MB

Home Page: https://flaport.github.io/inverse_design

License: Apache License 2.0

Makefile 0.46% Jupyter Notebook 52.76% Python 30.53% Dockerfile 0.04% Rust 15.98% CSS 0.23%

inverse_design's People

Contributors

Stargazers

Watchers

Forkers

lucasgrjn jan-david-fischbach joamatab fryslie kojaama

inverse_design's Issues

Fix GH Action

The gh action seems to fail as it runs out of memory in notebook 10.
Reducing the number of iterations that are performed there should do the trick.

Slow execution of the generator

Hey @flaport,
Quite an interesting project! Probably my configuration is not quite right, but when I try running the inverse design executing the generator takes about 15 minutes (I am on a M1 MacBook Pro), which makes this quite unfeasible.
Do you experience similar execution times, or is there something wrong with my setup?

Migrate to nbdev v2?

Is there a particular reason not to migrate?
It seems like the github actions are currently failing because the nbdev command is not found.

Rust Initial Touches

To handle the border between design region and waveguide nicely, it would be great if we could pass initial touches to the generator, to initialize the design with. This is already possible for the local_generator written in python. It would be great to have a similar possibility with the rust based generator.

`value_and_grad` on the straight thru estimator

I seem to run into the problem, that if I use jax.value_and_grad I do not actually go through the forward pass of the generator.
I think this is related to these lines:

inverse_design/inverse_design/conditional_generator.py

Lines 188 to 190 in 86486d6

 @generate_feasible_design_mask.defjvp 

 def generate_feasible_design_mask_jvp(primals, tangents): 

 return primals[0], tangents[0] # identity function for first argument: latent_t

I believe one could solve that by passing the primals through the generator in the customjvp. This however leads to unneccesary computational cost if I only want the gradients i.e. use the jax.grad function. Any Idea how to get it to work efficiently in both cases?

What happened to `environment.yml`

Was it removed intentionally?

MemoryError in 10_inverse_design_local.ipynb

Currently I get a MemoryError when trying to run 10_inverse_design_local.ipynb

@Jan-David-Black , any idea why that is? Or does my computer just not have enough RAM?

Convergence behavior sub-par

Hey @ianwilliamson, @Dj1312,
sorry to bother you. We do not seem to be able to match the incredible pace at which the optimizations converge in the paper. Possible problems under investigation include:

errors in the loss function (#18 -> doesn't seem to be the case)
problems with the way the boundaries of the design region are handled. (the convergence behavior seems to persist even when disabling that.)
selecting the hyperparameters: What should the value of $\beta$ be... Do we have to schedule changes to beta during the course of the optimization. ( Maybe even use it as an optimized parameter (not in the original paper as far as I see) -> https://github.com/flaport/inverse_design/issues/10#issuecomment-1426191124)
the initialization: The convergence behavior is highly dependent on the bias and random spread of the initial latent space. How do we select appropriate values here? the original paper only states:

The latent design is randomly initialized with a bias so that the first feasible design is fully solid

the gradients: to test the rest of the optimization loop it was run without the generator:

Running the optimization loop including the loss from above for the modeconverter, but ignoring the generator.

When comparing that to the paper:

I have a hard time believing that the loss is reduced more strongly including the fabrication constraints than without.

Originally posted by @Jan-David-Black in #18 (comment)
the settings for the Adam optimizer are as in the paper: adam(0.01, b1=0.667, b2=0.9)

@Dj1312 suggested that the discrepancy might be related to a differing way to translate the scattering parameters to dB quantities. It is correct to assume that ceviche-challenges returns the S-parameters as field quantities, right?
coarser mesh: Another source of discrepancy is the fact that I generally use a coarser mesh (sometimes I try with 10nm resolution, but it slows down the workflow quite dramatically)
As described in the paper different brushes and design region size lead to a more or less challenging optimization problem. Increasing the design region and decreasing the brushsize doesn't seem to improve the convergence significantly.

Ideas for further improvement:

use AdamW to avoid stagnation because of low gradients for saturated latent space.
scale the transform by some factor 1<s<2 to make the transformed latent space even more similar to the generator output. From the paper:

Thus, the estimator can be seen as a differentiable approximation of the conditional generator. The success of this estimator is consistent with the finding in binary neural networks that estimators which approximate their forward-pass counterpart outperform simpler functions

Am I missing something?
Regards JD

Loss function

Hey @flaport, @Dj1312,
I am currently trying to get the loss function right.
Unfortunately, my convergence behavior is still very different from the paper and I suspect the culprit lies here:

s11 = jnp.abs(s_params[:, 0, 0])**2
s21 = jnp.abs(s_params[:, 0, 1])**2

s = jnp.stack((s11,s21))
g = jnp.stack((jnp.ones_like(s11),-jnp.ones_like(s21)))

t_s21 = 10**(-0.5/20)
t_s11 = 10**(-20/20)

target = jnp.stack((jnp.ones_like(s11)*(t_s11**2),jnp.ones_like(s21)*(t_s21**2)))
w_min = min(1-t_s21, t_s11)
L = jnp.sum( jax.nn.softplus(g*(s-target)/w_min)**2 )

Use `ceviche-challenges` for inverse design examples

Very cool to see this repository!

I see that the inverse design examples in the documentation are using the much older inverse design workshop code that I created when I was back at Stanford. Just as an FYI, along with the Inverse Design of Photonic Devices with Strict Foundry Fabrication Constraints paper, we did release a codebase that enables one to easily setup and call all of the design challenges we present in the paper. It's really easy to just import one of the prefab devices and throw it into a loss function. That code base lives in google/ceviche-challenges and is also pip-installable as a package.

The only limitation of ceviche_challenges is that it uses HIPS autograd (like base Ceviche) and doesn't speak "JAX." The nice javiche package can be used to plug things into JAX. (See also jan-david-fischbach/javiche#1).

Fiddle with `setup.py` to make installable with a single command?

I'd love to be able to install this repo in a single command similar to pip install inverse-design. I have prepared a branch that tries to achieve that. However, I had to modify setup.py to get the local rust "sub-package" to install:

    install_requires = requirements + 
         [f"inverse_design_rs @ file://localhost/{current_directory}/rust#egg=inverse_design_rs"],

It doesn't quite seem to be supposed to be done like this...
Another option would be to separate the two into different git repos (and distribute both on pypi). However, that seems like a lot of effort to me.

My current install prompt

pip install git+https://github.com/Jan-David-Black/inverse_design_strict_fabrication.git@easy_install

It assumes rustc and cargo are installed on the system.

Using the prompt we can run our notebooks on colab also:
https://colab.research.google.com/github/Jan-David-Black/inverse_design_strict_fabrication/blob/easy_install/notebooks/11_ceviche_challenges.ipynb

Strict Symmetry

@flaport suggested in #1 that a symmetry constraint can be put on the design by adding the transformed latent design with its mirror image. As far as I can see this only leads to an almost symmetric design (counterexample below). I believe this might arise from the fact, that the touches are selected one after the other.
Any idea how to mitigate this effect?

PATH modifications

We still have quite a lot of

import sys; sys.path.insert(0, '..')

in the notebooks. With maturin in place now these should not be required anymore, as it installs the pyo3 bindings directly into the environment.

Suggestion:
get rid of the PATH manipulations :)

Potential features: Tidy3D backend? gdsfactory export?

Have you seen the Adjoint plugin in Tidy3D?
Seems quite interesting to me as a alternative to ceviche, that also does 3D...
@flaport do you think it might make sense to look into that?
@joamatab should we also think about a way to easily export the resulting optimized geometry and simulation results (e.g. scattering parameters) to gdsfactory? Would you rather export the pixelized shape or a marching squares representation?

	@generate_feasible_design_mask.defjvp
	def generate_feasible_design_mask_jvp(primals, tangents):
	return primals[0], tangents[0] # identity function for first argument: latent_t