Comments (37)
I went a little bit crazy today and re-wrote the generator in rust (using arrayfire). It should be a drop-in replacement (after compilation). It's about 10 times faster for a 30x30 grid but gets progressively faster (relatively speaking) the larger the grid becomes.
code-wise it's still highly unoptimized I think (I basically just translated my crappy python code), but it's pretty cool to see an actual substantial performance increase. Feel free to check it out... :)
from inverse_design.
Hi!
I gave a look at inverse_design repo this summer after @flaport shared it. Unfortunately, I was very busy with some nanofabrication.
Now, I am more available and determined to make it work. So @Jan-David-Black I hope to be able to help you.
I just finished setting up my env to have Jax-GPU enabled on my desktop. If you want to do some testing, I can already do it. I may also have an idea to speed up the convolutions, I'll do some tests tomorrow.
Regards,
Lucas :)
from inverse_design.
one way to enforce symmetry is to add your transformed latent matrix with a symmetric version of itself.
I would assume that adding it with its transposed then leads to a symmetry along a diagonal?
from inverse_design.
I was able to improve the rust algorithm performance even more:
It's nice to see a linear behavior between # pixels and computation time :)
from inverse_design.
Yeah, for me it took about 10 seconds, but let's be fair... that's still too long.
That's why today I chose to completely rewrite the generator from scratch (still in rust, but without arrayfire). The new version is is 'blazingly fast': 128x128 takes about 200ms.
It's a new implementation which only works in the vicinity of the brush and does not use any convolutions and scales therefore a lot better with the size of the grid.
Moreover, I no longer use arrayfire and hence it should be easier to compile for you :)
The new version was forced pushed to master (sorry if you pulled already!), the old version still is in the rust-v1 branch.
from inverse_design.
Yeah, install rust and cargo first with a tool like https://rustup.rs. Then you can build the repo with a 'cargo build' when you're inside the rust folder. That build command creates the .so file needed for importing in Python too (it should appear somewhere in the 'target' folder created by cargo).
But once you confirmed cargo works, you can just run a 'make lib' in the root of the repo. That will move the .so file to a more convenient location.
That said, on windows the file rust builds might have a different extension, so we might have to make the make commands more cross platform
from inverse_design.
I did it exactly the same way 👍
from inverse_design.
Here is an initial performance graph
I don't like how the algorithm scales with brush size at all, but I'm sure there are tons of optimizations possible still. Not to mention that everything works on a single thread currently. Maybe there are parallelizations possible too.
from inverse_design.
For reference, I end up with 7 seconds for 128x128, when using a naive numpy
and scipy.ndimage.convolve
implementation
from inverse_design.
Hey @Jan-David-Black , that's pretty awesome! It's been a while since I worked on this so I don't fully remember the details, but I do remember that that the generator took too long (although I don't think it took 15 min for me). I was gonna give it an other shot another day but clearly that didn't happen...
Would you mind if I replace my implementation by yours in this repo? If you agree, maybe you can add the notebook you linked as a PR for proper attribution?
Anyway, thanks for pointing to a better solution :)
from inverse_design.
@flaport I don't quite understand your code in 07_inverse_design.ipynb
. You seem to define a forward
function however in the optimization scheme it is not used. When I try to naively "plug it in" where it seems to belong I run into issues with autograd
and jax
not playing nice.
from inverse_design.
Yep, this repository is one of my many unfinished projects 😅
from inverse_design.
I seem to have gotten it to work: https://jan-david-black.github.io/inverse_design_strict_fabrication/notebooks/inverse_design_local.html It is still quite rough around the edges: Because the outer regions are just masked away the fabrication constraints are violated in some parts around the design region border. I mostly struggled with differing versions of nbdev
:/ The generator speed is still impacting the overall runtime quite heavily, especially as it scales worse than linear with pixel count at the moment (most parts do scale linear at the moment, just some bits missing).
from inverse_design.
Yeah, this repository uses nbdev<2. The more recent version is incompatible with the old one...
Thanks for getting it to work! I'll check it out soon :)
Are you using your generator or mine? Maybe if i ever find some time i might rewrite the generator in some faster language
from inverse_design.
I am running my generator, which I have extended to use local dilations, wherever easily possible. I believe that Jax might be used to speed up the local dilations using GPU acceleration. I cannot test that at the moment, however, as I do not have a suitable GPU at hand.
I am still struggling a bit with finding resolving touches after free touches have been applied (I think it makes sense to just calculate it once after all free touches, but I have to implement finding the extent for the local dilation still, as it depends on the positions of all free touches.) Apart from that I still have a crappy implementation of the heatmap for selecting the next touch (currently involving a convolution in every step). Here is some timing analysis of the current implementation:
The lower graph is summarized by its sum as "linear_ops" in the upper one. Resolving in the upper graph stands for calculating the resolving touches after free touches. It coincides almost perfectly with the time needed for global dilations. A brush of 9-diameter is used.
from inverse_design.
I have published a small package here: https://pypi.org/project/javiche/ that provides a wrapper decorator to easily use ceviche autograd differentiable functions with jax. I would like to make it a dependency going forward, to beautify/clean up the interaction between jax and ceviche in the inverse design notebooks a bit. Would that be ok with you @flaport ?
from inverse_design.
Sure! Go ahead :)
from inverse_design.
Wow really cool!! Still need to try it out. As far as I can see you still use global dilations, correct? So we might speed it up further using local dilations?
from inverse_design.
I have been trying to get your rust implementation to work on my M1 mac. Somehow I can only get the arrayfire binaries to work from within a x86 compiled program. Therefore I have to run it behind rosetta, which I believe plummets the performance in this case: For an exemplary 128x128 map the rust generator takes approximately 77s. Could you maybe report on the time it takes on your system?
from inverse_design.
Awesome!
I actually have never used Rust. To give it a try, I simply need to use a make build
within the Rust part of the repo ?
from inverse_design.
How about using maturin?
from inverse_design.
I took some time to really dig deeply into your code @flaport.
I just have a question about the Python part which can be a bottleneck:
inverse_design/inverse_design/design.py
Lines 97 to 109 in 5abf24d
Why do you not use a bool mask instead of this function ?
I am asking this just in case but, since you were able to cut the time so drastically, I dont think it will be useful. (Except if you use a similar function in Rust !
(For this part, I also take a look. But... I am not really familiar with this language, so, I prefer not to comment this part.)
:)
from inverse_design.
How about using maturin?
Cool! I will have to look into this :)
Why do you not use a bool mask instead of this function ?
Yeah... that function is horrible. I even forgot what I was doing there...
I think (apart from that function) the python code is an ok implementation if you want to understand what's going on but the implementation is pretty horrible. Too many global convolutions (dilations) and re-calculations.
The new implementation is much better... it does not re-compute anything and when it needs to dilate anything it stays close to the brush touch (I think that's what @Jan-David-Black meant with local dilations?). The new implementation is much better.
That said, I understand it's a lot more difficult to understand what's going on in an unfamiliar language, but rust is pretty cool... re-writing parts of existing python code in Rust is my preferred way to learn it better :)
from inverse_design.
The new implementation is much better... it does not re-compute anything and when it needs to dilate anything it stays close to the brush touch (I think that's what @Jan-David-Black meant with local dilations?). The new implementation is much better.
Jep exactly
It is also what I had started in my "local generator" in python which takes ~2s for 128x128 even though i still use global dilations in some places...
I got your new rust-generator to work much more easily (no more arrayfire 🎉). I'll open a pull request to share my maturin configuration, which also makes adding paths and copying around files obsolete :)
from inverse_design.
I got your new rust-generator to work much more easily (no more arrayfire tada). I'll open a pull request to share my maturin configuration, which also makes adding paths and copying around files obsolete :)
Awesome!
One caveat is that the current version of my algorithm scales pretty badly with brush size. But at least it's better than scaling with the grid size I think.
from inverse_design.
Maybe we could make a small benchmark to see the evolution as function of the size (object and brush ?)
TBH really need speed improvements! You rock :)
It cost me less than 20s for a latent design generation of Figure.6 size (640px with a circular_brush of 10px)
That said, I understand it's a lot more difficult to understand what's going on in an unfamiliar language, but rust is pretty cool... re-writing parts of existing python code in Rust is my preferred way to learn it better :)
Learning Rust was on my TODO list but you just underlined me how it is cool!
from inverse_design.
I have also added a notebook that uses ceviche_challenges
in the optimization loop here
So far the optimization is quite bad...
Two things we might add to the generator:
- Initial touches: I use a set of solid and void touches at the start of the generation process to ensure the fabrication constraints are also obeyed at the borders and waveguide ports (e.g. here)
- Symmetry constraints: I still need to figure out how to do this best
from inverse_design.
@flaport if you have some spare times: do you plane to add some comments on you Rust code to explain the main idea of the different parts ? (I think you use a big brush and then a very brush on the voxel of the big brush).
-> It is more to understand the algorithm and its possible caveats rather than the syntax
from inverse_design.
Yes, I'll try to add some comments soon.
To answer your question related to the brushes:
- the brush is used to dilute touches
- the big brush (the brush convolved with itself) is used to find invalid regions
- the very big brush (ideally the big brush convolved by the brush) is used as a search area for free and resolving touches.
from inverse_design.
I have also added a notebook that uses
ceviche_challenges
in the optimization loop here So far the optimization is quite bad... Two things we might add to the generator:
I will try to make an example tomorrow also using ceviche_challenges
.
* Initial touches: I use a set of solid and void touches at the start of the generation process to ensure the fabrication constraints are also obeyed at the borders and waveguide ports ([e.g. here](https://jan-david-black.github.io/inverse_design_strict_fabrication/notebooks/inverse_design_local.html)) * Symmetry constraints: I still need to figure out how to do this best
Cool to see you arleady done what you explain me Friday !
For the simmetries, it wont be easy...
- The idea to force touch on the symmetry boundary wont work since it will avoid or force one type...
- With a symmetry an half brush size on the boundary will be allowed but it wont be authorized on a half part optimization...
from inverse_design.
Thanks for the explanations !
* the very big brush (ideally the big brush convolved by the brush) is used as a search area for free and resolving touches.
This is the point where I think it may be possible to make use of a mask, I will try to investigate.
from inverse_design.
I currently have the following problem. Maybe you know a quick fix:
My fork has diverged from the current state of this repo quite a bit. I would like to make a pull request for some small changes to incorporate maturin. Unfortunately I cannot just make a second clean fork... What do I do?
from inverse_design.
Yeah, that was my mistake. This is a public repo and I treated it as one of my private ones by force pushing something.
maybe just make the PR anyway and tell me which files you're interested in. I might try to solve it that way
from inverse_design.
Symmetry constraints: I still need to figure out how to do this best
one way to enforce symmetry is to add your transformed latent matrix with a symmetric version of itself.
For example using this
latent_t = latent_t + latent_t[::-1]
yields the following symmetric design:
from inverse_design.
Symmetry constraints: I still need to figure out how to do this best
one way to enforce symmetry is to add your transformed latent matrix with a symmetric version of itself.
I fully agree but in this case, we wont gain any time unfortunately
from inverse_design.
As we have a quite fast generator implementation by now I will close this Issue.
Feel free to open other issues for the other suggestions made here.
from inverse_design.
Very nice improvements !
from inverse_design.
Related Issues (14)
- Migrate to nbdev v2? HOT 4
- Fix GH Action HOT 1
- PATH modifications HOT 2
- Strict Symmetry HOT 18
- MemoryError in 10_inverse_design_local.ipynb HOT 3
- `value_and_grad` on the straight thru estimator HOT 1
- Potential features: Tidy3D backend? gdsfactory export? HOT 5
- Use `ceviche-challenges` for inverse design examples HOT 4
- What happened to `environment.yml` HOT 2
- Rust Initial Touches HOT 7
- Loss function HOT 25
- Convergence behavior sub-par HOT 7
- Fiddle with `setup.py` to make installable with a single command? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from inverse_design.