henry2004y / batsrus.jl Goto Github PK
View Code? Open in Web Editor NEWBATSRUS/SWMF Data Processor
Home Page: https://henry2004y.github.io/Batsrus.jl/dev/
License: MIT License
BATSRUS/SWMF Data Processor
Home Page: https://henry2004y.github.io/Batsrus.jl/dev/
License: MIT License
matplotlib.tri.LinearTriInterpolator
is pretty slow. However, when I tried to replace that with either ScatteredInterpolations.jl or Dierckx.jl Spline2D
, I got OOM errors...
spl = Spline2D(vec(X), vec(Y), vec(W))
#Wi = [evaluate(spl, i, j) for j in yi, i in xi]
ERROR: OutOfMemoryError()
Stacktrace:
[1] Array
@ .\boot.jl:475 [inlined]
[2] Spline2D(x::Base.ReshapedArray{…}, y::Base.ReshapedArray{…}, z::Base.ReshapedArray{…}; w::Vector{…}, kx::Int64, ky::Int64, s::Float64)
@ Dierckx C:\Users\hyzho\.julia\packages\Dierckx\TDOyl\src\Dierckx.jl:775
[3] Spline2D(x::Base.ReshapedArray{…}, y::Base.ReshapedArray{…}, z::Base.ReshapedArray{…})
@ Dierckx C:\Users\hyzho\.julia\packages\Dierckx\TDOyl\src\Dierckx.jl:733
[4] top-level scope
@ REPL[31]:1
Related to #14
For some outputs, drawing streamlines may end up with an error in Matplotlib
ValueError("'x' values must be equally spaced")
File "C:\Users\hyzho\AppData\Local\Programs\Python\Python311\Lib\site-packages\matplotlib\__init__.py", line 1461,
where the actual check in Matplotlib says
if not np.allclose(np.diff(x), self.width / (self.nx - 1)):
raise ValueError("'x' values must be equally spaced")
A similar check in Julia would be
all(i -> isapprox(xrange[i+1] - xrange[i], (xrange[end] - xrange[1])/(length(xrange)-1)), eachindex(xrange)[1:end-1])
This is clearly a mismatch between Julia and Numpy, probably for Float32 numbers. Not a big deal, but the error messages are annoying. A quick workaround would be to set the plotinterval
value together with plotrange
.
Batsrus.jl/src/plot/utility.jl
Line 11 in a540abc
Currently getdata2d
depends on matplotlib to do interpolations. In the future we really need to figure out how to do it native Julia. Related to #14.
Similar issue as I had previously in Vlaisator.jl.
A while ago, the wrapper over Matplotlib makes it difficult to modify the plots afterwards, which especially causes problems when dealing with time series snapshots. The colorbar is so hard to fix. The solution is, instead of using level
, provide a range of points.
The current support of animation in Matplotlib is not good enough, especially for interactive plotting and scanning through multiple snapshots. Based on what I've seen, Makie does a better job in this category, so I make a ref to #21.
More recent versions improve on the wrapper dramatically. Animations with fixed colorbars may become feasible, just as I do in Vlasiator.jl where you manually create a colorbar with fixed range. I will work on this if new simulation results require analyzing, and until then we will decide which way to go
Data
!mmap
to do lazy data loading.meshgrid
if possible.We (@GaryQ-physics) are looking for a way to convert *.out files to VTK files with the appropriate connectivity.
Does this library do this?
In WriteVTK v0.14.3, JuliaVTK/WriteVTK.jl#106 the connectivity and offsets for unstructured grids have been changed from Int32
to Int64
. This causes test failure because we compare sha256 for the generated VTU files.
I am asking if they want to support the old behavior, or just keep the current behavior.
Tecplot and VTK unstructured data formats have the same connectivity ordering for hexahedron, but different ordering for voxel (in VTK). A function swaprows
is implemented to switch the orderings.
I encountered a very bad problem of corrupting binary *.vtu files in the early days. It turned out that the issue is the starting position of data is wrong because of the way I skip the header AUXDATA part. Sometimes the binary numbers may contain newline character that confuses the reader. It is now fixed.
Later on the reading of the header part was completely rewritten to provide better support for a variety of Tecplot ASCII headers.
All the AXUDATA information is now stored into global VTK data.
So far there is no single use case where I need to read multiple files together. If you want to do so, just feed readdata
with one string of filename at a time. By eliminating the support for multiple files, the function's internal structure is greatly simplified.
In BATSRUS,
dxSavePlot = -1
=> unstructured grid, cuts are saved point-wise with no grid informationdxSavePlot = 0
=> structured grid at the finest cell level. For AMR 2D cuts, this will introduce many duplicate points such that the output file size is about 3 times larger than the unstructured equivalent.In our plotting routines, it takes significantly longer time (~20x for ~100 MB data) to reconstruct the mesh given unstructured outputs. Streamline plotting is working, though.
Streamline plotting for AMR structured grid is not working.
The VTK files does not have timestep information. To allow for further time series processing in Paraview, a script create_pvd.jl
is provided for generating the pvd
container that stores the filenames and timestamps calculated from the filenames. This then becomes the new function create_pvd
in vtk.jl
.
After dicussing with the author of WriteVTK.jl, the timestamp information is now added to each file through a scalar that is not related to grid in VTK, i.e. "global data" in the VTK world.
Currently, multi-block (VTM), rectilinear (VTR), and unstructured (VTU) conversions are supported.
By default the file size will be reduced with compression level 6, but the actual compression ratio depends on the original data.
Currently, BATSRUS output contains only cell center coordinates and cell center values (referring as tcp
in PARAM.in
), or node coordinates and interpolated nodal values (referring as tec
). It is recommended to save in the tcp
format to avoid interpolation. In principle VTK also supports the combination of node coordinates and cell center values, but it is not necessary here.
The simple multiblock VTK format conversion can only deal with data within a block, but it cannot figure out the connectivities between neighboring blocks. To fill the gap between blocks, we have to retrieve the tree data stored in .tree
files. This requires a in-depth understanding of the grid tree structure in BATSRUS, and it took me a whole week to think, write and debug the code!
Information contained in iTree_IA
:
Basically, it requires several steps:
Several issues worth noticing:
ModWriteTecplot
implementation,
dxPlot⩾0
) will generate duplicate points for 2D runs after postprocessing. I don't fully understand why.dxPlot<0
) can generate correct point coordinates, but the output points in postprocessing are sorted based on x+e*y+e^2*z
to allow removing duplicate points. This is mostly useful for 2D cuts of 3D simulations where duplicate points are possible to exist, and should be merged/averaged. This also means that it is literally impossible to figure out the original cell ordering, and impossible to convert to 2D VTU format (unless you construct some new connectivities based on triangulation)! However, it guarantees that the cut outputs from parallel runs are identical.ModWriteTecplot
that will generate duplicate voxel grid. For example, imagine a coarse block which is surrounded by refined blocks all around. On edge 8, the two refined face neighbor will both think they are in charge of writing the connectivity for the cells around the edge. This does not cause any serious issues as of now, so I don't fix it. Speaking of fixing that, there are two ways:
fillCellNeighbors!
not only deals with coarsened neighbors, but also the refined neighbors. This means that literally we need to implement the message_pass_cell
completely. We still do a two-round process, write everything without checking, and in the end before IO remove the duplicate rows. This may be even faster then implementing some complicated writing rules to avoid duplicating.ParaView allows for custom Python reader. Examples can be found in Chapter 12.4 in the official manual, and an example of full Python plugins can be found at Kiware's gitlab page.
The XML package not only provide writing into XML files, but also reading XML structures. Therefore, if you want you can create a VTK reader.
However, it would be even better to write VTK directly from BATSRUS! In the future versions of BATSRUS, we should be able to output VTK files directly with VTKFortran. I won't do it now.
When I want to add a test for Data
type IO, the returned string is always empty. This must be a bug related to how I handle files, but I have no clue what it is...
Right now the derived quantity plots are not supported. In order to achieve this, I may need:
getvar(data::Data, var::String)
returning the derived variableThe first one is achieved by a trick I found on discourse, which basically identifies symbols as names to members in a struct.
This test feature is not ideal and will be dropped in later versions. This looks like the Python Calculator in ParaView.
I don't know how ParaView achieve this, but in Python it is pretty easy to parse a string and execute it as some new codes using exec
function, as Yuxi did in his interface to the yt library. Julia has equivalent metaprogramming capabilities, but there are many restrictions to apply it in practice so it is generally adviced to avoid evaluating expressions inside functions during runtime.
Another option is to create functions for derived variables. This is actually pretty good both in terms of performance and conciseness. The idea is to create a dictionary of derived variable names as keys and anonymous functions as values, and if the input string is found in the dictionary, call the corresponding function to obtain the value.
This has been successfully tested in my new scripts for processing Vlasiator outputs, and can be directly ported here for BATSRUS.
Makie is still very hard to use with limited examples and resources as of April 2021. I will only put more effort on this if they keep improving the APIs. Currently I only provide a minimal working example without using recipes.
I encountered yet another issue with Makie compatibility....
BATSRUS support HDF5 outputs, so we should also support reading those files.
There is a extremely powerful user recipe in Plots.
gr()
backend. The GR backend contour plot only accept vector x,y!u"km/s"
notation, we just need to use bu"amucc"
. Now this part becomes a submodule within Batsrus.jl, mainly because a registered package cannot depend on an unregistered package.I was enlightened by the way SpacePy handles files. Instead of a separate header and data array, it may be better to build a more contained struct.
Also, the header could use NamedTuple
instead of Dict
.
These are all implemented in newer versions.
Demos are provided for calling Matlab/Python directly from Julia for debugging and testing. This part will later be separated out for potential Python and Matlab users. Currently the plotting and interpolation needed during plotting are done in Python. For instance, the 3D scatterred interpolation is done via Interpolate
in Scipy. Hopefully these additional dependencies will be cut down.
In the current version of PyCall
and PyJulia
, there is already direct support for accessing Julia struct objects (noted as jlwrap
).
To overcome the dependency issue on Python, I take advantage of Requires.jl
. The idea is that the plotting related functions are only loaded when PyPlot
is imported.
Time to use artifacts for test data to reduce the package size for end users.
We can refactor the IO part by taking advantage of FortranFiles.jl. Several potential issues:
mmap
.Currently the 2D plot for unstructured data is done through triangulation and interpolation provided by Matplotlib: https://matplotlib.org/stable/api/tri_api.html#matplotlib.tri.TriInterpolator. I want to find a substitution of triangulation and interpolation in Julia.
There are several options in Julia, but I haven't found a good solution yet. I've looked at
These two are pretty similar, but neither of them support linear scatter interpolations, and they both suffer from the OOM issue.
Triangulate.jl is a wrapper over another famous triangle mesh generator. I don't know if it supports linear scatter interpolations.
GridInterpolation.jl is another option, but the so-called simplex grid implementation suffers from performance issues.
This is the first time I use Julia for reading general ascii/binary files. It was a pain at first due to the lack of examples and documents using any basic function like read/read!, but fortunately I figured them out myself. One trick in reading binary array data is the usage of view, or subarrays, in Julia. In order to achieve that, I have to implement my own read!
function in addition to the base ones.
Before v0.5.1 (the tag being used when the package was not registered), readdata
function in MATLAB for large data is 2 times faster than that in Julia. The reason is simply using read
or unsafe_read
in the lower level. The latter one is much faster. After the fix, Julia version performs 5 times faster than the Matlab version in reading binary data.
After Julia 1.4, reading views of an array is supported by the base read
function, so there is no need to dispatch for a customized version. This part has been removed from io.jl
since I don't expect people use an older version of Julia anymore.
The yt Project in Python is a much more mature postprocessing package than mine. That's a good place to learn.
Off-axis slices: very similar to ParaView and other 3D visualization tools provide, make a 2D cut at any direction. This requires a robust interpolation scheme.
Projection: it is like an integral form of a quantity in a certain direction, and can be weighted or unweighted. It is not a necessary feature, as one can easily implement this once he or she knows how to do make off-axis slices.
Same interface for both structured and unstructure data: this is important for a larger user base.
Mesh plotting: they provide a method for adding mesh on top of plots. I haven't had a good implementation of this yet.
Center: it would be good to give the option of choosing the center of plots.
Units: they build a unit system using dictionary themselves. I don't want to reinvent the wheel, so I will just keep an eye on Unitful.jl
and yt. What is inherited from IDL in Batsrus.jl
needs rewritten.
Property control: I don't like the way yt handles figure properties like colorbar, axes, etc. Matplotlib already teaches you how to set these things, and yet again in yt they wrap everything with names they come up with. This is a more of a burden instead of benefit. They do certain things right like set_cmap
, but not all of them. The section Further Customization via Matplotlib
is all I want to see. Using popular libraries generally means that new comers won't take long to get used to the interfaces, and developers can have a eaiser time developing new features.
Log scale: symlog for data containing both positive and negative data, and switch to linear scale for small values.
There is a unit package in Julia unitful for handling units.
Take a look at that one if you really want to solve the unit problem.
On top of that, UnitfulRecipes.jl provides integrity with Plots.jl. The ideas of how to make abstractions is more important than the unit conversion itself.
Eventually, I want this to replace the old way of utilizing units. This is now partially complete, but not fully tested.
Vector naming is messed up if you are using Tecplot VTK reader. For example, "B [nT]" --> "B [nT]_X", "B [nT]_Y", "B [nT]_Z". Not a big issue, but annoying.
Related to Issue #20, we may also want a true user recipe in Makie.
MakieLayout is a nice extension built on top of Makie to create publication quality figures and interactive plots. It basically includes all the funcationalities I want, so definitely worth a try.
Now in recent version this is merged into the main package, and they are working on unified APIs. Things will improve, but at this point I feel that many things are so hard to make them right without the main developers' help. Clearly they are lack of documentations and examples. Let's see how it goes.
As for the GUI development, GTK seems to be an ideal candidate. However, the GTK interface in Julia lacks full support for the toolkit, which makes it a little bit hard to use. I have only played with it for half a day. You can design the appearance of your window interactively, and save your in an HTML-like file.
Makie is actually good at this, with the underlying OpenGL support.
At this point GUI is not necessarily needed, if it does not speed up my own workflow.
@JuliaRegistrator register()
Something went wrong.
When doing processing in batch mode on a cluster, there's usually no need to render the plots on screen. There exists such a backend for this purpose:
using PyPlot
PyPlot.matplotlib.use("Agg")
However, notice that currently Agg backend does not support draw_artist. For example, you cannot add an anchored text to your figure.
Right now I use this in tests.
I have made several rounds of mistakes related to array storage ordering, specifically meshgrid
vs ndgrid
. Even though some Matplotlib functions like contourf
support 1D grid arrays, it is very easy to mess it up when other conditions are handling 2D ndarrays from the interpolation package Dierckx.
Therefore I decided to eliminate passing 1D arrays and always use 2D grid arrays. The list comprehension version of meshgrid
is
X = [x for _ in y, x in x]
Y = [y for y in y, _ in x]
while the matrix multiplication version is
X = x' .* ones(length(y))
Y = ones(length(x))' .* y
Based on my test results, the list comprehension version is consistently faster and consumes less memory than the matrix multiplication version. I have now replaced all the occurrences with a function called meshgrid
.
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
See the discussion in MakieOrg/Makie.jl#3157. Currently streamplot
in Makie only support function input.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.