bnn-upc / ignnition Goto Github PK
View Code? Open in Web Editor NEWFramework for fast prototyping of Graph Neural Networks
License: Apache License 2.0
Framework for fast prototyping of Graph Neural Networks
License: Apache License 2.0
If you add a custom normalization function to the model_description that is not defined, the program does not raise any error and finishes.
2020-11-19 15:44:22.543203: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2985 MB memory) -> physical GPU (device: 0, name: GeForce GTX 970, pci bus id: 0000:01:00.0, compute capability: 5.2)
2020-11-19 15:44:22.545986: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1f956720cb0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-11-19 15:44:22.546151: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 970, Compute Capability 5.2
Starting the training and evaluation process...
---------------------------------------------------------------------------
Number of devices: 1
Process finished with exit code 1
Thanks for sharing such a helpful platform that frees us from the complexity of implementation.
Can I use two readout functions for two different nodes and get a loss from these two functions? For example, in RouteNet, can I read link and path states with two readout functions?
Thanks a lot!
Yang
I am getting the following errors at the start of a new epoch even if the training is running without TF errors:
Epoch 3/3
There was an unexpected error:
The entity "link" was used in the model_description.json file but was not defined in the dataset. A list should be defined with the names (string) of each node of type link.
E.g., "link": ["n1", "n2", "n3" ...]
Please make sure that all the names used for the definition of the model are defined in your dataset. For instance, you should define a list for:
1) A list for each of the entities defined with all its nodes of the graph
2) Each of the features used to define an entity
3) Additional lists/values used for the definition
4) The label aimed to predict
---------------------------------------------------------
100/100 [==============================] - 1s 9ms/step - val_loss: 0.5586 - val_mean_absolute_error: 0.3851 - val_mean_absolute_percentage_error: 144.1184 - sample_num: 200.0000
The documentation for the epoch_size training parameter says that leaving it blank, it will consider the entire dataset as one epoch.
https://ignnition.org/doc/train_and_evaluate.html#epoch-size
However leaving this blank produces following error:
ValueError: When providing an infinite dataset, you must specify the number of steps to run (if you did not intend to create an infinite dataset, make sure to not call
repeat()
on the dataset)
This seems to be a data generator related issue, see tf docs: https://www.tensorflow.org/api_docs/python/tf/data/Dataset#repeat
Looking at the source code it seems that the datagenerator repeat is hardcoded to True:
ignnition/ignnition/ignnition_model.py
Line 761 in 905e4aa
Right now leaving the epoch_size blank does not give the desired behaviour of training
Update this part:
wget 'https://github.com/BNN-UPC/ignnition'
pip install -r requirements.txt
python setup.py install
Hello,
I'm facing a issue when running ignnition with the aggregation type parameter "ordered" (it works fine with min, max, etc.). I'm receiving this message:
ValueError: Input tensor 'ignnition_model/states_creation/actions/build_state0/concat_1:0' enters the loop with shape (1, 32), but has shape (None, 32) after one iteration. To allow the shape to vary across iterations, use the shape_invariants
argument of tf.while_loop to specify a less-specific shape.
Where in the can I change this parameter?
Thanks!
Hello,
Is there currently a way to use GPU instead of CPU ?
I started training GNN using the following example GIT project: https://github.com/BNN-UPC/GNN-NIDS
using the function model.train_and_validate()
and receive following output:
Hello,
I am currently working with ignnition models, and I am interested in performing hyperparameter optimization for them. I was wondering if it is possible to do so and if there are any examples available that could help me get started.
Thank you for your time, and any help you can provide would be greatly appreciated!
Ognjen
After correctly training I am receiving the following error:
NotImplementedError: Saving the model to HDF5 format requires the model to be a Functional model or a Sequential model. It does not work for subclassed models, because such models are defined via the body of a Python method, which isn't safely serializable. Consider saving to the Tensorflow SavedModel format (by setting save_format="tf") or using save_weights
.
The full traceback is the following:
Traceback (most recent call last): File "main_ignnition.py", line 7, in <module> model.train_and_validate() File "[...]/.pyenv/versions/miniconda3-latest/envs/py38/lib/python3.8/site-packages/ignnition/ignnition_model.py", line 678, in train_and_validate self.gnn_model.fit(train_dataset, File "[...]/.pyenv/versions/miniconda3-latest/envs/py38/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 1229, in fit callbacks.on_epoch_end(epoch, epoch_logs) File "[...]/.pyenv/versions/miniconda3-latest/envs/py38/lib/python3.8/site-packages/tensorflow/python/keras/callbacks.py", line 435, in on_epoch_end callback.on_epoch_end(epoch, logs) File "[...]/.pyenv/versions/miniconda3-latest/envs/py38/lib/python3.8/site-packages/tensorflow/python/keras/callbacks.py", line 1369, in on_epoch_end self._save_model(epoch=epoch, logs=logs) File "[...]/.pyenv/versions/miniconda3-latest/envs/py38/lib/python3.8/site-packages/tensorflow/python/keras/callbacks.py", line 1433, in _save_model self.model.save(filepath, overwrite=True, options=self._options) File "[...]/.pyenv/versions/miniconda3-latest/envs/py38/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 2111, in save save.save_model(self, filepath, overwrite, include_optimizer, save_format, File "[...]/.pyenv/versions/miniconda3-latest/envs/py38/lib/python3.8/site-packages/tensorflow/python/keras/saving/save.py", line 139, in save_model raise NotImplementedError( NotImplementedError: Saving the model to HDF5 format requires the model to be a Functional model or a Sequential model. It does not work for subclassed models, because such models are defined via the body of a Python method, which isn't safely serializable. Consider saving to the Tensorflow SavedModel format (by setting save_format="tf") or using
save_weights.
Is there a parameter that must be defined to solve this issue?
Hello,
Is it possible to have multiple types of edges and/or is it possible to distinguish edge types during message passing phase (in model_description)?
thanks in advance.
Training with graphs containing 1 edge generates the following error:
File "[...]/.pyenv/versions/miniconda3-latest/envs/py38/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Expected concatenating dimensions in the range [-1, 1), but got 1
[[{{node gnn_model/StatefulPartitionedCall/ignnition_model/message_passing/iteration_0/stage_0/MP_to_AS/message_phase/AS_to_AS/create_message_AS_to_AS/apply_nn_0/concat_2}}]] [Op:__inference_train_function_8025]
TensorFlow version is 2.5.0
and Python version is 3.8.5
Hello!
When I try to use convolution as an aggregation function I get an error listed at the bottom of this issue. I'm not sure if there are some prerequirements that should be satisfied before using convolutions. Message passing stage looks like:
- stage_message_passings:
- destination_entity: variableNode
source_entities:
- name: factorNode
message:
- type: direct_assignment
aggregation:
- type: convolution
update:
type: neural_network
nn_name: recurrent1
You can find the whole code here, if needed:
code.zip
Best regards,
Ognjen
Epoch 1/1000
100/100 [==============================] - 8s 40ms/step - loss: 0.0266 - mean_absolute_error: 0.1001 - val_loss: 0.0027 - val_mean_absolute_error: 0.0316
Traceback (most recent call last):
File "main.py", line 38, in
main()
File "main.py", line 11, in main
model.train_and_validate()
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\ignnition\ignnition_model.py", line 751, in train_and_validate
verbose=1)
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1145, in fit
callbacks.on_epoch_end(epoch, epoch_logs)
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\tensorflow\python\keras\callbacks.py", line 428, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\tensorflow\python\keras\callbacks.py", line 1344, in on_epoch_end
self._save_model(epoch=epoch, logs=logs)
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\tensorflow\python\keras\callbacks.py", line 1406, in _save_model
filepath, overwrite=True, options=self._options)
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\tensorflow\python\keras\engine\training.py", line 2124, in save_weights
self._trackable_saver.save(filepath, session=session, options=options)
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\tensorflow\python\training\tracking\util.py", line 1217, in save
file_prefix_tensor, object_graph_tensor, options)
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\tensorflow\python\training\tracking\util.py", line 1154, in _save_cached_when_graph_building
object_graph_tensor=object_graph_tensor)
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\tensorflow\python\training\tracking\util.py", line 1120, in _gather_saveables
feed_additions) = self._graph_view.serialize_object_graph()
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\tensorflow\python\training\tracking\graph_view.py", line 408, in serialize_object_graph
trackable_objects, path_to_root)
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\tensorflow\python\training\tracking\graph_view.py", line 363, in _serialize_gathered_objects
object_names[obj] = _object_prefix_from_path(path)
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\tensorflow\python\training\tracking\graph_view.py", line 64, in _object_prefix_from_path
for trackable in path_to_root))
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\tensorflow\python\training\tracking\graph_view.py", line 64, in
for trackable in path_to_root))
File "C:\Users\OgnjenKundacina\miniconda3\envs\gnn_env\lib\site-packages\tensorflow\python\training\tracking\graph_view.py", line 57, in _escape_local_name
return (name.replace(_ESCAPE_CHAR, _ESCAPE_CHAR + _ESCAPE_CHAR)
AttributeError: 'NoneType' object has no attribute 'replace'
Hi ignnition team!
Briefly the problem is that model.predict() function doesn't return good results when it is called without model.train_and_validate() function.
I've trained a GNN model successfully (training and validation losses were converging to zero) and predictions from the predict() methods were aligned with the labels in the prediction set. I've used the following code in the main() method:
model = ignnition.create_model(model_dir='./')
model.computational_graph()
model.train_and_validate()
predictions = model.predict(num_predictions = 1)
Part of the train_options.yaml file:
train_dataset: ./data/train
validation_dataset: ./data/test
predict_dataset: ./data/test
load_model_path: ./weights.1000-0.00R.hdf5
additional_functions_file: ./main.py
output_path: ./
I copied trained trained model parameters "weights.1000-0.00R.hdf5" from the CheckPoint into the root directory and called the model.predict() in the following way:
model = ignnition.create_model(model_dir='./')
predictions = model.predict(num_predictions = 1)
and in this way also:
model = ignnition.create_model(model_dir='./')
model.computational_graph()
predictions = model.predict(num_predictions = 1)
but the predictions were not fitting the labels in the predict set well. I also tried calling the model.train_and_validate() function with epochs and epoch_size set to 0, but it gave the same results.
I would also note that there is no "weights.1000-0.00R.hdf5" in some location other than root directory, so the correct trained parameters should be loaded:
Console logs:
←[1m
Processing the described model...
←[0m
←[1mCreating the GNN model...
←[0m
Restoring from ./weights.1000-0.00.hdf5
←[1mStarting to make the predictions...
You can find the code attached in code.zip, as well as predicted vs label plots for both working and non-working examples.
Kind regards!
Ognjen
This would allow for the user to train, for example, using multiple topologies. Check https://ignnition.net/doc/train_and_evaluate/#definition-of-the-paths for more details
I am a new user of the ignnition framework and I would like to ask the following two questions:
Looking forward to hearing from the community, thanks a lot!
Dear ignnition team,
In our problem we are training a GNN for a regression task - a subset of nodes is labeled by a float value and those labels are learned using a neural network as a readout model. The whole GNN model is trained based on the MSE loss between the labels and the predictions for the mentioned subset of nodes and works well!
We would like to create a new loss function that incorporates some physical laws related to our problem. In each training step, after the predictions are generated for all of the labeled nodes, we would like to add an additional term to the MSE between the labels and the predictions. That additional term would multiply all of the predictions with some coefficients (different for every node), and sum all of the obtained values. So the goal would be to minimize that sum along with the MSE.
Is something like this possible to implement in the ignnition framework? I'm not even sure is it consistent with the logic for creating the mini-batches - I guess that the requirement here would be to have all of the nodes from one training sample in the same mini-batch.
Thanks!
Ognjen
Currently, when the specified hidden state size for a certain entity is smaller than the number of node features, model creation fails with the following error:
Errors may have originated from an input operation.
Input Source operations connected to node
ignnition_model/hidden_states/hidden_state_atom/add_zeros_to_atom/zeros:
ignnition_model/hidden_states/hidden_state_atom/stack
(defined at /****/ignnition/ignnition/auxilary_classes.py:155)
Function call stack:
call
This is probably because the framework tries to pad the node features up to the hidden state size. Support for smaller hidden state sizes should be added.
Dear ignnition team,
In our problem we use a GNN to learn over graphs with two type of nodes. One type of nodes is intended to be used for inputting data into the GNN, and the other type is used for generating predictions (no nodes are intended for both predictions and inputting data). Furthermore, these graphs are bipartite. So in the ideal case we would like to label only the second type of nodes and calculate the loss functions using their labels and predictions.
Is it possible to specify a subset of nodes in a graph (type of nodes in our case) from which the loss function will be calculated?
Thanks!
Ognjen
This error appear when i try to train my GNN on graph datasets in order to optimize makespan of RCPSP problem.
The error disapear when i clear the dataset and i keep only few graphs (4-5 not more).
I tried to changed every parameter in train_options and model_description but nothing solved my problem.
Image with the full error message in attachment.
When trying to access the library with examples of GNNs already implemented, the page does not seem to exist:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.