princetonlips / sketchgraphs Goto Github PK

View Code? Open in Web Editor NEW

324.0 324.0 55.0 11.13 MB

A dataset of 15 million CAD sketches with geometric constraint graphs.

Home Page: https://princetonlips.github.io/SketchGraphs/

License: MIT License

Python 100.00%

sketchgraphs's People

Contributors

Stargazers

Watchers

sketchgraphs's Issues

datalib has no module: datalib.pgvgraph , no pygraphvis

`Bus error (core dumped)` after attempting to run the model example

To provide a bit more context, I cloned a fresh repo and downloaded the data as well as the metadata. I ran

python -m sketchgraphs_models.graph.train --dataset_train sg_t16_train.npy

This produced a bus error.

The caveat is that I did not install the python package via pip install . or python setup.py install, as I'm currently still trying to figure out some issues associated with nvcc. I did install all the relevant packages in requirements.txt. Is this an expected consequence of not installing the cuda extension?

how to get "distinct" shapes?

pardon my vagueness as I don't know how to properly articulate this yet, but try to reach me half-way.

I'm using your dataset for a kind of cognitive experiment, i.e. just want a sample of cool shapes that looked different.

I downloaded the validation set, which contained more than enough interesting shapes. however, noticed that most of the shapes in validation are duplicates in that they're conceptually same, i.e. a single circle, albeit of different sizes, a single rectangle, albeit different width/heights, or a combination of.

what would you suggest to get a set of "interesting" shapes? I thought about doing stuff like generating a kind of "fingerprint" for each shape, such as the number of lines and such, and keeping one shape per fingerprint. however it does seem a bit restrictive. I just want some idea that I can play with.

the "gold standard" would be using the constraint solver and see if some shapes can be isomorphic to others by modifying the parameters while still satisfying the constraints. but no way I'm doing that lol. something quick and dirty would be preferable.

thanks in advance

onshape.request() now returning status_code:500 An internal error has occurred

Is anybody else experiencing issues making OnShape calls? I'm currently getting

status_code:500
An internal error has occurred; support code xxxxx

I was previously able to make OnShape calls with the same code. Has there been a change to the FeatureScript microversion?

KeyError: xCenter in sketchgraphs/pipeline/graph_model/quantization.py

Hello SketchGraph Team,

I have been trying to look at the output of your generative model and hit a problem. I'm not sure if perhaps I did something wrong at the training or sampling stage.

I tried the generative model as follows

python -m sketchgraphs_models.graph.train --dataset_train /path/to/data/sg_t16_train.npy

This generated the following files

0219/time_104135
0219/time_104135/args.txt
0219/time_104135/model_state_20.pt
0219/time_104135/model_state_40.pt
0219/time_104135/model_state_30.pt
0219/time_104135/model_state_10.pt
0219/time_104135/eval
0219/time_104135/eval/events.out.tfevents.1613731298.lamboujdevbox.10253.1
0219/time_104135/model_state_50.pt
0219/time_104135/events.out.tfevents.1613731298.lamboujdevbox.10253.0

I then try to run the sketchgraphs_models/graph/sample.py to extract some generated examples from the model. I did this using

python -m sketchgraphs_models.graph.sample --output_path /path/to/output/sampled_data.pkl --model_state path/to/output/0219/time_104135/model_state_50.pt

I hit an error

Exception has occurred: KeyError
'xCenter'
  File "xxxx/sketchgraphs/pipeline/graph_model/quantization.py", line 258, in _numerical_features
    feature[i + offset] = int(np.searchsorted(edges, params[param_name]))

Digging into what is going on I see the dictionary params is generated by the following code

sketchgraphs/pipeline/graph_model/quantization.py#L314-L315

        for i, (param_name, centers) in enumerate(self._bin_centers.get(target, {}).items()):
            features[param_name] = centers[index[i + offset]]

Here targethas type <TargetType.NodeCircle: 8> but inside the self._bin_centers dictionary we have <EntityType.Circle: 2>. Hence the features dictionary isn't built up correctly.

I find that adding the code

        _entity_label_from_target_type_dict = {
            TargetType.NodeArc: datalib.EntityType.Arc,
            TargetType.NodeCircle: datalib.EntityType.Circle,
            TargetType.NodeLine: datalib.EntityType.Line,
            TargetType.NodePoint: datalib.EntityType.Point
        }

        target_entity = _entity_label_from_target_type_dict[target]

        for i, (param_name, centers) in enumerate(self._bin_centers.get(target_entity, {}).items()):
            features[param_name] = centers[index[i + offset]]

fixed the problem.

Could you let me know if I did something wrong at the training or test phase. If this fix would be useful for you I can create a PR to submit it back to the main repo.

Thank you for your help.

Explanation of some schemata

I don't quite understand some schemata - eg the Horizontal the Horizontal constraint with just one reference to local0 makes sense but in what context would a Horizontal constraint that refers to local0, local1 make sense?

Not quite sure I get that.

how to get bounding box?

I have a sequence of sketch entities, i.e. ARC, Line, Points, etc.
how do I find the bounding of these entities put together? Is there a line of code somewhere in the library that does this?
much thanks

can i get the pre-trained autoconstraint model?

i tried train autoconstraint model but i haven't gpu... I trained it back to m1 (mps) and it takes 30 hours per epoch. colab is paid.
If possible, I would appreciate it if you could upload a pre-trained autoconstraint model.

Effect of batch size on training dynamics

I'm running the autoconstraint task, and I realized that the default batch size is 2048 for the released training code. I'm reading the paper and it states that a batch size of 8192 is used.

I'm currently playing with the code using a batch size of 128, so that things fit on my local desktop GPU, and everything runs smoothly so far. But I was wondering whether the original choice of batch size was mainly to speed up training given a fixed number of epochs, or that it affected performance somehow (in the sense that it may affect test log-likelihood/precision/recall).

Duplicate stop tokens?

SketchGraphs/sketchgraphs_models/autoconstraint/dataset.py

Line 117 in e612f59

if seq[-1].label != 'Stop':

Notice that seq[-1].label = <EntityType.Stop: 8> even if the last element of the sequence is a stop node. To actually get the stop label we'd write seq[-1].label.name which indeed is 'stop'. Is the desired behavior to have two stop nodes, maybe each communicating something different? Otherwise, should we change this line to check that seq[-1].label.name != 'stop' instead?

There's also something interesting going on here in that the string being compared against is 'Stop' (capital s) while the node appended to the sequence has label 'stop' (lower case s).

Again, maybe this is the desired behavior, just looking for some clarity one what's going on.

NaNs in sketchgraphs_models/nn/summary.py cohen_kappa() when training on a subset of sg_t16_train.npy

Hi SketchGraphs experts,

I'm trying to train on a subset of the dataset sg_t16_train.npy with some "geometric" duplicates removed. It was a very trivial change to the GraphDataset from sketchgraphs_models/graph/dataset/__init__.py to get the data loader to choose only sequences from my de-duplicated list.

When training on the subset I find that I get some NaNs reported in the kappa edge statistics

Kappa Edges
EdgeAngle: aligned (nan); clockwise (nan); angle (nan)
EdgeLength: direction (1.000); length (-0.043)
EdgeDistance: direction (0.000); halfSpace0 (0.019); halfSpace1 (0.371); length (0.000)
EdgeDiameter: length (-0.013)
EdgeRadius: length (0.033)

Debugging I see that the NaNs get generated in the cohen_kappa() function in sketchgraphs_models/nn/summary.py

For the EdgeAngle TargetType the value of self.recorded = [0, 0, 0, 0].

        pm = self.prediction_matrix.float()
        N = self.recorded.sum().float()    <------ N == 0 

        p_observed = pm.diag().sum() / N                                                          <---- NaNs in here
        p_expected = torch.dot(pm.sum(dim=0), pm.sum(dim=1)) / (N * N)     <---- and here

My understanding of the problem is that the data subset doesn't contain any EdgeAngle constraints and consequently I can fix this with

        if N == 0:
            return 1

Does this sound like a valid solution, or are there other parts of the code I would need to change to work with a subset of the data?

Thank you for your help with this!

Arc/Circle entities do not have a start_point

SketchGraphs/sketchgraphs/data/constraint_checks.py

Line 257 in e69dd49

return coincident(ents[0].center_point, ents[1].start_point)

Hey,
.
I was trying to work through your code and this strikes me as a bug.
The entities Arc/Circle would be concentric if they have the same center. You test for the center and the start point being the same.
I don't think the Circle even has a start_point. Does this sound right?

Labels for classification

Hello,

Thank you for sharing an interesting dataset.
I am currently exploring this dataset as it consists of sketches. I wanted to know how to get a unique label from the sequence information , so that could be used for either classification or object detection task?
Also, what is the distribution of graphs sketches , are the graphs repeated in train-val and test sets?

Thank you
Anshu

General Label name for each graph

Hello,

Thank you
Anshu

Cannot run `pip install -r requirements.txt`

Hi, thanks for the amazing working!

This is a really minor issue, but it would be nice if we'd be able to install the requirements via
pip install -r requirements.txt

This currently breaks due to pytorch>=1.5 and python>=3.7.

how to get auxiliary dataset

i can't found auxiliary dataset. how to get auxiliary dataset?

i found it!! make_quantization_statistics.py

sketch in dataset causes "Some constraints are not applicable to the current external references and have not been solved."

When interacting with Onshape's API with the guide of demo, I find some sketches in the dataset result in the warning "Some constraints are not applicable to the current external references and have not been solved."

Will this affect Onshape's solver? what causes the warning and how to deal with it?

code below

from sketchgraphs.data import flat_array
import sketchgraphs.data as datalib
import sketchgraphs.onshape.call as onshape_call

dataset = 'validation'
url = R'https://cad.onshape.com/documents/xxxxxx' # onshape document url

seq_data = flat_array.load_dictionary_flat('sequence_data/sg_t16_validation.npy')
seq = seq_data['sequences'][100746]
sketch = datalib.sketch_from_sequence(seq)
onshape_call.add_feature(url, sketch.to_dict(), 'my sketch')

Compiler error trying to build extension

Hello,

I was trying to compile the package with setup.py under an Ubuntu 18.04 system with PyTorch 1.7.0. But I cannot compile the extensions. I get the following error

/usr/include/c++/6/tuple: In instantiation of ‘static constexpr bool std::_TC<, _Elements>::_MoveConstructibleTuple() [with _UElements = {std::tuple<at::Tensor, at::Tensor, at::Tensor>}; bool = true; _Elements = {at::Tensor, at::Tensor, at::Tensor}]’:
/usr/include/c++/6/tuple:626:248: required by substitution of ‘template<class ... _UElements, typename std::enable_if<(((std::_TC<(sizeof... (_UElements) == 1), at::Tensor, at::Tensor, at::Tensor>::_NotSameTuple<_UElements ...>() && std::_TC<(1ul == sizeof... (_UElements)), at::Tensor, at::Tensor, at::Tensor>::_MoveConstructibleTuple<_UElements ...>()) && std::_TC<(1ul == sizeof... (_UElements)), at::Tensor, at::Tensor, at::Tensor>::_ImplicitlyMoveConvertibleTuple<_UElements ...>()) && (3ul >= 1)), bool>::type > constexpr std::tuple< >::tuple(_UElements&& ...) [with _UElements = {std::tuple<at::Tensor, at::Tensor, at::Tensor>}; typename std::enable_if<(((std::_TC<(sizeof... (_UElements) == 1), at::Tensor, at::Tensor, at::Tensor>::_NotSameTuple<_UElements ...>() && std::_TC<(1ul == sizeof... (_UElements)), at::Tensor, at::Tensor, at::Tensor>::_MoveConstructibleTuple<_UElements ...>()) && std::_TC<(1ul == sizeof... (_UElements)), at::Tensor, at::Tensor, at::Tensor>::_ImplicitlyMoveConvertibleTuple<_UElements ...>()) && (3ul >= 1)), bool>::type = ]’
/home/parawr/anaconda3/lib/python3.7/site-packages/torch/include/ATen/core/TensorMethods.h:5613:173: required from here
/usr/include/c++/6/tuple:483:67: error: mismatched argument pack lengths while expanding ‘std::is_constructible<_Elements, _UElements&&>’
return _and<is_constructible<_Elements, _UElements&&>...>::value;

Could you help me with this?

UserWarning: The given NumPy array is not writeable

Hi SketchGraphs Team,

I'm running the sketchgraphs generative model like this

python -m sketchgraphs_models.graph.train \
   --dataset_train /data/sg_t16_train.npy \
   --dataset_test /data/sg_t16_test.npy

I'm seeing a worrying warning from NumPy

 UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  /opt/conda/conda-bld/pytorch_1607370156314/work/torch/csrc/utils/tensor_numpy.cpp:141.)
  self._offsets = torch.as_tensor(self._offsets).share_memory_()

I'm using Ubuntu 18.04.5 with a Quadro RTX 6000 GPU

Python 3.7.7
Pytorch 1.7.1
Cuda 11.0.221
Numpy 1.19.2

Full output from conda list is at the bottom of this message

I find I can fix the problem following the pytorch thread here like this

In (flat_array.py)[flat_array.py]

def __init__(self, offsets, pickle_data):
       """
       pickle_data : array_like
           an array of bytes representing the serialized data for all objects.
       """
   -   self._offsets = offsets
   -   self._pickle_data = pickle_data
   +  self._offsets = np.copy(offsets)
   +  self._pickle_data = np.copy(pickle_data)

This gets rid of the warning, but I'm worried it might be having side effects. I'm seeing some very odd stuff happening. Could you let me know if you think this fix is sensible. I'm happy to make a PR to submit the change if it looks useful.

Also please let me know if you need any other details of my setup.

Full output from conda list

# packages in environment at /home/lambouj/anaconda3/envs/sketchgraphs_fresh:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main
absl-py                   0.12.0             pyhd8ed1ab_0    conda-forge
blas                      1.0                         mkl
blinker                   1.4                        py_1    conda-forge
brotlipy                  0.7.0           py37hb5d75c8_1001    conda-forge
c-ares                    1.17.1               h36c2ea0_0    conda-forge
ca-certificates           2020.10.14                    0    anaconda
cachetools                4.2.1              pyhd8ed1ab_0    conda-forge
cairo                     1.14.12              h8948797_3
certifi                   2020.6.20                py37_0    anaconda
cffi                      1.14.4           py37h11fe52a_0    conda-forge
chardet                   4.0.0            py37h89c1867_1    conda-forge
click                     7.1.2              pyh9f0ad1d_0    conda-forge
cryptography              3.2.1            py37hc72a4ac_0    conda-forge
cudatoolkit               11.0.221             h6bb024c_0
cycler                    0.10.0                     py_2    conda-forge
dbus                      1.13.6               he372182_0    conda-forge
expat                     2.2.10               he6710b0_2
fontconfig                2.13.1               h6c09931_0
freetype                  2.10.4               h5ab3b9f_0
fribidi                   1.0.10               h7b6447c_0
glib                      2.63.1               h5a9c865_0
google-auth               1.21.3                     py_0    conda-forge
google-auth-oauthlib      0.4.1                      py_2    conda-forge
graphite2                 1.3.14               h23475e2_0
graphviz                  2.40.1               h21bd128_2
grpcio                    1.33.2           py37haffed2e_2    conda-forge
gst-plugins-base          1.14.5               h0935bb2_2    conda-forge
gstreamer                 1.14.5               h36ae1b5_2    conda-forge
harfbuzz                  1.8.8                hffaf4a1_0
icu                       58.2                 he6710b0_3
idna                      2.10               pyh9f0ad1d_0    conda-forge
importlib-metadata        3.7.3            py37h89c1867_0    conda-forge
intel-openmp              2020.2                      254
jpeg                      9b                   h024ee3a_2
kiwisolver                1.3.1            py37hc928c03_0    conda-forge
lcms2                     2.11                 h396b838_0
ld_impl_linux-64          2.33.1               h53a641e_7
libffi                    3.2.1             hf484d3e_1007
libgcc-ng                 9.1.0                hdf63c60_0
libpng                    1.6.37               hbc83047_0
libprotobuf               3.13.0.1             h8b12597_0    conda-forge
libstdcxx-ng              9.1.0                hdf63c60_0
libtiff                   4.2.0                h3942068_0
libuuid                   1.0.3                h1bed415_2
libuv                     1.40.0               h7b6447c_0
libwebp-base              1.2.0                h27cfd23_0
libxcb                    1.14                 h7b6447c_0
libxml2                   2.9.10               hb55368b_3
lz4                       3.1.0            py37h7b6447c_0    anaconda
lz4-c                     1.9.3                h2531618_0
markdown                  3.3.4              pyhd8ed1ab_0    conda-forge
matplotlib                3.3.4            py37h89c1867_0    conda-forge
matplotlib-base           3.3.4            py37h62a2d02_0
mkl                       2020.2                      256
mkl-service               2.3.0            py37he8ac12f_0
mkl_fft                   1.3.0            py37h54f3939_0
mkl_random                1.1.1            py37h0573a6f_0
ncurses                   6.2                  he6710b0_1
ninja                     1.10.2           py37hff7bd54_0
numpy                     1.19.2           py37h54aff64_0
numpy-base                1.19.2           py37hfa32c7d_0
oauthlib                  3.0.1                      py_0    conda-forge
olefile                   0.46                     py37_0
openssl                   1.1.1h               h7b6447c_0    anaconda
pango                     1.42.4               h049681c_0
pcre                      8.44                 he6710b0_0
pillow                    8.1.2            py37he98fc37_0
pip                       21.0.1           py37h06a4308_0
pixman                    0.40.0               h7b6447c_0
protobuf                  3.13.0.1         py37h745909e_1    conda-forge
pyasn1                    0.4.8                      py_0    conda-forge
pyasn1-modules            0.2.7                      py_0    conda-forge
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pygraphviz                1.3              py37h14c3975_1
pyjwt                     2.0.1              pyhd8ed1ab_0    conda-forge
pyopenssl                 20.0.1             pyhd8ed1ab_0    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pyqt                      5.9.2            py37hcca6a23_4    conda-forge
pysocks                   1.7.1            py37h89c1867_3    conda-forge
python                    3.7.7           hcf32534_0_cpython
python-dateutil           2.8.1                      py_0    conda-forge
python_abi                3.7                     1_cp37m    conda-forge
pytorch                   1.7.1           py3.7_cuda11.0.221_cudnn8.0.5_0    pytorch
qt                        5.9.7                h5867ecd_1
readline                  8.1                  h27cfd23_0
requests                  2.25.1             pyhd3deb0d_0    conda-forge
requests-oauthlib         1.3.0              pyh9f0ad1d_0    conda-forge
rsa                       4.7.2              pyh44b312d_0    conda-forge
setuptools                52.0.0           py37h06a4308_0
sip                       4.19.8          py37hf484d3e_1000    conda-forge
six                       1.15.0           py37h06a4308_0
sqlite                    3.35.2               hdfb4753_0
tensorboard               2.4.1              pyhd8ed1ab_0    conda-forge
tensorboard-plugin-wit    1.8.0              pyh44b312d_0    conda-forge
tk                        8.6.10               hbc83047_0
torchaudio                0.7.2                      py37    pytorch
torchvision               0.8.2                py37_cu110    pytorch
tornado                   6.1              py37h4abf009_0    conda-forge
typing_extensions         3.7.4.3            pyha847dfd_0
urllib3                   1.26.4             pyhd8ed1ab_0    conda-forge
werkzeug                  1.0.1              pyh9f0ad1d_0    conda-forge
wheel                     0.36.2             pyhd3eb1b0_0
xz                        5.2.5                h7b6447c_0
zipp                      3.4.1              pyhd8ed1ab_0    conda-forge
zlib                      1.2.11               h7b6447c_3
zstd                      1.4.5                h9ceee32_0

External Node and constraints on the external

Hey there,

I am trying to parse the dataset. I don't completely grok the external node. Moreso the fact that there are certain points that are coincident to it.

What is it used for? Is it expected by the OnShape API? What would happen if one submits a sequence to OnShape without a) the external node b) constraints on the external node? I am guessing the constraints are mostly coincidences.

Thanks!

Missing native C++ extensions commands in documentation

In the documentation, there is a mention that training can be greatly accelerated using the native extensions, but no given comment on how to generate the extensions.

Here is where the documentation says there should be native extensions

princetonlips / sketchgraphs Goto Github PK

sketchgraphs's People

Contributors

Stargazers

Watchers

Forkers

sketchgraphs's Issues

Recommend Projects

Recommend Topics

Recommend Org