Giter VIP home page Giter VIP logo

text2shape's Introduction

Codacy Badge Build Status

text2shape

Project is based on this paper.


Setup

Dataset

The dataset can be find here. Download and add to data folder:

  • Text Descriptions
  • Solid Voxelizations: 32 Resolution

Dependencies

  • pynrrd 0.4.2 :
pip install pynrrd
  • spacy-2.3.0 :
pip install spacy
python3 -m spacy download en_core_web_sm
  • language-tool-python-2.2.3 :
pip install language-tool-python

Getting started

Preprocessing for descriptions

  • remove descriptions with more than max_length words (default 96 words)
  • preprocessing description (each word/symbol is seperated by space)
  • vocabulary gets filled with words that appear more than twice
python3 preprocessing/run_preprocessing.py data/captions.tablechair.csv data/full_preprocessed.captions.csv data/full_voc.csv
python3 preprocessing/run_preprocessing_primitives.py data/primitives.v2/ "shape" data/vic_primitives primitives_voc.csv

Learning embeddings

  • set configuration in config/cfg.yaml
python3 train.py config/cfg.yaml

Retrievals

  • define which retrievals and further configs within config/cfg_retrieval.yaml
  • possibile retrievals:
  • text 2 text (t2t)
  • text 2 shape (t2s)
  • shape 2 text (s2t)
  • shape 2 shape (s2s)
python3 retrieval.py config/cfg_retrieval.yaml

run T-SNE

  • set configuration in config/cfg_tsne.yaml
python3 t-SNE.py config/cfg_tsne.yaml
  • result is found in results as tsne.png

text2shape's People

Contributors

codacy-badger avatar kynesto avatar maxim0815 avatar mh415-f avatar

Stargazers

 avatar

Watchers

 avatar  avatar

text2shape's Issues

Text preprocessing

  • Filter for max length of descriptor
  • save processed data
  • save vocabulary
    reload processed data and vocabulary --> move to dataloader

RenderVoxels crashes in some cases

Currently exception gets caught an image dropped to keep drawing and program alive.
Line of problems:

ax.voxels(data_crop, facecolors=colors_reshape, edgecolors=edges_reshape)

Error:

#91378ff' is neither a valid single color nor a color sequence consisting of single character color specifiers such as 'rgb'. Note also that the latter is deprecated.

model_ID

Would be easier to also get the model ID for making it possible to as example show the shape corresponding to the text2shape verification

dataloader

  • split data
  • generate batches (shuffled)
  • use preprocessed data
  • desc2vec and vice versa

generate for triplet loss:

  • Anchor
  • Positive as vector
  • Negative as vector

encoder-combination

combine:

  • txt encoder
  • shape encoder

handles:

  • triplets to tensor
  • loss function
  • pdate routine

Multimodal Loss

Implement whole loss function:

  • Cross-Modal Associations
  • Instance-Level Associations
  • Multimodal Metric Learning
  • Full Multimodal Loss

visualize voxel

Set up something to make visualization of voxel possible.
Useful at the end of project.

verify txt2txt

Verify if training was successful.
In this case just check for the text encoder (NNs?).

Improve memory consumption for retrievals

So far all data are stored within RetriavelLoader class as tensors.
Memory consumption for whole data set (shapes and captions) is about 9GB.

TripletLoader during training (list of class triplets) is about 4 GB.
Easiest solution:

  • adapt NNs for using triplets itself instead of list of tensors
  • provides retrievals to use them during training as metric

config parser

use config file to set up hyperparameter, directories ...

t-SNE

reduce dimension of embedding and plot results

adapt pipeline for t2s triplets

So far only s2t triplets are generated. Adapt whole pipeline to make also t2s triplets possible.

  • adapt data loader
  • generate triplets with description as anchor and pos/neg matching shapes
  • adapt loss function to figure out what kind of triplet
  • how to randomly generate different batch

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.