Light

maxim0815 / text2shape Goto Github PK

View Code? Open in Web Editor NEW

1.0 2.0 2.0 6.67 MB

Learn embedding for text and shape.

Python 100.00%

text2shape's Introduction

text2shape

Project is based on this paper.

Setup

Dataset

The dataset can be find here. Download and add to data folder:

Text Descriptions
Solid Voxelizations: 32 Resolution

Dependencies

pynrrd 0.4.2 :

pip install pynrrd

spacy-2.3.0 :

pip install spacy
python3 -m spacy download en_core_web_sm

language-tool-python-2.2.3 :

pip install language-tool-python

Getting started

Preprocessing for descriptions

remove descriptions with more than max_length words (default 96 words)
preprocessing description (each word/symbol is seperated by space)
vocabulary gets filled with words that appear more than twice

python3 preprocessing/run_preprocessing.py data/captions.tablechair.csv data/full_preprocessed.captions.csv data/full_voc.csv

python3 preprocessing/run_preprocessing_primitives.py data/primitives.v2/ "shape" data/vic_primitives primitives_voc.csv

Learning embeddings

set configuration in config/cfg.yaml

python3 train.py config/cfg.yaml

Retrievals

define which retrievals and further configs within config/cfg_retrieval.yaml
possibile retrievals:
text 2 text (t2t)
text 2 shape (t2s)
shape 2 text (s2t)
shape 2 shape (s2s)

python3 retrieval.py config/cfg_retrieval.yaml

run T-SNE

set configuration in config/cfg_tsne.yaml

python3 t-SNE.py config/cfg_tsne.yaml

result is found in results as tsne.png

text2shape's People

Contributors

Stargazers

Watchers

Forkers

jlutangchuan linlianjiang

text2shape's Issues

adapt for use of primitives

adapt dataloader to also run on primitives dataset.

Text preprocessing

Filter for max length of descriptor
save processed data
save vocabulary
reload processed data and vocabulary --> move to dataloader

RenderVoxels crashes in some cases

Currently exception gets caught an image dropped to keep drawing and program alive.
Line of problems:

ax.voxels(data_crop, facecolors=colors_reshape, edgecolors=edges_reshape)

Error:

#91378ff' is neither a valid single color nor a color sequence consisting of single character color specifiers such as 'rgb'. Note also that the latter is deprecated.

model_ID

Would be easier to also get the model ID for making it possible to as example show the shape corresponding to the text2shape verification

dataloader

split data
generate batches (shuffled)
use preprocessed data
desc2vec and vice versa

generate for triplet loss:

Anchor
Positive as vector
Negative as vector

encoder-combination

combine:

txt encoder
shape encoder

handles:

triplets to tensor
loss function
pdate routine

Multimodal Loss

Implement whole loss function:

Cross-Modal Associations
Instance-Level Associations
Multimodal Metric Learning
Full Multimodal Loss

visualize voxel

Set up something to make visualization of voxel possible.
Useful at the end of project.

verify txt2txt

Verify if training was successful.
In this case just check for the text encoder (NNs?).

Improve memory consumption for retrievals

So far all data are stored within RetriavelLoader class as tensors.
Memory consumption for whole data set (shapes and captions) is about 9GB.

TripletLoader during training (list of class triplets) is about 4 GB.
Easiest solution:

adapt NNs for using triplets itself instead of list of tensors
provides retrievals to use them during training as metric

Shape Encoder

Implement architecture for shape encoder

Cross-modal Retrieval

query text and find nearest shape neighbors

add evaluation metric to training

config parser

use config file to set up hyperparameter, directories ...

saving new best model

Solution so far is not pretty good.
Focusing on cross modal retrievals?

Text Encoder

Implement architecture for text encoder.

Triplet Loss

Metric learning approach

smoothed triplet-based similarity loss

verify shape2shape

check which nearest neighbors are found

t-SNE

reduce dimension of embedding and plot results

smart batches

How could we select "smart" batches.

adapt pipeline for t2s triplets

So far only s2t triplets are generated. Adapt whole pipeline to make also t2s triplets possible.

adapt data loader
generate triplets with description as anchor and pos/neg matching shapes
adapt loss function to figure out what kind of triplet
how to randomly generate different batch

Evaluation for retrievals

maybe use normalized discounted cumulative gain

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.