Giter VIP home page Giter VIP logo

lxmert's Introduction

LXMERT: Learning Cross-Modality Encoder Representations from Transformers

Our servers break again :(. I have updated the links so that they should work fine now. Sorry for the inconvenience. Please let me for any further issues. Thanks! --Hao, Dec 03

Introduction

PyTorch code for the EMNLP 2019 paper "LXMERT: Learning Cross-Modality Encoder Representations from Transformers". Slides of our EMNLP 2019 talk are avialable here.

  • To analyze the output of pre-trained model (instead of fine-tuning on downstreaming tasks), please load the weight https://nlp.cs.unc.edu/data/github_pretrain/lxmert20/Epoch20_LXRT.pth, which is trained as in section pre-training. The default weight here is trained with a slightly different protocal as this code.

Results (with this Github version)

Split VQA GQA NLVR2
Local Validation 69.90% 59.80% 74.95%
Test-Dev 72.42% 60.00% 74.45% (Test-P)
Test-Standard 72.54% 60.33% 76.18% (Test-U)

All the results in the table are produced exactly with this code base. Since VQA and GQA test servers only allow limited number of 'Test-Standard' submissions, we use our remaining submission entry from the VQA/GQA challenges 2019 to get these results. For NLVR2, we only test once on the unpublished test set (test-U).

We use this code (with model ensemble) to participate in VQA 2019 and GQA 2019 challenge in May 2019. We are the only team ranking top-3 in both challenges.

Pre-trained models

The pre-trained model (870 MB) is available at http://nlp.cs.unc.edu/data/model_LXRT.pth, and can be downloaded with:

mkdir -p snap/pretrained 
wget https://nlp.cs.unc.edu/data/model_LXRT.pth -P snap/pretrained

If download speed is slower than expected, the pre-trained model could also be downloaded from other sources. Please help put the downloaded file at snap/pretrained/model_LXRT.pth.

We also provide data and commands to pre-train the model in pre-training. The default setup needs 4 GPUs and takes around a week to finish. The pre-trained weights with this code base could be downloaded from https://nlp.cs.unc.edu/data/github_pretrain/lxmert/EpochXX_LXRT.pth, XX from 01 to 12. It is pre-trained for 12 epochs (instead of 20 in EMNLP paper) thus the fine-tuned reuslts are about 0.3% lower on each datasets.

Fine-tune on Vision-and-Language Tasks

We fine-tune our LXMERT pre-trained model on each task with following hyper-parameters:

Dataset Batch Size Learning Rate Epochs Load Answers
VQA 32 5e-5 4 Yes
GQA 32 1e-5 4 Yes
NLVR2 32 5e-5 4 No

Although the fine-tuning processes are almost the same except for different hyper-parameters, we provide descriptions for each dataset to take care of all details.

General

The code requires Python 3 and please install the Python dependencies with the command:

pip install -r requirements.txt

By the way, a Python 3 virtual environment could be set up and run with:

virtualenv name_of_environment -p python3
source name_of_environment/bin/activate

VQA

Fine-tuning

  1. Please make sure the LXMERT pre-trained model is either downloaded or pre-trained.

  2. Download the re-distributed json files for VQA 2.0 dataset. The raw VQA 2.0 dataset could be downloaded from the official website.

    mkdir -p data/vqa
    wget https://nlp.cs.unc.edu/data/lxmert_data/vqa/train.json -P data/vqa/
    wget https://nlp.cs.unc.edu/data/lxmert_data/vqa/nominival.json -P  data/vqa/
    wget https://nlp.cs.unc.edu/data/lxmert_data/vqa/minival.json -P data/vqa/
  3. Download faster-rcnn features for MS COCO train2014 (17 GB) and val2014 (8 GB) images (VQA 2.0 is collected on MS COCO dataset). The image features are also available on Google Drive and Baidu Drive (see Alternative Download for details).

    mkdir -p data/mscoco_imgfeat
    wget https://nlp.cs.unc.edu/data/lxmert_data/mscoco_imgfeat/train2014_obj36.zip -P data/mscoco_imgfeat
    unzip data/mscoco_imgfeat/train2014_obj36.zip -d data/mscoco_imgfeat && rm data/mscoco_imgfeat/train2014_obj36.zip
    wget https://nlp.cs.unc.edu/data/lxmert_data/mscoco_imgfeat/val2014_obj36.zip -P data/mscoco_imgfeat
    unzip data/mscoco_imgfeat/val2014_obj36.zip -d data && rm data/mscoco_imgfeat/val2014_obj36.zip
  4. Before fine-tuning on whole VQA 2.0 training set, verifying the script and model on a small training set (512 images) is recommended. The first argument 0 is GPU id. The second argument vqa_lxr955_tiny is the name of this experiment.

    bash run/vqa_finetune.bash 0 vqa_lxr955_tiny --tiny
  5. If no bug came out, then the model is ready to be trained on the whole VQA corpus:

    bash run/vqa_finetune.bash 0 vqa_lxr955

It takes around 8 hours (2 hours per epoch * 4 epochs) to converge. The logs and model snapshots will be saved under folder snap/vqa/vqa_lxr955. The validation result after training will be around 69.7% to 70.2%.

Local Validation

The results on the validation set (our minival set) are printed while training. The validation result is also saved to snap/vqa/[experiment-name]/log.log. If the log file was accidentally deleted, the validation result in training is also reproducible from the model snapshot:

bash run/vqa_test.bash 0 vqa_lxr955_results --test minival --load snap/vqa/vqa_lxr955/BEST

Submitted to VQA test server

  1. Download our re-distributed json file containing VQA 2.0 test data.
    wget https://nlp.cs.unc.edu/data/lxmert_data/vqa/test.json -P data/vqa/
  2. Download the faster rcnn features for MS COCO test2015 split (16 GB).
    wget https://nlp.cs.unc.edu/data/lxmert_data/mscoco_imgfeat/test2015_obj36.zip -P data/mscoco_imgfeat
    unzip data/mscoco_imgfeat/test2015_obj36.zip -d data && rm data/mscoco_imgfeat/test2015_obj36.zip
  3. Since VQA submission system requires submitting whole test data, we need to run inference over all test splits (i.e., test dev, test standard, test challenge, and test held-out). It takes around 10~15 mins to run test inference (448K instances to run).
    bash run/vqa_test.bash 0 vqa_lxr955_results --test test --load snap/vqa/vqa_lxr955/BEST

The test results will be saved in snap/vqa_lxr955_results/test_predict.json. The VQA 2.0 challenge for this year is host on EvalAI at https://evalai.cloudcv.org/web/challenges/challenge-page/163/overview It still allows submission after the challenge ended. Please check the official website of VQA Challenge for detailed information and follow the instructions in EvalAI to submit. In general, after registration, the only thing remaining is to upload the test_predict.json file and wait for the result back.

The testing accuracy with exact this code is 72.42% for test-dev and 72.54% for test-standard. The results with the code base are also publicly shown on the VQA 2.0 leaderboard with entry LXMERT github version.

GQA

Fine-tuning

  1. Please make sure the LXMERT pre-trained model is either downloaded or pre-trained.

  2. Download the re-distributed json files for GQA balanced version dataset. The original GQA dataset is available in the Download section of its website and the script to preprocess these datasets is under data/gqa/process_raw_data_scripts.

    mkdir -p data/gqa
    wget https://nlp.cs.unc.edu/data/lxmert_data/gqa/train.json -P data/gqa/
    wget https://nlp.cs.unc.edu/data/lxmert_data/gqa/valid.json -P data/gqa/
    wget https://nlp.cs.unc.edu/data/lxmert_data/gqa/testdev.json -P data/gqa/
  3. Download Faster R-CNN features for Visual Genome and GQA testing images (30 GB). GQA's training and validation data are collected from Visual Genome. Its testing images come from MS COCO test set (I have verified this with one of GQA authors Drew A. Hudson). The image features are also available on Google Drive and Baidu Drive (see Alternative Download for details).

    mkdir -p data/vg_gqa_imgfeat
    wget https://nlp.cs.unc.edu/data/lxmert_data/vg_gqa_imgfeat/vg_gqa_obj36.zip -P data/vg_gqa_imgfeat
    unzip data/vg_gqa_imgfeat/vg_gqa_obj36.zip -d data && rm data/vg_gqa_imgfeat/vg_gqa_obj36.zip
    wget https://nlp.cs.unc.edu/data/lxmert_data/vg_gqa_imgfeat/gqa_testdev_obj36.zip -P data/vg_gqa_imgfeat
    unzip data/vg_gqa_imgfeat/gqa_testdev_obj36.zip -d data && rm data/vg_gqa_imgfeat/gqa_testdev_obj36.zip
  4. Before fine-tuning on whole GQA training+validation set, verifying the script and model on a small training set (512 images) is recommended. The first argument 0 is GPU id. The second argument gqa_lxr955_tiny is the name of this experiment.

    bash run/gqa_finetune.bash 0 gqa_lxr955_tiny --tiny
  5. If no bug came out, then the model is ready to be trained on the whole GQA corpus (train + validation), and validate on the testdev set:

    bash run/gqa_finetune.bash 0 gqa_lxr955

It takes around 16 hours (4 hours per epoch * 4 epochs) to converge. The logs and model snapshots will be saved under folder snap/gqa/gqa_lxr955. The validation result after training will be around 59.8% to 60.1%.

Local Validation

The results on testdev is printed out while training and saved in snap/gqa/gqa_lxr955/log.log. It could be also re-calculated with command:

bash run/gqa_test.bash 0 gqa_lxr955_results --load snap/gqa/gqa_lxr955/BEST --test testdev --batchSize 1024

Note: Our local testdev result is usually 0.1% to 0.5% lower than the submitted testdev result. The reason is that the test server takes an advanced evaluation system while our local evaluator only calculates the exact matching. Please use this official evaluator (784 MB) if you want to have the exact number without submitting.

Submitted to GQA test server

  1. Download our re-distributed json file containing GQA test data.

    wget https://nlp.cs.unc.edu/data/lxmert_data/gqa/submit.json -P data/gqa/
  2. Since GQA submission system requires submitting the whole test data, we need to run inference over all test splits. It takes around 30~60 mins to run test inference (4.2M instances to run).

    bash run/gqa_test.bash 0 gqa_lxr955_results --load snap/gqa/gqa_lxr955/BEST --test submit --batchSize 1024
  3. After running test script, a json file submit_predict.json under snap/gqa/gqa_lxr955_results will contain all the prediction results and is ready to be submitted. The GQA challenge 2019 is hosted by EvalAI at https://evalai.cloudcv.org/web/challenges/challenge-page/225/overview. After registering the account, uploading the submit_predict.json and waiting for the results are the only thing remained. Please also check GQA official website in case the test server is changed.

The testing accuracy with exactly this code is 60.00% for test-dev and 60.33% for test-standard. The results with the code base are also publicly shown on the GQA leaderboard with entry LXMERT github version.

NLVR2

Fine-tuning

  1. Download the NLVR2 data from the official GitHub repo.

    git submodule update --init
  2. Process the NLVR2 data to json files.

    bash -c "cd data/nlvr2/process_raw_data_scripts && python process_dataset.py"
  3. Download the NLVR2 image features for train (21 GB) & valid (1.6 GB) splits. The image features are also available on Google Drive and Baidu Drive (see Alternative Download for details). To access to the original images, please follow the instructions on NLVR2 official Github. The images could either be downloaded with the urls or by signing an agreement form for data usage. And the feature could be extracted as described in feature extraction

    mkdir -p data/nlvr2_imgfeat
    wget https://nlp.cs.unc.edu/data/lxmert_data/nlvr2_imgfeat/train_obj36.zip -P data/nlvr2_imgfeat
    unzip data/nlvr2_imgfeat/train_obj36.zip -d data && rm data/nlvr2_imgfeat/train_obj36.zip
    wget https://nlp.cs.unc.edu/data/lxmert_data/nlvr2_imgfeat/valid_obj36.zip -P data/nlvr2_imgfeat
    unzip data/nlvr2_imgfeat/valid_obj36.zip -d data && rm data/nlvr2_imgfeat/valid_obj36.zip
  4. Before fine-tuning on whole NLVR2 training set, verifying the script and model on a small training set (512 images) is recommended. The first argument 0 is GPU id. The second argument nlvr2_lxr955_tiny is the name of this experiment. Do not worry if the result is low (50~55) on this tiny split, the whole training data would bring the performance back.

    bash run/nlvr2_finetune.bash 0 nlvr2_lxr955_tiny --tiny
  5. If no bugs are popping up from the previous step, it means that the code, the data, and image features are ready. Please use this command to train on the full training set. The result on NLVR2 validation (dev) set would be around 74.0 to 74.5.

    bash run/nlvr2_finetune.bash 0 nlvr2_lxr955

Inference on Public Test Split

  1. Download NLVR2 image features for the public test split (1.6 GB).

    wget https://nlp.cs.unc.edu/data/lxmert_data/nlvr2_imgfeat/test_obj36.zip -P data/nlvr2_imgfeat
    unzip data/nlvr2_imgfeat/test_obj36.zip -d data/nlvr2_imgfeat && rm data/nlvr2_imgfeat/test_obj36.zip
  2. Test on the public test set (corresponding to 'test-P' on NLVR2 leaderboard) with:

    bash run/nlvr2_test.bash 0 nlvr2_lxr955_results --load snap/nlvr2/nlvr2_lxr955/BEST --test test --batchSize 1024
  3. The test accuracy would be shown on the screen after around 5~10 minutes. It also saves the predictions in the file test_predict.csv under snap/nlvr2_lxr955_reuslts, which is compatible to NLVR2 official evaluation script. The official eval script also calculates consistency ('Cons') besides the accuracy. We could use this official script to verify the results by running:

    python data/nlvr2/nlvr/nlvr2/eval/metrics.py snap/nlvr2/nlvr2_lxr955_results/test_predict.csv data/nlvr2/nlvr/nlvr2/data/test1.json

The accuracy of public test ('test-P') set should be almost same to the validation set ('dev'), which is around 74.0% to 74.5%.

Unreleased Test Sets

To be tested on the unreleased held-out test set (test-U on the leaderboard ), the code needs to be sent. Please check the NLVR2 official github and NLVR project website for details.

General Debugging Options

Since it takes a few minutes to load the features, the code has an option to prototype with a small amount of training data.

# Training with 512 images:
bash run/vqa_finetune.bash 0 --tiny 
# Training with 4096 images:
bash run/vqa_finetune.bash 0 --fast

Pre-training

  1. Download our aggregated LXMERT dataset from MS COCO, Visual Genome, VQA, and GQA (around 700MB in total). The joint answer labels are saved in data/lxmert/all_ans.json.

    mkdir -p data/lxmert
    wget https://nlp.cs.unc.edu/data/lxmert_data/lxmert/mscoco_train.json -P data/lxmert/
    wget https://nlp.cs.unc.edu/data/lxmert_data/lxmert/mscoco_nominival.json -P data/lxmert/
    wget https://nlp.cs.unc.edu/data/lxmert_data/lxmert/vgnococo.json -P data/lxmert/
    wget https://nlp.cs.unc.edu/data/lxmert_data/lxmert/mscoco_minival.json -P data/lxmert/
  2. [Skip this if you have run VQA fine-tuning.] Download the detection features for MS COCO images.

    mkdir -p data/mscoco_imgfeat
    wget https://nlp.cs.unc.edu/data/lxmert_data/mscoco_imgfeat/train2014_obj36.zip -P data/mscoco_imgfeat
    unzip data/mscoco_imgfeat/train2014_obj36.zip -d data/mscoco_imgfeat && rm data/mscoco_imgfeat/train2014_obj36.zip
    wget https://nlp.cs.unc.edu/data/lxmert_data/mscoco_imgfeat/val2014_obj36.zip -P data/mscoco_imgfeat
    unzip data/mscoco_imgfeat/val2014_obj36.zip -d data && rm data/mscoco_imgfeat/val2014_obj36.zip
  3. [Skip this if you have run GQA fine-tuning.] Download the detection features for Visual Genome images.

    mkdir -p data/vg_gqa_imgfeat
    wget https://nlp.cs.unc.edu/data/lxmert_data/vg_gqa_imgfeat/vg_gqa_obj36.zip -P data/vg_gqa_imgfeat
    unzip data/vg_gqa_imgfeat/vg_gqa_obj36.zip -d data && rm data/vg_gqa_imgfeat/vg_gqa_obj36.zip
  4. Test on a small split of the MS COCO + Visual Genome datasets:

    bash run/lxmert_pretrain.bash 0,1,2,3 --multiGPU --tiny
  5. Run on the whole MS COCO and Visual Genome related datasets (i.e., VQA, GQA, COCO caption, VG Caption, VG QA). Here, we take a simple single-stage pre-training strategy (20 epochs with all pre-training tasks) rather than the two-stage strategy in our paper (10 epochs without image QA and 10 epochs with image QA). The pre-training finishes in 8.5 days on 4 GPUs. By the way, I hope that my experience in this project would help anyone with limited computational resources.

    bash run/lxmert_pretrain.bash 0,1,2,3 --multiGPU

    Multiple GPUs: Argument 0,1,2,3 indicates taking 4 GPUs to pre-train LXMERT. If the server does not have 4 GPUs (I am sorry to hear that), please consider halving the batch-size or using the NVIDIA/apex library to support half-precision computation. The code uses the default data parallelism in PyTorch and thus extensible to less/more GPUs. The python main thread would take charge of the data loading. On 4 GPUs, we do not find that the data loading becomes a bottleneck (around 5% overhead).

    GPU Types: We find that either Titan XP, GTX 2080, and Titan V could support this pre-training. However, GTX 1080, with its 11G memory, is a little bit small thus please change the batch_size to 224 (instead of 256).

  6. I have verified these pre-training commands with 12 epochs. The pre-trained weights from previous process could be downloaded from https://nlp.cs.unc.edu/data/github_pretrain/lxmert/EpochXX_LXRT.pth, XX from 01 to 12. The results are roughly the same (around 0.3% lower in downstream tasks because of fewer epochs).

  7. Explanation of arguments in the pre-training script run/lxmert_pretrain.bash:

    python src/pretrain/lxmert_pretrain_new.py \
        # The pre-training tasks
        --taskMaskLM --taskObjPredict --taskMatched --taskQA \  
        
        # Vision subtasks
        # obj / attr: detected object/attribute label prediction.
        # feat: RoI feature regression.
        --visualLosses obj,attr,feat \
        
        # Mask rate for words and objects
        --wordMaskRate 0.15 --objMaskRate 0.15 \
        
        # Training and validation sets
        # mscoco_nominival + mscoco_minival = mscoco_val2014
        # visual genome - mscoco = vgnococo
        --train mscoco_train,mscoco_nominival,vgnococo --valid mscoco_minival \
        
        # Number of layers in each encoder
        --llayers 9 --xlayers 5 --rlayers 5 \
        
        # Train from scratch (Using intialized weights) instead of loading BERT weights.
        --fromScratch \
    
        # Hyper parameters
        --batchSize 256 --optim bert --lr 1e-4 --epochs 20 \
        --tqdm --output $output ${@:2}

Alternative Dataset and Features Download Links

All default download links are provided by our servers in UNC CS department and under our NLP group website but the network bandwidth might be limited. We thus provide a few other options with Google Drive and Baidu Drive.

The files in online drives are almost structured in the same way as our repo but have a few differences due to specific policies. After downloading the data and features from the drives, please re-organize them under data/ folder according to the following example:

REPO ROOT
 |
 |-- data                  
 |    |-- vqa
 |    |    |-- train.json
 |    |    |-- minival.json
 |    |    |-- nominival.json
 |    |    |-- test.json
 |    |
 |    |-- mscoco_imgfeat
 |    |    |-- train2014_obj36.tsv
 |    |    |-- val2014_obj36.tsv
 |    |    |-- test2015_obj36.tsv
 |    |
 |    |-- vg_gqa_imgfeat -- *.tsv
 |    |-- gqa -- *.json
 |    |-- nlvr2_imgfeat -- *.tsv
 |    |-- nlvr2 -- *.json
 |    |-- lxmert -- *.json          # Pre-training data
 | 
 |-- snap
 |-- src

Please also kindly contact us if anything is missing!

Google Drive

As an alternative way to download feature from our UNC server, you could also download the feature from google drive with link https://drive.google.com/drive/folders/1Gq1uLUk6NdD0CcJOptXjxE6ssY5XAuat?usp=sharing. The structure of the folders on drive is:

Google Drive Root
 |-- data                  # The raw data and image features without compression
 |    |-- vqa
 |    |-- gqa
 |    |-- mscoco_imgfeat
 |    |-- ......
 |
 |-- image_feature_zips    # The image-feature zip files (Around 45% compressed)
 |    |-- mscoco_imgfeat.zip
 |    |-- nlvr2_imgfeat.zip
 |    |-- vg_gqa_imgfeat.zip
 |
 |-- snap -- pretrained -- model_LXRT.pth # The pytorch pre-trained model weights.

Note: image features in zip files (e.g., mscoco_mgfeat.zip) are the same to which in data/ (i.e., data/mscoco_imgfeat). If you want to save network bandwidth, please download the feature zips and skip downloading the *_imgfeat folders under data/.

Baidu Drive

Since Google Drive is not officially available across the world, we also create a mirror on Baidu drive (i.e., Baidu PAN). The dataset and features could be downloaded with shared link https://pan.baidu.com/s/1m0mUVsq30rO6F1slxPZNHA and access code wwma.

Baidu Drive Root
 |
 |-- vqa
 |    |-- train.json
 |    |-- minival.json
 |    |-- nominival.json
 |    |-- test.json
 |
 |-- mscoco_imgfeat
 |    |-- train2014_obj36.zip
 |    |-- val2014_obj36.zip
 |    |-- test2015_obj36.zip
 |
 |-- vg_gqa_imgfeat -- *.zip.*  # Please read README.txt under this folder
 |-- gqa -- *.json
 |-- nlvr2_imgfeat -- *.zip.*   # Please read README.txt under this folder
 |-- nlvr2 -- *.json
 |-- lxmert -- *.json
 | 
 |-- pretrained -- model_LXRT.pth

Since Baidu Drive does not support extremely large files, we split a few features zips into multiple small files. Please follow the README.txt under baidu_drive/vg_gqa_imgfeat and baidu_drive/nlvr2_imgfeat to concatenate back to the feature zips with command cat.

Code and Project Explanation

  • All code is in folder src. The basics in lxrt. The python files related to pre-training and fine-tuning are saved in src/pretrain and src/tasks respectively.
  • I kept folders containing image features (e.g., mscoco_imgfeat) separated from vision-and-language dataset (e.g., vqa, lxmert) because multiple vision-and-language datasets would share common images.
  • We use the name lxmert for our framework and use the name lxrt (Language, Cross-Modality, and object-Relationship Transformers) to refer to our our models.
  • To be consistent with the name lxrt (Language, Cross-Modality, and object-Relationship Transformers), we use lxrXXX to denote the number of layers. E.g., lxr955 (used in current pre-trained model) indicates a model with 9 Language layers, 5 cross-modality layers, and 5 object-Relationship layers. If we consider a single-modality layer as a half of cross-modality layer, the total number of layers is (9 + 5) / 2 + 5 = 12, which is the same as BERT_BASE.
  • We share the weight between the two cross-modality attention sub-layers. Please check the visual_attention variable, which is used to compute both lang->visn attention and visn->lang attention. (I am sorry that the name visual_attention is misleading because I deleted the lang_attention there.) Sharing weights is mostly used for saving computational resources and it also (intuitively) helps forcing the features from visn/lang into a joint subspace.
  • The box coordinates are not normalized from [0, 1] to [-1, 1], which looks like a typo but actually not ;). Normalizing the coordinate would not affect the output of box encoder (mathematically and almost numerically). (Hint: consider the LayerNorm in positional encoding)

Faster R-CNN Feature Extraction

We use the Faster R-CNN feature extractor demonstrated in "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering", CVPR 2018 and its released code at Bottom-Up-Attention github repo. It was trained on Visual Genome dataset and implemented based on a specific Caffe version.

To extract features with this Caffe Faster R-CNN, we publicly release a docker image airsplay/bottom-up-attention on docker hub that takes care of all the dependencies and library installation . Instructions and examples are demonstrated below. You could also follow the installation instructions in the bottom-up attention github to setup the tool: https://github.com/peteanderson80/bottom-up-attention.

The BUTD feature extractor is widely used in many other projects. If you want to reproduce the results from their paper, feel free to use our docker as a tool.

Feature Extraction with Docker

Docker is a easy-to-use virtualization tool which allows you to plug and play without installing libraries.

The built docker file for bottom-up-attention is released on docker hub and could be downloaded with command:

sudo docker pull airsplay/bottom-up-attention

The Dockerfile could be downloaed here, which allows using other CUDA versions.

After pulling the docker, you could test running the docker container with command:

docker run --gpus all --rm -it airsplay/bottom-up-attention bash

If errors about --gpus all popped up, please read the next section.

Docker GPU Access

Note that the purpose of the argument --gpus all is to expose GPU devices to the docker container, and it requires Docker >= 19.03 along with nvidia-container-toolkit:

  1. Docker CE 19.03
  2. nvidia-container-toolkit

For running Docker with an older version, either update it to 19.03 or use the flag --runtime=nvidia instead of --gpus all.

An Example: Feature Extraction for NLVR2

We demonstrate how to extract Faster R-CNN features of NLVR2 images.

  1. Please first follow the instructions on the NLVR2 official repo to get the images.

  2. Download the pre-trained Faster R-CNN model. Instead of using the default pre-trained model (trained with 10 to 100 boxes), we use the 'alternative pretrained model' which was trained with 36 boxes.

    wget 'https://www.dropbox.com/s/2h4hmgcvpaewizu/resnet101_faster_rcnn_final_iter_320000.caffemodel?dl=1' -O data/nlvr2_imgfeat/resnet101_faster_rcnn_final_iter_320000.caffemodel
  3. Run docker container with command:

    docker run --gpus all -v /path/to/nlvr2/images:/workspace/images:ro -v /path/to/lxrt_public/data/nlvr2_imgfeat:/workspace/features --rm -it airsplay/bottom-up-attention bash

    -v mounts the folders on host os to the docker image container.

    Note0: If it says something about 'privilege', add sudo before the command.

    Note1: If it says something about '--gpus all', it means that the GPU options are not correctly set. Please read Docker GPU Access for the instructions to allow GPU access.

    Note2: /path/to/nlvr2/images would contain subfolders train, dev, test1 and test2.

    Note3: Both paths '/path/to/nlvr2/images/' and '/path/to/lxrt_public' requires absolute paths.

  4. Extract the features inside the docker container. The extraction script is copied from butd/tools/generate_tsv.py and modified by Jie Lei and me.

    cd /workspace/features
    CUDA_VISIBLE_DEVICES=0 python extract_nlvr2_image.py --split train 
    CUDA_VISIBLE_DEVICES=0 python extract_nlvr2_image.py --split valid
    CUDA_VISIBLE_DEVICES=0 python extract_nlvr2_image.py --split test
  5. It would takes around 5 to 6 hours for the training split and 1 to 2 hours for the valid and test splits. Since it is slow, I recommend to run them parallelly if there are multiple GPUs. It could be achieved by changing the gpu_id in CUDA_VISIBLE_DEVICES=$gpu_id.

The features will be saved in train.tsv, valid.tsv, and test.tsv under the directory data/nlvr2_imgfeat, outside the docker container. I have verified the extracted image features are the same to the ones I provided in NLVR2 fine-tuning.

Yet Another Example: Feature Extraction for MS COCO Images

  1. Download the MS COCO train2014, val2014, and test2015 images from MS COCO official website.

  2. Download the pre-trained Faster R-CNN model.

    mkdir -p data/mscoco_imgfeat
    wget 'https://www.dropbox.com/s/2h4hmgcvpaewizu/resnet101_faster_rcnn_final_iter_320000.caffemodel?dl=1' -O data/mscoco_imgfeat/resnet101_faster_rcnn_final_iter_320000.caffemodel
  3. Run the docker container with the command:

    docker run --gpus all -v /path/to/mscoco/images:/workspace/images:ro -v $(pwd)/data/mscoco_imgfeat:/workspace/features --rm -it airsplay/bottom-up-attention bash

    Note: Option -v mounts the folders outside container to the paths inside the container.

    Note1: Please use the absolute path to the MS COCO images folder images. The images folder containing the train2014, val2014, and test2015 sub-folders. (It's the standard way to save MS COCO images.)

  4. Extract the features inside the docker container.

    cd /workspace/features
    CUDA_VISIBLE_DEVICES=0 python extract_coco_image.py --split train 
    CUDA_VISIBLE_DEVICES=0 python extract_coco_image.py --split valid
    CUDA_VISIBLE_DEVICES=0 python extract_coco_image.py --split test
  5. Exit from the docker container (by executing exit command in bash). The extracted features would be saved under folder data/mscoco_imgfeat.

Reference

If you find this project helps, please cite our paper :)

@inproceedings{tan2019lxmert,
  title={LXMERT: Learning Cross-Modality Encoder Representations from Transformers},
  author={Tan, Hao and Bansal, Mohit},
  booktitle={Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing},
  year={2019}
}

Acknowledgement

We thank the funding support from ARO-YIP Award #W911NF-18-1-0336, & awards from Google, Facebook, Salesforce, and Adobe.

We thank Peter Anderson for providing the faster R-CNN code and pre-trained models under Bottom-Up-Attention Github Repo. We thank Hengyuan Hu for his PyTorch VQA implementation, our VQA implementation borrows its pre-processed answers. We thank hugginface for releasing the excellent PyTorch code PyTorch Transformers.

We thank Drew A. Hudson to answer all our questions about GQA specification. We thank Alane Suhr for helping test LXMERT on NLVR2 unreleased test split and provide a detailed analysis.

We thank all the authors and annotators of vision-and-language datasets (i.e., MS COCO, Visual Genome, VQA, GQA, NLVR2 ), which allows us to develop a pre-trained model for vision-and-language tasks.

We thank Jie Lei and Licheng Yu for their helpful discussions. I also want to thank Shaoqing Ren to teach me vision knowledge when I was in MSRA. We also thank you to help look into our code. Please kindly contact us if you find any issue. Comments are always welcome.

LXRThanks.

lxmert's People

Contributors

airsplay avatar bryant1410 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lxmert's Issues

VQA finetuning training time

In this section of the readme, it says that fine-tuning should only take 2 hours per epoch:

'''
If no bug came out, then the model is ready to be trained on the whole VQA corpus:

bash run/vqa_finetune.bash 0 vqa_lxr955
It takes around 8 hours (2 hours per epoch * 4 epochs) to converge. The logs and model snapshots will be saved under folder snap/vqa/vqa_lxr955. The validation result after training will be around 69.7% to 70.2%.
'''

Is this with 4 GPUs? Because on my system with a single Titan XP, it is reporting 300+ hours/epoch

image

Is this an issue with my system or with the code? Because it seems even with 4 GPUs, we would still need 75 hours/epoch

Thanks

Pre-training doesn't work

Hello,

I am trying to run the pre-training of the model again. When I run the command:
bash run/lxmert_pretrain.bash 1,2 --multiGPU --tiny

I get the following output:

Load 174866 data from mscoco_train,mscoco_nominival,vgnococo
Load an answer table of size 9500.
Start to load Faster-RCNN detected objects from data/mscoco_imgfeat/train2014_obj36.tsv
Loaded 500 images in file data/mscoco_imgfeat/train2014_obj36.tsv in 2 seconds.
Start to load Faster-RCNN detected objects from data/mscoco_imgfeat/val2014_obj36.tsv
Loaded 500 images in file data/mscoco_imgfeat/val2014_obj36.tsv in 2 seconds.
Start to load Faster-RCNN detected objects from data/vg_gqa_imgfeat/vg_gqa_obj36.tsv
Loaded 500 images in file data/vg_gqa_imgfeat/vg_gqa_obj36.tsv in 2 seconds.
Use 33226 data in torch dataset

Load 5000 data from mscoco_minival
Load an answer table of size 9500.
Start to load Faster-RCNN detected objects from data/mscoco_imgfeat/val2014_obj36.tsv
Loaded 500 images in file data/mscoco_imgfeat/val2014_obj36.tsv in 2 seconds.
Use 20707 data in torch dataset

LXRT encoder with 9 l_layers, 5 x_layers, and 5 r_layers.
Train from Scratch: re-initialize all BERT weights.
Batch per epoch: 129
Total Iters: 2580
Warm up Iters: 129
  0%|                                                                                                                                | 0/129 [00:00<?, ?it/s]/mnt/8tera/claudio.greco/bert_foil/lxmert/venv_lxmert/lib/python3.6/site-packages/torch/nn/parallel/_functions.py:61: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
  warnings.warn('Was asked to gather along dimension 0, but all '

and nothing else happens.

I guess I should see a progress bar or some intermediate information, right? Do you know how I could try to fix this issue?

Thanks,
Claudio

AttributeError: 'NoneType' object has no attribute 'init_bert_weights'

Traceback (most recent call last):
File "src/tasks/vqa.py", line 178, in
vqa = VQA()
File "src/tasks/vqa.py", line 48, in init
self.model = VQAModel(self.train_tuple.dataset.num_answers)
File "/home/shaohuan/lxmert/src/tasks/vqa_model.py", line 32, in init
self.logit_fc.apply(self.lxrt_encoder.model.init_bert_weights)
AttributeError: 'NoneType' object has no attribute 'init_bert_weights'

issues when finetuning the pretrained models for GQA

There exists the random seed generator in your codes. And I found it initially set as 9595. And I also found the parameters you loaded the pretrained models are unchanged if I change the random seed ,for example 2. But the final results on val datasets are worse, less than 0.5 point. So I wonder the initial parameters are the same ,just the difference of random seed. Why can it cause such different results?

FP16 for training acceleration

Thanks for your fantastic work. The code is clear and operation manual is detailed. There is still one thing I want know. Does make lxmert support fp16 in your plan? When I reproduce your work in 2080Ti (or other Tesla architecture GPU) which suggests to using fp16 for speeding up training. Unfortunately, I find the current version does not support fp16. I would appreciate it if you(or anyone else) could provide a version supporting fp16.

pretrained only on GQA dataset

Hi, have you tried pretraining only on the GQA dataset and use the pretrained model to test on the GQA task? The purpose of this might be to test how effective the extra data helps for the specific task such as GQA and also how "powerful" your model is to extract the relationship between 2 modalities.

Visual genome features

Are the visual genome visual features extracted from the fine-tuned bottom-up model on the vg dataset or the pretrained one from bottom-up repo?

Pretrained model for GQA

Could you please upload a pretrained model for GQA and put the link in the README? Thank you very much!

When i use your docker image to extract butd features, should i need to make pycaffe or build Caffe?

i download your docker image and build my docker container. inside the container, i want to run 'bottom-up-attention/tools/generate_tsv.py'.
but when i do some import in generate_tsv.py as following:
import _init_paths
from fast_rcnn.config import cfg, cfg_from_file
from fast_rcnn.test import im_detect,_get_blobs
from fast_rcnn.nms_wrapper import nms
from utils.timer import Timer

import caffe
import argparse
import pprint
import time, os, sys
import base64
import numpy as np
import cv2
import csv
from multiprocessing import Process
import random
import json

some error occured:
ImportError Traceback (most recent call last)
in ()
1 import _init_paths
2 from fast_rcnn.config import cfg, cfg_from_file
----> 3 from fast_rcnn.test import im_detect,_get_blobs
4 from fast_rcnn.nms_wrapper import nms
5 from utils.timer import Timer

/workspace/bottom-up-attention/lib/fast_rcnn/test.py in ()
14 import numpy as np
15 import cv2
---> 16 import caffe
17 from fast_rcnn.nms_wrapper import nms, soft_nms
18 import cPickle

/workspace/bottom-up-attention/caffe/python/caffe/init.py in ()
----> 1 from .pycaffe import Net, SGDSolver, NesterovSolver, AdaGradSolver, RMSPropSolver, AdaDeltaSolver, AdamSolver, NCCL, Timer
2 from ._caffe import init_log, log, set_mode_cpu, set_mode_gpu, set_device, Layer, get_solver, layer_type_list, set_random_seed, solver_count, set_solver_count, solver_rank, set_solver_rank, set_multiprocess, Layer, get_solver
3 from ._caffe import version
4 from .proto.caffe_pb2 import TRAIN, TEST
5 from .classifier import Classifier

/workspace/bottom-up-attention/caffe/python/caffe/pycaffe.py in ()
11 import numpy as np
12
---> 13 from ._caffe import Net, SGDSolver, NesterovSolver, AdaGradSolver,
14 RMSPropSolver, AdaDeltaSolver, AdamSolver, NCCL, Timer
15 import caffe.io

ImportError: No module named _caffe

thank you in advance!

Empty file in pretraining VQA

Hi,

I'm doing as you wrote in the readme file
Now when I'm running:
bash run/vqa_finetune.bash 0 vqa_lxr955_tiny --tiny

I get the following error:

Load 632117 data from split(s) train,nominival.
Start to load Faster-RCNN detected objects from data/mscoco_imgfeat/train2014_obj36.tsv
Loaded 512 images in file data/mscoco_imgfeat/train2014_obj36.tsv in 2 seconds.
Start to load Faster-RCNN detected objects from data/mscoco_imgfeat/val2014_obj36.tsv
Loaded 512 images in file data/mscoco_imgfeat/val2014_obj36.tsv in 2 seconds.
Use 2888 data in torch dataset

Load 25994 data from split(s) minival.
Start to load Faster-RCNN detected objects from data/mscoco_imgfeat/val2014_obj36.tsv
Loaded 512 images in file data/mscoco_imgfeat/val2014_obj36.tsv in 2 seconds.
Use 2618 data in torch dataset

Traceback (most recent call last):
File "/StudentData/lxmert-master/name_of_environment/lib/python3.5/tarfile.py", line 2281, in next
tarinfo = self.tarinfo.fromtarfile(self)
File "/StudentData/lxmert-master/name_of_environment/lib/python3.5/tarfile.py", line 1083, in fromtarfile
obj = cls.frombuf(buf, tarfile.encoding, tarfile.errors)
File "/StudentData/lxmert-master/name_of_environment/lib/python3.5/tarfile.py", line 1019, in frombuf
raise EmptyHeaderError("empty header")
tarfile.EmptyHeaderError: empty header

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "src/tasks/vqa.py", line 178, in
vqa = VQA()
File "src/tasks/vqa.py", line 48, in init
self.model = VQAModel(self.train_tuple.dataset.num_answers)
File "/StudentData/lxmert-master/src/tasks/vqa_model.py", line 21, in init
max_seq_length=MAX_VQA_LENGTH
File "/StudentData/lxmert-master/src/lxrt/entry.py", line 95, in init
mode=mode
File "/StudentData/lxmert-master/src/lxrt/modeling.py", line 769, in from_pretrained
with tarfile.open(resolved_archive_file, 'r:gz') as archive:
File "/StudentData/lxmert-master/name_of_environment/lib/python3.5/tarfile.py", line 1577, in open
return func(name, filemode, fileobj, **kwargs)
File "/StudentData/lxmert-master/name_of_environment/lib/python3.5/tarfile.py", line 1631, in gzopen
t = cls.taropen(name, mode, fileobj, **kwargs)
File "/StudentData/lxmert-master/name_of_environment/lib/python3.5/tarfile.py", line 1607, in taropen
return cls(name, mode, fileobj, **kwargs)
File "/StudentData/lxmert-master/name_of_environment/lib/python3.5/tarfile.py", line 1470, in init
self.firstmember = self.next()
File "/StudentData/lxmert-master/name_of_environment/lib/python3.5/tarfile.py", line 2296, in next
raise ReadError("empty file")
tarfile.ReadError: empty file

Thanks!

cudaSuccess (9 vs. 0)

hi,
thanks for your sharing
i was trying to extract object features on other dataset(images)
but, i met this error
i just updated my pytorch to 1.4
but still same error come out...
how can i solve this?
capture

objects_id and attrs_id

Hi, in the "*.tsv" files, I want to know whether a dictionary exists to denote objects_id and attrs_id.
Thank you!

GPU out of memory error

Hi,

I just ran the docker for feature extractor on COCO, but I hit an error saying out of memory:

/opt/butd//tools/../lib/rpn/proposal_layer.py:27: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. layer_params = yaml.load(self.param_str) 0%| | 0/82783 [00:00<?, ?it/s]WARNING: Logging before InitGoogleLogging() is written to STDERR F0829 23:53:49.776543 146 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0) out of memory *** Check failure stack trace: *** Aborted root@eb7c95126dbc:/workspace/features#

It's weird because I'm watching my GPU and it only uses ~ 2GB (out of 8GB). But it still throws me an out-of-memory error. It even happens when I set BATCH_SIZE to 2 in the config file. Any thought?

CUDA error: invalid configuration argument on large

Hi,

I'm trying to reproduce the pretraining results with multi-GPU training, and encountered this error: RuntimeError: CUDA error: invalid configuration argument during validation, but only when the validation batch size is large, e.g. 2048, not when the validation batch size is, say, 512. Have you encountered this before, or know how to fix it?

Thank you.

Pretraining attribute and object loss don't match the ones in the log file

I tried to load the provided pretrained model with the latest version of the code. But when I tried printing the values for attribute and object, I found a discrepancy between the log file shared for different epochs and the one I got when I printed.
I found the values for the training
obj loss: 2.2379
attr loss : 2.6806

While the one reported in the log file are : Obj: 0.2028 Attr: 0.1267
I see the same trend in the eval losses as well.

QA data used in image-QA ablation study

Hi,

When you did ablation experiments about image-QA loss (1, 2 in table 4),
did you just use COCO-Cap and VG-Cap or did you still include VQA/GQA/VG-QA questions for other pretraining tasks?

image

image

CPU memory usage is too high and other queries

Thanks for sharing this code. When I'm performing finetuning with VQA, my RAM usage blows up. With num_workers set to 4, it requires 207 GB. I've tried with different batch sizes also. The script with --tiny flag runs successfully. But when I'm loading both train and nominival, the memory usage blows up. I get memory can't be allocated. Do you know a workaround for this ? I think this is because we are storing all the features from faster_rcnn in RAM ?

Extracting Image Features

When trying to run feature extraction using the docker image, I am running
CUDA_VISIBLE_DEVICES=4 python extract_nlvr2_image.py --split train
and I get the error
python: can't open file 'extract_nlvr2_image.py': [Errno 2] No such file or directory
I have installed the docker image using sudo pull airsplay/bottom-up-attention. Please let me know how I can fix this. Thank you!

Model size

Hi, do you roughly know how large is your model?

Can not use multiply gpus for bottom-up feature extraction

Hi, Thanks for authors great doker environment for bu feature extraction.
However, I find it cannot use the multiply GPUs by using CUDA_VISIBLE_DEVICES=0,1,2,3 in docker?
And I check the extract.py file, it seemed that it does not support the multiply gpu extracting?

Or am I missing anything?
Thanks~

"pre-training" section in the readme

Just want to confirm, when you talk about "pre-training" in the readme (https://github.com/airsplay/lxmert#pre-training) you mean training the entire LXMERT model from scratch?

If we just want to use a trained LXMERT model (and stick on a classification or LSTM layer at the end), we can just use the pre-trained model link you provided: http://nlp.cs.unc.edu/data/model_LXRT.pth, load your model, freeze the weights and then finetune with our specific task, right?

Thanks

Bad performance on NLVR2

Hi, thanks for releasing your code! I'm not able to reproduce your fine-tuning result on NLVR2. I followed your instructions by downloading the pre-trained model, downloading the image features, pre-processing the nlvr2 JSON files, and running the nlvr2_finetune.bash script as is. However, I get the following results, which are much lower than the result you reported. Do you know why this might be happening?

Epoch 0: Train 52.32
Epoch 0: Valid 50.86
Epoch 0: Best 50.86

Epoch 1: Train 50.50
Epoch 1: Valid 49.14
Epoch 1: Best 50.86

Epoch 2: Train 50.56
Epoch 2: Valid 49.31
Epoch 2: Best 50.86

Epoch 3: Train 54.83
Epoch 3: Valid 51.65
Epoch 3: Best 51.65

How to build 'all_ans.json'

Hi and thank you for sharing your code in such an actually usable way !
The pretraining instructions mention a file name 'all_ans.json', which is required for launching the pretraining instruction, but I coulnd't find out how to download or build it, and when I run
bash run/lxmert_pretrain.bash 0,1,2,3 --multiGPU --tiny
I get
FileNotFoundError: [Errno 2] No such file or directory: 'data/lxmert/all_ans.json'
Could anyone tell me how to get all_ans ?
Many thanks

parallel warning

/usr/local/lib/python3.5/dist-packages/torch/nn/parallel/_functions.py:61: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
warnings.warn('Was asked to gather along dimension 0, but all '

Do you have this problem?torch=1.0.1,Whether it affects the final result?

Low training speed

Thanks for your excellent work. When I tried to run the code you provided, I found that each epoch would take 28 hours on 8 2080Ti GPUs. However, you mentioned that it takes only 19 hours per epoch on 4 Titan XP GPUs in the previous issue. Do you think there is any reason why the current training is too slow?

image

Why does object and attribute loss not be masked?

In lxmert_pretrain.py
obj_labels={ 'obj': (obj_labels, obj_confs), 'attr': (attr_labels, attr_confs), 'feat': (feat, feat_mask), },
It seems that obj and attr loss is computed on all objects, not masked by feat_mask. It's odd to predict object class with its feature unmasked. Am I correct?
Also mentioned in here #19
Thanks in advance.

Dev-set splits in paper

Hi,

Could you please specify which dev-set splits you used to obtain scores below in the paper?
Ex) Are VQA dev-set accuracy below from VQA test-dev split? or are they from MSCOCO minival? And which splits are used in GQA/NLVR?

image

image

Why is object classification loss multiplied with the Faster R-CNN confidence score?

During training, mask_conf is multiplied to feature regression and object classification loss, which are defined here.
It is reasonable to mask feature regression loss on masked regions, but I don't understand the reason of multiplying the Faster R-CNN confidence score (top object probability) to object classification loss (which is cross-entropy loss).
Is this sort of knowledge distillation? This is not mentioned in the EMNLP paper.

masked_lm_loss being optimised even when label is not matched ?

In this code snippet of file lxrt.modelling.py

lxmert/src/lxrt/modeling.py

Lines 940 to 953 in 9b8f0ff

if masked_lm_labels is not None and self.task_mask_lm:
masked_lm_loss = loss_fct(
lang_prediction_scores.view(-1, self.config.vocab_size),
masked_lm_labels.view(-1)
)
total_loss += masked_lm_loss
losses += (masked_lm_loss.detach(),)
if matched_label is not None and self.task_matched:
matched_loss = loss_fct(
cross_relationship_score.view(-1, 2),
matched_label.view(-1)
)
total_loss += matched_loss
losses += (matched_loss.detach(),)

masked_lm_loss is still being optimized when the label is not matched i.e. even when the image is not related to 'caption/text' since there is no check/mask multiplication.

This might not be exactly a bug considering the masked word can still be recovered using the language stream only but was it intended? Or am I missing some other part of the code?

problem in feature extraction

hi,
Thanks a lot for sharing such a useful repo. We are trying to apply it on a new dataset, but we got in trouble during feature extraction.
The link of pretrained Faster-RCNN model seems to be unavailable. Could you please share it on Google Drive or Baidu Drive?
Thank you.

GQA submission

I generated the submit_predict.json and submited it to GQA evaluation server. However, I got an accuracy of 0 in test phase, but the result in dev phase makes sense. Is it possible that I predict all wrong answers in test split?

What is wrong with the submission file?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.