This repositoriy is a forked version of google-reasearch/long-range-arena to benchmark neural attention memory
- CUDA 11 compatible GPU with more than 9GB VRAM (Tested on RTX 3080 10GB).
- Linux system with nvidia-driver and nvidia-docker installed.
- Pull the tensorflow-2.7.0 image
docker pull tensorflow/tensorflow:2.7.0-gpu
. - Download the dataset from [https://storage.googleapis.com/long-range-arena/lra_release.gz] (This is actually a tar.gz file. Rename it extract properly).
- Start the docker container while attaching this repository to
/lra
and the dataset directory to/dataset
like below.
docker run -it --name=myflax --gpus all -v [this repository]:/lra -v [lra_release dataset directory]:/dataset tensorflow/tensorflow:2.7.0-gpu
- Go to
/lra
and install requirements pypip install -r requirements.txt
. - Install jaxlib by running
jaxlibinstall.sh
.
The tasks can run without any modification.
Use the pre-defined scripts to run them.
- Ex1) Image classification with Transformer:
./cifar10.sh transformer
- Ex2) ListOps with NAM Transformer:
./listops.sh nam_transformer
- Ex3) Text classification with Linear Transformer:
./text_classification.sh linear_transformer
Only tested with those three transformers, but it is possible to run with others.
Debug in progress. document.sh
does not properly work yet.
Debug in progress. pathfinder.sh
does not properly work yet.
The tensorboard results are recorded under /tmp
directory.