Giter VIP home page Giter VIP logo

textsnake.pytorch's Introduction

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

A PyTorch implement of TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes (ECCV 2018) by Megvii

Paper

Comparison of different representations for text instances. (a) Axis-aligned rectangle. (b) Rotated rectangle. (c) Quadrangle. (d) TextSnake. Obviously, the proposed TextSnake representation is able to effectively and precisely describe the geometric properties, such as location, scale, and bending of curved text with perspective distortion, while the other representations (axis-aligned rectangle, rotated rectangle or quadrangle) struggle with giving accurate predictions in such cases.

Textsnake elements:

  • center point
  • tangent line
  • text region

Description

Generally, this code has following features:

  1. include complete training and inference code
  2. pure python version without extra compiling
  3. compatible with laste PyTorch version (write with pytroch 0.4.0)
  4. support TotalText and SynthText dataset

Getting Started

This repo includes the training code and inference demo of TextSnake, training and infercence can be simplely run with a few code.

Prerequisites

To run this repo successfully, it is highly recommanded with:

  • Linux (Ubuntu 16.04)
  • Python3.6
  • Anaconda3
  • NVIDIA GPU(with 8G or larger GPU memory for training, 2G for inference)

(I haven't test it on other Python version.)

  1. clone this repository
git clone https://github.com/princewang1994/TextSnake.pytorch.git
  1. python package can be installed with pip
$ cd $TEXTSNAKE_ROOT
$ pip install -r requirements.txt

Data preparation

Pretraining with SynthText

$ CUDA_VISIBLE_DEVICES=$GPUID python train.py synthtext_pretrain --dataset synth-text --viz --max_epoch 1 --batch_size 8

Training

Training model with given experiment name $EXPNAME

training from scratch:

$ EXPNAME=example
$ CUDA_VISIBLE_DEVICES=$GPUID python train.py $EXPNAME --viz

training with pretrained model(improved performance much)

$ EXPNAME=example
$ CUDA_VISIBLE_DEVICES=$GPUID python train.py example --viz --batch_size 8 --resume save/synthtext_pretrain/textsnake_vgg_0.pth

options:

  • exp_name: experiment name, used to identify different training processes
  • --viz: visualization toggle, output pictures are saved to ./vis by default

other options can be show by run python train.py -h

Running tests

Runing following command can generate demo on TotalText dataset (300 pictures), the result are save to ./vis by default

$ EXPNAME=example
$ CUDA_VISIBLE_DEVICES=$GPUID python eval_textsnake.py $EXPNAME --checkepoch 190

options:

  • exp_name: experiment name, used to identify different training process

other options can be show by run python train.py -h

Evaluation

Total-Text metric is included in dataset/total_text/Evaluation_Protocol/Python_scripts/Deteval.py, you should first modify the input_dir in Deteval.py and run following command for computing DetEval:

$ python dataset/total_text/Evaluation_Protocol/Python_scripts/Deteval.py $EXPNAME --tr 0.8 --tp 0.4

or

$ python dataset/total_text/Evaluation_Protocol/Python_scripts/Deteval.py $EXPNAME --tr 0.7 --tp 0.6

it will output metrics reports.

Pretrained Models

Download from links above and place pth file to the corresponding path(save/XXX/textsnake_vgg_XX.pth).

Performance

DetEval reporting

Following table reports DetEval metrics when we set vgg as the backbone(can be reproduced by using pertained model in Pretrained Model section):

tr=0.7 / tp=0.6(P|R|F1) tr=0.8 / tp=0.4(P|R|F1) FPS(On single 1080Ti)
expand / no merge 0.652 | 0.549 | 0.596 0.874 | 0.711 | 0.784 12.07
expand / merge 0.698 | 0.578 | 0.633 0.859 | 0.660 | 0.746 8.38
no expand / no merge 0.753 | 0.693 | 0.722 0.695 | 0.628 | 0.660 9.94
no expand / merge 0.747 | 0.677 | 0.710 0.691 | 0.602 | 0.643 11.05
reported on paper - 0.827 | 0.745 | 0.784

* expand denotes expanding radius by 0.3 times while post-processing

* merge denotes that merging overlapped instance while post-processing

Pure Inference

You can also run prediction on your own dataset without annotations:

  1. Download pretrained model and place .pth file to save/pretrained/textsnake_vgg_180.pth
  2. Run pure inference script as following:
$ EXPNAME=pretrained
$ CUDA_VISIBLE_DEVICES=$GPUID python demo.py $EXPNAME --checkepoch 180 --img_root /path/to/image

predicted result will be saved in output/$EXPNAME and visualization in vis/${EXPNAME}_deploy

Qualitative results

  • left: prediction/ground true
  • middle: text region(TR)
  • right: text center line(TCL)

What is comming

  • Pretraining with SynthText
  • Metric computing
  • Pretrained model upload
  • Pure inference script
  • More dataset suport: [ICDAR15, CTW1500]
  • Various backbone experiments

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgement

textsnake.pytorch's People

Contributors

princewang1994 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

textsnake.pytorch's Issues

Pytorch 0.4.0 issue

Sharing with this issue on Pytorch 0.4.0

Traceback (most recent call last):
File "train_textsnake.py", line 241, in
main()
File "train_textsnake.py", line 226, in main
train(model, train_loader, criterion, scheduler, optimizer, epoch, logger)
File "train_textsnake.py", line 66, in train
for i, (img, train_mask, tr_mask, tcl_mask, radius_map, sin_map, cos_map, meta) in enumerate(train_loader):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 451, in iter
return _DataLoaderIter(self)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 247, in init
self._put_indices()
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 295, in _put_indices
indices = next(self.sample_iter, None)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/sampler.py", line 138, in iter
for idx in self.sampler:
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/sampler.py", line 51, in iter
return iter(torch.randperm(len(self.data_source)).tolist())
RuntimeError: randperm is only implemented for CPU

=======================================================
In my case, I need to install other versions of PyTorch to solve this issue.

弯曲图像仿射变换

感谢你的开源,请问有没有实现作者的后处理,弯曲图像的透视变换,方便接入ocr。

find_bottom logic understanding issue

HI @princewang1994, I found the logic to find snake head is a little different from the paper. But I have to say it is really difficult to handle. Compared with selecting the 2 edges with M nearest to -1. Here you use a proposal based method selecting all possible edges which has M < -0.7.

    if len(pts) > 4:
        e = np.concatenate([pts, pts[:3]])
        candidate = []
        for i in range(1, len(pts) + 1):
            v_prev = e[i] - e[i - 1]
            v_next = e[i + 2] - e[i + 1]
            if cos(v_prev, v_next) < -0.7:
                candidate.append((i % len(pts), (i + 1) % len(pts), norm2(e[i] - e[i + 1])))

The next refining step is somehow confusing.
if this logic is True:

candidate[0][0] == candidate[1][1] or candidate[0][1] == candidate[1][0]

you are visiting the same edge from 2 different direction, which seems impossible for me.

too many "torch.uint8 Warning"

C:\w\1\s\windows\pytorch\aten\src\ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.

Warning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. (expandTensors at C:/w/1/s/windows/pytorch/aten/src\ATen/native/IndexingUtils.h:20)

RuntimeError: merge_sort: failed to synchronize: an illegal memory access was encountered

Hello Author,I get a problem " loss=inf " when I train my data,it happend in several epoch, and the detail is here:
..........
(650 / 2000) - Loss: 1.3488 - tr_loss: 0.2789 - tcl_loss: 0.4046 - sin_loss: 0.1108 - cos_loss: 0.0985 - radii_loss: 0.4560
(700 / 2000) - Loss: 0.9112 - tr_loss: 0.2656 - tcl_loss: 0.3378 - sin_loss: 0.1111 - cos_loss: 0.0474 - radii_loss: 0.1492
(750 / 2000) - Loss: 1.3603 - tr_loss: 0.2677 - tcl_loss: 0.3537 - sin_loss: 0.0388 - cos_loss: 0.0105 - radii_loss: 0.6896
(800 / 2000) - Loss: 1.5277 - tr_loss: 0.2856 - tcl_loss: 0.3284 - sin_loss: 0.1668 - cos_loss: 0.0420 - radii_loss: 0.7048
/home/hj/smbshare/fffan/Detector/TextSnake/TextSnake.pytorch-master/util/misc.py:85: RuntimeWarning: invalid value encountered in double_scalars
return v[1] / l
/home/hj/smbshare/fffan/Detector/TextSnake/TextSnake.pytorch-master/util/misc.py:91: RuntimeWarning: invalid value encountered in double_scalars
return v[0] / l

(850 / 2000) - Loss: 2.1355 - tr_loss: 0.4698 - tcl_loss: 0.4413 - sin_loss: 0.1460 - cos_loss: 0.1237 - radii_loss: 0.9547
(900 / 2000) - Loss: 1.9340 - tr_loss: 0.4183 - tcl_loss: 0.3667 - sin_loss: 0.0760 - cos_loss: 0.0620 - radii_loss: 1.0110
(950 / 2000) - Loss: 1.2544 - tr_loss: 0.2943 - tcl_loss: 0.3723 - sin_loss: 0.0869 - cos_loss: 0.0938 - radii_loss: 0.4072
(1000 / 2000) - Loss: 1.2810 - tr_loss: 0.3134 - tcl_loss: 0.4176 - sin_loss: 0.1045 - cos_loss: 0.0622 - radii_loss: 0.3833
(1050 / 2000) - Loss: 2.0816 - tr_loss: 0.2053 - tcl_loss: 0.3180 - sin_loss: 0.0600 - cos_loss: 0.0517 - radii_loss: 1.4467
(1100 / 2000) - Loss: 1.7673 - tr_loss: 0.2696 - tcl_loss: 0.4861 - sin_loss: 0.1194 - cos_loss: 0.0957 - radii_loss: 0.7965
(1150 / 2000) - Loss: inf - tr_loss: 0.3392 - tcl_loss: 0.4790 - sin_loss: inf - cos_loss: 0.1815 - radii_loss: 0.1718

Traceback (most recent call last):
File "train_textsnake.py", line 239, in
main()
File "train_textsnake.py", line 224, in main
train(model, train_loader, criterion, scheduler, optimizer, epoch, logger)
File "train_textsnake.py", line 73, in train
criterion(output, tr_mask, tcl_mask, sin_map, cos_map, radius_map, train_mask)
File "/home/hj/.pyenv/versions/fffan-env/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/hj/smbshare/fffan/Detector/TextSnake/TextSnake.pytorch-master/network/loss.py", line 61, in forward
loss_tr = self.ohem(tr_pred, tr_mask.long(), train_mask.long())
File "/home/hj/smbshare/fffan/Detector/TextSnake/TextSnake.pytorch-master/network/loss.py", line 24, in ohem
loss_neg, _ = torch.topk(loss_neg, n_neg)
RuntimeError: merge_sort: failed to synchronize: an illegal memory access was encountered

And always get this error during training "util/misc.py:91: RuntimeWarning: invalid value encountered in double_scalars"
Can you solve those problems?
PS: My data is ICPR

About CTW1500 and ICDAR2015 dataset

Hi, thank you for your open source contributions.
I follow total_text.py write the data loading part of CTW 1500 and encountered the following problems
Traceback (most recent call last): File "CTW1500_debug.py", line 86, in <module> img, train_mask, tr_mask, tcl_mask, radius_map, sin_map, cos_map, meta = trainset[idx] File "CTW1500_debug.py", line 55, in __getitem__ return self.get_training_data(image, polygons, image_id=image_id, image_path=image_path) File "/work3/liuchao/TextSnake.pytorch/dataset/dataload.py", line 179, in get_training_data sideline1, sideline2, center_points, radius = polygon.disk_cover(n_disk=cfg.n_disk) File "/work3/liuchao/TextSnake.pytorch/dataset/dataload.py", line 53, in disk_cover inner_points2 = split_edge_seqence(self.points, self.e2, n_disk) File "/work3/liuchao/TextSnake.pytorch/util/misc.py", line 195, in split_edge_seqence while(cur_end > point_cumsum[cur_node + 1]): IndexError: index 1 is out of bounds for axis 0 with size 1
some of the cases have the above problems. I can understand the reason of the error, but I don't know how to correct it. Can you help me fix the bug or share your CTW1500 and ICDAR 2015 dataset file? Thank you very much.

what is the format of dataset ?

Could you tell me what is the format of the dataset?

Could the ICDAR 2015 dataset work?

the data format like this 👍

x1 , y1 , x2, y2 , x3, y3 ,x4 ,y4 hello

thank you very much

About the result of demo.py

I managed to finish the demo.py, but I got the .txt result, how should I convert it to the corresponding image result display?I am sorry to trouble you again.

training on my own dataset

Hi Dear Wang,
If I find out correctly,

  1. Pretraining with SynthText section, train model using SynthText dataset.
  2. In the Training section, we have 2 choises: a)training from scratch which train model with only TotalText dataset and b) training with pretrained model.

Is these correct?

My question is that: I have 100 images. How can I prepare them to training?
thanks

result of textsnake

I have trained this model, but i only got F1 measure 58% nearly. do you have any ideas to improve the results. thank you. by the way, could you tell me your result.

validation loss dont decrease under 0.45

Hi dear,

thank you for your code.
but my both train and validation loss don't decrease under 0.45 with 400 epochs. May you help me for more config such as learning rate and so on?

regards

merge contour issue

Hi Dear. I want to use detector.merge_contours method to merge overlapped contours. but it gives an error "ValueError: too many values to unpack (expected 2)" at this line: cont_i, disk_i = all_contours[i]. thank you for your help. can you give insight that how does merge contour work? what are cont_i disk_i s? thanks

RuntimeError: Cannot re-initialize CUDA in forked subprocess

Epoch: 0 : LR = 0.0001 Traceback (most recent call last): File "train_textsnake.py", line 238, in <module> main() File "train_textsnake.py", line 223, in main train(model, train_loader, criterion, scheduler, optimizer, epoch, logger) File "train_textsnake.py", line 63, in train for i, (img, train_mask, tr_mask, tcl_mask, radius_map, sin_map, cos_map, meta) in enumerate(train_loader): File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__ data = self._next_data() File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data) File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data) File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 79, in default_collate return [default_collate(samples) for samples in transposed] File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 79, in <listcomp> return [default_collate(samples) for samples in transposed] File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 64, in default_collate return default_collate([torch.as_tensor(b) for b in batch]) File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 64, in <listcomp> return default_collate([torch.as_tensor(b) for b in batch]) File "/home/yt/anaconda3/envs/pytorch_zqs/lib/python3.6/site-packages/torch/cuda/__init__.py", line 148, in _lazy_init "Cannot re-initialize CUDA in forked subprocess. " + msg) RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
Does anyone know how to solve it? Thanks.

How to introduce resnet50 backbone?

@princewang1994 Thanks for your excellent implementation.

I have tried vgg16 backbone. However, I suppose the resnet50 backbone is better than vgg16 in this case. How to support resnet50 backbone in the textnet.py?

Thanks

How to get image_list.txt?

Hi, I download the dataset from here then ran python dataset/synth-text/make_list.py and I couldn't find image_list.txt.

Is it necessary to download the dataset from this link?

How does mask_to_tcl work?

I have two questions about function mask_to_tcl in util/detection.py.

  1. Why not change result.append(np.array([x_shift, y_shift, radii])) to result.append(np.array([x_c, y_c, radii]))? I tried and F score raised from 0.657 to 0.663, and it makes more sense.
  2. What are x_shift_pos and x_shift_neg's purpose?

How can I test trained model for my test images?

Thanks for your great work and codes.

I followed your commands:

  1. I used pre-train phase.
  2. used total-text dataset to improve training model.
  3. test on total-text database to evaluate model.

now I want to evaluate model on my sample images. I run demo.py but it need mat file of my sample images, while I dont want to give .mat file. I want to test for images in which i don't have their gt files.

regards,John

Unable to run inference

i am getting below error when i try to test pretrained model on my image

=============End=============
Loading from ./save/pretrained/textsnake_vgg_180.pth
Start testing TextSnake.
Traceback (most recent call last):
File "demo.py", line 110, in
main()
File "demo.py", line 92, in main
inference(detector, test_loader, output_dir)
File "demo.py", line 46, in inference
contours, output = detector.detect(image)
File "/home/ec2-user/vinayak/TextSnake.pytorch/util/detection.py", line 239, in detect
tr_pred = output[0, 0:2].softmax(dim=0).data.cpu().numpy()
AttributeError: 'Tensor' object has no attribute 'softmax'

Error using torch.load to load textsnake_vgg_0.pth downloaded from Google Drive using Google Colab (to get free GPU)

Hello,

torch.load does not manage to load correctly textsnake_vgg_0.pth (downloaded with google drive) using google colab which is linked to my drive. Here is the command I ran:

! python3.6 train_textsnake.py example --viz --batch_size 8 --resume /content/gdrive/My\ Drive/Colab\ Notebooks/TextSnake.pytorch/save/synthtext_pretrain/textsnake_vgg_0.pth

Here is the error I get (straight after it printed all the configs):

Loading from /content/gdrive/My Drive/Colab Notebooks/TextSnake.pytorch/save/synthtext_pretrain/textsnake_vgg_0.pth
Traceback (most recent call last):
File "train_textsnake.py", line 238, in
main()
File "train_textsnake.py", line 210, in main
load_model(model, cfg.resume)
File "train_textsnake.py", line 46, in load_model
state_dict = torch.load(model_path)['state_dict']
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 303, in load
return _load(f, map_location, pickle_module)
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 454, in _load
return legacy_load(f)
File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 380, in legacy_load
with closing(tarfile.open(fileobj=f, mode='r:', format=tarfile.PAX_FORMAT)) as tar,
File "/usr/lib/python3.6/tarfile.py", line 1589, in open
return func(name, filemode, fileobj, **kwargs)
File "/usr/lib/python3.6/tarfile.py", line 1619, in taropen
return cls(name, mode, fileobj, **kwargs)
File "/usr/lib/python3.6/tarfile.py", line 1482, in init
self.firstmember = self.next()
File "/usr/lib/python3.6/tarfile.py", line 2297, in next
tarinfo = self.tarinfo.fromtarfile(self)
File "/usr/lib/python3.6/tarfile.py", line 1092, in fromtarfile
buf = tarfile.fileobj.read(BLOCKSIZE)
OSError: [Errno 5] Input/output error

I assume part of the error is that pytorch is trying to open as a tar file although it is not.

I would really appreciate your help.

Best Wishes.

problematic point removal

The point removal of in TextInstance removes all qualified points independently. Which is not working well in some cases.

before [[  15    0]
 [1216    0]
 [1219    0]
 [  13   16]]

after [[15  0]
 [13 16]]

oops_screenshot_05 03 2019
The image is from the latest ArT dataset.

How to get the Square or Rectangle by the test detect results?

First , Wish you a Happy New Year.

I have some problem ,maybe you can help me . Thakn you very much.

I trained the model with my datasets . Now the detector results is some ploygons which have three
formats (center point , tangent line,text region). How can I use the results to get the rectangle or square?

Thank you very much.

RuntimeError: CUDA error: initialization error 请问这是什么情况

运行你的训练脚本,结果报下面这个错误:
Traceback (most recent call last):
File "train_textsnake.py", line 239, in
main()
File "train_textsnake.py", line 224, in main
train(model, train_loader, criterion, scheduler, optimizer, epoch, logger)
File "train_textsnake.py", line 63, in train
for i, (img, train_mask, tr_mask, tcl_mask, radius_map, sin_map, cos_map, meta) in enumerate(train_loader):
File "/home/hj/anaconda3/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 582, in next
return self._process_next_batch(batch)
File "/home/hj/anaconda3/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
RuntimeError: Traceback (most recent call last):
File "/home/hj/anaconda3/lib/python3.5/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/hj/anaconda3/lib/python3.5/site-packages/torch/utils/data/_utils/collate.py", line 68, in default_collate
return [default_collate(samples) for samples in transposed]
File "/home/hj/anaconda3/lib/python3.5/site-packages/torch/utils/data/_utils/collate.py", line 68, in
return [default_collate(samples) for samples in transposed]
File "/home/hj/anaconda3/lib/python3.5/site-packages/torch/utils/data/_utils/collate.py", line 63, in default_collate
return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
File "/home/hj/anaconda3/lib/python3.5/site-packages/torch/utils/data/_utils/collate.py", line 63, in
return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
File "/home/hj/anaconda3/lib/python3.5/site-packages/torch/utils/data/_utils/collate.py", line 59, in default_collate
return torch.tensor(batch)
RuntimeError: CUDA error: initialization error

相关配置为:
==========Options============
input_channel: 1
log_freq: 100
num_workers: 8
optim: SGD
max_annotation: 200
gamma: 0.1
val_freq: 100
mgpu: False
viz: False
lr: 0.0001
display_freq: 50
save_freq: 10
loss: CrossEntropyLoss
lr_adjust: fix
exp_name: ICPR
pretrain: False
viz_freq: 50
cuda: True
rescale: 255.0
stepvalues: []
weight_decay: 0.0
momentum: 0.9
max_points: 20
resume: None
start_iter: 0
input_size: 512
batch_size: 4
img_root: /home/hj/smbshare/fffan/Data/ICPR_text_train_20180313/train_image_8000
means: [0.485, 0.456, 0.406]
tr_thresh: 0.6
start_epoch: 0
use_hard: True
save_dir: ./save/
max_epoch: 200
checkepoch: -1
log_dir: ./logs/
net: vgg
verbose: True
output_dir: output
device: cuda
post_process_merge: False
dataset: synth-text
post_process_expand: 0.3
n_disk: 15
tcl_thresh: 0.4
stds: [0.229, 0.224, 0.225]
vis_dir: ./vis/

RuntimeError: CUDA error: initialization error

Hello,

When trying to run
CUDA_VISIBLE_DEVICES=0 python train.py train_testy
I get the following error:

Traceback (most recent call last):
  File "train.py", line 157, in <module>
    main()
  File "train.py", line 143, in main
    train(model, train_loader, criterion, scheduler, optimizer, epoch)
  File "train.py", line 45, in train
    for i, (img, train_mask, tr_mask, tcl_mask, radius_map, sin_map, cos_map, meta) in enumerate(train_loader):
  File "/home/anaconda3/envs/text_snake_env/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 513, in __next__
    return self._process_next_batch(batch)
  File "/home/anaconda3/envs/text_snake_env/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 534, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
RuntimeError: Traceback (most recent call last):
  File "/home/anaconda3/envs/text_snake_env/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/anaconda3/envs/text_snake_env/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 66, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/home/anaconda3/envs/text_snake_env/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 66, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/home/anaconda3/envs/text_snake_env/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 63, in default_collate
    return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
  File "/home/anaconda3/envs/text_snake_env/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 63, in <dictcomp>
    return {key: default_collate([d[key] for d in batch]) for key in batch[0]}
  File "/home/anaconda3/envs/text_snake_env/lib/python3.6/site-packages/torch/utils/data/_utils/collate.py", line 59, in default_collate
    return torch.tensor(batch)
RuntimeError: CUDA error: initialization error

It looks like it has something to do with enumerator(train_loader)

Can't run demo.py

Trying to run pure inference part, Downgraded pytorch version to 1.5 (torch==1.5.0+cu101 torchvision==0.6.0+cu101), then Set CUDA_VISIBLE_DEVICES=0 python demo.py $EXPNAME --checkepoch 180 --img_root "path to image " , but getting following error in dataset/dataload.py
error

RuntimeError: cuda runtime error (11) : invalid argument at /opt/conda/conda-bld/pytorch_1532579805626/work/aten/src/THC/THCGeneral.cpp:663

遇到一些问题想请教一下作者。首先是bash download.sh运行失败。索性自己去github地址上面下下来了total-text数据集。然而github上面的totaltext 其gt使用txt标记。于是首先吧txt改为了csv文件,然后修改total-text.py中的parse_mat()函数为如下代码:
'''
def parse_csv(self, csv_path):
with open(csv_path,'rt', encoding='UTF-8') as row_data:
readers = csv.reader(row_data, delimiter=',')
data = list(readers)

        polygons = []
        for cell in data:
            x = [int(num) for num in re.findall('\d+', str(cell[0].strip()))]
            y = [int(num) for num in re.findall('\d+', str(cell[1].strip()))]

            text = re.findall('\'([^>]+?)\'', str(cell[3]))[0]
            ori = re.findall('\'([^>]+?)\'', str(cell[2]))[0]

            text = text if len(text)>0 else '#'
            ori = ori if len(ori)>0 else 'c'

            if len(x) < 4:  # too few points
                continue
            pts = np.stack([x, y]).T.astype(np.int32)
            polygons.append(TextInstance(pts, ori, text))

    return polygons

'''

最后这是我的运行结果:
'''
aceback (most recent call last):
File "eval_textsnake.py", line 123, in
main()
File "eval_textsnake.py", line 102, in main
inference(detector, test_loader, output_dir)
File "eval_textsnake.py", line 48, in inference
contours, output = detector.detect(image)
File "/root/fsy_SceneTextRec/Docker-pytorch0.4.1/TextSnake.pytorch/util/detection.py", line 237, in detect
output = self.model(image)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/root/fsy_SceneTextRec/Docker-pytorch0.4.1/TextSnake.pytorch/network/textnet.py", line 47, in forward
C1, C2, C3, C4, C5 = self.backbone(x)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/root/fsy_SceneTextRec/Docker-pytorch0.4.1/TextSnake.pytorch/network/vgg.py", line 92, in forward
C1 = self.stage1(x)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: cuda runtime error (11) : invalid argument at /opt/conda/conda-bld/pytorch_1532579805626/work/aten/src/THC/THCGeneral.cpp:663
'''
运行环境是RTX2070,pytorch0.4.1的官方docker,cuda9.0

复现精度

您好,您有测过复现的精度能达到或者接近作者的精度吗

the error in running demo.py

Loading from ./save/example/textsnake_vgg_10.pth
Start testing TextSnake.
detect 0 / 300 images: img650.jpg.
Traceback (most recent call last):
  File "demo.py", line 133, in <module>
    main()
  File "demo.py", line 119, in main
    inference(model, detector, test_loader)
  File "demo.py", line 77, in inference
    batch_result = detector.detect(tr_pred, tcl_pred, sin_pred, cos_pred, radii_pred)  # (n_tcl, 3)
  File "/home/crypa/TextSnake.pytorch/util/detection.py", line 204, in detect
    detect_result = self.build_tcl(tcl, sin_pred, cos_pred, radii_pred)
  File "/home/crypa/TextSnake.pytorch/util/detection.py", line 164, in build_tcl
    _, conts, _ = cv2.findContours(mask.astype(np.uint8), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
ValueError: not enough values to unpack (expected 3, got 2)

Due to time, I didn't reach 200epoch according to the training requirements.
I tried to test the code with 10epoch.
so run CUDA_VISIBLE_DEVICES=0 python demo.py --checkepoch 10 example
But got the above error.so sorry,I am a newbie,could u help me?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.