summerlvsong / aggregation-cross-entropy Goto Github PK

View Code? Open in Web Editor NEW

304.0 304.0 60.0 18.44 MB

Aggregation Cross-Entropy for Sequence Recognition. CVPR 2019.

Python 98.65% Shell 1.35%

aggregation-cross-entropy's People

Contributors

Stargazers

Watchers

aggregation-cross-entropy's Issues

I would like to ask you how to accurately predict the character order of a word.

I recreated your project and found that the input GT was converted into a word list, which had lost its order, and your prediction only provided the number of characters. Only through the two-dimensional matrix position of the network output can barely judge the order, I would like to ask you how to accurately predict the character order of a word.

It doesn't work on my data, why don't you provide a pretrained model?

I pretrained a model using ctcloss and it works well. Then I loaded the weights and continued to train with the aceloss. The losses seemed to be coming down, but the test results were terrible, almost all wrong.

Here is my implementation of ACELoss.

device = torch.device("cuda:" + cfg.TRAIN.GPU_ID if torch.cuda.is_available() else "cpu")
class ACELoss(nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, input_, target, target_lens):
        w, bs, num_class = input_.size()
        aggragetions = torch.zeros(bs, cfg.ARCH.NUM_CLASS)
        for i in range(bs):
            idx = 0
            for j in range(target_lens[i]):
                aggragetions[i][target[idx]] += 1
                idx += 1
            aggragetions[i][0] = w - target_lens[i]
        target = aggragetions.to(device)

        input_ = input_ + 1e-10
        input_ = torch.sum(input_, 0)
        input_ = input_ / w
        target = target / w

        loss = (-torch.sum(torch.log(input_) * target)) / bs
        return loss

KL Divergence

In the paper in https://arxiv.org/pdf/1904.08364.pdf sec 3.2 it is mentioned:
"We borrow the concept of cross-entropy from information theory, which is designed to measure the “distance” between two probability
distributions."

Wont' kl-divergence be a better way to measure the distance between both probability distributions ?

Can you provide the data generator tool ?

Hi, thanks for the amazing works.

I was wondering that could you provide the tool that you guys have used to generate these toy dataset ?

Thanks you

CUDA_VISIBLE_DEVICES=0 python -u main.py \ 2>&1 | tee $filename

-u代表什么，没有查到

Need Help! Loss nan

for this line: torch.log(input)

The 'input' is the softmax score (0-1).
If k-th class does not show in an input, the accumulative softmax score of all time steps for k-th class is very likely to be 0. Then this will result into torch.log(input) = nan.

How do you make sure that 'input' does not equal to 0 for 'torch.log(input)'

code about HCTR will be publicly or not?

Table 3. Comparison with previous methods for HCTR. ACE (1D) 91.68 91.25 96.70 96.22
code about this experiment will be be publicly or not?

Do ACE work? same point about HCTR Result

3C-FCRN+B_SLD+SLD（residual LSTM proposed）get ICDAR CR 97.15 AR 96.50

But ACE just CR 96.70 AR 96.22

so do ACE work in HCTR or not residual LSTM?

Compared with CTC, the effect is very poor in Chinese long text

The result of training with fixed length (32 * 280) (the same number of characters (10)) is only good for short text.

loss decline but accuracy near to 0

I train a model(CRNN) base on dataset synth90k, through the loss decline step by step, the accuracy is near to 0 all the time. What casue this problem?

Would MSE loss work if i remove the softmax?

What is the accuracy criterion in 2D prediction?

As the model can only recognize the characters and characters' number, so what's the accuracy criterion for 2D prediction in the paper?

def vis(self,iteration) function, possible error

in https://github.com/summerlvsong/Aggregation-Cross-Entropy/blob/master/source/models/seq_module.py#L65, should it be pred_string_set = [pred_string[i:i+self.w] for i in xrange(0, len(pred_string), self.w)] instetad of self.w*2?

please verify.

Thanks

Use ACE loss for English handwriting recognition，but not converge

I use the ACE loss function to do English handwriting recognition. When the model is trained, it does not converge, but with CTC, it gets a good convergence effect. How can this happen?

When will the code be publicly available?

Hello, it is an excellent work for your "Aggregation Cross-Entropy for Sequence Recognition" paper. Just want to check whether you will release the code or not. Thanks!

Can’t reproduce your results in your cvpr paper

I reproduced crnn+ctc and test it on IIIT5K+SVT+IC03+IC13 test database, got WER 0.153, which is same as the reported results in paper.
I also reproduced crnn+ace loss, but only got WER 0.205 on the same test database, any advise?
My environment:
pytorch 1.2.0
batchsize 60
trained only on 8-million synthetic data released by Jaderberg
iterations 1000k
adadelta rho 0.9

Would it be helpful in Speech Recognition especially in acoustic model to replace CTC with ACE?

Hello, I am writing this topic to ask if it will be useful in speech recognition tasks. I am going to test your ACE loss on my acoustic model. Hope it can produce comparable performance. I will show the result later.

Why use ResLSTM for Offline Handwritten Chinese Text Recognition ?

Nice work!

I found this network architecture
(126,576)Input − 8C3 − MP2 − 32C3 − MP2 − 128C3 − MP2−5∗256C3−MP2−512C3−512C3−MP2−512C2− 3 ∗ 512ResLSTM − 7357F C − Output
in the paper for Handwritten Chinese Text Recognition.

Is it necessary for Chinese Text Recognition ?

summerlvsong / aggregation-cross-entropy Goto Github PK

aggregation-cross-entropy's People

Contributors

Stargazers

Watchers

Forkers

aggregation-cross-entropy's Issues

Recommend Projects

Recommend Topics

Recommend Org