Comments (22)
@scut-salmon I plan to publish a runable version.
from captcha_solver.
@yeguixin Thank you very much!
In the article, you use grid search method to search for the optimal parameters.
I still confuse with MADS.
Q1:Befor using traditional captcha generator, are these parameters initialized first?
Q2:Can you give us a more detailed introduction, or some links?
Q3:How does MADS determine whether these parameters are optimal?(if compare with real captchas, Compared to what)
Best wishes!
from captcha_solver.
I have the same confusion. I wish know how to use the parameter control the data generator,for example,if I wish get the image with noise background. How does the discriminator generate images with noise? Is it because the real data have the lables(Noisy background,Occluding lines,Distortion)?
from captcha_solver.
I used an image generator which can paste each character in the white image. Here every character is also a small image. All parameters such as rotate angle, occluding lines and etc are trained at this step.
We also have another gengrator which is part of GAN and it aims to modify the generated captcha at pixel level to make sure that the generated captchas are similar to the real ones.
from captcha_solver.
Sorry, I reopened this issue.
from captcha_solver.
@yeguixin You use only 500 real captchas,and Captcha Image Generator would generator 500 captchas or more?
from captcha_solver.
If each character is a small image, then how do you consider the distance between them, do you think the distance of all characters in real image are equal?
And the most confusing thing is that you train all the parameters, do you mean train by neural net work? Or you just set this parameters manually?
Looking forward to your kindly reply, thanks!
from captcha_solver.
@awsssix We can synthesize as many captchas as we want as we generate using an traditional captcha generator. The 500 real captchas and the synthetic captchas are used to train the GANs. Here is a traditional captcha generator which maybe contribute to understand how to generate captchas.
@scut-salmon The distance between two characters is random within a certain range such as [0, 20]pixels. Our sythesizer can automatically tune the boundariy of the range. Note that some parameters such as background, number of occluding lines are fixed according to the captcha scheme. In our initial experiments we found that manually set the synthetic parameters can also performs well.
from captcha_solver.
The parameters are initialized at first. After that, the captcha generator synthesizes captchas and then the generator of GANs tries to tune the synthetic captchas. At last, the discriminator will distinguish the generated captchas from the real ones.
If the discriminator successful identify the synthetic captchas (Here is a threshold to determine the discriminating ability), the value of parameters will be tuned. Here the value increases iteratively because we set the initial value relatively small.
In order to determine the optimal parameters, there are many training tricks such as tuning parameters per 10 iternations. If the discriminator successfully identify the synthetic captchas at most times, the parameters will be tuned.
In fact, the discriminator determines fake or true by comparing the patch images between the real captchas and sythetic captchas. Here you can refer to the PachGANs
Hope that the above will be helpful.
from captcha_solver.
excuse me, could you please offer some sample of real captcha and the correspoding synthetic captcha generated by the captcha generator(without security feature)
from captcha_solver.
This repo should be very helpful to you. @scut-salmon @awsssix
from captcha_solver.
here is used 1500+ real captchas,the image's background has noise, after the model was trained,when i predict the new captcha,the Recognization Accytacy is zero
from captcha_solver.
@yeguixin when is the runable version scheduled for release?
from captcha_solver.
when is the runable version scheduled for release? @yeguixin
from captcha_solver.
In the first step, the captchas generator generates the color captchas through the input characters. Can I think that if all characters have the same color, it is a captcha without safety features, and if the characters have different colors, it is a safety feature? @yeguixin
from captcha_solver.
In general, a single color can eaisly be removed by using some image preprocessing methods. Different character colors increases the difficulty of the preprocessing. In our paper, we just summaried and categoried six kinds of security features for better description.
from captcha_solver.
In general, a single color can eaisly be removed by using some image preprocessing methods. Different character colors increases the difficulty of the preprocessing. In our paper, we just summaried and categoried six kinds of security features for better description.
So if I want to generate the synthetic captchas with different colors by GANs, I can input some captchas with a single color and set the different colors(included in the fifth security feature) as a parameter of training. The characters of my real captchas are all different colors. Am I right?
from captcha_solver.
When generating captchas, you can random set different colors for each characters by changing the RGB value.
from captcha_solver.
When generating captchas, you can random set different colors for each characters by changing the RGB value.
Do you mean to set color parameters in traditional generators or in GANs? According to your previous reply, multiple colors should be considered as a security feature, while traditional generators should generate relatively clean captchas(single color) to the generator in GANs. Thank you for your reply.
from captcha_solver.
Yes, the parameters of security features are setted in the traditional captcha image generator. Once trained, the traditional captcha image generator can generate captchas with and without security features for a targeted captcha scheme.
from captcha_solver.
Thank you very much for your reply and it has helped me a lot. But I may have to bother you. How many data pairs did you use to train the pix2pix model in the preprocessing step? Did you combine two pictures into one?
from captcha_solver.
We use 20K pairs of synthesized captcha images to train the preprocessing model. The data format strictly follows Pix2Pix model. To do so, we first resize the captcha image to 256*256pixels and then combine two images into one as the following example.
from captcha_solver.
Related Issues (20)
- Example code? HOT 1
- About Test data-set
- 对于Synthesizer有些不太理解的地方 HOT 1
- the question about Generator HOT 1
- whether the code is wholeness HOT 1
- 500 image samples for Synthetic generator HOT 1
- Code to remove security features before training it in PIX2PIX
- 对于论文几处不太明白的地方 HOT 8
- Lenet model overfitting HOT 1
- Lenet model overfitting
- Lenet model overfitting
- Lenet model overfitting
- Lenet model overfitting
- Training Loss
- captcha Synthesizerh关于验证码合成器 HOT 1
- questions about CAPTCHA solver
- Release of dataset?
- file real/real_train.txt
- captcha preprocessing codes HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from captcha_solver.