Comments (6)
We first using a generator to synthesize the captchas. Actually, the generator is a traditional captcha generator. After synthesizing the captchas, we use discrinimator the distinguish the synthetic captchas from the real ones. To make sure the style of synthetic captchas are similar to the real one, we use 500 real captchas to tune the generator parameters. For details, you can refer to my paper http://delivery.acm.org/10.1145/3250000/3243754/p332-ye.pdf?ip=148.88.244.92&id=3243754&acc=ACTIVE%20SERVICE&key=BF07A2EE685417C5%2EF52F20EBE5138950%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35&__acm__=1544006378_04ab2419edba35c7875b03e4147dd03d
from captcha_solver.
Thanks for your reply.You mentioned using 500 real captchas to tune the generator parameters,so the training set is 500.It usually requires a lot of data to train a network well.
I'm confused about how to train the network well just by using 500 pictures?
from captcha_solver.
Actually,the training set is not only the 500 real captchas. It includes both real captchas and synthetic captchas generated by the captcha generator. Note that the captcha generator can produce captchas using initial parameters such as rotation, distortion, waving and so on. In fact, more real captchas used, the better the model will be. In our work, to quick lanch the attack, we used 500 real captchas which we found it performs well.
from captcha_solver.
Sorry to trouble you again.I wonder why does captcha generator use only 500 captchas .And with so samll dataset ,the captcha generator can generate captchas similar to real ones.
from captcha_solver.
@yeguixin Do you use transfer Learning?Looking forward your reply!
from captcha_solver.
Hi, in our initial experiments, we respectively used 500, 1000, 2000 and 5000 real captchas. We found their results have little difference. But 500 real captchas is easier to overfit. To prevent overfit, we setup the drop out lower than 0.5 according to the complexity of captcha scheme. Also, we do multi-scale transformation for the captcha image.
I used tansfer learning to tune the based solver which is trained by synthetic captchas. Because the synthetic captchas are not absolute the same to the real ones.
from captcha_solver.
Related Issues (20)
- Example code? HOT 1
- About Test data-set
- 对于Synthesizer有些不太理解的地方 HOT 1
- the question about Generator HOT 1
- whether the code is wholeness HOT 1
- 500 image samples for Synthetic generator HOT 1
- Code to remove security features before training it in PIX2PIX
- 对于论文几处不太明白的地方 HOT 8
- Lenet model overfitting HOT 1
- Lenet model overfitting
- Lenet model overfitting
- Lenet model overfitting
- Lenet model overfitting
- Training Loss
- captcha Synthesizerh关于验证码合成器 HOT 1
- questions about CAPTCHA solver
- Release of dataset?
- About the parameter settings of Captcha Synthesizer HOT 22
- file real/real_train.txt
- captcha preprocessing codes HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from captcha_solver.