Giter VIP home page Giter VIP logo

captchaharvester's Introduction

captchaharvester's People

Contributors

1fge avatar cosmo3904 avatar dependabot[bot] avatar noahcardoza avatar verybadsoldier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

captchaharvester's Issues

Add more info to README

Clarify some things about the API mostly.

Also, mention that stale cookies could be an issue as per #9

Sir hello

I have one question..can you integrated it in User Script ..?! Please 🙏

Issue: Missing browser executable location for Windows

Starting from line 29 in browser.py:

browser_command = []
   if system == 'Darwin':
       browser_command.append(f"open -a '{app}' -n --args")
       if not restart:
           browser_command.append(
               f'--user-data-dir=/tmp/havester/{str(uuid4())}')
       else:
           os.system(f'killall "{app}"')
   elif system == 'Windows':
       if not restart:
           browser_command.append(
               f"--user-data-dir={os.path.join(os.environ['TEMP'], 'harvester', str(uuid4()))}")
       else:
           os.system(f'TASKKILL /IM {browser}.exe /F')
   else:
       raise RuntimeError(
           'Automatic broswer functinality only avalible on MacOS and Windows for now')

Missing the initial argument referencing the browser executable location for Windows.
Adding the line below right under elif system == 'Windows': was a quick fix.

browser_command.append(f"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe")

datadome

hello,
can you adjust this to work also on datadome captcha?

Accessing the website

Hi,
Thank you for this script, it's excellent!

I am new to web scraping and have a pretty basic question. I used your example and was able to solve the captcha manually and retrieve the token. But how do I actually post it back to the website if I want to access it? I am planning to scrape the website using requests. It's hcaptcha by Cloudflare and I am using Python. I would be very grateful if you could point me in the right direction.
Thank you!

question about usage

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from python_anticaptcha import AnticaptchaClient, NoCaptchaTaskProxylessTask
import time
from harvester import fetch

server_address = ('127.0.0.1', 5000)
token = fetch.token(server_address)
url = "https://emailondeck.com"
driver = webdriver.Firefox()
driver.implicitly_wait(30)
driver.get(url)

wait = WebDriverWait(driver, 30)
api_key = token
site_key = '6LeQLyEUAAAAAKTwLC-xVC0wGDFIqPg1q3Ofam5M'  # grab from site

client = AnticaptchaClient(api_key)
task = NoCaptchaTaskProxylessTask(url, site_key)
job = client.createTask(task)
print("Waiting to solution by Anticaptcha workers")
print('Token:', token)
job.join()
# Receive response
response = job.get_solution_response()
print("Received solution", response)

# Inject response in webpage
driver.execute_script('document.getElementById("g-recaptcha-response").innerHTML = "%s"' % response)

# Wait a moment to execute the script (just in case).
time.sleep(1)

# Press submit button
driver.find_element_by_xpath('//*[@id="get_email_btn"]').click()
/home/azamet/Belgeler/projelerim/venv/bin/python /home/azamet/Belgeler/projelerim/remover.py
Traceback (most recent call last):
  File "/home/azamet/Belgeler/projelerim/remover.py", line 20, in <module>
    job = client.createTask(task)
  File "/home/azamet/.local/lib/python3.8/site-packages/python_anticaptcha/base.py", line 128, in createTask
    self._check_response(response)
  File "/home/azamet/.local/lib/python3.8/site-packages/python_anticaptcha/base.py", line 112, in _check_response
    raise AnticaptchaException(
python_anticaptcha.exceptions.AnticaptchaException: [ERROR_KEY_DOES_NOT_EXIST:1]Account authorization key not found in the system

Process finished with exit code 1

after running the code at top the token is removed from stored tokens but i everytime get this error what am i doing wrong here?
and thanks again for this library

Error thrown when trying to install the library.

After running the pip install captcha-harvester command recommended by the README.md file, I am presented with this error:

Traceback (most recent call last):
File "", line 1, in
File "/private/var/folders/gb/5xlqkbdn70d5gsptxj6m2p4h0000gp/T/pip-install-PDB4vf/captcha-harvester/setup.py", line 3, in
import harvester
File "harvester/init.py", line 1, in
from harvester import browser, fetch, server
File "harvester/browser.py", line 31
def read_osx_defults(plist: str, binary: str) -> str:
^
SyntaxError: invalid syntax

Any help figuring this issue out would be greatly appreciated. Thanks!

Note: Just to clarify the error statement more, the ^ in the last line of the error statement is pointing to the : after plist. I wasn't sure if this was made clear by the original comment cause of the formatting change after I posted my question.

Recaptcha v3

Can you paste some recaptcha v3 example, and if possible can you tell me how did ou actually solved recaptcha v3

Captcha never solve

Hi, somehow the captcha never solve itself.

            pystyle.Write.Print("\t[*] Solving captcha... please be patient!\n", pystyle.Colors.yellow, interval=0)
            harvester = Harvester('0.0.0.0', 7777)
            captchatokens = harvester.intercept_hcaptcha(domain='discord.com', sitekey=SITE_KEY)
            server_thread = threading.Thread(target=harvester.serve, daemon=True)
            server_thread.start()
            while True:
                print("1")
                captchatoken = captchatokens.get()
                print(captchatoken)

Using this Code above, it print me number 1 after that it's just waiting for a undefined period of time ( unlimited waiting)
Hope that there is a solution for it.

Regards,
FuckingToasters

Info about server

Hi,
I'm trying to configure CaptchaHarvester with FlareSolver from ngosang, but when I run harvester I get the error "harvester: error: the following arguments are required: type, -k/--site-key, -d/--domain"

I'm not expert, so the only I can do is to follow the instruction.

There's any way to run the server and interact with it from FlareSolver?

Thanks

Jo

Does V3 harvest works?

Hi, im planning to implement this on puppeteer in nodejs and i wanted to know if i can solve beforehand a v3 captcha with this.

Hcaptcha Not Displaying

@NoahCardoza I took your script, and edited it a little. I removed the sign into google feature in the harvester.py file. I also changed the main.py script. I changed it so that I can execute it all via pycharm. The code is below:

from harvester import Harvest, load_html_template

def getTokens():
    html_template = load_html_template(
        'hcaptcha', '33f96e6a-38cd-421b-bb68-7806e1764460', 'localhost:5000')

    s = Harvest('http://www.sneakersnstuff.com', html_template)


    while True:
        s.solve()
def startServer():
    server.start('5000')


if __name__ == '__main__':
    getTokens()
    startServer()

I am only interested in getting into SNS right now. When I run main.py, I can't get the h-captcha box to display. Can you please help?

ANOTHER IMPORTANT THING. WHEN I RUN THE SCRIPT ON A VPN, I AM ABLE TO PULL UP THE H-CAPTCHA BOX, BUT AGAIN, I NEED TO HAVE A VPN ON. WHAT IS THE REASON FOR THIS?

Help Getting started

I am using your CloudProxy project but I often get this message:

2020-09-02T10:50:00.240Z INFO REQ-294 Captcha detected but no automatic solver is configured.

So I thought I should take a look at this project. But I cannot get it to work. As far as I know I should start it just by typing harvester but then I get this:

(cloudproxy) C:\Users\User>harvester
usage: harvester [-h] [-a DATA_ACTION] -k SITE_KEY -d DOMAIN [-H HOST]
                 [-p PORT] [-b BROWSER] [-B] [-r] [-e LOAD_EXTENSION] [-v]
                 {recaptcha-v2,recaptcha-v3,hcaptcha}
harvester: error: the following arguments are required: type, -k/--site-key, -d/--domain

So it seems I need 3 arguments. type seems to specify the captcha type. Well sorry, but how do I know?
And what is a site-key and where to get it?

Sorry for these (probably nooby) questions... Thanks for help.

Error on Linux

Traceback (most recent call last):                                                                                                                                                                  
  File "/root/tools/pythonscripts/main.py", line 1206, in run_normal                                                                                             
    run_token(token, token_raw, headers_tls, token_proxy)                                                                                                                                           
  File "/root/tools/pythonscripts/main.py", line 1135, in run_token                                                                                              
    join_data = join_captcha(response, token, server_task, total_servers, invite, token_proxy, current_ua, headers_tls, thread_id=thread_id)                                                        
  File "/root/tools/pythonscripts/main.py", line 910, in join_captcha                                                                                            
    captcha_key = captcha.get_captcha_key(current_ua, sitekey, rqdata, token_proxy)                                                                                                                 
  File "/root/tools/pythonscripts/captcha.py", line 443, in get_captcha_key                                                                                         
    from harvester import Harvester                                                                                                                                                                 
  File "/usr/local/lib/python3.9/dist-packages/harvester/__init__.py", line 1, in <module>                                                                                                          
    from harvester import browser, fetch, server                                                                                                                                                    
  File "/usr/local/lib/python3.9/dist-packages/harvester/server/__init__.py", line 54, in <module>                                                                                                  
    class MITMRecord:                                                                                                                                                                               
  File "/usr/lib/python3.9/dataclasses.py", line 1021, in dataclass                                                                                                                                 
    return wrap(cls)                                                                                                                                                                                
  File "/usr/lib/python3.9/dataclasses.py", line 1013, in wrap                                                                                                                                      
    return _process_class(cls, init, repr, eq, order, unsafe_hash, frozen)                                                                                                                          
  File "/usr/lib/python3.9/dataclasses.py", line 927, in _process_class                                                                                                                             
    _init_fn(flds,                                                                                                                                                                                  
  File "/usr/lib/python3.9/dataclasses.py", line 531, in _init_fn                                                                                                                                   
    return _create_fn('__init__',                                                                                                                                                                   
  File "/usr/lib/python3.9/dataclasses.py", line 400, in _create_fn                                                                                                                                 
    exec(txt, globals, ns)                                                                                                                                                                          
  File "/root/tools/pythonscripts/main.py", line 538, in lg                                                                                                      
    raise SystemExit                                                                                                                                                                                
SystemExit  

Buster Integration with Harvester?

I wanted to integrate Buster (https://github.com/dessant/buster) which is an automatic captcha solver with Harvester. Since Harvester doesn't use Selenium so there's no way to add buster.crx (for Chromedriver) with Harvester instance. Maybe a possible plugin/extension addition in the future would help? Or are there any alternative ways to achieve this?

ReCaptcha not rendering for https://www.google.com/recaptcha/api2/demo

Steps to reproduce:

visit url "https://www.google.com/recaptcha/api2/demo"

domain = www.google.com
sitekey = 6Le-wvkSAAAAAPBMRTvw0Q4Muexq9bi0DJwx_mJ-
self.harvester.intercept_recaptcha_v2(domain=domain, sitekey=sitekey)

Running the method:

self.harvester = Harvester()
server_thread = Thread(target=self.harvester.serve, daemon=True)
server_thread.start()

# launch a browser instance where we can solve the captchas
print("Launching browser...")
self.harvester.launch_browser()

The resulting error is that the harvester browser launches and everything renders except for the recaptcha. When checking network tab there is an ssl error.

image

Hey

So for me the example code works with the sneaker site. But when I try to use it on discord I get this error

image

discord

hey im trying to add you on discord,

MacHacker#7322 is not found, lmk :)

Problems with recaptcha v3

I have studied recaptcha v3 and used the 2captcha service (It doesn't work very well for my purposes). As far as I could see the recaptcha depends on the site of origin to pass the test. I thought that just changing the HTTP referer header can solve this. But I don't know how to fix this in your code.

Could you help me find a way to define the source URL of the recaptcha?

I tried to force the full site url at "harvester.intercept_recaptcha_v3 (domain = complete_url)", but that didn't work.

Can't find ssl cert files when packaging with pyarmor/pyinstaller

Traceback (most recent call last):
File "threading.py", line 926, in bootstrap_inner
File "threading.py", line 870, in run
File "site-packages\harvester\server_init
.py", line 149, in start
File "site-packages\harvester\server_init_.py", line 135, in setup
File "ssl.py", line 1232, in wrap_socket
FileNotFoundError: [Errno 2] No such file or directory

Ways to improve Captcha V3 score

First of all thanks for this great package! Works like a charm 😄

I'm trying to bypass a webservice call that uses a captcha V3, I'm able to successfully "complete" the V3 captcha, however my score is being considered too low, meaning it identified my script as a bot 😢 (means the captcha is working great though haha).

Are there ways to improve the captcha score into thinking it's a human? I tried signing in to a google account and browse some sites, however that didn't seem to change much. I'm using it n a python script pretty close to the version of your example since I didn't spend much time on it to write the rest of the project 😄

Or would it be better to try and bypass the V3 captcha with a V2 Captcha?

Integrate with selenium

I made an api to solve hcaptcha, recaptcha and others
It is working perfectly but now I want to make a server that solve captcha for me so that my virtual machine does not have to load models everytime

I want to use harvester with selenium, all I need to do is when I open harvester, i need to solve captcha with selenium

Issue with HSTS

When using a website with HSTS enabled the Harvester would not work

harvester(ERROR) [08/Dec/2020 17:31:59] [127.0.0.1] code 400, message Bad request version ('P\x19ê®')
harvester(ERROR) [08/Dec/2020 17:31:59] [127.0.0.1] code 400, message Bad request version ('Ô\x95ã}\x11½\x9bGÙ¨iBá\x0f\x91k\x1b`}÷&b\x86\x1a:±.\x05Å°ý\x00"ªª\x13\x03\x13\x01\x13\x02̨̩À+À/À,À0À\x13À\x14\x00\x9c\x00\x9d\x00/\x005\x00')

H-captcha box not displaying

When I try running your script, I don't get the h-captcha box itself running (I implemented your h-captcha html script where it says self.htmlcode.

Opening website instead of Captcha Harvester

whenever i am running harvester on a website, for e.g. supreme checkout, it just straightaway loads the website and not the captcha box from it. can you please check why its happening ?

Uploading 2 of my tries with different links from the shop -

C:\CaptchaHarvester>harvester -d www.supremenewyork.com/checkout.json -k 6LeWwRkUAAAAAOBsau7KpuC9AV-6J8mhw4AjC3Xz -b chrome recaptcha
server running on http://127.0.0.1:5000
['start chrome', '--user-data-dir=C:\Users\ankur\AppData\Local\Temp\harvester\6346a89c-f6dc-49f3-9bdb-0858893f7d51', '--no-default-browser-check', '--proxy-pac-url="http://127.0.0.1:5000/www.supremenewyork.com/checkout.json.pac"', '--window-size=400,580', '--app="http://www.supremenewyork.com/checkout.json"']

========================================================

C:\CaptchaHarvester>harvester -d www.supremenewyork.com/mobile/#checkout -k 6LeWwRkUAAAAAOBsau7KpuC9AV-6J8mhw4AjC3Xz -b chrome recaptcha
server running on http://127.0.0.1:5000
['start chrome', '--user-data-dir=C:\Users\ankur\AppData\Local\Temp\harvester\8dff42b2-bf2e-4e16-9509-71b6c616f0ce', '--no-default-browser-check', '--proxy-pac-url="http://127.0.0.1:5000/www.supremenewyork.com/mobile/#checkout.pac"', '--window-size=400,580', '--app="http://www.supremenewyork.com/mobile/#checkout"']

============================================================

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.