Comments (8)
PS i got autoscab
on pypi lmao lets do this
https://pypi.org/project/autoscab/
https://github.com/sneakers-the-rat/autoscab
from kelloggbot.
We could take a dictionary of {xpath: action}
, where xpath
is a string and action is a class. There could be a base action
class:
class Action():
def __init__(self, fun):
'''
Create Action from function/lambda.
It will be passed the element at xpath, so it must accept one argument.
'''
self.fun = lambda _, element: fun(element)
Then have two standard ones:
class Click(Action):
fun = lambda _, element: element.click()
class Input(Action):
def __init__(self, inputs):
'''
Inputs is either a string, an array of possible strings to choose or a function/lambda that returns the string to use
'''
if type(inputs) == list:
self.fun = lambda _, element: element.send_keys(random.choice(inputs))
elif type(inputs) == function:
self.fun = lambda _, element: element.send_keys(inputs())
else: # Assume it’s either a string or can be cast to a string
self.fun = lambda _, element: element.send_keys(inputs)
Then have a function that takes the dictionary as input and carries out the actions, perhaps even in a loop:
def autoscab(actions, times=0):
'''
actions is a dictionary:
{
xpath (string),
action (Action or a class that inherits from it.)
}
times is the amount of times to run this, 0 for infinite
'''
if times == 0:
iterate = iter(int, 1)
else:
iterate = range(times)
for _ in iterate:
for xpath, action in actions.items():
element = driver.find_element_by_xpath(xpath)
action.fun(element)
This is a rough draft, it probably needs some more error handling but it’s probably best to let the user handle errors. It should work for most applications and to update it you mainly just have to change a dictionary.
from kelloggbot.
I like the idea, but I'd be worried about that being limiting.
Fair enough, but if you just want to make a quick bot that's probably a good place to start, until people have time to make actual bots. Technically it could also work for purposes other than this but this is the main focus.
It might be better to componentize the project into a set of tools -
resume_generator
,captcha_solver
,email_verifier
, etc. - which can be imported into a selenium project and used in-place.
I think this would be a good idea, it also helps people quickly make bots, and it would also work with things other than selenium, if it's a bit too heavy for the particular application (~1mb last time I checked).
from kelloggbot.
yes! this is what i am doing ^^ will post here when i get a draft. Splitting into a bot that can take a set of selectors, an identity class that can do all the faking, and the resume generator with hooks for the identity class to use.
from kelloggbot.
I also think that trying to abstract the process further would take a bit of development time, i'm thinking of a programming interface that would be v familiar/usable by nonprogrammers (click thing, wait, type thing, wait, switch tabs, wait) and then we can do further abstraction depending on patterns that emerge
from kelloggbot.
Being able to extract the work done here into either a generalized program or build out some of the subroutines into external libraries would be extremely beneficial. Is there a specific way we'd want to do this?
from kelloggbot.
Being able to extract the work done here into either a generalized program or build out some of the subroutines into external libraries would be extremely beneficial. Is there a specific way we'd want to do this?
Am working on a draft over here, though will need another day to get a full version, sorry to be cryptic: https://github.com/sneakers-the-rat/autoscab
edit have also fixed up the packaging and it's pypi-ready.
from kelloggbot.
OK autoscab 0.2.0 is up now. I'm totally fuzzyheaded right now, but basic organization
postbot:PostBot
is the main driver, it spawns the chromedriver and has some syntactical sugar to let people interact with the page by just usingself.<element>
calls, as well as a logger, etc. It takes...- a starting url,
- a
locator_dict
-- a dictionary of{'name': (By.<SELECTOR_TYPE>, '<SELECTOR>')}
, you can see an example in constants/locators (that empty class just gets turned into a dict likedict(LocatorClass.__dict__)
- an
Identity
(or it makes one if none is provided, see below)
Identity
class creates all the identity elements, including resume et al. Also added in some browser fingerprinting randomization.
The basic pattern is to subclass PostBot
with a series of actions to take to fill the form, using the locations in Locator
, and then put them in PostBot.apply
method -- you can see an example in deployments/fredmeyer
. It's a little awkward right now, but trying to get it out to the people in time for it to be useful on my end.
A Deployment
consists of a name, list of starting URLs, a Locator dictionary, and a subclasses PostBot. Any Deployment is picked up by the metaclass, so then the calling syntax is just
autoscab <DEPLOYMENT_NAME>
I tried to leave in place a lot of what was here, but like i said am trying to get this out ASAP and figured we could cohere later. Also haven't pulled in any of y'all work.
usage: APPLY FOR MANY OF THE SAME JOB [-h] [-n N] [--relentless] [--list] [--noheadless] [--leaveopen]
[deployment]
positional arguments:
deployment Which deployment to run
optional arguments:
-h, --help show this help message and exit
-n N Apply for n jobs (default: 1)
--relentless Keep applying forever
--list List all available deployments and exit
--noheadless Show the chromium driver as it fills in the application
--leaveopen Try to leave the browser open after an application is completed
IF THEY WANT SCABS, WE'LL GIVE EM SCABS
from kelloggbot.
Related Issues (20)
- You've had a code review à la front page on Hacker News HOT 2
- Headless firefox compatibility
- "Syntax errors" regardless of python version used HOT 5
- Use less obviously fake fake email addresses HOT 1
- Well, they added a reCaptcha -_- HOT 9
- [SUGGESTION] Phone Extensions HOT 1
- LaTeX resume generator HOT 5
- Looks like they messed with Capta Attribute Names HOT 4
- Invalid Session ID After Running for a While HOT 4
- reCaptcha complains about automated queries when running headlessly HOT 8
- Salary numbers? HOT 7
- Now they're doing email verification HOT 39
- FAILED TO CREATE ACCOUNT: list index out of range HOT 3
- Please remove .DS_Store files and add to all .gitignore files
- Updates for installation readme HOT 1
- Error "FAILED TO FILL OUT APPLICATION AND SUBMIT: [WinError 2] The system cannot find the file specified" HOT 6
- Maybe I'm just being dumb, but could I please have a hand? HOT 11
- Well, it's over. HOT 2
- Starbucks fires workers attempting to unionize, next steps? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kelloggbot.