carsonyl / pypac Goto Github PK
View Code? Open in Web Editor NEWFind and use proxy auto-config (PAC) files with Python and Requests.
Home Page: https://pypac.readthedocs.io
License: Apache License 2.0
Find and use proxy auto-config (PAC) files with Python and Requests.
Home Page: https://pypac.readthedocs.io
License: Apache License 2.0
In download_pac, when an error occurs during an HTTP request, the error is ignored, and the next URL is attempted. At the end, if all URLs failed with error, the function returns None. It can be implied from this function that an error has occurred. If the list of URLs contained at least one URL, it should be expected that a response is available. However, it cannot be implied from get_pac, as perhaps PAC is not configured at all. Hence, either:
I am working on Px, a proxy server that does NTLM/Kerberos authentication and I recently added pypac to support PAC files.
Before pypac, Px was 6MB executable with PyInstaller and using around 10MB of RAM per process. After pypac, it's at 14MB and using 140MB of RAM per process. This is all on Windows.
I know pypac in itself only uses around 10-20MB of RAM but the dependency on js2py is 120MB worth. And this adds up quickly since Px can run in multiple processes to improve scale.
Long story short, at least on Windows, there's JScript which is free and available to execute the JS functions so depending on js2py seems relatively expensive. Using JScript along with PAC JS code from here, you have a simpler mechanism to process the PAC file.
No doubt, it might be slower (I plan on testing it) to do the check for each URL since you need to spawn a cscript.exe process for each call but RAM and size wise, it might be worth it.
I'm curious what your thoughts are on this matter.
I created a PACSession with a pac url but it seems to be incorrectly setting the proxy as "PROXY proxyIP:port" instead of just using "proxyIP:port".
I've created 2 test cases to illustrate my problem:
from pypac import PACSession, get_pac, download_pac
import requests
url = "https://www.turbotax.intuit.com"
pac_url="https://www.somepacurl.com/pac"
pac = get_pac(url=pac_url, allowed_content_types=['text/plain])
"""
This works
"""
proxy = pac.find_proxy_for_url("https://www.turbotax.intuit.com", "https://www.turbotax.intuit.com")
proxy = proxy.split(" ")[1]
proxies = { 'https': proxy, 'http': proxy }
s = requests.Session()
s.proxies = proxies
r = s.get(url)
print(r.text)
"""
This doesn't work
"""
s = PACSession(pac)
r = s.get(url)
print(r.text)
I just realized the check I uncommented in #68 breaks my application in more ways. For some reason the same PAC script, if subjected to the check, takes exceedingly long time to complete, crippling performance. I suppose it's not the problem in pypac
rather in the script, but given it appears it works OK on real URLs, so I'd like to make the validation optional.
My VPN adds an AutoConfigURL to the registry with the file:/// format. Since os.path.isfile() doesn't recognize this as a file, pypac fails in api.py and doesn't load the PAC file.
It will be great if pypac could identify such file:/// URLs and still load them.
Further, my VPN's PAC path is file:///Users\username... so it even skips the drive letter on Windows so file:/// should be replaced with \.
Assuming there are no objections. Travis is currently failing with Python 3.4 due to some issue with requirements / pipenv.
Hi,
I need to use Google API to get json info file and I must use proxy in order to connect in internet. On my computer I use .pac file to connect in internet as describre here.
The script I used it is :
from urllib2 import urlopen
import json
def get_jsonparsed_data(url):
response = urlopen(url)
data = response.read().decode("utf-8")
return json.loads(data)
url = ("http://maps.googleapis.com/maps/api/geocode/json?"
"address=googleplex&sensor=false")
print(get_jsonparsed_data(url))
When we want use proxy with urllib.open method the script syntax is :
urllib.urlopen(url[, data[, proxies[, context]]])
as desribe in the documentation.
How can insert pypac into my script in order to use my proxy (pac file) ?
Thanks a lot for your help.
Hi Carsonyl,
from pypac import PACSession
s = PACSession()
r = s.request('GET', url, None, stream=True, verify=self.get_verification_condition(event_data),
cookies=cookies)
Above is the piece of code I am running through my .pkg file while building an app using py2app.
I am getting some errors like SSL verify failed even verify is false for Mac OS.
If I run the install certificates.command and update shell profile.command then it works fine but in every system, there is a dependency of python for running those files.
Thanks in advance,
Shashanka
fail first time
I try to let itself to find system's proxy setting, while I really set proxy and can visit the goal site, it can't work
fail second time
I try to set the pac url in the code with the get_pac(), I dont know reason, the function result is None ,haha
fail third time
I try to download the pac file, read the pac file content to PACSession, yes, here is the Exception about follow picture.....really bad using tring, I quit haha
If there is a complicated redirect chain (e.g. HTTP -> HTTPS -> DIRECT ) PyPAC will not appropriately set the proxy server for each URL.
The following PoC provides an example fix:
https://gist.github.com/brad-anton/4e27b15df76e6eb2390d2b4c4e7e930d
In order to test our proxy pac, we would like to be able to set the ip address that myIpAddress
returns.
Would it be possible to have a config key that overwrites this value?
Thanks!
Hi:
I added the pypac module, and ran the following in ipython console of anaconda(spyder):
from pypac import PACSession
session = PACSession()
session.get('http://example.org')
Out[554]: <Response [407]>
What do I do next if i need to "pip install implicit"?
Still not clear how to go about downloading modules in anaconda by getting past pacs. Normally when i am websurfing i always have to enter user and password for proxy.
Is pypac the right way?
Thank you so much for your help. This is very frustating! :-)
Best regards,
Sagar
https://pypi.org/project/tld/ dropped support for Python 2.7 without bumping major version - I think pypac should act accordingly.
I would gladly submit a pull request to fix that - but I'm not sure what's the preferred way to fix this:
Hello, after several testing I found that dnsDomainIs is not following the common behavior.
For now it works as follow:
def dnsDomainIs(host, domain):
if domain.startswith('.'):
domain = '*' + domain
return shExpMatch(host, domain)
Therefor it means that when you have case like this one:
host = "subdomain.domain.com"
domain = "domain.com" (note the missing '.' in front of the domain)
In this case def dnsDomainIs doesn't match. However when trying this case in multiple browser (chrome, firefox, ie) it does match.
It makes me think that it this would be a lot more accurate with this kind of implementation:
def dnsDomainIs(host, domain):
return host.endswith(domain)
Let me know what you think about it.
Best regards,
Hi Carson,
Thanks for this library! This is great. It does mostly what we need it to do.
There's only one request here. Would it be possible to make some of the dependencies optional? We're trying to use it in the context of an async application, and I'm trying to limit the number of dependencies (because of supply chain attack surface, follow-up on security incidents, etc...) Given that we've an async application, there is zero need to have things like requests or requests-file in our dependency tree. I think tldextract is also not needed.
We can fetch the pacfile from a URL using httpx. The only thing we'd like to use pypac for is to parse the pacfile. So, pure I/O work. Having to add 5 additional dependencies to our dependency tree feels like overkill for resolving a proxy URL.
Is that anything you would consider? If you'd like, maybe I can find somebody to prepare a PR.
I am creating PACSession from pypac module and calling request method from PACSession object it not giving any response back when I am creating excutable using cx_freeze module.Same code works fine from python console.
I tried debugging the inbuilt files then I came to know that there is some issue with this method autoconfig_url_from_preferences() in this line config=SystemConfiguration.SCDynamicStoreCopyProxies(None)
which is in /pypac/os_settings.py . I tried it running as a subprocess it works fine but subprocess returns output as bytes or string but I want output as response object. Below is my code snippet:
try:
from pypac import PACSession
s = PACSession()
get_logger().debug("Getting Internet details from PACSession")
r = s.request('GET', url, None, stream=True, verify=self.get_verification_condition(event_data),
cookies=cookies)
except requests.exceptions.ProxyError as e:
get_logger().debug(e)
get_logger().debug("proxy failed trying direct connection")
I expect the output as response object but actually it is not returning anything i.e, it does not show error or exception.
pac file contents has "let" keyword declaring an integer variable.
let test_pac = 0;
Traceback (most recent call last):
File "/Users/shaahidahams/Desktop/automation/gvenv/lib/python3.8/site-packages/pypac/parser.py", line 60, in init
self._context.evaljs(pac_js)
File "/Users/shaahidahams/Desktop/automation/gvenv/lib/python3.8/site-packages/dukpy/evaljs.py", line 57, in evaljs
res = _dukpy.eval_string(self, jscode, jsvars)
_dukpy.JSRuntimeError: SyntaxError: unterminated statement (line 1)
at [anon] (eval:1) internal
at [anon] (duk_js_compiler.c:6826) internal
During handling of the above exception, another exception occurred:
Hi,
I opened an issue on pyjsparser right here: PiotrDabkowski/pyjsparser#16
I originally stumbled on this when using pypac. I just want to notify you of this.
If there is no progress in pyjsparser considering this problem, then I would probably reconsider the usage of that package.
Rgds
Nils
Hello,
I was testing the PAC handler on Windows. When retrieving the file from the registry and it is a file://
, this will not work: file://e:\\proxy.pac
. To make it work, we must convert to file://e:/proxy.pac
.
My question is: is it always in the good format in the registry? In your tests, you are not testing this form. Note that there is no issue here, I was just asking to prevent any regression in our product using your module :)
This is due to the new dukpy library.
An alternate dukpy fixes this, the merger request is here: #31
Thank you for pypac.
I have used it successfully to make api requests through my pac
This is the code I used:
def proxyconnect():
from pypac import PACSession, get_pac
return PACSession(get_pac(url='the url of my pac'))
then
def myfunction(variable):
session = proxyconnect()
api = 'the api url'
return session.get(api + variable).json()
This was basically from your usage examples and I found it very helpful for the particular api I needed to get data from
But, I'm also trying to query google analytics. My code is fine when not needing to go through my pac
I'm wondering if you are familiar with the python code provided by google analytics api v4 and if pypac can be made to work through it? I am not so skilled to know what to do.
this is the code I have working (when not needing to go through the proxy)
from oauth2client.service_account import ServiceAccountCredentials
from googleapiclient.discovery import build
def initialize_analyticsreporting():
credentials = ServiceAccountCredentials.from_json_keyfile_name(
KEY_FILE_LOCATION, SCOPES)
return build('analyticsreporting', 'v4', credentials = credentials)
def get_report(analytics, query):
return analytics.reports().batchGet(body=query).execute()
I'm really stuck. Is it possible to set python (I'm using Spyder through anaconda) to always go through a pac when trying to access the internet? (this pac has no user login or password)
Sorry if these questions are too ignorant.
My PAC file is served without a Content-Type
header. pypac checks for this header and doesn't accept the PAC file if the header is missing.
Currently, the following keywords are recognized from the PAC: DIRECT
PROXY
, SOCKS
.
Mozilla's PAC file doc says Firefox supports more keywords: HTTP
, HTTPS
, SOCKS4
, SOCKS5
.
Proposed interpretation:
HTTP host:port
-> http://host:port
(synonym for PROXY
)HTTPS host:port
-> https://host:port
SOCKS host:port
-> socks4://host:port
SOCKS5 host:port
-> socks5://host:port
The SOCKS
keyword is already assumed by default to be socks5://
.
Hi, could you please help me to understand what I am doing wrong in my code?
I can't understand why the pac is not get.
I'm new in Python and have some trouble in understanding of the exceptions mechanics.
Trying to guess how to check an exception, I used these variants:
try:
_pac = get_pac(url = 'https://antizapret.prostovpn.org/proxy.pac')
except MalformedPacError as e:
print(e.msg)
except Exception as e:
print(e.__class__)
except:
print('Unknown exception')
finally:
if not _pac:
print('pac is null')
else:
print('ok!')
I have none of exceptions to occur, but i get the pac is null
message as a result.
Is the problem in the specified pac-file code (it's pretty complicated)? But then PyPAC should tell me about it.
And I wonder if I ever do have to check exceptions at all, aren't they to be shown automatically by the py compiler with the default settings? I use to get a lot of exceptions in my other py exercises.
I'm confused and feel stupid.
Sorry if its a lame question and for my bad english also.
PS: Using python 3.7.3 in Windows if it matters.
TLDExtract performs an HTTP query to fetch valid top level domains. This is fine, except that this library will be mostly run within the context of a domain where proxy is enforced.
Enterprises that enforce proxying, are also likely to block requests that are not dispatched per policy. For this reason, it doesn't make sense to dispatch an HTTP request with the purpose of evaluating the proxy, as the proxy URL is more likely than not to be needed to dispatch such request.
For this reason, it should be considered that the base case for the library is that TLDExtract will not be able to dispatch this request and that it will fallback to the file with the TLDs.
Hence this library should:
Some of the options used by TLDExtract are not bad at all, however, they are not able to accommodate all cases. For example, within a pyinstaller executable, in which the package directory itself will be the location where the executable is located. In such cases where the application is being distributed on scale, the application may choose to contain a specific directory for such uses. Thus, applying one of the recommendations would be meaningful, and avoid the implementer a deep-dive into foreign code.
I think there is a bug in https://github.com/carsonyl/pypac/blob/master/pypac/resolver.py#L51 and https://github.com/carsonyl/pypac/blob/master/pypac/resolver.py#L55:
https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlparse is used to parse the URL. Then the .netloc
part is used as host when calling FindProxyForURL
to determine the proxy.
BUT the .netloc
part also contains the :
and the port number and not just the pure host name.
In contrast to that http://findproxyforurl.com/netscape-documentation/ states:
host
the hostname extracted from the URL. This is only for convenience, it is the exact same string as between :// and the first : or / after that. The port number is not included in this parameter. It can be extracted from the URL when necessary.
So splitting before the :
would help to resolve the correct host if there is a port number in the URL.
Hi folks.
The library requires tld
for the sole purpose of parsing the tld out of the host in one place in the code.
It makes using pypac
a bit harder for commercial usage. There is a much more popular library called tldextract
which is distributed under BSD-3clause license - which is much more permissive. I'd be happy to open a PR replacing tld
with tldextract
.
Running v0.1.0 downloaded from PyPI:
In [6]: session = PACSession('http://proxy.dataeng.mycompany.net/user/dataeng/proxy.pac')
In [7]: session.get("http://ip-100-74-44-105.ec2.internal:20888/proxy/application_1478638756790_1206426/")
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-7-586997e4bdf7> in <module>()
----> 1 session.get("http://ip-100-74-44-105.ec2.internal:20888/proxy/application_1478638756790_1206426/")
/usr/local/lib/python2.7/site-packages/requests/sessions.pyc in get(self, url, **kwargs)
485
486 kwargs.setdefault('allow_redirects', True)
--> 487 return self.request('GET', url, **kwargs)
488
489 def options(self, url, **kwargs):
/usr/local/lib/python2.7/site-packages/pypac/api.pyc in request(self, method, url, proxies, **kwargs)
155
156 if using_pac:
--> 157 proxies = self._proxy_resolver.get_proxy_for_requests(url)
158
159 while True:
/usr/local/lib/python2.7/site-packages/pypac/resolver.pyc in get_proxy_for_requests(self, url)
72 and 'DIRECT' is not configured as a fallback.
73 """
---> 74 proxy = self.get_proxy(url)
75 if not proxy:
76 raise ProxyConfigExhaustedError(url)
/usr/local/lib/python2.7/site-packages/pypac/resolver.pyc in get_proxy(self, url)
57 :rtype: str|None
58 """
---> 59 proxies = self.get_proxies(url)
60 for proxy in proxies:
61 if proxy == 'DIRECT' or proxy not in self._offline_proxies:
/usr/local/lib/python2.7/site-packages/pypac/resolver.pyc in get_proxies(self, url)
35 :rtype: list[str]
36 """
---> 37 value_from_js_func = self.pac.find_proxy_for_url(url, urlparse(url).netloc)
38 if value_from_js_func in self._cache:
39 return self._cache[value_from_js_func]
AttributeError: 'str' object has no attribute 'find_proxy_for_url'
For reference, the PAC file contains:
function regExpMatch(url, pattern) {
try { return new RegExp(pattern,"i").test(url); } catch(ex) { return false; }
}
function FindProxyForURL(url, host) {
if (regExpMatch(host, "^h2td\.master\.dataeng\.mycompany\.net$") ||
regExpMatch(host, "^bdp_h2td_[^\.]+\.master\.dataeng\.mycompany\.net$") ||
regExpMatch(host, "^100\.74\.([0-9]|[0-9][0-9]|1[01][0-9]|12[0-7])\.[0-9]+$") ||
regExpMatch(host, "^ip-100-74-([0-9]|[0-9][0-9]|1[01][0-9]|12[0-7])-[0-9]+(\.ec2\.internal)?$")
) {
return "SOCKS5 proxy.dataeng.mycompany.net:7778";
}
if (regExpMatch(host, "^10\.20[0-2]\.[0-9]+\.[0-9]+$") ||
regExpMatch(host, "^ip-10-20[0-2]-[0-9]+-[0-9]+(\.ec2\.internal)?$") ||
regExpMatch(host, "^100\.(6[4-9]|[7-9][0-9]|1[01][0-9]|12[0-7])\.[0-9]+\.[0-9]+$") ||
regExpMatch(host, "^ip-100-(6[4-9]|[7-9][0-9]|1[01][0-9]|12[0-7])-[0-9]+-[0-9]+(\.ec2\.internal)?$")
) {
return "DIRECT";
}
if (regExpMatch(host, "^.*\.master\.dataeng\.mycompany\.net$") ||
regExpMatch(host, "^.*\.internal$") ||
regExpMatch(host, "^10\.[0-9]+\.[0-9]+\.[0-9]+$") ||
regExpMatch(host, "^ip-10-[0-9]+-[0-9]+-[0-9]+$")
) {
return "SOCKS5 proxy.dataeng.mycompany.net:7777";
}
return "DIRECT";
}
pypac
doesn't handle redirects in PAC server response. Additionally, it should consider both the original URL's response and the target URL's Content-Type
for validation. I've discovered certain servers serve them with different content types, so if at least one response's Content-Type
satisfies the check, it should be considered valid.
Hi, thank you for creating pypac. I'm not certain if below use case was addressed before and I was not successful in finding this. Please help.
After loading the file I was able to use find_proxy_for_url to test the hosts and called parser_functions as an individual for testing. But how can I use the same file to run the parser_functions so I can pass the new URL/Host to the existing file "f" for testing? something like below:
and yes, I used find_proxy_for_url it returns the proxy redirect. which doesn't help in my use case.
proxy.pac
if(shExpMatch(host, *.abc)) return proxy_general;
with open('proxy.pac') as f:
pac = PACFile(f.read())
session = PACSession(pac)
shExp = parser_functions.shExpMatch(bc, pac )
or
shExp = parser_functions.shExpMatch (bc, f)
this returns me below error, but how can I achieve this?
shExpMatch
return fnmatch(host.lower(), pattern.lower())
AttributeError: 'PACFile' object has no attribute 'lower'
The current implementation of dnsResolve() returns None if the host cannot be resolved.
Because we now use dukpy, that value is potentially propagated back to the js engine if we depend on the result. For example, code such as:
isInNet(dnsResolve('bad-host', "10.1.1.0", "255.255.255.0"));
Will result in a duktape crash.
It seems that the pattern matching done by dnsDomainIs (in file parser_functions.py) is not correct.
For example, dsnDomainIs('www.example.com', 'example.com') returns False while it should return True.
pypac version: 0.9.0 (retrieved from pip)
Python version: Python 2.7.15 (v2.7.15:ca079a3ea3, Apr 30 2018, 16:30:26) [MSC v.1500 64 bit (AMD64)] on win32
When the configured pac file returns "DIRECT" for a URL then the function will fail with the following error:
File "file.py", line 27, in get_job_info
with pac_context_for_url('http://' + self.host):
File "C:\Python27\lib\contextlib.py", line 17, in __enter__
return self.gen.next()
File "C:\Python27\lib\site-packages\pypac\api.py", line 305, in pac_context_for_url
os.environ['HTTP_PROXY'] = proxies.get('http')
File "C:\Python27\lib\os.py", line 422, in __setitem__
putenv(key, item)
TypeError: putenv() argument 2 must be string, not None
In this case the pac file will return "DIRECT" for the url - the code gets to line 303
proxies = resolver.get_proxy_for_requests(url)
which will result in the code going to pypac/resolver.py:133 and the function proxy_parameter_for_requests("DIRECT") will be called. This will then set proxy_url_or_direct to None which will be used for the returns values in the dictionary for 'http' and 'https'. These value of None are used in
os.environ['HTTP_PROXY'] = proxies.get('http')
os.environ['HTTPS_PROXY'] = proxies.get('https')
Resulting in the error putenv() argument 2 must be string, not None
I installed pypac using pip on a fresh Python 3.6.0 install and I get the following error when trying to run the three lines sample:
Python 3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 07:18:10) [MSC v.1900 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.`
>>> from pypac import PACSession
>>> session = PACSession()
>>> session.get('http://www.google.com')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\jfroy3\AppData\Local\Programs\Python\Python36-32\lib\site-packages\requests\sessions.py", line 501, in get
return self.request('GET', url, **kwargs)
File "C:\Users\jfroy3\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pypac\api.py", line 150, in request
self.get_pac()
File "C:\Users\jfroy3\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pypac\api.py", line 215, in get_pac
pac = get_pac()
File "C:\Users\jfroy3\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pypac\api.py", line 41, in get_pac
return PACFile(downloaded_pac)
File "C:\Users\jfroy3\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pypac\parser.py", line 20, in __init__
orig_pyimport_meth = js2py.translators.pyjsparser.PyJsParser.parsePyimportStatement
AttributeError: module 'js2py.translators' has no attribute 'pyjsparser'
I'm not sure why and what the correct fix is, but I managed to patch it by removing 'pyjsparser.' from parser.py on lines 20, 21 and 33, like this:
20: orig_pyimport_meth = js2py.translators.PyJsParser.parsePyimportStatement
21: js2py.translators.PyJsParser.parsePyimportStatement = _raise_pyimport_error
...
33: js2py.translators.PyJsParser.parsePyimportStatement = orig_pyimport_meth
Cannot install a built wheel on Ubuntu (tested with Ubuntu 16.04/ pypac 0.10.1 / Python 3.5.4)
git clone https://github.com/carsonyl/pypac.git
cd pypac
python setup.py bdist_wheel
cd dist
pip install pypac-0.10.1-py2-py3-none-any.whl
FileNotFoundError: [Errno 2] No such file or directory: '/System/Library/CoreServices/SystemVersion.plist'
This is probably related to the line
pyobjc-framework-SystemConfiguration >= 3.2.1; sys.platform=="darwin"
in setup.py; but changing to 'linux' and rebuilding gives the same error.
When environment variables HTTPS_PROXY or HTTP_PROXY are defined, and the PACSession detects that no proxy should be used, self.proxies is None and as self.trust_env is not set to False, the proxies defined in the env variables are used instead of no proxy at all.
Adding self.trust_env = False
in the PACSession.init fixes the issue.
Hello, I work in a company where the security team uses IP in the url of the proxy file instead of the domain, the justification for it is that it becomes more difficult to track the domain that is best known.
In this step above I came across a problem, when the pypac searches for the domain and does not find it, generating an error, with this I will prepare a workaround solution to be used when you have the name of the url in IP, however it would be good to embed this in my own lib, I am creating some libs to work safely in python and I would like to leave here my contribution to pypac, remembering that I will be able to use this function in other projects of mine on github.
this causes an error:
pac = get_pac (url = 'http: //10.1.1.1/fileName.pac')
this works:
pac = get_pac (url = 'http: //domain.com/fileName.pac')
Workaround for the error:
url_domain_pac = exchange_ip_by_domain ('http://10.1.1.1/fileName.pac')
pac = get_pac (url = url_domain_pac)
Function of the solution described below:
def exchange_ip_by_domain (proxy_url):
import socket
import re
"" "
Function that receives the wpac url with ip and
after dns search translates to hostname with
domain
"" "
proxy_ip = None
pattern1 = re.compile ("(http | https): // (. *?) + /")
match = pattern1.match (proxy_url)
url_ip = match.group ()
pattern2 = re.compile (r '(\ d {1,3} . \ d {1,3} . \ d {1,3} . \ d {1,3})')
try:
proxy_ip = pattern2.search (url_ip) [0]
except (TypeError, AttributeError):
proxy_ip = None
proxy_name = None
try:
data = socket.gethostbyaddr (proxy_ip)
proxy_name = str (data [0])
except (socket.gaierror, socket.herror):
proxy_name = None
#print ("host_name", proxy_name)
if proxy_ip is not None and proxy_name is not None:
proxy_url = proxy_url.replace (proxy_ip, proxy_name)
return proxy_url
Hi,
Can you tell how can i test it with multiple source ip address . Something i tried like
from requests_toolbelt.adapters import source
session = PACSession(pac)
list_source_ip_and_url = [("10.129.xx.yy", "https://google.com"), ("10.189.yy.xx", "https://random.com/Surveyor.Web/")]
responses = []
try:
for source_ip, url in list_source_ip_and_url:
new_source = source.SourceAddressAdapter(source_ip)
session.mount('http://', new_source)
session.mount('https://', new_source)
responses.append(session.get(url))
print(responses)
except Exception as e:
print(e)
I am getting error like:
HTTPSConnectionPool(host='google.com', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x033D5547>: Failed to establish a new connection: [WinError 10049] The requested address is not valid in its context'))
Looks like im hitting some requests module error.
Hi,
I just installed pypac with pip and replaced my Session by a PACSession, and I get :
Traceback (most recent call last):
File "D:\TimecardAutomation\absReaderWitPAC.py", line 41, in <module>
login_page = s.get(login_url)
File "C:\Python27\lib\site-packages\requests\sessions.py", line 521, in get
return self.request('GET', url, **kwargs)
File "C:\Python27\lib\site-packages\pypac\api.py", line 184, in request
self.get_pac()
File "C:\Python27\lib\site-packages\pypac\api.py", line 249, in get_pac
pac = get_pac(recursion_limit=self._recursion_limit)
File "C:\Python27\lib\site-packages\pypac\api.py", line 57, in get_pac
return PACFile(downloaded_pac, **kwargs)
File "C:\Python27\lib\site-packages\pypac\parser.py", line 60, in __init__
context.execute(pac_js)
File "C:\Python27\lib\site-packages\js2py\evaljs.py", line 176, in execute
compiled = cache[hashkey] = compile(code, '<EvalJS snippet>', 'exec')
File "<EvalJS snippet>", line 221
def PyJs_LONG_87_(var=var):
^
IndentationError: too many levels of indentation
code is the following :
[from pypac import PACSession
from bs4 import BeautifulSoup
import os
username_field_name = "ctl00$ContentPlaceHolder1$Login1$UserName"
password_field_name = "ctl00$ContentPlaceHolder1$Login1$Password"
form_action_name = "***"
username = "***"
password = "***"
login_url = "***"
abscence_history_url = "***"
payload = {
username_field_name: username,
password_field_name: password,
"ctl00$ContentPlaceHolder1$HiddenUrlPage" : "***",
"ctl00$ContentPlaceHolder1$Login1$LoginButton" : "***",
"__EVENTTARGET" : "",
"__EVENTARGUMENT" : ""
}
with PACSession() as s:
login_page = s.get(login_url)
print "getting page 1"
login_page = s.get(login_url
login_soup = BeautifulSoup(login_page.content)
payload["__VIEWSTATE"] = login_soup.select_one("#__VIEWSTATE")["value"]
payload["__VIEWSTATEGENERATOR"] = login_soup.select_one("#__VIEWSTATEGENERATOR")["value"]
first_rep = s.post(login_url, data=payload)
response = s.get(abscence_history_url)](url)
It raises _dukpy.JSRuntimeError: EvalError: Error while calling Python Function: TypeError('inet_aton() argument 1 must be str, not bool')
when testing self.find_proxy_for_url("/", "0.0.0.0")
. Removing the check locally allows the file to be correctly consumed by my code. Note that my other issue needs to be resolved before you could use the URL as is, or manually follow the redirect and use the end URL when working with the PAC file.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.