Comments (23)
For anyone suffering this issue, can take a look this PR #350 and apply the patch
from safaribooks.
It seems that the login function has a problem. I need to used cookie instead: #150 (comment)
from safaribooks.
[-] Logging into Safari Books Online...
[*] Sending request to https://learning.oreilly.com/login/unified/?next=/home/:
This felt weird, so I looked into the code a bit today. It sent the request to https://learning.oreilly.com/profile/
and then straight to https://learning.oreilly.com/api/v1/book/XXXXXXXX/
. It then proceeded with downloading the book.
Not sure what's wrong, but here are a few pointers to check:
- execute a program like
python3 safaribooks.py XXXXXXXX
(without the--cred
argument) cookies.json
content is not a 1:1 copy of Firefox's 'Copy All' output. Only put theRequest Cookies
value there, i.e.:
{
"_abck":"XXXXX",
"_dd_s":"XXXXX",
"_evga_5802":"XXXXX",
"_ga":"XXXXX",
"_ga_4WZYL59WMV":"XXXXX",
"_gat_UA-112091926-1":"XXXXX",
"_gid":"XXXXX",
"_sfid_472e":"XXXXX",
"ak_bmsc":"XXXXX",
"akaalb_LearningALB":"XXXXX",
"AMP_49f7a68a85":"XXXXX",
"AMP_MKTG_49f7a68a85":"XXXXX",
"bm_sv":"XXXXX",
"bm_sz":"XXXXX",
"groot_sessionid":"XXXXX",
"OptanonAlertBoxClosed":"XXXXX",
"OptanonConsent":"XXXXX",
"orm-jwt":"XXXXX",
"orm-rt":"XXXXX"
}
from safaribooks.
O'Reilly implemented https://www.akamai.com/products/bot-manager on their site, the cookies needed after login for authentication are orm-jwt and orm-rt. And I think also groot_sessionid.
from safaribooks.
I pulled with chrome no dice, I pulled with firefox and now I'm getting this error:
[24/Aug/2023 14:23:44] ** Welcome to SafariBooks! **
[24/Aug/2023 14:23:45] Authentication issue: unable to access profile page.
[24/Aug/2023 14:23:45] Last request done:
URL: https://learning.oreilly.com/profile/
DATA: None
OTHERS: {}
----> cookie set info here <-----
Found. Redirecting to https://www.oreilly.com/accounts/login-academic-ch eck/?next=https%3A%2F%2Flearning.oreilly.com%2Fprofile%2F
I started over completely, pulled the latest repo, and get this error using the cookies.json file.
from safaribooks.
@lijie-jiang I ran into the same problem. After comparing the cookie I got from #150 with the previous one I had, I found the actual content is wrapped inside another layer.
Basically you need to remove this content from beginning
{
"Request Cookies":
and also extra
}
at the end.
from safaribooks.
Me too. It looks like the website's login page has changed?
[19/Aug/2023 08:41:00] Logging into Safari Books Online...
[19/Aug/2023 08:43:29] ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
[19/Aug/2023 08:43:29] Login: unable to perform auth to Safari Books Online.
Try again...
[19/Aug/2023 08:43:29] Last request done:
URL: https://learning.oreilly.com/login/unified/?next=/home/
DATA: None
OTHERS: {}
404
Server: istio-envoy
Content-Type: text/html; charset=utf-8
Content-Length: 40538
from safaribooks.
As a workarround you can still use cookie auth
from safaribooks.
I'm getting the same issues, and for some reason the instructions on the #150 aren't working for me in chrome or firefox.
from safaribooks.
I'm getting the same issues, and for some reason the instructions on the #150 aren't working for me in chrome or firefox.
Can you elaborate on what issues are you running into?
I was able to download a book with the #150 just fine, grabbing cookies using Firefox.
from safaribooks.
I was following the #150 but get error below using chrome:
[#] Unhandled Exception: too many values to unpack (expected 2) (type: ValueError)
[!] Aborting...
from safaribooks.
Hitting the same issue, so added some output to see what's going on. Looks like it's hitting a 404 when trying to log in:
The first request (to LOGIN_ENTRY_URL) returns a 404, and then the LOGIN_URL request just hangs forever.
[-] Logging into Safari Books Online...
[*] Sending request to https://learning.oreilly.com/login/unified/?next=/home/:
[*] Method: (<requests.sessions.Session object at 0x10b5faaf0>, 'get')
[*] Data: None
[*] kwargs: {}
[*] Response: <Response [404]>
[*] Sending request to https://www.oreilly.com/member/auth/login/:
[*] Method: (<requests.sessions.Session object at 0x10b5faaf0>, 'post')
[*] Data: None
[*] kwargs: {'json': {'email': 'REDACTED', 'password': 'REDACTED', 'redirect_uri': 'https://api.oreilly.com%2Fhome%2F'}}
Getting browser cookies as suggested in #150 doesn't help in this case. I have downloaded a number of books before this started happening, but I would expect to get something like Too Many Requests if I have exceeded some limits 🤷
from safaribooks.
from safaribooks.
- execute a program like
python3 safaribooks.py XXXXXXXX
(without the--cred
argument)
This worked for me too, thank you @lukasvavrek! It bypasses the login request and plows on.
Should the default behaviour be to use the existing cookies file? Could also add a --refresh-cookies
flag to force a fresh login?
from safaribooks.
I pulled the cookies using firefox, put them in a cookies.json, and when I do python3 safaribooks.py I get this:
[#] Unhandled Exception: too many values to unpack (expected 2) (type: ValueError)
[!] Aborting...
I'm glad it's working for some people, just not sure what I'm doing wrong at this point. I verified I have the latest clone of the repo as well.
from safaribooks.
I pulled the cookies using firefox, put them in a cookies.json
@digitalw00t make sure that you modify your cookie.json
file. As I mentioned above, it is not a 1:1 copy (see this).
from safaribooks.
The the lazy, here's a quick python converter from the json export:
#!/usr/bin/env python3
import json
# Define the input file name
input_file = "sample.json" # Replace with the actual file name
# Define a dictionary to store the converted data
converted_data = {}
# Read data from the input JSON file
try:
with open(input_file, 'r') as file:
data = json.load(file)
except FileNotFoundError:
print(f"Error: File '{input_file}' not found.")
exit(1)
except json.JSONDecodeError:
print(f"Error: Invalid JSON format in '{input_file}'.")
exit(1)
# Iterate through each JSON object in the data
for entry in data:
name_raw = entry.get("Name raw", "")
content_raw = entry.get("Content raw", "")
# Check if both name_raw and content_raw exist
if name_raw and content_raw:
converted_data[name_raw] = content_raw
# Display the converted data to the screen
print(json.dumps(converted_data, indent=2))
from safaribooks.
Just put the contents in sample.json or whatever you wanna change the filename to, and bam. It's working now btw.
from safaribooks.
Out of curiosity, with all the cookies from the site, how do you know those cookies specifically are the ones that are required?
from safaribooks.
Hello folks, thanks everyone for the support. I really love this community ❤️
Let's do some work:
- Do you think we can replicate the new login mechanism into the
safaribooks
program? - Do you think it can be useful to add a complete guide explaining how to export cookies from a browser session?
from safaribooks.
The second one, at the beginning ;-)
from safaribooks.
For the first one, we'd have to understand how the login mechanism even works,which I honestly don't.
from safaribooks.
Here's a python script to convert the Cookie header value to json:
import sys
import json
from http.cookies import SimpleCookie, CookieError
def parse_cookie_string(cookie_str):
cookie = SimpleCookie()
parsed_cookies = {}
# Try to load the entire cookie string first.
try:
cookie.load(cookie_str)
parsed_cookies.update({k: v.value for k, v in cookie.items()})
except CookieError:
# If there's an error, split the string and try each key-value pair individually.
for kv in cookie_str.split(';'):
try:
cookie.load(kv.strip())
parsed_cookies.update({k: v.value for k, v in cookie.items()})
except CookieError:
pass # Skip illegal key-values.
return parsed_cookies
def main():
rawdata = sys.stdin.read().strip()
parsed_cookies = parse_cookie_string(rawdata)
print(json.dumps(parsed_cookies, indent=4))
if __name__ == "__main__":
main()
usage:
echo '<COOKIE_VALUE_HERE>' | python3 convert_cookies > cookies.json
from safaribooks.
Related Issues (20)
- Downloading from public library providing Oreilly subscription HOT 1
- Images from books are corrupt HOT 3
- Auth Failure. - Unexpected error! HOT 3
- flask3.9 ImportError: cannot import name 'escape' from 'jinja2' HOT 1
- Authentication issue: unable to access profile page. HOT 8
- Cannot sudo rm -rf some .log file so cannot download my book HOT 1
- Parser: book content's corrupted or not present: ch01.xhtml
- Unhandled Exception: 'rights' (type: KeyError) HOT 1
- Trial account not working due to email issue HOT 2
- Error trying to parse this page
- SSO, Company, University, etc., Login Problems: *READ BEFORE NEW ISSUE* HOT 1
- Crawler: error trying to parse this page: c02.xhtml HOT 5
- Every chapter only has first page HOT 1
- Parser: book content's corrupted or not present
- download all books in specific playlist
- Table titles appear vertically HOT 1
- Stuck at login HOT 1
- 'Connection aborted.', RemoteDisconnected('Remote end closed connection without response') HOT 1
- Still being maintained? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from safaribooks.