Giter VIP home page Giter VIP logo

isi's People

Stargazers

Andrej Kastrin avatar  avatar Edgard Pineda avatar  avatar  avatar Max LUAN avatar  avatar  avatar FanFeng avatar WcW avatar  avatar Melanie Day avatar X. N. Chu avatar Aaron avatar  avatar Farooq Sadiq avatar Dan Hicks avatar Row Chen avatar Luca Braglia avatar

Watchers

James Cloos avatar Nick Guenther avatar Reid McIlroy-Young avatar  avatar

isi's Issues

Support ProQuest SSO in ezproxy.py

ProQuest has a different login flow than most academic publishers. The ezproxy config for it is a bit complicated. It has "patron login" and "SSO login". I was able to try out the SSO mode and discovered that after a couple rounds of redirects between it and Ebook Central, it constructs this URL:

f"https://ebookcentral.proquest.com/lib/{partner}/SignInPartnerUser?ebrary_username={partner}_{username}&partner_key={key}'"

partner and key seem to be an institutional login to ProQuest's database, partner being a codeword for the relaying site and key being the corresponding password; username seems to be arbitrary -- it's there to pass to Ebook Central the human username for logging, but is otherwise ignored. If partner and key are good, ebookcentral generates guest session cookies (JSESSIONID, EBSESSIONID and EBUQUSER; and the latter two are always equal) and gives them back to the calling user. After that point, the calling user talks to ebookcentral.proquest.com directly, without going through the proxy.

For bibliotecavirtual.uis.edu.co the codeword ispartner="bibliouissp" and for librarylogin-um.suagm.edu partner="ebooksumet-ebooks", for example.

Unlike OAuth or Kerberos, there is no public key scheme or backchannel communication from the auth server ezproxy.whatever.net to the target server ebookcentral.proquest.com. The authorization step happens entirely by ezproxy passing a key in a HTTP Location redirect.

Since this situation is special-cased in ezproxy, we need to special-case it in ezproxy.py.


Further work would be:

  • "patron login", whatever that means
  • supporting ProQuest's legacy "ebrary" software; here's the ezproxy config for it; perhaps it uses the same SSO scheme, since the magic SSO URL includes "ebrary" in it?

Support Knovel

Similar to ProQuest (!8), http://app.knovel.com/web/ uses some sort of SSO process which leaves the user connecting to that site directly but with a login cookie authorized by the relaying ezproxy.

Reverse engineer enough of this to support it in ezproxy.py.

Support port-based ezproxies

ezproxy can run in two modes: by port or by hostname. In port mode, https://ezproxy.example.com:$port proxies to https://paywalled-site.net. In hostname mode, https://paywalled-site.ezproxy.example.com proxies to paywalled-site.net. See https://help.oclc.org/Library_Management/EZproxy/Get_started/Evaluate_proxy_by_port_versus_proxy_by_hostname.

ezproxy.py is currently only compatible with by hostname mode, and I'm not 100% convinced it even does that right, because it has the assumption that the login page is going to be at https://login.ezproxy.example.com/login.

The relevant lines are around

isi/ezproxy.py

Lines 151 to 160 in 73a3577

def proxify(self, url):
scheme, host, path, params, query, fragment = urlparse(url)
# Response objects returned from this Session should have all references
# to the proxy in headers, cookies, and html silently stripped, so that following a link
# received from this class with this class doesn't end up going to "http://target.proxy.proxy"
# But that problem is unbounded in general. Instead we detect this common case and cancel it
if not host.endswith(self.address):
host = host + "." + self.address #construct the fully
return urlunparse((scheme, host, path, params, query, fragment))

and

isi/ezproxy.py

Lines 102 to 105 in 73a3577

r = super().request('POST',
self.proxify("https://login/login"), #this funny looking URL will be fixed by proxify() munges it
data=params)

Support non-proxy accounts

ISI does have an internal accounts system--they don't just sell themselves through library proxies.

I have no way to test this. If anyone does actually have an ISI account not via a research institution, please get in touch.

Use-composition-not-inheritence for ezproxy

I've falled out of love with inheritence trees and mixins. I would rather make the ezproxy instances I create return a fresh object, with no parent class, that fits the requests.Session API but is not a Session.

I think the easiest way to do this is to override getattribute to proxy calls to requests.Session, except for those we explicitly define.
... but maybe that's just reimplementing python's inheritence lookup rules all over again? I'm not sure. Anyway, it'll be a good excercise to try.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.