Comments (9)
Update: For what it's worth, on onion.link we simply put a Varnish cache in front of tor2web and we cache contents using the various Varnish policies. This was the best solution we found and it's compatible with CDNs like Fastly (if someone else wanted to do that).
I strongly discourage tor2web from implementing its own caching system. It would be a lot of work, as well as reinventing the wheel. Just use Varnish/Squid/whatever.
For the curious, onion.link actually caches pretty aggressively. It caches anything that doesn't explicitly say "don't cache me", via one of the various HTTP headers. I choose this because (1) most sites don't set the cache headers and (2) if their site breaks, it will encourage the onion-site maintainer to start using the appropriate cache headers.
from tor2web.
@virgil on a sidenote, thank you for the wonderful onion.link. With its 'aggressive caching' feature it is the best Tor2Web implementations that exists, and I use it on a daily basis. Most of tor sites are either slow, or broken at times, and this caching pretty much prevents these problems from accessing the content.
Also, I agree with the caching policies you've set up. In my opinion, any tor2web setup should have the same caching policies for the reasons you stated.
from tor2web.
Some consideration from caching:
- Caching break dynamic website
An internet forum with html pages cached at 1h will be unusable cause it will have a forced refresh of 1h.
Caching must be as less invasive as possible. - Caching may bring additional responsibilities to a Tor2web node administrator
Keeping unwanted files on tor2web operator's server filesystem.
So the only good caching is the "in-memory" caching, that bring to a limitation on "how much stuff we can cache". - Caching does not specifically address latency issues
Providing a fully-offline-cache it's almost impossible while not controlling the backend-content, without breaking it.
For that reason caching may only act on specific kind of content (to avoid breaking t2w websites).
For that reason the "latency improvements" provided by caching will be relatively low. - Caching can provide bandwidth saving improvements for high-traffic websites
This is not currently an issue for Tor2web.
It may represent an issue in case of high-traffic websites, that with it's own requests may overload the Tor Hidden Services infrastructure (like overloading the rendezvous point / introductory point).
To mitigate/overcome this "possible" Tor limitation and provide a performance improvements, we may apply specific caching strategy to "high-traffic resources" in order to save Tor bandwidth.
My proposal is to apply caching only given the following conditions:
- Only in-memory caching (NO DISK WRITE)
- Only high-traffic resources caching (cache only when it's relevant to cache due to resource constraint)
- Only static-resources caching (only cache resources that does not include dynamic content)
from tor2web.
Caching seems supported by Twisted Web Client http://twistedmatrix.com/trac/ticket/5126 .
We may enable this for specific static objects, for top-accessed websites detected by #13, and/or for all websites but keeping those objects cached for a short amount of time, for example some hours (it's memory only cache)?
from tor2web.
Hellais brought he idea that the TorHS server should also be able to "influence" the Caching Behaviour of Tor2web Proxy by setting specific Caching related headers.
That way for example a TorHS node may specify, with appropriate caching headers, to provide a static cache of /index.html and the resources required to quickly display the website without waiting to connect to TorHS.
Such a cache should have a maximum expiry time.
from tor2web.
/cc @virgil
from tor2web.
This is already done. The TorHS can dictate caching behavior with the "Cache-Control:" HTTP headers.
Bluntly, caching is complicated. In my work I've simply used Fastly.com (which does DNS-man-in-the-middle) and caches the content as your backend responds. This is the same model that CloudFlare uses.
TL;DR---there's already specific caching related headers. Incorporate caching support into tor2web directly would be welcome, but given the existing solutions I argue other issues (e.g., client-side-rewriting, generalizing to .i2p, etc.) have higher priority.
from tor2web.
@virgil with the varnish setup on onion.link have you disabled paging? Do you have a config file you are willing to share?
I vote we close this topic. There are tools that solve this problem.
from tor2web.
I think using an external caching solution has many benefits. For example, having an already completed and stable platform instead of re-inventing the wheel is obviously a major advantage.
However, I also think that if such solution is going to be used, there should be proper documentation and example configuration files, so it can be easily and effortlessly used in each setup. Tor2Web should have easy integration setup for a Varnish/Squid setup.
@virgil, I use Varnish as a regular basis, but as @NSkelsey has pointed out, sharing an example config file is much appreciated.
from tor2web.
Related Issues (20)
- Recommend using certificates from LetsEncrypt instead of self-signed certs HOT 3
- tor2web broken on xenial HOT 2
- Tor2web Error
- Add support for Debian Buster (10) HOT 4
- Tor2web Error: Generic Error (500) HOT 1
- Unify Tor2web network to improve preformance and stability - proposal HOT 1
- Does tor2web support v3 onions? HOT 2
- ExecStart=/etc/init.d/tor2web start (code=exited, status=1/FAILURE)
- how can I use tor2web to access hidden service without tor brower? HOT 7
- how does tor2web works ? HOT 4
- i need support "Unhandled error in Deferred" HOT 3
- Could be possible to host on serverless/heroku? HOT 6
- Anything work?!?! HOT 1
- 404:not found HOT 8
- https://eqt5g4fuenphqinx.tor2web.org/antanistaticmap/stats/yesterday is offline
- .
- tor2web project is still running? HOT 3
- Install script linked in wiki dead
- 502 Bad Gateway HOT 1
- New Onion Link Index[2024]
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tor2web.