Giter VIP home page Giter VIP logo

tor2web / tor2web Goto Github PK

View Code? Open in Web Editor NEW
684.0 71.0 175.0 1.28 MB

Tor2web is an HTTP proxy software that enables access to Tor Hidden Services by mean of common web browsers

Home Page: https://www.tor2web.org

License: GNU Affero General Public License v3.0

Python 75.73% Shell 7.42% HTML 0.97% CSS 0.91% JavaScript 5.87% Smarty 9.10%
twisted python proxy tor https socks5 streaming anonimous-proxies transparency digital-human-rights

tor2web's Introduction

Tor2web is an HTTP proxy software that enables access to Tor Hidden Services by mean of common web browsers

The software was originally developed by Aaron Swartz and Virgil Griffith and is now mainteined by Giovanni Pellerano as part of the GlobaLeaks project.

Build status: Build Status Codacy Badge

Documentation

Donate

To support the Tor2web project you can help us with donations that will goes entirely for the software development.

Help us by sending us a small donation!

Contacts

License

This software is released under the AGPLv3 license. See LICENSE file for more information.

Copyright (c) 2011-2022 - Hermes Center for Transparency and Digital Human Rights

tor2web's People

Contributors

alexlauerman avatar evilaliv3 avatar fpietrosanti avatar gianlucagilardi avatar hellais avatar ileiva avatar ilv avatar mmaker avatar nskelsey avatar syd avatar vecna avatar virgil avatar wtf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tor2web's Issues

Caching support

Caching support for Tor2web has been discussed many times on Tor2web mailing lists, and different approach has been proposed.

Take a decision on caching strategy consensus, analyze implementation requirements and implement it.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/7768577-caching-support?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github).

Nym / Pet Name / Usable Hostname / Support

Tor2web translate TorHS as-is transparently.

Tor HS .onion URL are not very usable.

This ticket describe a feature for the creation of support of Nym support that would allow a Tor HS site to automatically configure it's host-name until it exists.

This feature has been described by Arturo and Jacob on Tor Git Repository: https://gitweb.torproject.org/torspec.git/blob/HEAD:/proposals/ideas/xxx-onion-nyms.txt

This feature, in order to be implemented, must analyze and document the following aspects:

  • How to handle nym registration/deregistration/auto-expiration
  • How to reduce the responsibility of the node running free pet registration

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/14807272-nym-pet-name-usable-hostname-support?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github).

Tor2web startup from init.d scripts

The Tor2web daemon must support startup from init.d scripts so it can be loaded on startup of linux server.

Additionally it must write a "pid" file in /var/run like per debian/ubuntu convention, so it's possible to include (like already done for tor2web 2.0 based on apache/glype) a software watchdog using monit that will restart the service in case of exceptional down.

Opendata of Tor2web visited websites

Each node will publish a list of previous 30 days accessed Tor2web URLs in real-time at a given Tor2web node url in a standard format.

This will allow batch-process to index the TorHS web spaces, to make statistics and analysis of Tor Hidden Service public space content (For example creation of search engine).

TorHS Does Not Exists vs Not Reachble

It has been identified that current version of Tor does not allow to know if a Tor HS exists or not and this reflect a bug of #31 .

So Tor2web is not able to distinguish between the two following error conditions:

  • Host not found (TorHS does not exists)
  • Host unreachable (Tor HS exists but is not reachable)

A ticket for fixing on Tor has been documented https://trac.torproject.org/projects/tor/ticket/6031

The fix for https://trac.torproject.org/projects/tor/ticket/6031 has been implemented by hellais.

The patch can be installed as follow:

wget -O patch_hs.patch 'https://gitweb.torproject.org/user/art/tor.git/patch/f6d3dc3d9e0e70f2c553ce254b49630bd98910e9?hp=ca525db02dbb026bda4305881476dada754c3ca3'

patch -p1 < patch_hs.patch

Procedure to apply the patch has been documented on https://github.com/globaleaks/Tor2web-3.0/wiki/Getting-started-with-tor2web .

Twisted Socks client must be improved to support the new return code (to be used with error handling) and Tor2web use it.

Part of this ticket will be cross-documenting the implementation also on Tor Project's Trac ticket 6031 to explain.

Blocklist comments

When someone manage a blacklist, it maybe suitable to be able to annotate something on the specific item of the blacklist.

It would be useful to be able to add notes to the blacklist like:

5b225270bb26ac71ba43d64e3f6ebddd # Added on 1-Apr-2012 - CP

Issue with gzip compressed page

It seems that t2w is not working properly for compressed pages.

TEST URL: https://g6lfrbqd3krju3ek.tor2web.org/ .

KO: Does not work with a browser (safari/firefox):

KO: Does not work with curl with enabled compression
curl -v -v --compressed https://g6lfrbqd3krju3ek.tor2web.org/

But it works with Curl/Wget default parameter: (no compression by default)
curl -v -v https://g6lfrbqd3krju3ek.tor2web.org/

So it seems that there is something related to gzip compression.

Connection Level Optimization

This ticket is to research, inspect and dump ideas on Connection Level (TCP Level) Optimizations that can be done to speedup Tor2web.

Some possible area of improvements are:

  • Connection / Socket Ruses
    Ex: Privoxy support "connection-sharing" as documented on http://www.privoxy.org/user-manual/config.html that allow to re-use an existing TCP connection for other client requests.
  • Connection / Socket long-keepalive
    Ex: TorHS have a big issue with high connection setup latency.
    Maybe keeping for long time connection established, eventually re-using it, could drammatically reduce the setup time to a TorHS.

Tor2web Web admin interface

Make a tor2web web administration interface capable of handling most essential operations, analyze statistics and manage blocklists.

Before implementation this feature must be specified.

This application need to be implemented as an APAF application, like current Tor2web 3.0.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/14807291-tor2web-web-admin-interface?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github).

Proxy trough static URL (es: http://x.tor2web.org)

Issue:

  • In some corporate environment the browser client have a new Certification Authority SSL. This permit SSL mitm by the corporate firewall, and works like a proxy even in HTTPS connection.
  • hiddenservice.tor2web.org bring to a DNS leak by the client

descriptive solution:

  • tor2web may support a special hostname x.tor2web.org and wait via POST the hidden service request and the accessed URL, avoiding both SSL proxy recording and DNS leaking.

Feature description:

when "x." subdomain is connected, all the parameters expected via GET and the destination host, are expected via POST

security and scalability

  • This would not provide a complete security against this kind of threat, because having SSL CA compromised would bring to a complete traffic interception, but would be almost a nice way to avoid the proxy logging (and start in support special security trigger selected by the hostname)
  • by hypothesis, x.tor2web.org would support this feature, and in future y.tor2web.org other, and then k.tor2web.org ... this is out of scope in this release, but develop "x" with this mindset would help future extensions

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/14807262-proxy-trough-static-url-es-http-x-tor2web-org?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github).

Uniquely Random Access URL

Following the draft specification for "networked tor2web" as a work in progress on #24 to distribute responsibility between nodes that provide access to the content and node that provide link to access the content (on other server), this feature implement the Unique Random Access URL logic.

In a typical scenario, a user will click on a blahblah.tor2web.org and will get presented with Access Disclaimer as defined on #15 and after acceptance will get redirected to a Uniquely Random Access URL, specific for that client on that server for that period.

This feature represent the foundation for future "networked tor2web" .

This ticket must be better specified following #24 draft.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/14807269-uniquely-random-access-url?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github).

Circuit Level Optimization

This ticket is to research, inspect and dump ideas on Circuit Level (Tor Level) Optimizations that can be done to speedup Tor2web.

Area of interests maybe:

  • Timeout / Keep-alive on circuit to reduce the amount of time required to create a circuit, being able to re-use existing ones
  • Selection of Entry Nodes based on Performance metrics, use the single tor2web hop as an high performance tor relay


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Tor2web user can fill an abuse request

In order to simplify abuse handling the Tor2web user must be able to fill an abuse request (like current tor2web 2.0 based on glype) from:

  • The injected disclaimer/abuse header
  • The acceptance disclaimer (to be done)

The abuse page contain a text that can be loaded from an external template.

The abuse page pre-fill the Hostname field with the Hostname of the HS the user is accessing (and for which he is reasonably reporting the abuse).

The abuse page must send an email to an operator trough a configured email account (into t2w configuration file).

It must be configurable to add an email account (like a gmail account) with SMTPS/SMTPTLS/username/password .

Remote Blacklists

When the number of tor2web networks and sites will grow, it will be an issue to handle sharing and alignments related to blacklists.

While each node would be able to handle it's own blacklist, it maybe desirable for one administrator of multiple tor2web nodes, to centralize the handling of blacklists.

In that case a basic feature of:

  • Remotely exposing over a web page the list of blacklisted host/url
  • Remotely importing blacklists over a specific URL

would allow easier maintenance and handling of blacklists required to keep tor2web network running without main issues.

Socks reuse and client isolation

While implementing Socks4a reuse and keep-alive I'm thinking about the problem of client isolation and this are my considerations:

  1. Socks is naturally reused on client HTTP "connection:keep-alive" connections (this seems a stupid consideration but it's not)
  2. a Socks MUSTN'T be reused for different clients due to possible communication Hijacking issues; in fact a reused socks could have residual data to be read due to a client disconnection before HS response.
  3. Due to point 1, we could reuse socks for client "connection: close" connections forcing the keep-alive on the HTTP request forwarded to the HS and providing a cookie to the client to validate the socks reuse.

this could help to understand the issue:
http://www.privoxy.org/user-manual/config.html#CONNECTION-SHARING
https://www.google.com/search?q=stream+isolation+tor

Exception via email

Any unhandled exception from Python should be sent via email.

This feature must be configurable, but for early alpha-beta testing of Tor2web it may be very important to facilitate detection of bugs.

Tor2web does not report Non existing TorHS

Tor2web, in case of error connecting to a TorHS, should tell to the user with an appropriate message if a Tor Hidden Service is unreachable or if it does not exists.

Tor2web currently does provide a 502 error for non existing Tor Hidden Services, while it should provide an appropriate message with it's own template saying that the TorHS is likely not existing.

For example the if we try the TorHS https://duskgytldkxiuqc6.tor2web.org/ it connect and works fine.

If we change the last char of the hostname, making it behave like https://duskgytldkxiuqc5.tor2web.org/ (so, most probably a not existing TorHS) it report 502 .

Tor2web email notification end up in spam

Email received from Tor2web notification system are missing several characteristics that drive them into spam folder:

  • Missing "From" description fields (it does not provide a description of the From, but only email)
  • Missing "To:" fields, does not specify who is the receiption of email
  • Wrongly formatted To fields with "TO" in capital letter while it should be "To:"
  • Wrongly formatted Subject fields with "SUBJECT" in capital letter while it should be "Subject:"
  • Missing mime encoding (with text/plain)

Example message:
"
From: [email protected]
TO:
SUBJECT: Tor2web node 194.150.168.70: notification for https://XXXX.tor2web.org/
"

This problem has been already an issue with GlobaLeaks 0.1, causing messages to end-up in spam folders.

Web API to check if using Tor

Certain client-side Javascript application on the internet require to know if they are using Tor or not.

Currently Tor2web have the awareness about the existing Tor Exit Relay due to feature #10 .

This feature is to implement a publicly available Web API, that allow a third party Javascript client, to request if is coming from Tor or not.

The API must be available for query via CORS, so appropriate header to allow cross-domain query must be provided.

Cc @hellais, @evilaliv3, @vecna .

Error and Blocked templates

In case of error, all errors must provide a nice and user-friendly interface reporting the error.

All the templates must be loadable from an external file.

Particular care must be given to:

  • Unreachable (torhs is not reachable)
  • Blocked (the url has been blocked)
  • Invalid format (for example asking a non torhs format hostname https://yyyyy.tor2web.org/)
  • Not found error (to be transparently proxed)

The text, css and graphics must be loaded from the old tor2web 2.0 site.

It should be interesting to "research" if it's possible to differentiate the following different error conditions:

  • Tor HS does not exists
  • Tor HS is not reachable

flag to disable censorship

I saw that you are blocking/censuring some hosting provider, even if there are many good things on them.
Now that you have added the frame declaring that you aren't hosting any contents you can disable this censorship.

If you want to maintain the censorship, another solution could be adding a flag, example:

With censorship
http://4eiruntyxxbgfv7o.onion/imgzapr/

Without
http://4eiruntyxxbgfv7o.onion/imgzapr/?censorship=off
It will change the cookie and then going back to http://4eiruntyxxbgfv7o.onion/imgzapr/ un-blocked.

Software Watchdog

Implement a software watchdog to verify the continuity of the service and alert the tor2web site administrator.

Create a Tor Hidden service mapped to this Tor2web instance and make short-circuited connections trough Tor to verify that the service is working properly, or if there is a Tor outage.

In that case evaluate whenever to restart Tor and retry and/or send an alert email.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

leaving links banner feature

is the leaving link banner a feature really needed?

if yes, it needs a research and specification due to the following problematics:

  • we can't provide a leaving link for any exiting resources different from html (example css, js, images ..)
  • the simple resource name could not indicate that the resource type is html

here are the only matching patterns i've identified to permit leaving links banner feature:
tag is 'a' attrbute is 'href'
tag is 'iframe' and attribute is 'src'

(by the way if the feature it's not really needed for tor2web as I think, I suggest to avoid a such overhead for a so basic and so restrict implementation)

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/14807254-leaving-links-banner-feature?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github).

Tor2web Directory

This ticket is about implementing a feature that solve a basic needs to allow Tor2web network to be networked, that is to have a directory of all tor2web nodes and their characteristics (this ticket does not still enter overall networking mechanism, but just how the Tor directory could be used by tor2web).

In order to do so we will use an existing directory that's related to the tor2web strict software dependency that it's Tor.

Tor2web nodes will announce themselves trough the Tor Directory Authority itself, by making any Tor2web nodes became also a Tor Relay.

The information metadata required for the tor2web nodes will be pushed trough standard existing entries such as:

  • Name
    • Used to identify that it's a tor2web node (ex: "Tor2web-MyOwnTor2webNode01")
  • Contact
    • Used to show the Tor Hidden Service hostname of the tor2web node. (ex: "blahblahblahblah.onion")
    • Used to show fingerprint of it's SSL certificate and root CA certificate

The Tor2web nodes acting as a relay must not hurt and/or create issues to the Tor network.
This topic has to be discussed within Tor community, to understand the right configuration parameters of Tor Relay.

Each tor2web node expose extended information trough http interface on a dedicated Tor Hidden Service.

This ticket would result from software perspective as a set of API to:

  • Get list of all Tor2web nodes and their metadata associated with it
    This will be done trough proper parsing of consensus downloaded via onionoo as already available at #10
    In order to identify all the tor2web nodes between all tor relays, in the consensus the "Name" will begin with "tor2web" string.
    In order to download the tor2web metadata from all tor2web nodes:
    • an http connection will be done to each Tor2web node's Tor Hidden Service (listed in in Tor Directory "Contact" field)
    • Query the Web Service REST API to download json encoded extended node information
  • Publish / Update an entry into Tor2web directory
    This API will publish Tor2web to the Tor Directory, create the Tor Hidden Service, Configure Tor to act as a Relay, writing Name (tor2web-Tor2WebNodeNickName) and Contact (Tor Hidden Service) fields.
  • Web Service (REST/Json) API serving Tor2web extended node information.
    This API expose via web all the extended information about tor2web nodes.
    This contain the most important and extended nodes information.

As a security check:

  • each tor2web node will verify that the SSL fingerprint of certificate and root's CA, match the one defined in Tor Directory Contact field by connecting over SSL both to the internet host and to the TorHS host.
  • each tor2web node will be tested on it's functionality by retrieving "our own tor hs url" trough it's tor2web url.

The implementation of Tor2web directory would enable further particular logic such as #24, by enabling automatic joining / leaving to a network.

TODO:

  • Define the data-format for the information to be provided in Extended node information

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/14807253-tor2web-directory?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github).

Prevent google crawling

Tor2web represent the access to a Darknet.

Into a Darknet there's no Google Search Engine.

We should implement a default robots.txt to prevent google from indexing both webpages and images on Tor2web sites.

https://www.google.com/support/webmasters/bin/answer.py?answer=156449

The option to provide a static Robots.txt preventing the site from working must be configurable.

Robots.txt by default must be returned to crawlers with:
User-agent: *
Disallow: /

Serve static content on "node hostname"

This feature describe the ability for a Tor2web node to serve static content over it's own specific IP or Hostname.

A tor2web node have to associate a unique identifier, being this an IP address or an hostname.

Over that unique identifier the Tor2web node must serve static files over http and https.

This feature is useful for a tor2web node that want to:

  • Provide a standard abuse disclaimer and/or general information (an index.html)
  • Publish system collected information (like the ones generated by mrtg)

The configuration file by default should be enabled and point, as a webroot, to the directory /static/tor2website/ with a default, standard index.html file .

The configuration by default should use the external IP address of the hostname, but it should be possible to optionally configure also another hostname (for example mytor2webnode.mydomain.org) .

Tor2web Access disclaimer

As additional improvement, to reduce the risks by Tor2web operators and better distribute tor2web nodes, we should implement an Access Disclaimer.

Access disclaimer is shown to every user that have not accepted the access disclaimer.
Upon acceptance of Access Disclaimer the user is given a temporary cookie.

Any requests without the appropriate acceptance of Access Disclaimer will result in a redirection to the Access Disclaimer page.

That way we can definitely prevent embedding and crawling possibly illegal content from internet forums, directly from Tor2web resources.

The Access Disclaimer must be loaded from an external template file.

The Access Disclaimer feature must be configurable, as there may be custom Tor2web usage scenario where it's not required.

The acceptance of the disclaimer may mitigate the "White page effect" as the user will immediately be prompted with some content (the disclaimer) and when he will click to accept disclaimer and see the TorHS website, a Javascript comfort loader may be provided.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/3638078-tor2web-access-disclaimer?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github).

Enable Secure Computing Mode

Following the implementation of #43 to properly ensure uid/gid privileges dropping and chroot of Tor2web, this ticket is to further improve the security of the daemon by enabling Linux's Secure Computing Mode.

Secure Computing Mode

Linux Kernel introduced Secure Computing Mode.
After the secure computing mode has been set to True, the only system calls that the thread is permitted to make are read(), write(), _exit(), and sigreturn(). Other system calls result in the delivery of a SIGKILL signal.

It is implemented by prctl.set_seccomp(mode) with package PRCTL available on http://packages.python.org/python-prctl/


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

TorHS cannot be accessed due to a Content Encoding Error

Some web pages are not accessible anymore with new Tor2web 3.0-alpha code.

In particular https://duskgytldkxiuqc6.tor2web.org/ is not accessible anymore, with the browser giving the error "Content Encoding Error" .

This problem is present on:

  • Firefox 14
  • Chrome
  • Safari

To my mind this problem has always happened and has been fixed by @hellais in past, but probably within the code reorganization has been reintroduced.

That's a critical bug, because from tor2web.org the first link is https://duskgytldkxiuqc6.tor2web.org/ and it provide a browser error.

We should also consider, how to detect such kind of error in an automatic way?
Are there a way to detect a "normal behavior" of Tor2web and detect when there are "problems" in general?

Content Encoding Error from Firefox/Safari

When visiting the website https://dsyghxm2xtmffaxx.tor2web.org/ all the browsers (Firefox/Safari) doesn't show the web page, but show the error "Content Encoding Error".

From wget everything works:

wget --no-check-certificate https://dsyghxm2xtmffaxx.tor2web.org/
--2012-04-01 19:11:40-- https://dsyghxm2xtmffaxx.tor2web.org/
Resolving dsyghxm2xtmffaxx.tor2web.org (dsyghxm2xtmffaxx.tor2web.org)... 194.150.168.70
Connecting to dsyghxm2xtmffaxx.tor2web.org (dsyghxm2xtmffaxx.tor2web.org)|194.150.168.70|:443... connected.
WARNING: cannot verify dsyghxm2xtmffaxx.tor2web.org's certificate, issued by /O=AlphaSSL/CN=AlphaSSL CA - G2': Unable to locally verify the issuer's authority. HTTP request sent, awaiting response... 200 OK Length: 177 [text/html] Saving to:index.html.1'

100%[=========================================================================================================================================================================>] 177 --.-K/s in 0s

2012-04-01 19:11:43 (169 MB/s) - `index.html.1' saved [177/177]

Tor2web support statistics

It must be considered that statistics are very important to understand and improve the usage and functionality of Tor2web without breaking user privacy.

That means that we should record (eventually into a dedicated file or database) all statistics on it's operations:

  • number of requests during time (the amount of access time)
  • requests that was success vs request that went in timeout (to know how t2w is behaving)
  • requests response time (to make minimum, maximum, average response time)
  • geo-ip of accessing users (without registering the ip, but doing a geo-ip lookup for each client to provide statistics on country of usage)
  • daily bandwidth usage

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/14807298-tor2web-support-statistics?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github).

case study on granularity of tor2web content filtering

the basic blocklist implementation provided in the alpha Tor2web release permits to filter:

  • to filter out an entire HS.
  • to filter out an HS specific url.

but does not permits:

  • to filter out an HS pattern (a generical initial pattern of the url path)

we need to evaluate, with a case study, the need of this additional granularity and to estimate the additional computational cost.

as suggested by stefanw we could also evaluate the possibility of use of an high end technique like bloom filtering.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/14807404-case-study-on-granularity-of-tor2web-content-filtering?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github).

Public Term of Service

Any tor2web node, in order to survive, must sometime block the tor2web's access to specific URL.

It must be known which are the TOS (Term of Service) from a specific Tor2web node related to the blocking policy.

This feature is to implement a configurable TOS file and make it visible from everywhere.

The term of service must be clearly visible and accessible from:

  • Injected banner
  • Access Disclaimer
  • Blocked page template
  • Abuse reporting page

The TOS file must be configurable from the configuration file.

A default TOS, provided by Tor2web project, must be defined. (We may need to engage lawyer to properly that).

Bandwidth Quota

In order to run easily a tor2web node over a server with limited bandwidth it would be useful for the system administrator to configure specific Bandwidth Quota monthly limits.

To be decided how to act upon quota limit reaching. This feature must be simple.


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

Large file upload over Tor2web

Given previous experience with Tor2web 1.0/2.0 interaction with File Upload functionality there was some issues.

In particular the main issues that arised was related file buffering, the Tor2web proxy was acquiring the file data at "internet speed" by buffering it, and then sending it to the TorHS at "Tor speed"
.
This created an issue with the "file progress bar" on internet client, ending up in few seconds up to 99% and then waiting stalled until Tor2web->TorHS progress was finished.

As an additional possible additional issues we should consider that File Resume support require Chunked Encoding that is not supported by Twisted Web Client http://twistedmatrix.com/trac/wiki/TwistedWebClient .

A test plan must be defined to understand the impact and possible mitigation strategies on the matter of big file upload, especially with suspend/resume support, trough Tor2web.

Blocklist support for URL

Blocklist should support URL in order to support more granularity in applying filters to keep tor2web network running.

Federated multi-domain/multi-server support

In order to increase resiliency of Tor2web, it would be required to have different persons running different tor2web networks over different domains names.

To be researched and defined on a consensus basis which would be the best way to implement distributed tor2web network.

Uniquely random Access is described on #33
Torweb Directory #41

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/14807288-federated-multi-domain-multi-server-support?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github).

Tor2web does not load Certificate Chain resulting in SSL warning

Tor2web does not load certificate chain files properly, this result in an error when using CA wildcard certificate that require the delivery of a certificate chain.

A good cert chain for current tor2web.org certificate is as follow (verified with openssl s_client -connect host:443) :

Certificate chain
0 s:/C=DE/OU=Domain Control Validated/CN=*.tor2web.org
i:/O=AlphaSSL/CN=AlphaSSL CA - G2
1 s:/O=AlphaSSL/CN=AlphaSSL CA - G2
i:/C=BE/O=GlobalSign nv-sa/OU=Root CA/CN=GlobalSign Root CA

The current cert chain is:
Certificate chain
0 s:/C=DE/OU=Domain Control Validated/CN=*.tor2web.org
i:/O=AlphaSSL/CN=AlphaSSL CA - G2

An analysis of that problem and a possible proposed solution is described at:
http://twistedmatrix.com/pipermail/twisted-python/2010-July/022597.html

Direct access to images from outside must be blocked

BUG: Block Directly accessed URL for image resources from outside

Direct requests to images running on tor2web must be blocked and appropriate error provided, in order to avoid internet forums to linking to images hosted on tor2web.

On apache it has been implemented that way:
Block Direct Access to URL
Prevent ppl to link on external forum direct access to Tor2web hosted images
RewriteCond %{REQUEST_URI} (.gif$)|(.jpg$)|(.png$)
RewriteCond %{HTTP_REFERER} !.tor2web.org.
RewriteRule (.*) - [G,L]

It maybe valuable to implement it as tor2web logic and not nginx logic.

Add Tor2web http header

It's reasonable to provide the Tor HS a way to let know that a specific user is coming from Tor2web.

This is done by setting up an X-tor2web HTTP header following the specification described globaleaks/GlobaLeaks#99 and currently implemented by apache:
RequestHeader set X-tor2web "encrypted"

This must be added following globaleaks/GlobaLeaks#99

Security Enforcement of Daemon (uid/gid+chroot)

In order to properly enforce security of the tor2web proxy, it must run with a dedicated uid/gid and automatically chroot into it's own directory.

Implementing this kind of feature require taking care of:

  • fixing installation procedures
  • handling location and permission of configuration files, digital certificates and of log files

Twisted support by default chroot by command line, it must be evaluated whenever it's better to chroot by twistd command line or from within the application.

Twisted support the following cmdline switch http://linux.die.net/man/1/twistd :

  • --chroot
    Chroot to a supplied directory before running (default: don't chroot). Chrooting is done before changing the current directory.
  • -u, --uid
    The uid to run as. (default: don't change)
  • -g, --gid
    The gid to run as. (default: don't change)

Some good info on that are available on http://www.tsheffler.com/blog/?p=526

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/14807251-security-enforcement-of-daemon-uid-gid-chroot?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F318575&utm_medium=issues&utm_source=github).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.