Giter VIP home page Giter VIP logo

tproxy's Introduction

tproxy

tproxy is a simple TCP routing proxy (layer 7) built on Gevent that lets you configure the routine logic in Python. It's heavily inspired from proxy machine but have some unique features like the pre-fork worker model borrowed to Gunicorn.

Instalation

tproxy requires Python 2.x >= 2.5. Python 3.x support is planned.

$ pip install gevent
$ pip install tproxy

To install from source:

$ git clone git://github.com/benoitc/tproxy.git
$ cd tproxy
$ pip install -r requirements.txt
$ python setup.py install

Test your installation by running the command line:

$ tproxy examples/transparent.py

And go on http://127.0.0.1:5000 , you should see the google homepage.

Usage

$ tproxy -h

Usage: tproxy [OPTIONS] script_path

Options:
  --version                     show program's version number and exit
  -h, --help                    show this help message and exit
  --log-file=FILE               The log file to write to. [-]
  --log-level=LEVEL             The granularity of log outputs. [info]
  --log-config=FILE             The log config file to use. [None]
  -n STRING, --name=STRING      A base to use with setproctitle for process naming.
                                [None]
  -D, --daemon                  Daemonize the tproxy process. [False]
  -p FILE, --pid=FILE           A filename to use for the PID file. [None]
  -u USER, --user=USER          Switch worker processes to run as this user. [501]
  -g GROUP, --group=GROUP
                                Switch worker process to run as this group. [20]
  -m INT, --umask=INT           A bit mask for the file mode on files written by
                                tproxy. [0]
  -b ADDRESS, --bind=ADDRESS    The socket to bind. [127.0.0.1:8000]
  --backlog=INT                 The maximum number of pending connections.     [2048]
  --ssl-keyfile=STRING          Ssl key file [None]
  --ssl-certfile=STRING         Ssl ca certs file. contains concatenated
                                "certification [None]
  --ssl-ca-certs=STRING         Ssl ca certs file. contains concatenated
                                "certification [None]
  --ssl-cert-reqs=INT           Specifies whether a certificate is required from the
                                other [0]
  -w INT, --workers=INT         The number of worker process for handling requests. [1]
  --worker-connections=INT      The maximum number of simultaneous clients per worker.
                                [1000]
  -t INT, --timeout=INT         Workers silent for more than this many seconds are
                                killed and restarted. [30]

Signals

QUIT    -   Graceful shutdown. Stop accepting connections immediatly
            and wait until all connections close

TERM    -   Fast shutdown. Stop accepting and close all conections
            after 10s.
INT     -   Same as TERM

HUP     -   Graceful reloading. Reload all workers with the new code
            in your routing script.

USR2    -   Upgrade tproxy on the fly

TTIN    -   Increase the number of worker from 1

TTOU    -   Decrease the number of worker from 1

Exemple of routing script

import re
re_host = re.compile("Host:\s*(.*)\r\n")

class CouchDBRouter(object):
    # look at the routing table and return a couchdb node to use
    def lookup(self, name):
        """ do something """

router = CouchDBRouter()

# Perform content-aware routing based on the stream data. Here, the
# Host header information from the HTTP protocol is parsed to find the
# username and a lookup routine is run on the name to find the correct
# couchdb node. If no match can be made yet, do nothing with the
# connection. (make your own couchone server...)

def proxy(data):
    matches = re_host.findall(data)
    if matches:
        host = router.lookup(matches.pop())
        return {"remote": host}
    return None

Example SOCKS4 Proxy in 18 Lines

import socket
import struct

def proxy(data):
    if len(data) < 9:
        return

    command = ord(data[1])
    ip, port = socket.inet_ntoa(data[4:8]), struct.unpack(">H", data[2:4])[0]
    idx = data.index("\0")
    userid = data[8:idx]

    if command == 1: #connect
        return dict(remote="%s:%s" % (ip, port),
                reply="\0\x5a\0\0\0\0\0\0",
                data=data[idx:])
    else:
        return {"close": "\0\x5b\0\0\0\0\0\0"}

Example of returning a file

import os

WELCOME_FILE = os.path.join(os.path.dirname(__file__), "welcome.txt")

def proxy(data):
    fno = os.open(WELCOME_FILE, os.O_RDONLY)
    return {
            "file": fno,
            "reply": "HTTP/1.1 200 OK\r\n\r\n"
           }

Valid return values

  • { "remote:": string or tuple } - String is the host:port of the server that will be proxied.
  • { "remote": String, "data": String} - Same as above, but send the given data instead.
  • { "remote": String, "data": String, "reply": String} - Same as above, but reply with given data back to the client
  • None - Do nothing.
  • { "close": True } - Close the connection.
  • { "close": String } - Close the connection after sending the String.
  • { "file": String } - Return a file specify by the file path and close the connection.
  • { "file": String, "reply": String } - Return a file specify by the file path and close the connection.
  • { "file": Int, "reply": String} - Same as above but reply with given data back to the client
  • { "file": Int } - Return a file specify by its file descriptor
  • { "file": Int, "reply": String} - Same as above but reply with given data back to the client

Notes:

If sendfile API available it will be used to send a file with "file" command.

The file command can have 2 optionnnal parameters:

  • offset: argument specifies where to begin in the file.
  • nbytes: specifies how many bytes of the file should be sent

To handle ssl for remote connection you can add these optionals arguments:

  • ssl: True or False, if you want to connect with ssl
  • ssl_args: dict, optionals ssl arguments. Read the ssl documentation for more informations about them.

Handle errors

You can easily handling error by adding a proxy_error function in your script:

def proxy_error(client, e):
    pass

This function get the ClientConnection instance (current connection) as first arguments and the error exception in second argument.

Rewrite requests & responses

Main goal of tproxy is to allows you to route transparently tcp to your applications. But some case you want to do more. For example you need in HTTP 1.1 to change the Host header to make sure remote HTTP server will know what to do if uses virtual hosting.

To do that, add a rewrite_request function in your function to simply rewrite clienrt request and rewrite_response to rewrite the remote response. Both functions take a tproxy.rewrite.RewriteIO instance which is based on io.RawIOBase class.

See the httprewrite.py example for an example of HTTP rewrite.

Copyright

2011 (c) Benoît Chesneau <[email protected]>

tproxy's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tproxy's Issues

fork moved from gevent.hub to gevent.os

Traceback (most recent call last):
  File "/venv/bin/tproxy", line 7, in <module>
    from tproxy.app import run
  File "/venv/lib/python2.7/site-packages/tproxy/app.py", line 19, in <module>
    from . import util
  File "/venv/lib/python2.7/site-packages/tproxy/util.py", line 23, in <module>
    from gevent.hub import fork
ImportError: cannot import name fork

proxy over ssh dosen't work

  1. my proxy script test.py,
def proxy(data):
    print data
    return {"remote": "127.0.0.1:22"}
  1. Then I run tproxy with tproxy test.py -b 0.0.0.0:8887

  2. try to ssh 127.0.0.1,

vagrant@vagrant ~/tmp $ ssh -vT 127.0.0.1 -p 8887
OpenSSH_5.9p1 Debian-5ubuntu1, OpenSSL 1.0.1 14 Mar 2012
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 19: Applying options for *
debug1: Connecting to 127.0.0.1 [127.0.0.1] port 8887.
debug1: Connection established.
debug1: identity file /home/vagrant/.ssh/id_rsa type 1
debug1: Checking blacklist file /usr/share/ssh/blacklist.RSA-2048
debug1: Checking blacklist file /etc/ssh/blacklist.RSA-2048
debug1: identity file /home/vagrant/.ssh/id_rsa-cert type -1
debug1: identity file /home/vagrant/.ssh/id_dsa type -1
debug1: identity file /home/vagrant/.ssh/id_dsa-cert type -1
debug1: identity file /home/vagrant/.ssh/id_ecdsa type -1
debug1: identity file /home/vagrant/.ssh/id_ecdsa-cert type -1

Then it just blocked and nothing more comes out.

  1. output from tproxy,
vagrant@vagrant ~/github/tproxy $ tproxy examples/test.py -b 0.0.0.0:8887
2014-05-09 19:38:57 [30445] [INFO] tproxy 0.5.4 started
2014-05-09 19:38:57 [30445] [INFO] Listening on 0.0.0.0:8887
2014-05-09 19:38:57 [30446] [INFO] Booting worker with pid: 30446

Any suggestions?
Thanks!

tproxy + nginx

I wanted to use tproxy and nginx to serve static files. When I take my browser to tproxy url, most of the time the browser fail to show the page.

I got this in the console running tproxy with --log-level=debug

2011-05-10 23:04:17 [11337] [INFO] tproxy 0.5.3 started
2011-05-10 23:04:17 [11337] [INFO] Listening on 127.0.0.1:5000
2011-05-10 23:04:17 [11338] [INFO] Booting worker with pid: 11338
2011-05-10 23:21:07 [11338] [DEBUG] Successful connection to 127.0.0.1:8000
Traceback (most recent call last):
  File "/home/amirouche/ftv/hm2/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
    result = self._run(*self.args, **self.kwargs)
TypeError: proxy_io() takes at most 4 arguments (5 given)
<Greenlet at 0x1eef160: <bound method Route.proxy_io of <tproxy.route.Route object at 0x1ed7c90>>(<socket at 0x1f61a10 fileno=15 sock=127.0.0.1:5000, <socket at 0x7fcf5f0081d0 fileno=16 sock=127.0.0.1, [], None)> failed with TypeError

2011-05-10 23:21:07 [11338] [DEBUG] got data from input
2011-05-10 23:21:07 [11338] [DEBUG] Close connection to 127.0.0.1:8000
2011-05-10 23:21:07 [11338] [DEBUG] Successful connection to 127.0.0.1:7999
Traceback (most recent call last):
  File "/home/amirouche/ftv/hm2/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
    result = self._run(*self.args, **self.kwargs)
TypeError: proxy_io() takes at most 4 arguments (5 given)
<Greenlet at 0x1eef958: <bound method Route.proxy_io of <tproxy.route.Route object at 0x1ed7c90>>(<socket at 0x1f718d0 fileno=15 sock=127.0.0.1:5000, <socket at 0x1f79ed0 fileno=16 sock=127.0.0.1:5125, [], None)> failed with TypeError

2011-05-10 23:21:07 [11338] [DEBUG] Close connection to 127.0.0.1:7999```

To make things more clear at localhost:8000 there is gunicorn, and at 7999 there is nginx. 

here are the versions a used to do the test:```
gevent==0.13.6
greenlet==0.3.1
gunicorn==0.12.1
http-parser==0.3.3
-e git://github.com/benoitc/tproxy.git@0f32b5b0d8a1f004ed060db4729943b37a3306c1#egg=tproxy-dev
wsgiref==0.1.2```

there is an example setup at this repository https://bitbucket.org/abki/ftv/src.

I think it has to do something with nginx, since when I tried gunicorn+file served with tproxy (like in the readme example) it worked well.

I runned this tests on linux 2.6

tproxy support for script cli args

extend tproxy such that command line arguments can be passed all the way to the called script.

Current workaround:

assume the following command line (notice the extra --remote argument):

$ dproxy.py --daemon --log-file=MyLogFile.log --bind=127.0.0.1:5678 --remote=130.130.130.130:6789 transparent2.py

where dproxy.py replaces tproxy by supporting the --remote cli argument as follows:

from tproxy.app import run
from tproxy.config import *
import os

class RemoteAddress(Setting):
    name = "remote"
    section = "the address of the remote host"
    cli = ["-r", "--remote"]
    meta = "REMOTEADDRESS"
    default = "127.0.0.1:8080"
    validator = validate_string
    desc = """\
        address of remote host to connect to
        """
run() 

while transparent2.py would look as follows:

import argparse

# parse the arguments
parser          = argparse.ArgumentParser()
parser.add_argument('--remote',     dest='remote',      action='store', type=str, default=None)
args, remainder = parser.parse_known_args()

ip, port        = args.remote.split(':')

def proxy(data):
    return {'remote': (ip, int(port))}

although this approach works, it's still nicer if tproxy would provide support for extra cli arguments and possibly facilitate access to them without going through argparse in the called script.

"Connection refused" Error with example/httpproxy.py

Hi,

example/httpproxy always returns "Connection refused", whatever the URLs used.

http: //127.0.0.1:5000/http://www.google.fr

(tproxy)nassim@nba:~/Projects/tproxy$ tproxy examples/httpproxy.py -w 3 --log-level=debug
2011-05-12 10:29:47 [4174] [INFO] tproxy 0.5.4 started
2011-05-12 10:29:47 [4174] [INFO] Listening on 127.0.0.1:5000
2011-05-12 10:29:47 [4175] [INFO] Booting worker with pid: 4175
2011-05-12 10:29:47 [4176] [INFO] Booting worker with pid: 4176
2011-05-12 10:29:47 [4177] [INFO] Booting worker with pid: 4177
2011-05-12 10:29:51 [4175] [ERROR] Error while connecting: [socket error while connectinng: [[Errno 111] Connection refused]]
2011-05-12 10:29:56 [4177] [ERROR] Error while connecting: [socket error while connectinng: [[Errno 111] Connection refused]]

Version commit 8b37957
Python version 2.6.6

patch

Hello,

We found two issues when testing tproxy. Please find enclosed a quick patch to each, you may want to reformat or reorganise the code though.

Firstly the old import(name) syntax was deprecated, the code was changed to use the imp library.
Secondly, the class Worker is created before the fork. On my local version of gevent, it was causing the version in the parent, instead of the child to be used some of the time. The new code - only lightly tested - create a small class to store the information required which does not inherit from StreamServer

Regards

Thomas

diff -u /Users/thomas/source/others/proxy/tproxy/tproxy/arbiter.py tproxy/tproxy/arbiter.py
--- /Users/thomas/source/others/proxy/tproxy/tproxy/arbiter.py 2011-11-29 15:10:36.000000000 +0000
+++ tproxy/tproxy/arbiter.py 2011-11-29 14:50:44.000000000 +0000
@@ -19,7 +19,15 @@
from .pidfile import Pidfile
from .proxy import tcp_listener
from .worker import Worker
+from .workertmp import WorkerTmp

+class WorkerInfo(object):

  • def init(self, age, ppid, listener, cfg, script,tmp):

  •    self.name = cfg.name
    
  •    self.age = age
    
  •    self.ppid = ppid
    
  •    self.cfg = cfg
    
  •    self.tmp = tmp
    

    class HaltServer(Exception):
    @@ -386,14 +394,17 @@

    def spawn_worker(self):
    self.worker_age += 1

  •    worker = Worker(self.worker_age, self.pid, self.LISTENER, self.cfg,
    
  •            self.script)
    
  •    tmpworker = WorkerTmp(self.cfg)
    
  •    worker = WorkerInfo(self.worker_age, self.pid, self.LISTENER, self.cfg,
    
  •            self.script,tmpworker)
     pid = os.fork()
     if pid != 0:
         self.WORKERS[pid] = worker
         return
    
     # Process Child
    
  •    worker = Worker(self.worker_age, self.pid, self.LISTENER, self.cfg,
    
  •            self.script,tmpworker)
     worker_pid = os.getpid()
     try:
         self.log.info("Booting worker with pid: %s" % worker_pid)
    

    diff -u /Users/thomas/source/others/proxy/tproxy/tproxy/worker.py tproxy/tproxy/worker.py
    --- /Users/thomas/source/others/proxy/tproxy/tproxy/worker.py 2011-11-29 15:10:36.000000000 +0000
    +++ tproxy/tproxy/worker.py 2011-11-29 14:50:25.000000000 +0000
    @@ -14,7 +14,6 @@

    from . import util
    from .proxy import ProxyServer
    -from .workertmp import WorkerTmp

    class Worker(ProxyServer):

@@ -25,7 +24,7 @@

 PIPE = []
  • def init(self, age, ppid, listener, cfg, script):
  • def init(self, age, ppid, listener, cfg, script,tmp):
    ProxyServer.init(self, listener, script,
    spawn=Pool(cfg.worker_connections))

@@ -45,7 +44,7 @@
self.age = age
self.ppid = ppid
self.cfg = cfg

  •    self.tmp = WorkerTmp(cfg)
    
  •    self.tmp = tmp
     self.booted = False
     self.log = logging.getLogger(**name**)
    

diff -u /Users/thomas/source/others/proxy/tproxy/tproxy/tools.py tproxy/tproxy/tools.py
--- /Users/thomas/source/others/proxy/tproxy/tproxy/tools.py 2011-11-29 15:10:36.000000000 +0000
+++ tproxy/tproxy/tools.py 2011-11-29 15:14:27.000000000 +0000
@@ -3,6 +3,7 @@

This file is part of tproxy released under the MIT license.

See the NOTICE for more information.

+import imp

try:
from importlibe import import_module
@@ -40,5 +41,4 @@
break
level += 1
name = _resolve_name(name[level:], package, level)

  •    **import**(name)
    
  •    return sys.modules[name]
    
  •    return imp.load_source(name,name)
    

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.