andybalholm / redwood Goto Github PK

View Code? Open in Web Editor NEW

215.0 215.0 36.0 1.14 MB

Web content filter that runs as an HTTP proxy

License: BSD 2-Clause "Simplified" License

Go 99.54% Shell 0.46%

redwood's People

Contributors

Stargazers

Watchers

redwood's Issues

How do I define ssl-bump exception rules for a big set of domains?

In squid I have the option to use regex for a single domain or multiple ie .microsoft.com will take both microsoft.com and all the subdomains.
And the destination IP is acl is missing for me, is it possible to add in some way an option for that?
Is the there such an option in RedWood?
(I am willing to put some efforts into it)

Redwood service stop after some time

Hey ,

I try to check the Redwood filter in port 80 only in transparent mode with heavy traffic (centos 7 ).
After some time (20 min) I get this in my messages log:
systemd: redwood.service: main process exited, code=exited, status=2/INVALIDARGUMENT
systemd: Unit redwood.service entered failed state.
systemd: redwood.service failed.

And if I try to restart again it fails, Why?

pruning elements

Quick question, my /etc/redwood/pruning.conf file has this line but the "More information..." link still shows up when I browse to http://example.com/

example.com a

Thoughts?

Regex support for negative lookbehind/lookahead

Hi,

Seems that the regex implementation doesn't support negative lookbehind/lookahead syntax/feature.

See here for more info:
Regex Tutorial - Lookahead and Lookbehind Zero-Length Assertions

Below an example of how I would like to use it (the ?! part is the negative lookahead):

^(.*\.)*((?!font[s]*).)*\.googleapis\.com$
Meaning whitelist googleapis.com domain, except if it is fonts.googleapis.com

This would make it easier just to block just one subdomain instead of whitelisting a lot if not all possible subdomains.

It is not a deal-breaker but makes it easier to whitelist stuff from a regex perspective.

See some regex examples here I use:
https://github.com/cbuijs/accomplist/blob/master/chris/regex.white
https://github.com/cbuijs/accomplist/blob/master/chris/tld-black.regex

The second one is interesting as it blocks anything except valid TLD's registered by IANA (it is a blacklist as it negates).

Cheers,
-Chris

Redirect all outgoing traffic through redwood

I want to setup redwood so that it transparently filters all the outgoing traffic on my mac.

For my attempt, I copied the default configuration and (I hope correctly) created the root certificate and key. After this, redwood was run as user nobody and the firewall (PF) was configured to redirect all outgoing traffic not arising from user nobody to the redwood ports.

This configuration works for http, but for each https request the redwood logs show something like:

2018-03-14 02:34:03.817397,192.168.1.6,,127.0.0.1:6510,infinite redirect loop

Here is the PF configuration file I am using.

rdr pass inet proto tcp from any to any port = 80 -> 127.0.0.1 port 6502
rdr pass inet proto tcp from any to any port = 443 -> 127.0.0.1 port 6510
pass out route-to (lo0 127.0.0.1) inet proto tcp from any to any port = 80 user != nobody 
pass out route-to (lo0 127.0.0.1) inet proto tcp from any to any port = 443 user != nobody

These config-rules user a workaround mentioned here because the redirect command only applies to incoming traffic, but that may or may not be very relevant here.

Even without going into the specific details, would such an approach work?

EDIT: By the way, awesome project :)

blackweb lists compatability

I have seen this blacklist:
https://github.com/maravento/blackweb

And was wondering, what would be the best way to use blackweb lists with RedWood?

EOF as the body of a response

I noticed that sometimes a request is being responded with the text "EOF".
In the logs I found something like:
2016/08/26 03:01:25 error fetching http://www.vassalengine.org/: EOF

Is it a known issue?

santanderbank login issue

I cant login to santanderbank.com through the redwood proxy. I dont get an error just, after i put in my password it takes be back to home page. I confirmed that I can login without the proxy.

I tried with a min redwood setup.

redwood.conf
http-proxy :6502
acls /etc/redwood/acls.conf
tls-cert /etc/redwood/root.pem
tls-key /etc/redwood/root_key.pem
acls.conf
acl connect method CONNECT
ssl-bump connect

Just reporting the issue.

Thanks

Integration of drbl-peer into redwod

Hey,
I wrote https://github.com/elico/drbl-peer and I wanted to add redwood support for it.
I can try to contributre the coding for it via a fork and then a pull requet but I will need coupel pointers.
The basic setup is based on a single text file which will contain the relevenat details.
The caching can be done using a caching DNS server and\or an http caching proxy so there is no need to implement caching of results inside redwood.

I think that a simple "if file exists" then "use" the function with the file settings would be the most apropriate.
The drbl peers list can be contain a custom DB or a publicone like OpenDNS or Symantec.

Per user content pruning?

Is it possible to do per user content pruning? Let's say I have a user that wants to have all images removed from a number of sites. Is there a way that I can do this and apply it to a specific user or group of users or is there only a global policy?

Check Client Certificate in SSLBump

Hey,

How can I get client certificate details like "Issued By" ( to see if it my self signed certificate)
in TLS.GO (Before tlsConn.Handshake() )

Thanks,

URL based filtering

We are looking for a proxy to filter outbound connections from our servers based upon an url whitelist. We thought about running a combination of iptables and Redsocks on our servers, which redirects outbund connections to a proxy for filtering (as done by Spotify)

This still poses problems as adding something like github.com to the whitelist exposes quite a lot of content to those servers. We would like to filter this based upon the requested URL such as https://github.com/andybalholm/redwood.*. Especially for bumped TLS traffic, as most of the destinations are encrypted already.

Is this something which is possible with redwood or worth considering as a future feature?

Not an ICAP server

The description says "Web content filter that runs as an ICAP server" but 
looking at the code this actually runs as a proxy, not an ICAP server.

Original issue reported on code.google.com by [email protected] on 16 Aug 2014 at 4:16

Build and install instructions?

Greetings,
I am new to golang and could you provide a set of build instructions?

I tried the following:

export GOPATH=/path/to/git-checkout-of/redwood/
go get ./

Then I got:
go install: no install location for directory /home/Code-Work/redwood outside GOPATH
For more details see: go help gopath

Not sure how to proceed. Please advise.

Logfile does not show real request?

I am pretty new to redwood so I am still trying to understand the mechanism. While doing that I stumble over the effect that one client request is printed to the log in a (for me) strange way:

This is the request my client is running:
curl --proxy http://proxy:18081 https://www.youtube.com

In the logfile I can see entries like this:
2021-02-26 15:18:59.682487,192.168.178.68,allow,http://www.youtube.com:443,CONNECT,0,,0,,youtube.com 1,"localbump 500, youtube 7",youtube,,,,HTTP/1.1,,

So if the client requests some HTTPS URL, why does redwood understand HTTP?

Domain Rewrite

Ref: #21

Would it be possible to implement a domain-changes capability similar to the query-changes option? This would enable the following types of scenarios:

youtube.com --> restrict.youtube.com
bing.com --> strict.bing.com
www.google.com --> forcesafesearch.google.com (alternate to vss querystring option)
duckduckgo.com --> safe.duckduckgo.com

HTTP/2 Is being used for all bumped HTTP/1.1 server which breaks connections and protocol

In my test's now I have a CentOS 7 server which has Cockpit on port 9090.
I am using RedWood with ssl-bump for all connections as a plain http proxy.
I have trouble accessing the cockpit web interface which is based on web sockets.
The basic issue is that the remote host is being bumped blindly into HTTP/2 but the remote server is using HTTP/1.1.
For many services it works fine but for websocket(wss://) and couple other security features the connections is breaking.
I do not remember where I have seen the sources on another project but, it is doable to verify what is the remote server tls/http support before forcing the client into HTTP/2.

What do you think @andybalholm?

Tagging versions

Currently there aren't any official versions tagging for the software.
If there is a possibility to say that this revision is has some versioning it would nice.
@andybalholm I can say that if you have a list of things that the software does then it's a list and can be percented from to V1.0 from 0% to 100% in some fashion.

What do you think?

Yahoo.com stylesheets not loading...Is this a proxy issue?

I'm getting a 400 error on all stylesheets when opening http://www.yahoo.com. The website opens fine without Redwood, have any of you guys experienced the same?

Content phrases problem

Hi,
I try to block pages based on:
< 18+ > 1000

but it blocks all pages that have only 18 (it seem not to handle the plus sign)
How to block page that have '18+' (I tried also to put 18+, but no effect) ?

Thanks

Content injection

Hi,
I'm thinking about adding content injection to this project. I am wondering if there is any technical reasons why the author don't include content injection or is it just not within the authors use case?

Thanks

SARG support

Can Redwood's log file be formatted for SARG support: https://sourceforge.net/p/sarg/wiki/Logs%20options/

Serve PAC file over HTTPS instead of HTTP

What would it take to have Redwood serve PAC files over HTTPS instead of HTTP? I believe there are issues with some apps on iOS that won't respect all the rules in PAC file if it isn't served securely.

build error from source: too many arguments in call to brotli.NewWriter

You're probably aware of this, but FYI:

# go get -u github.com/andybalholm/redwood
go/src/github.com/andybalholm/redwood/proxy.go:525:34: too many arguments in call to brotli.NewWriter
        have (*bytes.Buffer, brotli.WriterOptions)
        want (io.Writer)

Error in the README example of blockpage value

For quite a while I didn't found what is causing my setup to generate bad block page.
ie empty pages..
In the README the example is:
blockpage "/etc/redwood/block.html"

but it only works like this:
blockpage /etc/redwood/block.html

Without the double quotes.

Enhancement: Source ip acl action

We would like to specify the source ip address used for outbound requests in the ACL, so we are able to distinguish internal systems. Would that be an enhancement worth considering?

example:

acl users 10.0.0.0/24
acl managers 10.0.1.0/24
source 172.217.22.110 users
source 172.217.22.111 managers

Exclude some Https sites from ssl_bump

Hi,

There is some issue in "acl config" file that not working.
When I try this(in acl.conf):

acl connect method CONNECT
acl nobump url dk.com
ssl-bump connect !nobump

all working well(got dk.com real certificate)

But this not working:

acl connect method CONNECT
acl nobump url /dk/h
ssl-bump connect !nobump

Why can't use Url regular expressions?
How can I put sites tjat need to exclude from ssl-bump in some list file in categories dir(I try it without success)?

RPM package for CentOS 6 and 7 and DEB for Debian and ubuntu

Currently there is no binary package for CentOS 6 or 7 and other similar or for debian.
I have expirence with this and I think I can put it together in a way and to promote the server into more production environments which needs it.

content pruning example

Hi,
I am having trouble getting content pruning working.
I dont think the examples in the readme are still relevant.
If you can give me a current example of content pruning would be greatly appreciated.
Thanks

Further development in redwood-config categories

Thank you for this project for I have used much of your work in my production.
I am wondering how you came up with the result in redwood-config categories?
I would like to develop url and web content classification base on this project and make it more accurate by doing it automatically. Do you suggest any idea to do so?
I am currently doing it manually but I don't know what keywords or phrases would be suitable to add, how many points to give and it is time consuming doing it this way.

Thank you

iptables rules

whats the iptables rules for a linux router to use the redwood proxy in a separate machine on the local network?
Thanks

Content phrases support multiple words

Hey,

For now you support word matching like this(to block page need 100):

<game>  10
<sport>  10

If in page the word "game" count 5 times all ok
if in other page the word "sport" count 2 times all ok
Now I want that if in page there is "game" and also "sport" (other place in page) I want to block it,
how to do it?
This not support :

<game>,<sport> 150

Thanks

Integration into NethServer (based on CentOS)

I introduced RedWood to the NethServer community at:
https://community.nethserver.org/t/redwood-filtering-proxy-server/6714/14

It has good potential and can be used in many deployments.
The CentOS 7 package can be used on NethServer but someone there needs to put some time on integration with the current system webui.

The first step would be to be able to enable\disable and stop\start\reconfigure the service.
The next step would be to add a DNAT\REDIRECT option with bypass Interception for specific ip addresses or domains.

Then there is more but I'm not there yet.

Craigslist.org very slow to load

I am noticing that whenever I try to access Craigslist through redwood it is taking a really long time. Over 2 minutes to fully load a page. The issues appears to be related to loading the javascript and css assets.

I am attaching a screenshot from developer tools to show what I mean.

What can I do to get around this?

Update from config files

Hey,

Can we reload updated config files without Restart the Application (like reconfigure in squid)?

[Enhancement] Project Structure

I have inspected your project and found that the current structure is somehow confusing and hard to scale.
Do you have any plan in the future to refactor the current code towards the Package Oriented as Go structure?
Moreover, do you plan to supply the way to CRUD config and store in db?

dns server

Is there a simple way I can set the dns server that redwood should use?
Thanks

iptables typo?

In your README iptables command, I believe it should be

iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 80 -j REDIRECT --to-ports 6502
iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 443 -j REDIRECT --to-ports 6510

HTTPS

Hi,
I am having trouble configuring redwood for https.
For every https request the logs show:
2017-09-28 19:11:27,192.168.1.59,,,"error reading client hello: expected content type of 22, got 67",
What is content type 67? Any ideas?

Thanks

Url rewrite

Hey ,

In squid for example I used to change user url with "URL_Rewrite" so :
when he try to go to "www.google.com" I redirect him to 'bing.com'
How can I do here redirect users ?

Server buffer size defonition, is it exists?

Hey @andybalholm,
I have been using Squid-Cache and RedWood for quite some time and I would like to know if something like: read_ahead_gap of Squid-Cache exists in RedWood.
I am using it here since I have a server in an unrestricted and unlimited traffic zone in a server farm and a DSL line connected to this server farm.
When I am contacting directly many sites what happens is that I'm stuck because of some connection's limit per client and also by some QOS system for the DSL clients.
But the connection to the server farm which terminates the DSL pppoe connection is not restricted by any mean, I can utilize 100% of this segment.
With Squid-Cache I am using "ead_ahead_gap 16MB" and the server does the heavy lifting for me.
Is something like this exists or can be added into RedWood?

Blocking categories for specific users

I'm a little confused as to how you can block certain categories of sites based on the username. For example, I would like to create a policy that blocks all image searches by certain users while allowing it for others. I tried adding changing the action to acl under categories/image-search but how can I apply this to a specific username?

Cross Platform build script: Resolved, look inside

I wrote a nice script that build the binary for:
windows
linux's(arm included)
BSD's
Solaris
Darwin

Can we somehow pull it into the repo?
Or should I will post it is a gist?

Replace Content inside scanned Page

Hey,

Is there a way to replace html code inside scanned page?
For now we have the "pruning.conf" that can only remove from page.
For example can it do like Dansguardian (contentregexplist):

-> <script language='javascript'>....some code.... Or: .*car -> new Text

iCloud block on iOS

I setup Redwood on an ubuntu server and it works on my devices, including iOS, but I'm receiving a certificate error, which is blocking iCloud. Is there any extra configuration that needs to be done to fix this?

Will it be possible to integrate Tproxy interception?

I wrote a tiny example for Linux tproxy usage at: https://github.com/elico/go-linux-tproxy
And was wondering if we can somehow add tproxy support for redwood?

android app ssl certificate

I am getting this type of error when some android apps try to connect with the proxy.
2017-10-11 00:28:53,192.168.1.90,slack.com,slack.com:443,error in handshake with client: remote error: tls: unknown certificate,cached certificate
Does this mean the app is not trusting my CA? i.e. it has a pinned CA?
Thanks

Captive Portal Redirect

I have noticed that when I connect to a network with a captive portal I do not get a splash page allowing me to authenticate and am thus unable to browse the web. This happens on my iPhone as well as my laptop. I am aware that this is not necessarily an issue with Redwood but with proxies in general and was wondering if you knew of a solution.

YouTube urls and links classification

Part of the logic of my classification DB SquidBlocker is youtube related.
To allow the option to block urls based on external classification pages I will write a daemon that will receive youtube urls and will return a weight of a category inside a JSON.

Before reaching to YouTube I will write a tiny classification service that uses the drbl-peer(https://github.com/elico/drbl-peer/) library that will only check for malware and offending abusive content ie(porn ,nudity and violence).

step one: write a hostname logger daemon for the url "http://service/vote?url=http://google.com/file.js"
step two: advance the logic to test for a specific category at an endpoint like "http://service/abusive/vote?url=http://google.com/file.js"
step thee: create an end point by a score like in the spam work like: "http://service/128/vote?url=http://google.com/file.js"
... couple other steps.

128 is the test for both abusive and malware content(phishing is considered abusive) while not testing for other categories.

Let me know if it sounds right.

YouTube Restrict

Hey,

To enfoce Google SafeSearch I added in safesearch.conf file this line:

/google/d safe=active

and it working very well.

How can I do safe search for Youtube that say to add this:

YouTube-Restrict: Strict

in the header(not in the end of the url)?

andybalholm / redwood Goto Github PK

redwood's People

Contributors

Stargazers

Watchers

Forkers

redwood's Issues

Recommend Projects

Recommend Topics

Recommend Org