andybalholm / redwood Goto Github PK
View Code? Open in Web Editor NEWWeb content filter that runs as an HTTP proxy
License: BSD 2-Clause "Simplified" License
Web content filter that runs as an HTTP proxy
License: BSD 2-Clause "Simplified" License
In squid I have the option to use regex for a single domain or multiple ie .microsoft.com will take both microsoft.com and all the subdomains.
And the destination IP is acl is missing for me, is it possible to add in some way an option for that?
Is the there such an option in RedWood?
(I am willing to put some efforts into it)
Hey ,
I try to check the Redwood filter in port 80 only in transparent mode with heavy traffic (centos 7 ).
After some time (20 min) I get this in my messages log:
systemd: redwood.service: main process exited, code=exited, status=2/INVALIDARGUMENT
systemd: Unit redwood.service entered failed state.
systemd: redwood.service failed.
And if I try to restart again it fails, Why?
Quick question, my /etc/redwood/pruning.conf
file has this line but the "More information..." link still shows up when I browse to http://example.com/
example.com a
Thoughts?
Hi,
Seems that the regex implementation doesn't support negative lookbehind/lookahead syntax/feature.
See here for more info:
Regex Tutorial - Lookahead and Lookbehind Zero-Length Assertions
Below an example of how I would like to use it (the ?!
part is the negative lookahead):
^(.*\.)*((?!font[s]*).)*\.googleapis\.com$
googleapis.com
domain, except if it is fonts.googleapis.com
This would make it easier just to block just one subdomain instead of whitelisting a lot if not all possible subdomains.
It is not a deal-breaker but makes it easier to whitelist stuff from a regex perspective.
See some regex examples here I use:
https://github.com/cbuijs/accomplist/blob/master/chris/regex.white
https://github.com/cbuijs/accomplist/blob/master/chris/tld-black.regex
The second one is interesting as it blocks anything except valid TLD's registered by IANA (it is a blacklist as it negates).
Cheers,
-Chris
I want to setup redwood so that it transparently filters all the outgoing traffic on my mac.
For my attempt, I copied the default configuration and (I hope correctly) created the root certificate and key. After this, redwood was run as user nobody
and the firewall (PF) was configured to redirect all outgoing traffic not arising from user nobody
to the redwood ports.
This configuration works for http, but for each https request the redwood logs show something like:
2018-03-14 02:34:03.817397,192.168.1.6,,127.0.0.1:6510,infinite redirect loop
Here is the PF configuration file I am using.
rdr pass inet proto tcp from any to any port = 80 -> 127.0.0.1 port 6502
rdr pass inet proto tcp from any to any port = 443 -> 127.0.0.1 port 6510
pass out route-to (lo0 127.0.0.1) inet proto tcp from any to any port = 80 user != nobody
pass out route-to (lo0 127.0.0.1) inet proto tcp from any to any port = 443 user != nobody
These config-rules user a workaround mentioned here because the redirect command only applies to incoming traffic, but that may or may not be very relevant here.
Even without going into the specific details, would such an approach work?
EDIT: By the way, awesome project :)
I have seen this blacklist:
https://github.com/maravento/blackweb
And was wondering, what would be the best way to use blackweb lists with RedWood?
I noticed that sometimes a request is being responded with the text "EOF".
In the logs I found something like:
2016/08/26 03:01:25 error fetching http://www.vassalengine.org/: EOF
Is it a known issue?
I cant login to santanderbank.com through the redwood proxy. I dont get an error just, after i put in my password it takes be back to home page. I confirmed that I can login without the proxy.
I tried with a min redwood setup.
redwood.conf
http-proxy :6502
acls /etc/redwood/acls.conf
tls-cert /etc/redwood/root.pem
tls-key /etc/redwood/root_key.pem
acls.conf
acl connect method CONNECT
ssl-bump connect
Just reporting the issue.
Thanks
Hey,
I wrote https://github.com/elico/drbl-peer and I wanted to add redwood support for it.
I can try to contributre the coding for it via a fork and then a pull requet but I will need coupel pointers.
The basic setup is based on a single text file which will contain the relevenat details.
The caching can be done using a caching DNS server and\or an http caching proxy so there is no need to implement caching of results inside redwood.
I think that a simple "if file exists" then "use" the function with the file settings would be the most apropriate.
The drbl peers list can be contain a custom DB or a publicone like OpenDNS or Symantec.
Is it possible to do per user content pruning? Let's say I have a user that wants to have all images removed from a number of sites. Is there a way that I can do this and apply it to a specific user or group of users or is there only a global policy?
Hey,
How can I get client certificate details like "Issued By" ( to see if it my self signed certificate)
in TLS.GO (Before tlsConn.Handshake() )
Thanks,
We are looking for a proxy to filter outbound connections from our servers based upon an url whitelist. We thought about running a combination of iptables and Redsocks on our servers, which redirects outbund connections to a proxy for filtering (as done by Spotify)
This still poses problems as adding something like github.com
to the whitelist exposes quite a lot of content to those servers. We would like to filter this based upon the requested URL such as https://github.com/andybalholm/redwood.*
. Especially for bumped TLS traffic, as most of the destinations are encrypted already.
Is this something which is possible with redwood or worth considering as a future feature?
The description says "Web content filter that runs as an ICAP server" but
looking at the code this actually runs as a proxy, not an ICAP server.
Original issue reported on code.google.com by [email protected]
on 16 Aug 2014 at 4:16
Greetings,
I am new to golang and could you provide a set of build instructions?
I tried the following:
export GOPATH=/path/to/git-checkout-of/redwood/
go get ./
Then I got:
go install: no install location for directory /home/Code-Work/redwood outside GOPATH
For more details see: go help gopath
Not sure how to proceed. Please advise.
I am pretty new to redwood so I am still trying to understand the mechanism. While doing that I stumble over the effect that one client request is printed to the log in a (for me) strange way:
This is the request my client is running:
curl --proxy http://proxy:18081 https://www.youtube.com
In the logfile I can see entries like this:
2021-02-26 15:18:59.682487,192.168.178.68,allow,http://www.youtube.com:443,CONNECT,0,,0,,youtube.com 1,"localbump 500, youtube 7",youtube,,,,HTTP/1.1,,
So if the client requests some HTTPS URL, why does redwood understand HTTP?
Ref: #21
Would it be possible to implement a domain-changes
capability similar to the query-changes
option? This would enable the following types of scenarios:
In my test's now I have a CentOS 7 server which has Cockpit on port 9090.
I am using RedWood with ssl-bump for all connections as a plain http proxy.
I have trouble accessing the cockpit web interface which is based on web sockets.
The basic issue is that the remote host is being bumped blindly into HTTP/2 but the remote server is using HTTP/1.1.
For many services it works fine but for websocket(wss://) and couple other security features the connections is breaking.
I do not remember where I have seen the sources on another project but, it is doable to verify what is the remote server tls/http support before forcing the client into HTTP/2.
What do you think @andybalholm?
Currently there aren't any official versions tagging for the software.
If there is a possibility to say that this revision is has some versioning it would nice.
@andybalholm I can say that if you have a list of things that the software does then it's a list and can be percented from to V1.0 from 0% to 100% in some fashion.
What do you think?
I'm getting a 400 error on all stylesheets when opening http://www.yahoo.com. The website opens fine without Redwood, have any of you guys experienced the same?
Hi,
I try to block pages based on:
< 18+ > 1000
but it blocks all pages that have only 18 (it seem not to handle the plus sign)
How to block page that have '18+' (I tried also to put 18+, but no effect) ?
Thanks
Hi,
I'm thinking about adding content injection to this project. I am wondering if there is any technical reasons why the author don't include content injection or is it just not within the authors use case?
Thanks
Can Redwood's log file be formatted for SARG support: https://sourceforge.net/p/sarg/wiki/Logs%20options/
What would it take to have Redwood serve PAC files over HTTPS instead of HTTP? I believe there are issues with some apps on iOS that won't respect all the rules in PAC file if it isn't served securely.
You're probably aware of this, but FYI:
# go get -u github.com/andybalholm/redwood
go/src/github.com/andybalholm/redwood/proxy.go:525:34: too many arguments in call to brotli.NewWriter
have (*bytes.Buffer, brotli.WriterOptions)
want (io.Writer)
For quite a while I didn't found what is causing my setup to generate bad block page.
ie empty pages..
In the README the example is:
blockpage "/etc/redwood/block.html"
but it only works like this:
blockpage /etc/redwood/block.html
Without the double quotes.
We would like to specify the source ip address used for outbound requests in the ACL, so we are able to distinguish internal systems. Would that be an enhancement worth considering?
example:
acl users 10.0.0.0/24
acl managers 10.0.1.0/24
source 172.217.22.110 users
source 172.217.22.111 managers
Hi,
There is some issue in "acl config" file that not working.
When I try this(in acl.conf):
acl connect method CONNECT
acl nobump url dk.com
ssl-bump connect !nobump
all working well(got dk.com real certificate)
But this not working:
acl connect method CONNECT
acl nobump url /dk/h
ssl-bump connect !nobump
Why can't use Url regular expressions?
How can I put sites tjat need to exclude from ssl-bump in some list file in categories dir(I try it without success)?
Currently there is no binary package for CentOS 6 or 7 and other similar or for debian.
I have expirence with this and I think I can put it together in a way and to promote the server into more production environments which needs it.
Hi,
I am having trouble getting content pruning working.
I dont think the examples in the readme are still relevant.
If you can give me a current example of content pruning would be greatly appreciated.
Thanks
Thank you for this project for I have used much of your work in my production.
I am wondering how you came up with the result in redwood-config categories?
I would like to develop url and web content classification base on this project and make it more accurate by doing it automatically. Do you suggest any idea to do so?
I am currently doing it manually but I don't know what keywords or phrases would be suitable to add, how many points to give and it is time consuming doing it this way.
Thank you
whats the iptables rules for a linux router to use the redwood proxy in a separate machine on the local network?
Thanks
Hey,
For now you support word matching like this(to block page need 100):
<game> 10
<sport> 10
If in page the word "game" count 5 times all ok
if in other page the word "sport" count 2 times all ok
Now I want that if in page there is "game" and also "sport" (other place in page) I want to block it,
how to do it?
This not support :
<game>,<sport> 150
Thanks
I introduced RedWood to the NethServer community at:
https://community.nethserver.org/t/redwood-filtering-proxy-server/6714/14
It has good potential and can be used in many deployments.
The CentOS 7 package can be used on NethServer but someone there needs to put some time on integration with the current system webui.
The first step would be to be able to enable\disable and stop\start\reconfigure the service.
The next step would be to add a DNAT\REDIRECT option with bypass Interception for specific ip addresses or domains.
Then there is more but I'm not there yet.
I am noticing that whenever I try to access Craigslist through redwood it is taking a really long time. Over 2 minutes to fully load a page. The issues appears to be related to loading the javascript and css assets.
I am attaching a screenshot from developer tools to show what I mean.
What can I do to get around this?
Hey,
Can we reload updated config files without Restart the Application (like reconfigure in squid)?
I have inspected your project and found that the current structure is somehow confusing and hard to scale.
Do you have any plan in the future to refactor the current code towards the Package Oriented as Go structure?
Moreover, do you plan to supply the way to CRUD config and store in db?
Is there a simple way I can set the dns server that redwood should use?
Thanks
In your README iptables command, I believe it should be
iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 80 -j REDIRECT --to-ports 6502
iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 443 -j REDIRECT --to-ports 6510
Hi,
I am having trouble configuring redwood for https.
For every https request the logs show:
2017-09-28 19:11:27,192.168.1.59,,,"error reading client hello: expected content type of 22, got 67",
What is content type 67? Any ideas?
Thanks
Hey ,
In squid for example I used to change user url with "URL_Rewrite" so :
when he try to go to "www.google.com" I redirect him to 'bing.com'
How can I do here redirect users ?
Hey @andybalholm,
I have been using Squid-Cache and RedWood for quite some time and I would like to know if something like: read_ahead_gap of Squid-Cache exists in RedWood.
I am using it here since I have a server in an unrestricted and unlimited traffic zone in a server farm and a DSL line connected to this server farm.
When I am contacting directly many sites what happens is that I'm stuck because of some connection's limit per client and also by some QOS system for the DSL clients.
But the connection to the server farm which terminates the DSL pppoe connection is not restricted by any mean, I can utilize 100% of this segment.
With Squid-Cache I am using "ead_ahead_gap 16MB" and the server does the heavy lifting for me.
Is something like this exists or can be added into RedWood?
I'm a little confused as to how you can block certain categories of sites based on the username. For example, I would like to create a policy that blocks all image searches by certain users while allowing it for others. I tried adding changing the action to acl under categories/image-search but how can I apply this to a specific username?
I wrote a nice script that build the binary for:
windows
linux's(arm included)
BSD's
Solaris
Darwin
Can we somehow pull it into the repo?
Or should I will post it is a gist?
Hey,
Is there a way to replace html code inside scanned page?
For now we have the "pruning.conf" that can only remove from page.
For example can it do like Dansguardian (contentregexplist):
I setup Redwood on an ubuntu server and it works on my devices, including iOS, but I'm receiving a certificate error, which is blocking iCloud. Is there any extra configuration that needs to be done to fix this?
I wrote a tiny example for Linux tproxy usage at: https://github.com/elico/go-linux-tproxy
And was wondering if we can somehow add tproxy support for redwood?
I am getting this type of error when some android apps try to connect with the proxy.
2017-10-11 00:28:53,192.168.1.90,slack.com,slack.com:443,error in handshake with client: remote error: tls: unknown certificate,cached certificate
Does this mean the app is not trusting my CA? i.e. it has a pinned CA?
Thanks
I have noticed that when I connect to a network with a captive portal I do not get a splash page allowing me to authenticate and am thus unable to browse the web. This happens on my iPhone as well as my laptop. I am aware that this is not necessarily an issue with Redwood but with proxies in general and was wondering if you knew of a solution.
Part of the logic of my classification DB SquidBlocker is youtube related.
To allow the option to block urls based on external classification pages I will write a daemon that will receive youtube urls and will return a weight of a category inside a JSON.
Before reaching to YouTube I will write a tiny classification service that uses the drbl-peer(https://github.com/elico/drbl-peer/) library that will only check for malware and offending abusive content ie(porn ,nudity and violence).
128 is the test for both abusive and malware content(phishing is considered abusive) while not testing for other categories.
Let me know if it sounds right.
Hey,
To enfoce Google SafeSearch I added in safesearch.conf file this line:
/google/d safe=active
and it working very well.
How can I do safe search for Youtube that say to add this:
YouTube-Restrict: Strict
in the header(not in the end of the url)?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.