Comments (9)
I can see your point of having the double encoding issues.
First, excuse me for not able to disclose how yahoo internally implement this and that.
But let's take one of the most popular open-sourced projects - Wordpress as an example. From https://github.com/WordPress/WordPress/search?utf8=%E2%9C%93&q=rawurlencode, you can see that rawurlencode()
(i.e., http://php.net/manual/en/function.rawurlencode.php, much like the php's encodeURI()
implementing RFC3986) is found on where the URL is about to be concatenated as a part of HTML output. And occasionally, urldecode()
runs exactly before that. It's inapparent to me that rawurlencode()
is placed before DB calls. A modern DB can properly store non-alphanumeric characters.
from xss-filters.
It's inapparent to me that rawurlencode() is placed before DB calls. A modern DB can properly store non-alphanumeric characters.
It's not obvious why one should do that.
Not to mention that WordPress is an example of terrible code practices one should never use as arguments.
from xss-filters.
It's inapparent to me that rawurlencode() is placed before DB calls. A modern DB can properly store non-alphanumeric characters.
It's not obvious why one should do that.
@zerkms, As long as the DB can store those non-alphanumeric characters, you're agreeing that rawurlencode()
is not needed before data is stored at rest, right?
I concur that wordpress might not be a good enough example. So, let's take a look at the following one.
In general, DB should store the raw text from users http://example.com/你好
, which will be encoded using encodeURI()
or encodeURIComponent()
depending on the output context.
- In
<a href="/redirect?url={{url}}">
,encodeURIComponent('http://example.com/你好')
is needed, as inuriComponentInDoubleQuotedAttr()
- In
<a href="{{url}}">
,encodeURI('http://example.com/你好')
is needed, as inuriInDoubleQuotedAttr()
If an input filtering using encodeURI()
is applied to the url before being saved into DB, what stored in the DB will be http://example.com/%E4%BD%A0%E5%A5%BD
. Imagine if @bitinn's suggestion is followed, i.e., removing the encodeURI()
and encodeURIComponent()
at output filters. One could expect to see something like <a href="/redirect?url=http://example.com/%E4%BD%A0%E5%A5%BD">
, which is undesirable.
It is because the correct encoded output should be <a href="/redirect?url=http%3A%2F%2Fexample.com%2F%E4%BD%A0%E5%A5%BD">
.
So, unfortunately, @bitinn may need to workaround the problem by running decodeURI()
on everything that is previously encoded before using the context-aware output filters (e.g., uriInDoubleQuotedAttr(decodeURI(url))
.
from xss-filters.
@adon-at-work just want to clarify one main point: the raw user input is http://example.com/%E4%BD%A0%E5%A5%BD
, because browser automatically escape them when user copy the url from address bar.
So you mean it's better to do decodeURI on input than on output, in general?
from xss-filters.
Yes, something like uriInDoubleQuotedAttr(decodeURI(url))
will ensure a correct output.
from xss-filters.
To further elaborate, I guess you're asking which of the following is preferable:
- when collecting input,
url = decodeURI(url); saveDB(url)
. at output,uriInDoubleQuotedAttr(url)
- when collecting input,
saveDB(url)
. at output,uriInDoubleQuotedAttr(decodeURI(url))
Surely, both will work. The first approach is better since decodeURI()
runs only once per 'save', and does not need to run every time the page is visited.
from xss-filters.
@adon-at-work cheers! on a semi-related note: we use xss-filters with virtual-dom and find on many instance we have to decode xss-filtered output again for virtual-dom
to output properly, eg:
var input = '> <';
var filtered = filter. inHTMLData(input); // > <
var vdom = h('div', filtered) // > &lt;
So we end up needing decode again before step 3, because on client-side virtual-dom
eventually generate text node using document.createTextNode
and it auto-encode &
. And on server we follow the same standard.
TL;DR in many vdom solution the encoding is done automatically, leading for some performance lost when using xss-filters (nothing serious, but annoying enough).
from xss-filters.
I will open an issue on virtual-dom
, close for now.
from xss-filters.
So we end up needing decode again before step 3
xss filters if not applied at the last step before output is always error-prone.
on client-side virtual-dom eventually generate text node using
document.createTextNode
No xss filters are needed if document.createTextNode()
is used. But, I don't know virtual-dom
enough to tell whether it's 100% time being used. You may like to confirm with them.
from xss-filters.
Related Issues (17)
- null byte should be stripped? HOT 4
- Bypass uriInUnQuotedAttr using data: scheme. HOT 3
- Add tests for https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet HOT 2
- Version mismatch in bower.json and package.json HOT 1
- Question: a basic API for most common use-case? HOT 4
- xss-filters and hmac HOT 3
- manual css style attribute filter HOT 2
- manual js string filter HOT 1
- [Question] Is it production ready? HOT 3
- Why is a blacklist approach taken for URI protocols, and would a whitelist be safer? HOT 3
- Travis build on Node 0.12 fails (timeout) HOT 4
- $ and . (mongo) HOT 3
- NPM error concerning shasum failure HOT 2
- Question: DON'T apply any filters inside any scriptable contexts? HOT 2
- Server side usage to sanitize input
- npm ERR! code EPROTO
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xss-filters.