hillbrad / sri-addressable-caching Goto Github PK
View Code? Open in Web Editor NEWHow to do Subresource Integrity addressable caching.
How to do Subresource Integrity addressable caching.
I lost the link to the output document. Maybe add a link from the README or from the GitHub top bar?
Would it be possible to mitigate origin laundering if SRI based cached resources would also contain a list of past URLs this resource was fetched from?
That way, CSP could be validated against that list of URLs, and if none passes CSP, the resource would not be fetched from cache.
If a resource was served from cache, the current URL won't be added to the list of URLs associated with the resource until the browser would asynchronously validate the hash for that URL.
(Such an implementation might have significant memory and runtime costs, as it would need to keep a list of URLs with each resources and compare CSP to all of them. There might be ways this could be optimized away though)
Alternatively, we could also say that SRI based caching will not be used for sites that deploy CSP unless they also whitelist those particular hashes.
“A substantial portion of the content downloaded by web user agents today consists of JavaScript frameworks. There are a relatively few frameworks which are both extremely popular and large in size. Many applications include these frameworks even if they only use a small portion of the functionality they provide.
In consequence, downloading, parsing, just-in-time compiling these frameworks represents a significant amount of the total network, battery and time budget for a modern browser.”
Rewrite for clarity and concision: “A small number of large, popular web application frameworks account for a substantial portion of the network, battery, and time budgets for a modern web user agent (UA). Many applications include these frameworks even if they only use a small portion of the functionality they provide. It would be a great improvement if UAs could pre-cache and pre-compile these libraries a single time, especially for UAs that do double-keyed caching for privacy.”
Question: What is double-keyed hashing? Maybe link to a definition.
Question: But don’t people use a billion variants of jQuery that have only the features they need? And jQuery has a billion versions? So really, even though everyone uses jQuery, nobody uses the same jQuery as anyone else?
—
Suggestion: s/browser/UA/g
Suggestion: s/user agent/UA/g after the 1st instance
—
“Cache-Control header”: Use, or don’t use, the code font consistently for headers.
—
Timing leaks: If everyone really is using the same jQuery, then B can’t really assume that the UA got a copy from A, right? It could just as well have been from C – Z.
—
“sha-256:hash-of-a’s-data”: don’t use curly quotes in code.
Suggestion: Do use them in prose though.
—
“Content-Security-Policy CSP3 allows resources to specify from which origins script may be loaded, as an attack surface reduction.”
When discussing protocols and ceremonies, I find it super important to keep track of who does what when, and in what context. Suggestion for clarity: “To reduce an origin’s attack surface, Content-Security-Policy CSP3 allows origins to specify from which origins the UA may load script to run in the context of the parent frame.” Or “parent document” or whatever else you think is more accurate.
“it would be bad if an attacker could inject the following into the resource:” ➝ “…inject the following code into the parent document:”
<script src="https://scripts.example.com/angular.min.js" integrity="sha-256:..." /> ➝ <script src="https://scripts.example.com/angular.min.js" integrity="sha-256:DEADBEEF0BADCAFE…” /> “And by so doing force example.com to load an old version of the Angular framework that allows bypassing of CSP, if the real https://scripts.example.com origin didn’t actually have a copy of that resource.” ➝ “and by so doing cause the UA to load and execute an old, vulnerable, or otherwise incorrect version of Angular in the context of example.com.” “Similar restrictions might apply for features like Workers or ServiceWorkers which are required to be loaded same-origin as a similar security precaution.” ➝ “A similar risk might apply to features like Workers or ServiceWorkers. (As a security mechanism, UAs require that Workers and ServiceWorkers can only invoke scripts that come from the same origin. Origin laundering could bypass that mechanism.)” — “Luckily, browsers can do magic behind-the-scenes, and performance improvements, so long as they are correct from the perspective of application semantics and security, need not be precisely identical across the population of user agents.” Suggestion for clarity/concision: “As long as they do not violate the application’s semantics (including security), performance improvements need not be uniform across the population of UAs.” However, I’m not sure what that means. :) — “Origin laundering is a bit more difficult to address, but it can be handled with the same strategy that also informs population of the cache, so long as one is aware of the issue in advance.” Question: What strategy is that? — Typo: “alloted” ➝ allotted — Ahh, I see now what double-keying is. Maybe put a “(see below)” in the 1st place you mention it. — “If a resource’s Content-Security-Policy header explicitly lists the hash of an external resource as allowed, that could be interpreted as an authoritative statement that its origin provenance is irrelevant” ➝ “If a resource’s Content-Security-Policy header explicitly lists the hash of an external resource as allowed, the UA could interpret that as an authoritative statement that the resource’s provenance is irrelevant” —Examples in the document show issues with pure content-addressable browser caches. Existence of such a cache can affect interactions with servers at arbitrary origins and result in leaks of information to outside parties.
In some examples the third party can detect that a resource from another origin has already been loaded into the user agent because it is not requested.
An example is also given where the URL given for a resource can be deceptive, since content in the cache is looked up only by the subresource integrity metadata and not by the accompanying URL.
All of these issues can be remedied by using the SRI metadata in the user agent only for verification that the desired content is already in the cache. (This in addition to its role of verifying that the resource has not been tampered, e.g. in a CDN.) The cache is addressed by the URL, but the user agent compares the SRI metadata of the request with the content already present, and the cache is used if and only if they match. (Some special cases such as "hard reload" may still ignore the cache.) In the very common case where most resources in the cache are valid, this can still eliminate a great many costly network roundtrips that would otherwise result in 304 response codes.
Perhaps this option is well-known to the participants in the original discussion, but I do not see it referenced elsewhere. If it is known and perhaps even being pursued elsewhere, I would be very interested to learn.
The user visits evil.com, which contains the following line:
<script src="/probe.js" integrity="sha-256:hash-of-bank.com's-data" />
The evil.com server then tells the page whether or not it received a request. If it received the request, then the user has never been to bank.com, or has not been to bank.com with the specific script data that was hashed. If the request is never received, then the user has been to bank.com with that data.
(I think this can also be accomplished without server collaboration: if probe.js sets a global variable, then the presence of the global means that bank.com has not been visited, whereas its absence means bank.com has been visited in the past.)
This is not really a timing attack, but a 100% privacy leak. It seems worthwhile to mention it in the document.
It is not clear to me why hash based timing attacks are different from URL based timing attacks (already possible today).
Is it possible to clarify the differences between the two and why hash based timing attacks would reveal more that what is currently possible?
The first bug report left off here:
“it would be bad if an attacker could inject the following into the resource:” ➝ “…inject the following code into the parent document:”
So here's the rest:
<script src="https://scripts.example.com/angular.min.js" integrity="sha-256:..." /> ➝ <script src="https://scripts.example.com/angular.min.js" integrity="sha-256:DEADBEEF0BADCAFE…” /> “And by so doing force example.com to load an old version of the Angular framework that allows bypassing of CSP, if the real https://scripts.example.com origin didn’t actually have a copy of that resource.” ➝ “and by so doing cause the UA to load and execute an old, vulnerable, or otherwise incorrect version of Angular in the context of example.com.” “Similar restrictions might apply for features like Workers or ServiceWorkers which are required to be loaded same-origin as a similar security precaution.” ➝ “A similar risk might apply to features like Workers or ServiceWorkers. (As a security mechanism, UAs require that Workers and ServiceWorkers can only invoke scripts that come from the same origin. Origin laundering could bypass that mechanism.)” — “Luckily, browsers can do magic behind-the-scenes, and performance improvements, so long as they are correct from the perspective of application semantics and security, need not be precisely identical across the population of user agents.” Suggestion for clarity/concision: “As long as they do not violate the application’s semantics (including security), performance improvements need not be uniform across the population of UAs.” However, I’m not sure what that means. :) — “Origin laundering is a bit more difficult to address, but it can be handled with the same strategy that also informs population of the cache, so long as one is aware of the issue in advance.” Question: What strategy is that? — Typo: “alloted” ➝ allotted — Ahh, I see now what double-keying is. Maybe put a “(see below)” in the 1st place you mention it. — “If a resource’s Content-Security-Policy header explicitly lists the hash of an external resource as allowed, that could be interpreted as an authoritative statement that its origin provenance is irrelevant” ➝ “If a resource’s Content-Security-Policy header explicitly lists the hash of an external resource as allowed, the UA could interpret that as an authoritative statement that the resource’s provenance is irrelevant”A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.