Giter VIP home page Giter VIP logo

sri-addressable-caching's People

Contributors

hillbrad avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

sri-addressable-caching's Issues

Origin laundering mitigation

Would it be possible to mitigate origin laundering if SRI based cached resources would also contain a list of past URLs this resource was fetched from?

That way, CSP could be validated against that list of URLs, and if none passes CSP, the resource would not be fetched from cache.

If a resource was served from cache, the current URL won't be added to the list of URLs associated with the resource until the browser would asynchronously validate the hash for that URL.

(Such an implementation might have significant memory and runtime costs, as it would need to keep a list of URLs with each resources and compare CSP to all of them. There might be ways this could be optimized away though)

Alternatively, we could also say that SRI based caching will not be used for sites that deploy CSP unless they also whitelist those particular hashes.

Suggestions, questions, thoughts on the text

“A substantial portion of the content downloaded by web user agents today consists of JavaScript frameworks. There are a relatively few frameworks which are both extremely popular and large in size. Many applications include these frameworks even if they only use a small portion of the functionality they provide.

In consequence, downloading, parsing, just-in-time compiling these frameworks represents a significant amount of the total network, battery and time budget for a modern browser.”

Rewrite for clarity and concision: “A small number of large, popular web application frameworks account for a substantial portion of the network, battery, and time budgets for a modern web user agent (UA). Many applications include these frameworks even if they only use a small portion of the functionality they provide. It would be a great improvement if UAs could pre-cache and pre-compile these libraries a single time, especially for UAs that do double-keyed caching for privacy.”

Question: What is double-keyed hashing? Maybe link to a definition.

Question: But don’t people use a billion variants of jQuery that have only the features they need? And jQuery has a billion versions? So really, even though everyone uses jQuery, nobody uses the same jQuery as anyone else?

Suggestion: s/browser/UA/g
Suggestion: s/user agent/UA/g after the 1st instance

“Cache-Control header”: Use, or don’t use, the code font consistently for headers.

Timing leaks: If everyone really is using the same jQuery, then B can’t really assume that the UA got a copy from A, right? It could just as well have been from C – Z.

“sha-256:hash-of-a’s-data”: don’t use curly quotes in code.

Suggestion: Do use them in prose though.

“Content-Security-Policy CSP3 allows resources to specify from which origins script may be loaded, as an attack surface reduction.”

When discussing protocols and ceremonies, I find it super important to keep track of who does what when, and in what context. Suggestion for clarity: “To reduce an origin’s attack surface, Content-Security-Policy CSP3 allows origins to specify from which origins the UA may load script to run in the context of the parent frame.” Or “parent document” or whatever else you think is more accurate.

“it would be bad if an attacker could inject the following into the resource:” ➝ “…inject the following code into the parent document:”

<script src="https://scripts.example.com/angular.min.js" integrity="sha-256:..." /> ➝ <script src="https://scripts.example.com/angular.min.js" integrity="sha-256:DEADBEEF0BADCAFE…” /> “And by so doing force example.com to load an old version of the Angular framework that allows bypassing of CSP, if the real https://scripts.example.com origin didn’t actually have a copy of that resource.” ➝ “and by so doing cause the UA to load and execute an old, vulnerable, or otherwise incorrect version of Angular in the context of example.com.” “Similar restrictions might apply for features like Workers or ServiceWorkers which are required to be loaded same-origin as a similar security precaution.” ➝ “A similar risk might apply to features like Workers or ServiceWorkers. (As a security mechanism, UAs require that Workers and ServiceWorkers can only invoke scripts that come from the same origin. Origin laundering could bypass that mechanism.)” — “Luckily, browsers can do magic behind-the-scenes, and performance improvements, so long as they are correct from the perspective of application semantics and security, need not be precisely identical across the population of user agents.” Suggestion for clarity/concision: “As long as they do not violate the application’s semantics (including security), performance improvements need not be uniform across the population of UAs.” However, I’m not sure what that means. :) — “Origin laundering is a bit more difficult to address, but it can be handled with the same strategy that also informs population of the cache, so long as one is aware of the issue in advance.” Question: What strategy is that? — Typo: “alloted” ➝ allotted — Ahh, I see now what double-keying is. Maybe put a “(see below)” in the 1st place you mention it. — “If a resource’s Content-Security-Policy header explicitly lists the hash of an external resource as allowed, that could be interpreted as an authoritative statement that its origin provenance is irrelevant” ➝ “If a resource’s Content-Security-Policy header explicitly lists the hash of an external resource as allowed, the UA could interpret that as an authoritative statement that the resource’s provenance is irrelevant” —

Subresource integrity for cache content verification

Examples in the document show issues with pure content-addressable browser caches. Existence of such a cache can affect interactions with servers at arbitrary origins and result in leaks of information to outside parties.

In some examples the third party can detect that a resource from another origin has already been loaded into the user agent because it is not requested.

An example is also given where the URL given for a resource can be deceptive, since content in the cache is looked up only by the subresource integrity metadata and not by the accompanying URL.

Another design option

All of these issues can be remedied by using the SRI metadata in the user agent only for verification that the desired content is already in the cache. (This in addition to its role of verifying that the resource has not been tampered, e.g. in a CDN.) The cache is addressed by the URL, but the user agent compares the SRI metadata of the request with the content already present, and the cache is used if and only if they match. (Some special cases such as "hard reload" may still ignore the cache.) In the very common case where most resources in the cache are valid, this can still eliminate a great many costly network roundtrips that would otherwise result in 304 response codes.

Perhaps this option is well-known to the participants in the original discussion, but I do not see it referenced elsewhere. If it is known and perhaps even being pursued elsewhere, I would be very interested to learn.

Doesn't this miss the obvious non-timing attack?

The user visits evil.com, which contains the following line:

<script src="/probe.js" integrity="sha-256:hash-of-bank.com's-data" />

The evil.com server then tells the page whether or not it received a request. If it received the request, then the user has never been to bank.com, or has not been to bank.com with the specific script data that was hashed. If the request is never received, then the user has been to bank.com with that data.

(I think this can also be accomplished without server collaboration: if probe.js sets a global variable, then the presence of the global means that bank.com has not been visited, whereas its absence means bank.com has been visited in the past.)

This is not really a timing attack, but a 100% privacy leak. It seems worthwhile to mention it in the document.

Clarify timing attacks and history leaks

It is not clear to me why hash based timing attacks are different from URL based timing attacks (already possible today).

Is it possible to clarify the differences between the two and why hash based timing attacks would reveal more that what is currently possible?

Additional comments/suggestions/questions on the text

The first bug report left off here:

“it would be bad if an attacker could inject the following into the resource:” ➝ “…inject the following code into the parent document:”

So here's the rest:

<script src="https://scripts.example.com/angular.min.js" integrity="sha-256:..." /> ➝ <script src="https://scripts.example.com/angular.min.js" integrity="sha-256:DEADBEEF0BADCAFE…” /> “And by so doing force example.com to load an old version of the Angular framework that allows bypassing of CSP, if the real https://scripts.example.com origin didn’t actually have a copy of that resource.” ➝ “and by so doing cause the UA to load and execute an old, vulnerable, or otherwise incorrect version of Angular in the context of example.com.” “Similar restrictions might apply for features like Workers or ServiceWorkers which are required to be loaded same-origin as a similar security precaution.” ➝ “A similar risk might apply to features like Workers or ServiceWorkers. (As a security mechanism, UAs require that Workers and ServiceWorkers can only invoke scripts that come from the same origin. Origin laundering could bypass that mechanism.)” — “Luckily, browsers can do magic behind-the-scenes, and performance improvements, so long as they are correct from the perspective of application semantics and security, need not be precisely identical across the population of user agents.” Suggestion for clarity/concision: “As long as they do not violate the application’s semantics (including security), performance improvements need not be uniform across the population of UAs.” However, I’m not sure what that means. :) — “Origin laundering is a bit more difficult to address, but it can be handled with the same strategy that also informs population of the cache, so long as one is aware of the issue in advance.” Question: What strategy is that? — Typo: “alloted” ➝ allotted — Ahh, I see now what double-keying is. Maybe put a “(see below)” in the 1st place you mention it. — “If a resource’s Content-Security-Policy header explicitly lists the hash of an external resource as allowed, that could be interpreted as an authoritative statement that its origin provenance is irrelevant” ➝ “If a resource’s Content-Security-Policy header explicitly lists the hash of an external resource as allowed, the UA could interpret that as an authoritative statement that the resource’s provenance is irrelevant”

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.