The sri-addressable-caching from hillbrad

Where did the output document go?

I lost the link to the output document. Maybe add a link from the README or from the GitHub top bar?

Origin laundering mitigation

Would it be possible to mitigate origin laundering if SRI based cached resources would also contain a list of past URLs this resource was fetched from?

That way, CSP could be validated against that list of URLs, and if none passes CSP, the resource would not be fetched from cache.

If a resource was served from cache, the current URL won't be added to the list of URLs associated with the resource until the browser would asynchronously validate the hash for that URL.

(Such an implementation might have significant memory and runtime costs, as it would need to keep a list of URLs with each resources and compare CSP to all of them. There might be ways this could be optimized away though)

Alternatively, we could also say that SRI based caching will not be used for sites that deploy CSP unless they also whitelist those particular hashes.

Suggestions, questions, thoughts on the text

“A substantial portion of the content downloaded by web user agents today consists of JavaScript frameworks. There are a relatively few frameworks which are both extremely popular and large in size. Many applications include these frameworks even if they only use a small portion of the functionality they provide.

In consequence, downloading, parsing, just-in-time compiling these frameworks represents a significant amount of the total network, battery and time budget for a modern browser.”

Rewrite for clarity and concision: “A small number of large, popular web application frameworks account for a substantial portion of the network, battery, and time budgets for a modern web user agent (UA). Many applications include these frameworks even if they only use a small portion of the functionality they provide. It would be a great improvement if UAs could pre-cache and pre-compile these libraries a single time, especially for UAs that do double-keyed caching for privacy.”

Question: What is double-keyed hashing? Maybe link to a definition.

Question: But don’t people use a billion variants of jQuery that have only the features they need? And jQuery has a billion versions? So really, even though everyone uses jQuery, nobody uses the same jQuery as anyone else?

—

Suggestion: s/browser/UA/g
Suggestion: s/user agent/UA/g after the 1st instance

—

“Cache-Control header”: Use, or don’t use, the code font consistently for headers.

—

Timing leaks: If everyone really is using the same jQuery, then B can’t really assume that the UA got a copy from A, right? It could just as well have been from C – Z.

—

“sha-256:hash-of-a’s-data”: don’t use curly quotes in code.

Suggestion: Do use them in prose though.

—

“Content-Security-Policy CSP3 allows resources to specify from which origins script may be loaded, as an attack surface reduction.”

When discussing protocols and ceremonies, I find it super important to keep track of who does what when, and in what context. Suggestion for clarity: “To reduce an origin’s attack surface, Content-Security-Policy CSP3 allows origins to specify from which origins the UA may load script to run in the context of the parent frame.” Or “parent document” or whatever else you think is more accurate.

“it would be bad if an attacker could inject the following into the resource:” ➝ “…inject the following code into the parent document:”

Subresource integrity for cache content verification

Examples in the document show issues with pure content-addressable browser caches. Existence of such a cache can affect interactions with servers at arbitrary origins and result in leaks of information to outside parties.

In some examples the third party can detect that a resource from another origin has already been loaded into the user agent because it is not requested.

An example is also given where the URL given for a resource can be deceptive, since content in the cache is looked up only by the subresource integrity metadata and not by the accompanying URL.

Another design option

All of these issues can be remedied by using the SRI metadata in the user agent only for verification that the desired content is already in the cache. (This in addition to its role of verifying that the resource has not been tampered, e.g. in a CDN.) The cache is addressed by the URL, but the user agent compares the SRI metadata of the request with the content already present, and the cache is used if and only if they match. (Some special cases such as "hard reload" may still ignore the cache.) In the very common case where most resources in the cache are valid, this can still eliminate a great many costly network roundtrips that would otherwise result in 304 response codes.

Perhaps this option is well-known to the participants in the original discussion, but I do not see it referenced elsewhere. If it is known and perhaps even being pursued elsewhere, I would be very interested to learn.

Doesn't this miss the obvious non-timing attack?

The user visits evil.com, which contains the following line:

<script src="/probe.js" integrity="sha-256:hash-of-bank.com's-data" />

The evil.com server then tells the page whether or not it received a request. If it received the request, then the user has never been to bank.com, or has not been to bank.com with the specific script data that was hashed. If the request is never received, then the user has been to bank.com with that data.

(I think this can also be accomplished without server collaboration: if probe.js sets a global variable, then the presence of the global means that bank.com has not been visited, whereas its absence means bank.com has been visited in the past.)

This is not really a timing attack, but a 100% privacy leak. It seems worthwhile to mention it in the document.

Clarify timing attacks and history leaks

It is not clear to me why hash based timing attacks are different from URL based timing attacks (already possible today).

Is it possible to clarify the differences between the two and why hash based timing attacks would reveal more that what is currently possible?

Additional comments/suggestions/questions on the text

The first bug report left off here:

“it would be bad if an attacker could inject the following into the resource:” ➝ “…inject the following code into the parent document:”

So here's the rest:

hillbrad / sri-addressable-caching Goto Github PK

sri-addressable-caching's People

Contributors

Stargazers

Watchers

sri-addressable-caching's Issues

Where did the output document go?

Origin laundering mitigation

Suggestions, questions, thoughts on the text

Subresource integrity for cache content verification

Another design option

Doesn't this miss the obvious non-timing attack?

Clarify timing attacks and history leaks

Additional comments/suggestions/questions on the text

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent