ampproject / amppackager Goto Github PK
View Code? Open in Web Editor NEWTool to improve AMP URLs via Signed Exchanges
Home Page: https://amp.dev/documentation/guides-and-tutorials/optimize-and-measure/signed-exchange/
License: Apache License 2.0
Tool to improve AMP URLs via Signed Exchanges
Home Page: https://amp.dev/documentation/guides-and-tutorials/optimize-and-measure/signed-exchange/
License: Apache License 2.0
certcache.go should serve the cert-chain+cbor with a max-age corresponding to the OCSP midpoint.
dep ensure
, etc.Amend the README to advise that publishers only package AMP documents, and that they audit those documents for validity offline (e.g. at publication time, or a weekly cronjob that audits a random 1%).
Users of the packager may mistakenly transform and sign non-amp content. The AMP Cache will reject this, but for security we want some light protections against this content being usable. This issue tracks removing non-amp javascript from the document in a new transformer.
The steps we should take are:
For each <script>
tag on the page, if any one of of the following is true:
src
attribute whose value is not prefixed by https://cdn.ampproject.org/
(case-insensitive match).src
attribute and no type
attribute (case-insensitive match).type
attribute whose value is neither application/json
nor application/ld+json
(case-insensitive match on both name and value).Then, remove the <script>
tag and all descendant nodes of <script>
tag, including text / cdata nodes.
For example:
<script async src="https://cdn.ampproject.org/v0.js">
should not be removed<script async custom-element='amp-analytics' src='https://cdn.ampproject.org/v0/amp-analytics-0.1.js'>
should not be removed.<script src='http://example.com/example.js'>
should be removed.<script>foo</script>
should be removed<script type=application/javascript>foo</script>
should be removed<script type=application/json>foo</script>
should not be removed<script type=application/json src="https://cdn.ampproject.org/v0.js">
For every tag on the page, if the tag has an attribute with a case-insensitive prefix of on
followed by another alphabetic character ([A-Za-z]
), then remove that attribute. For example:
on
should not be removed.on-foo
should not be removedonfoo
should be removedIf the fetchResp is not valid for packaging (either via validateFetch or because of stateful headers), instead of returning an error code, the packager should simply proxy the content unsigned. This is a friendlier error response.
Packager currently only provides a null validityUrl response for all signed exchanges. It would be nice to support validity updates, to save on network bandwidth when responding to intermediaries and to allow client-side reverification of signed exchanges as a downgrade mitigation.
Note that this will require a refetch of the document to get the message to sign, unless a cache is added. If a cache is added, it should obey the usual HTTP caching semantics re: freshness and validation.
Add an option to use a remote signing oracle to sign messages, rather than needing filesystem access to the key. The Go Keyless client library will probably be of use, here.
This should parse https://github.com/ampproject/amphtml/blob/master/spec/amp-cache-transform.md from the request and ensure target is any
or google
, and no params are specified.
WICG/webpackage#121 will change the SXG signed message to include the OCSP response attached to the cert-chain+cbor. This means that:
This is a tracking bug to make sure the AMP packager serves valid signed exchanges per the evolving v=b1 spec. Support for v=b0 would need to co-exist for a few months.
Spec changes being drafted here: WICG/webpackage#232
At present, these Chromium changes seem relevant:
(I may eventually split out bugs for larger subprojects, like OCSP stapling.)
See also http://b/92515679.
This would implement the changes in ampproject/amphtml#18334.
Improve the error messages such that they are actionable by people who don't know code. Currently, they're pretty cryptic and mostly useful for devs.
b2 format is still in development, and it seems plans are for WICG/webpackage to support b1/b2 as a flag (see WICG/webpackage#291, WICG/webpackage#292).
Related Chromium changes:
HTTPError.LogAndRespond calls http.Error, which does not set Cache-Control: no-store
. Update it to do so.
If the packager happens to proxy invalid AMP, and an attacker captures such an SXG and serves it somewhere that doesn't valid, the page would have its charset misinterpreted.
When [URLSet.Fetch] is not configured, then fetch and sign are the same URL object, so the destructive modifications that fetchURL() does to the query end up affecting the signed URL, too.
Follow the Standard Go Project Layout. That entails:
.go
files from the root into internal/pkg/
test/
configs/
and maybe some other stuff.
Currently only tested indirectly in packager_test.go. This function should be tested directly, with a variety of urlSets and fetch/sign URLs, to cover all the combinations.
This would integrate the transform library with no transformers (except maybe a transformed="..." attribute).
As an alternative to reading the .pem
from the local filesystem, allow the use of Vault's Read Certificate call to retrieve the certificate PEM at packager startup.
http://crrev.com/c/1060933 adds a requirement that certUrls be OCSP-stapled, as part of the new v=b1 spec. The packager should have a flag for outputting b1 format, and doing the necessary stapling things. A set of requirements to start with is https://gist.github.com/sleevi/5efe9ef98961ecfb4da8, and a good implementation to look at is mholt/caddy/caddytls.
At the very least, call https://golang.org/pkg/log/#New from main and pass to the handler constructors. This way we can make our tests less noisy.
Maybe upgrade to https://godoc.org/github.com/golang/glog or https://github.com/op/go-logging or https://github.com/juju/loggo or something, though https://dave.cheney.net/2015/11/05/lets-talk-about-logging argues against it.
If -development
is not true, then the packager should refuse to run or sign exchanges if given a non-SXG cert.
Transforms may have unintended effects on invalid documents. There may be cases where the SXG is fetched and served by third parties without validation. Since full validation at serve time is too expensive, it should perform some light validation; at least to verify that the document intends to be AMP.
Add support for /priv/doc/https://example.com/foo.html?query=param
as an alternative to /priv/doc?sign=https%3a%2f%2fexample.com%2ffoo.html%3fquery%3dparam
, as nginx doesn't support URL-encoding during rewriting [1].
[1] See https://stackoverflow.com/questions/31266629/nginx-encoding-normalizing-part-of-uri or https://forum.nginx.org/read.php?2,221867,221874 for instance. Possible with a custom build, though, using https://github.com/vozlt/nginx-module-url.
Provide a config flag to allow non-transformed AMP to be served (i.e. not going through the AMP CDN). This is just for debugging.
Currently the default case is to respond 502.
This will also depend on a TBD versioning/update strategy.
Currently, amppkg
only loads the cert file at startup. If it expires while the packager is running, the packager continues to sign with it and serve it. Instead, it should attempt to reload automatically starting a few days before expiry, and continuing at some regular interval until no longer imminently expiring. If the cert is expired, it should stop signing exchanges, and log a warning.
In addition, it should serve the cert-url with an http expiry no longer than the cert expiry (as a follow-up to #85).
This could be a collection of scripts, or packages of various formats (Docker, Flatpak, .deb
, etc.). Broad coverage of most of the production environments probably necessitates multiple formats, though we should prefer a solution that covers as many as possible in as few variants as possible. This will reduce the cost of maintenance and the chance of error in one or more of them. Googlers, see go/amp-packager-deployment-requirements for more info.
If the fetch redirects to another URL, the packager will follow the redirect(s), but sign it with the original request URL. This has the following problems (at least):
When the packager gets a 30x, it should simply respond with that instead of following it and signing the result.
At a minimum, it needs to specify:
Content-Security-Policy
Content-Type: text/html
Add relevant preloads as determined based on the fetch response body. This is dependent on the implementation of the local transformer.
We need to decide how AMP packages should be discovered on the web. Options include <link rel=amppackage>
, request headers, and sitemap.xml
(or some combination).
The packager should handle conditional requests.
The fetch issued by fetchURL() in packager.go should pass along any of the conditional headers it received in ServeHTTP().
If the fetchResp is a 304 not modified, then ServeHTTP() should respond in kind, and not try to package up that 304 in an application/signed-exchange
.
See also http://b/110430571.
-development
or maybe a new flag.Add X-Content-Type-Options: nosniff
to the SXG outer response headers.
Copy ETag
and Last-Modified
from the fetchResp
to the outer resp
. This enables clients to transparently take advantage of #34 without having to understand the SXG format.
Rewrite the documentation to be action-oriented -- e.g. start with running the packager, and then move backwards up the chain to adding a FE and all that.
Implement versioning according to the AMP-Cache-Transform spec.
This is necessary for the same-origin policy imposed in https://crrev.com/c/1075833.
This is a sub-bug of #15.
At the very least, go over the code, figure out what tests need adding and file bugs for them.
Accept
request header should either be missing, contain */*
, or specify b2
as one of the accepted versions.
Use correct rtv endpoint
It may be useful for a single packager instance to sign for multiple domains that are authenticated by different certs. (This should be an uncommon case, though, as it is very simple and possibly more secure to run different instances each with access to only one certificate.)
Currently each method call performs a copy of this
, which is unnecessarily costly.
Currently, the toml parser just silently ignores any fields that don't match the struct. This means that typos are hard to diagnose, and backwards-incompatible changes to the config will require special care to notify existing users.
Periodically (hourly?) poll to get the latest rtv (and css url), store in memory.
Pass these values into the local transformer library for writing the correct rtv script and css.
https://github.com/WICG/webpackage/pull/267/files added some instructions on submitting a cert to CT and getting an SCT in return, in cases where the CA has not already attached it. This would be nice for the packager to support automatically, though not strictly necessary. At the very least, the documentation should be updated to include a reference to this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.