Giter VIP home page Giter VIP logo

amp-library's People

Contributors

adonisfigueroa avatar alexpott avatar berdir avatar cashewz avatar davereid avatar deuh avatar dzmitrykruchko-nbcuni avatar e0ipso avatar echosa avatar fuzzytree avatar jatorresrcn avatar jvrdom avatar karens avatar keskinbu avatar les-lim avatar m4olivei avatar mayrop avatar mtift avatar obeyer avatar pkailasu avatar renzit avatar romantselepnev-nbcuni avatar sebastianbenz avatar sidkshatriya avatar sirkitree avatar takeit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

amp-library's Issues

Document how specification representation is generated

As a note to future developers please write a note (maybe at DEVELOPERS.md) on how the AMP specification generation representation (validator-generated.php ) works.

( We have a fork at https://github.com/Lullabot/amphtml which does this )

Fix bug with attribute trigger specs

Attribute trigger specs is a new feature that was implemented in our PHP validator recently.

In protoascii language it looks like this

...
attrs: {
    name: "on"
    trigger: {
      if_value_regex: "tap:.*"
      also_requires_attr: "role"
      also_requires_attr: "tabindex"
    }
  }
...

Essentially, if there is an attribute "on" on a tag, and its value starts with "tap:" then we should also require attributes role and tabindex on that tag.

Now:

  • Re-generate our validator-generated.php and copy it into this repo; this will bring in some attribute specs that use this feature in the validator
  • Fix a small bug exposed by this update
  • Add a test (from the amphtml project) to ensure this does not regress in the future

Code coverage related tweaks

  • Add a basic phpunit.xml configuration file to whitelist the files we want to do code coverage on
  • Find out why ParsedUrlSpecStylesheetErrorAdapter.php does not get any code coverage statistics when it is very obviously being called

Add support for CDATA validation

Add support for CDATA validation with tests. This type of validation supports validation of the contents of tags e.g. this would catch scenarios like the use of !important in css within style tags (which is not allowed). This uses things like a allowed data regex and a blacklist regex

This does not cover all of the scenarios for CSS validation. Some CSS validation will need to wait for more advanced CSS parsing validation (for another ticket).

This is related to the omnibus ticket #35

Improve overall friendliness of AMP Warnings reported

Currently the AMP warnings displayed have line numbers displayed beside them. Line numbers don't have much relevance if the HTML is coming from a rich text editor.

It might be a good idea to provide the HTML of the tag as a "context". The formatting of the message can be made better and more user friendly.

There are a lot more things that can be done here / possibly need to be done here. This will require refactoring of the warnings display/action taken related code.

Here are some general ideas mentioned by marc https://github.com/Lullabot/amp-module/issues/77#issuecomment-189687121 that can serve as an additional background for this ticket.

Duplicate amp component JS in full html

In Drupal 7, I'm seeing an error where the script tags for component JS are being added a second time when full HTML processing is turned on. I'm not seeing this error in Drupal 8. Examples include amp-iframe, amp-twitter, and amp-analytics. In Drupal 7, the tags are initially added to $head which appears towards the top of the head element. In Drupal 8, the tags are output by the JS placeholder, which is just above the main amp script tag. Maybe that has something to do with it? In any case, when full HTML processing is turned off, this doesn't happen, so it's something to do with the AMP library. I don't see any code in the D7 module that is doing this in conjunction with the full HTML pass, so I can only assume it's the library itself.

Make a good test suite

Now that basic testing infrastructure with sample tests is in place (See #11), please make a decent test suite.

After that is done, the update-validator branch can be confidently merged into master. (See PR #38)

Choose between competing errors like canonical validator

When a tag matches a specification then its is correct -- no problems.

When a tag does not satisfy any specifications, then there are competing tag specifications (e.g. amp-audio > track vs audio > track ) from which errors must be presented.

The amp php library should select the errors in a similar way that the canonical validator does. Otherwise, the output of the php validator may be less useful because its has chosen a longer set of errors or a tagspec thats not really the best fit etc.

Update the code so that we match canonical validator.

PHP Notice from the Mastermind HTML5 library

We keep getting this pesky PHP notice. It needs to hunted down and made to go away.

Notice: Undefined property: DOMElement::$data in 
Masterminds\HTML5\Serializer\OutputRules->element() (line 222 of 
/var/www/amp/docroot/vendor/masterminds/html5/src/HTML5/Serializer/OutputRules.php

As far as I can tell, this specific notice does not cause any problems in the AMP HTML output but we still need to try to figure out why it occurs and how to make it go away. Its probably something simple like the code of the html5 library like not doing an isset / !empty check before using the variable.

Should .git folder be downloaded with Composer?

When I installed this via Composer and uploaded the files to a server, I received an error about a git submodule being within my repo. I looked into the amp library files in the repo and found a .git directory within the amp php library directory, which seemed unlike the other libraries in my vendors directory. Once I removed the .git directory, I was able to upload the library to the server without an issue. Not sure if something should be changed in the setup, but wanted to note this.

Change README language regarding Composer

The section on Composer starts as follows:

If you're familiar with composer then you should not have any problems in getting this to work for your PHP project. If you're not familiar with composer then please read up on it before trying to set this up!

Perhaps change that to something like:

Use Composer to get this library working with your PHP project. Not familiar with Composer? Read on to learn more about how to get started.

Saying that setup steps are easy can trip people up when they run into unexpected problems. Not only can a setup step be frustrating, but now they are wondering why this wasn't easy. Better to just simply state what is needed to get started.

Other than that, great README!

Image dimensions for non-absolute URLs

The library should be able to handle conversion of <img> tags to <amp-img> tags even if the URLs are non-absolute. E.g. <img src="/abc/xyz.jpg" /> in that case the image dimensions should be attempted to be retrieved from the http://server-url.com/abc/xyz.jpg. The server url can be ascertained from the $_SERVER super globals.

Similarly it should be able to handle relative paths e.g. <img src="abc/xyz.jpg" /> (note that there is no leading /.

Encoding problems on Ubuntu 12.04 LTS (libxml 2.7.8)

I am using the amp module for Drupal 7 and i'm facing an encoding issue when using that in my hosting environment. I did a lot of research and debugging and i narrowed down the problem to the native DomDocument - implementation.

I already created an issue on d.o for that: https://www.drupal.org/node/2712895

AMP::convertToAmpHtml() calls $qp = QueryPath::withHTML($document, NULL, ['convert_to_encoding' => 'UTF-8']); that results (simplified) to an equivalent of this:

// see QueryPath\DOMQuery::parseXMLString(...)
$content = mb_convert_encoding($content, 'UTF-8', 'auto');
$document->loadHTML($content);

This is where my umlauts get broken in my hosting environment. When i change the lines to

// see QueryPath\DOMQuery::parseXMLString(...)
$content = mb_convert_encoding($content, 'HTML-ENTITIES', 'UTF-8');
$document->loadHTML($content);

Everything works fine.

I think that might be an issue with PHP/libxml? Is that library too old to work with AMP?
phpinfo-dom

Non Drupal PHP project

Hi,

I have a non-Drupal PHP website. It basically generates HTML files on the fly fetching from mysql database. In addition I have over 500 million HTML files that are generated on the fly as well. In case if you wondering what website is he talking about, head over to XYZCalledYou.com.

I am looking for an automated AMP HTML file creation integration.

I was this close to finishing, but converting so many pages can be a nightmare that can't end at all.
In case if you are still wondering, the site never breaches, abuses or violates any of Google's terms of conduct.

I can send my PHP code over to you if you need to examine it.

Thank you,
Esh

Instragram embed size ratio is not correct and leaves large gap under embed

Here's the embed code (note that there is no specified height) copied from this Instagram post:

<blockquote class="instagram-media" data-instgrm-captioned data-instgrm-version="7" style=" background:#FFF; border:0; border-radius:3px; box-shadow:0 0 1px 0 rgba(0,0,0,0.5),0 1px 10px 0 rgba(0,0,0,0.15); margin: 1px; max-width:658px; padding:0; width:99.375%; width:-webkit-calc(100% - 2px); width:calc(100% - 2px);"><div style="padding:8px;"> <div style=" background:#F8F8F8; line-height:0; margin-top:40px; padding:50.0% 0; text-align:center; width:100%;"> <div style=" background:url(); display:block; height:44px; margin:0 auto -44px; position:relative; top:-22px; width:44px;"></div></div> <p style=" margin:8px 0 0 0; padding:0 4px;"> <a href="https://www.instagram.com/p/BFUN97QpznO/" style=" color:#000; font-family:Arial,sans-serif; font-size:14px; font-style:normal; font-weight:normal; line-height:17px; text-decoration:none; word-wrap:break-word;" target="_blank">We&#39;ve still got #VR headsets at the #Lullabot #DrupalCon booth. Stop by and enter a virtual world!</a></p> <p style=" color:#c9c8cd; font-family:Arial,sans-serif; font-size:14px; line-height:17px; margin-bottom:0; margin-top:8px; overflow:hidden; padding:8px 0 7px; text-align:center; text-overflow:ellipsis; white-space:nowrap;">A photo posted by Lullabot (@lullabot) on <time style=" font-family:Arial,sans-serif; font-size:14px; line-height:17px;" datetime="2016-05-12T17:40:05+00:00">May 12, 2016 at 10:40am PDT</time></p></div></blockquote>
<script async defer src="//platform.instagram.com/en_US/embeds.js"></script>

Here is what the library transforms it to:

<amp-instagram layout="responsive" width="400" height="600" data-shortcode="BFUN97QpznO"></amp-instagram>

Here's what it looks like on a non-amp page:

image

Here's the AMP version

image

Make the HTML correction code more sophisticated

One of the objectives of this library is to find and automatically correct AMP HTML standard non-compliance automatically.

So if a style attribute is illegal for a tag, we remove it. If a tag cannot exist in a particular location in the document we remove it and so on.

As mentioned in the README.md this correction is currently basic and can be made a bit more sophisticated.

Modifications to support line numbers within CSS

As discussed in #78 (comment)

The canonical validator points out issues in the CSS on the exact line of the HTML document. In our case the line number we provide is the line number of the containing <style> tag. To provide exact line number within the CSS we might need to make changes to the the PHP parser library ( https://packagist.org/packages/sabberworm/php-css-parser ).

So we need to make two changes:

  • Make changes to our fork ( https://github.com/Lullabot/PHP-CSS-Parser) of sabberworm/php-css-parser to support line numbers. (This could later be submitted as a pull request upstream; but composer allows us to point to our repo if we need to also)
  • Make changes to our amp-library code to support validation error line numbers within the CSS

Run the Full HTML page through the library

We discussed this on the call. We should run the Full HTML page through the library on amp paths on Drupal 8.

While the library will strip out any bad attributes, bad tags and bad property value pairs it does not know how to "add" things that would fix any potential problems. So you have two situations:

  • An AMP page whose Full HTML was processed by the library, compliant in many more ways (but possibly messed up in some ways because Drupal may depend on some of things we removed)
  • An AMP page whose Full HTML was not processed by the library so still has a few non-complying issues.

It may still be worthwhile to do this exercise and see if we're net positive by running the full HTML through the library. And hopefully we should.

cc: @sirkitree @Mdrummond

Add continuous integration

Add continuous integration.

Most probably we can use TravisCI as its hosted and works well with composer and phpunit

Refactor AMP class and fix diffing

The AMP class is getting unwieldy. There is a lot of scope for refactoring code into smaller functions. Do this. Additionally, we have a facility to be able to diff the initial HTML input with the AMPized output HTML. This feature has some issues -- address them.

Provide better HTML5 support and add workaround for line numbers

While working on #62 I encountered problems related to the AMP library's support for HTML5. The main issue is that we're using the inbuilt HTML parser of PHP DOM (libxml version is 2.9.0).

The following does not parse properly for instance:

<audio controls autoplay>
    <p>Your browser does not support the <strong>audio</strong> tag</p>
    <source src="abc.wav" type="audio/wav">
    <source src="xyz.mp3" type="audio/mpeg">
</audio>

(The above HTML actually parses without any fatal errors but the parse tree is not correct).

This HTML needs to be changed to:

<audio controls autoplay>
    <p>Your browser does not support the <strong>audio</strong> tag</p>
    <source src="abc.wav" type="audio/wav" />
    <source src="xyz.mp3" type="audio/mpeg" />
</audio>

Notice that <source> was changed to <source /> so that the PHP dom parser knows that the tag has ended.

We can't expect users to do this while entering HTML. Users should be able to provide valid HTML5 in additional to traditional HTML.

The solution is to not use the default parser but the mastermind/html5-php library to parse the HTML5 to build the DOMDocument. (We're already using this library for HTML output)

The use of mastermind for input is a seemingly small change but causes a huge problem: We cannot point out line numbers when we find validation issues. DOMNode::getLineNo() always returns zero.

I filed an issue on the mastermind/html5-php. Sadly there has been no response on it (yet).

I have been able to make my own workaround though (See filed issue for workaround details).

Task for this ticket:

  • Make the changes in the AMP Library to use the mastermind/html5-php library for input HTML parsing. Add workaround for line numbers.

Add AMP Library statistics in form of an HTML comment in DOM

When the AMP library is run, it would be a good idea to see for performance tracking and understanding (for cache purposes) when a page was processed by the AMP library

  • At what time did the AMP Library process the html fragment or html document (good for understanding when a page was last cached)
  • How much time did the AMP Tag conversion take?
  • How many tags were processed?
  • Was memory_get_peak_usage() different at the start of the script vs at the end of the script?

Implement iframe tag conversion for soundcloud embed

The soundcloud embed code looks something like this:

<iframe width="100%" height="450" scrolling="no" 
frameborder="no" 
src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/253267058&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true"></iframe>

The specifications of the <amp-soundcloud> tag can be found here:
https://www.ampproject.org/docs/reference/extended/amp-soundcloud.html

Write a pass that will detect a soundcloud iframe and convert it into a <amp-soundcloud> tag with the appropriate javascript include.

Expand to the full range of validation such as CSS, Layout, Template, and CDATA

Our AMP HTML validator does not currently validate all aspects of the HTML to make sure it conforms to the AMP HTML standard.

CSS, Layout, Template, CDATA validations are still pending and should be implemented.

This is large task but can be listed under one ticket for now. (I'll create separate tickets if required as we go along).

Fix bugs in child tag validation

  • Child tag validation code is being called from wrong location
  • number of child tags is not being compared properly with number of mandatory child tags

This came to light when run test on:
https://github.com/ampproject/amphtml/blob/27ee29ffc3d809fcc8143044d22df9d176ad8169/extensions/amp-accordion/0.1/test/validator-amp-accordion.html

And comparing output against:
https://github.com/ampproject/amphtml/blob/27ee29ffc3d809fcc8143044d22df9d176ad8169/extensions/amp-accordion/0.1/test/validator-amp-accordion.out

Fix this bug and add this test to the full-html tests directory

Remove workaround for noscript tag (issue with tags being escaped)

In #22 there was a subtle workaround for <noscript> tags due to a problem with the mastermind/html5 library versions < 2.2.0

Now that the upstream library has fixed its bug as of 2.2.0 we need to remove this workaround. (See release 2.2.0 https://github.com/Masterminds/html5-php/releases/tag/2.2.0 . Also Masterminds/html5-php#98 ).

Not removing the workaround and running with 2.2.0 results in the noscript tag having html tags within <noscript> escaped:

i.e. instead of:

<noscript><style amp-boilerplate>body{-webkit-animation:none;-moz-animation:none;-ms-animation:none;animation:none}</style>

We get:
<noscript>&lt;style amp-boilerplate&gt;body{-webkit-animation:none;-moz-animation:none;-ms-animation:none;animation:none}&lt;/style&gt;</noscript>

In other words with the later versions of the masterminds/html5 library the workaround results in a html output issue.

Set up library on Packagist

I started trying to set this up on Packagist and got the following error when I tried:

"Uncaught Exception: [RuntimeException] The HOME or COMPOSER_HOME environment variable must be set for composer to run correctly"

I think we need something added to composer.json to make packagist happy.

Do more comparisons with canonical validator

While porting a subset of the canonical validator to PHP comparisons were made between the output given by the canonical validator and our (ported) validator.

This exercise should happen again with greater rigour and with more sample HTML documents to ensure that the AMP HTML warnings that we report are correct.

Please note that the output will not be the same between the two validators (and that is OK).

Our validator will give:

  • More warnings in certain cases (e.g. the canonical validator will abort as soon as it sees a bad attribute, we report more problems. Our solution is better and helps us do more corrections later on)
  • Less warnings in certain cases as we don't support things like template, layout etc validation

We just need to ensure that our warnings are correct.

This ticket is partially related to #11 but not fully.

Update PHP validator code to latest version of the specification

The AMP HTML specification is continuously changing. New tags are being introduced and the way existing tags are being validated is changing too.

Our PHP validator is based on the canonical javascript validator. We need to need to update our port to tackle non-trivial changes in the way the specification is validated (simple changes should automatically work when validator-generated.php is updated). (Examples of non-trivial changes are changes to the internal classes in validator-generated.php and so forth).

CSS validation (css_spec validation)

(This ticket is part of the omnibus ticket #35 )

In CSS validation we validate the CSS in the amp page. Some of the things that are not allowed are: @imports in your CSS, various rules around url() in your CSS and so forth. Please add functionality to support this CSS validation. See https://github.com/ampproject/amphtml/blob/master/spec/amp-html-format.md#stylesheets for a longer list.

Here is an example of a css_spec section that we have to use:
https://github.com/ampproject/amphtml/blob/dfaa8e1ba556165b24177aa4ec3c1aebdfcbd4f5/validator/validator-main.protoascii#L470

Note: CDATA validation as part of #75 already supports some CSS validation via regexes (e.g. !important is dissallowed and so forth). In this ticket we need to actually parse the CSS and report any problems with it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.