Comments (5)
Hi Emma
The parsing rules I am following here are:
If a property (p-name
) is empty do not add it to the output. In this case "empty" is classed as not containing any non-whitespace text. As far as I known there is no guidance on how to handle "empty" properties in microfomats paring rules, so I followed the conventions of JSON API's not to return "empty" properties.
The side effect of the above is that p-name
also has a number of "implied rules". The "implied rules" try to automatically fill properties like p-name
if there is no defined value. In your example it uses the full text content of the parent h-entry
.
I can see why the resulting different outputs would seem a little unexpected.
This is not really a bug, but a valid questions about how the parsing rules should work:
- Should I return empty properties if the author of the HTML has added property
p-name
to an element with no text content - Should I execute the "implied rules" where there is a author defined "empty" property
I am going to have to post this issue to microfomats IRC and see if we can define the rules a bit more clearly for your use case. Once we have an agreed approached I can update the parser.
Personally I would recommend to any author of microformats to always add a p-name
with some text to every h-*
. Also with my parser try setting the options to {'textFormat': 'normalised'}
you may find the resulting text more useful.
from microformat-node.
Thanks for clarifying. It sounds like the implied property rule breaks the "note type algorithm" and recommended practice of including a p-name in notes, if the note happens to be a photo with no text. Should this be documented on http://microformats.org/wiki/microformats2-parsing-issues?
from microformat-node.
I have add this problem to http://microformats.org/wiki/microformats2-parsing-issues. Sorry its taken a little while, but I wanted to go through my code carefully to make sure it was not an issue with my parser.
Please feel free to add your own view on how this should be dealt with to the wiki page.
from microformat-node.
Interesting question! mf2py and php-mf2 will both happily include empty string values in the output and not generate an implied name. Added comments/votes to the wiki.
from microformat-node.
Hi Emma, Its been a while but your view of what the output should be got agreed and I have now updated all my javascript parser code. You can try out the html in your example in http://glennjones.net/tools/microformats and the p-name should now return as an empty string.
from microformat-node.
Related Issues (20)
- Error when parsing http://www.boemiadigital.com/
- Error when parsing 'http://www.limitedtoendodontics.com'
- Error when parsing www.markgordondentistry.com HOT 1
- Parsing an h-entry with a root of <article> results in an empty h-entry HOT 3
- parseUrl promise not working HOT 2
- parseDom (feature request) HOT 4
- Remove or separate out parseUrl() to slim down library? HOT 4
- Can't find '../test/testWriter.js' from bin/microformat-node HOT 2
- HTML entity handling HOT 3
- Update microformat-shiv dependency reference HOT 1
- Regression in text parsing of weird XSS:ish content?
- crash on certain input HOT 2
- crash on certain markup HOT 7
- Truncated Dates (bday / without years) are not parsed HOT 2
- Backcompat parsing conflicts
- Some URL's causes library to "hang"
- href ignored for u-* properties HOT 1
- Alternative node/JS microformats parser
- Dependency "ent" uses deprecated Node punycode module HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from microformat-node.