Giter VIP home page Giter VIP logo

Comments (27)

HadrienGardeur avatar HadrienGardeur commented on June 19, 2024

@jccr @danielweck could you provide a few examples of the Readium 1 JSON document?

from architecture.

danielweck avatar danielweck commented on June 19, 2024

Will do.

from architecture.

jccr avatar jccr commented on June 19, 2024

Working on it.

from architecture.

jccr avatar jccr commented on June 19, 2024

@HadrienGardeur:
Was going to make a gist.. but I thought this would be better:
https://github.com/readium/readium-2/tree/develop/readium1-data-samples

from architecture.

jccr avatar jccr commented on June 19, 2024

What I added was captured using ReadiumJS only. I did not include the other data that the ReadiumSDK parses yet.

from architecture.

HadrienGardeur avatar HadrienGardeur commented on June 19, 2024

Thanks @jccr for these files. I have a few questions so I'm including @danielweck too:

  • Is runtime-data.js pretty much a JSON dump of the object kept in memory?
  • and static-data.js the equivalent of the document that we're talking about in this issue?
  • I'm surprised that the re-serialization of media overlay is actually contained in the same document (static-data.js), is that really necessary?

For the table of contents itself, this looks very straightforward and would most likely look almost the same if we add it to the Web Publication Manifest.

For media overlays, I would much rather treat it as an additional service (media overlay resolver) and have a link for it that shows up with other services:

"links": [
  {"href": "mo.json", "rel": "http://readium.org/mo-resolver", "type": "application/json"},
  {"href": "search.json?q={searchTerms}", "rel": "search", "type": "application/json", "templated": "true"}
]

from architecture.

jccr avatar jccr commented on June 19, 2024

@HadrienGardeur
Yes, runtime-data is a JSON-like dump of the objects in memory. My thinking is that it will be useful to have a reference to this as we move forward.
As for static-data.json, yea it's the readium 1 JSON document, the product of this call: https://github.com/readium/readium-js/blob/develop/js/epub-model/package_document_parser.js#L48

from architecture.

HadrienGardeur avatar HadrienGardeur commented on June 19, 2024

@jccr thanks for the clarification.

Regarding runtime-data, I think the major difference is that with Readium 2 we're targeting platform-specific code, which means that we probably won't have (or need) a universal representation.

It might be necessary to handle a similar object in JS for some modules, but IMO it's not a requirement.

For the ToC example, I see a single list. What happens in Readium-1 when there are multiple lists available (navigation, list of illustrations, guided navigation)?

from architecture.

HadrienGardeur avatar HadrienGardeur commented on June 19, 2024

I'm still waiting on additional info regarding Readium-1, but in the meantime, here's my proposal for handling ToC and other lists in the manifest:

  • we'll introduce a number of new collection roles, in a new extension that will cover the following elements from the EPUB structural semantic vocabulary: toc, page-list, landmarks, loi, loa, lov and lot
  • we'll also introduce a new key for the link object that can contain other link objects: children

Here's an example using toc:

"toc": [
  {
    "href": "pr01.xhtml",
    "title": "Preface",
    "children": [
      {"href": "pr01.xhtml#I_sect1_d1e137","title": "Conventions Used in This Book"}, 
      {"href": "pr01s02.xhtml", "title": "Using Code Examples"}, 
      {"href": "pr01s03.xhtml", "title": "Safari® Books Online"}, 
      {"href": "pr01s04.xhtml", "title": "How to Contact Us"}, 
      {"href": "pr01s05.xhtml", "title": "Acknowledgments"}
    ]
  },
  {"href": "ch01.xhtml", "title": "1. Introduction"}
]

The only real limitation with this design is that we can't have more than one collection of the same type, but according to the EPUB spec, you're not supposed to have more than one toc, landmarks or page-list anyway.

The syntax should also remain compatible with Readium-1 from what I've seen so far (I'm just not 100% sure how the different nav types work in Readium-1).

Since I've already made a proposal for handling media overlay in the Web Publication Manifest, I think that pretty much covers the current gap between Web Publications and the current JSON documents in use in Readium-1.

from architecture.

danielweck avatar danielweck commented on June 19, 2024

@HadrienGardeur please take a look at this Google Doc:
https://docs.google.com/document/d/1PbbJIUtDDsLTyOKcx9WkLK-IOANho_18qziLL85UxhI

A couple of remarks:
in readium-shared-js, the reader.openBook(json) function is effectively the "main entry point" for the rendering / layout engine common to both native ReadiumSDK apps and ReadiumJS apps. The JSON parameter therefore adheres to a shared syntax / schema (as described in the above document). This JSON structure carries information about the parsed EPUB (ordered list of spine items, SMIL / Media Overlays), but crucially lacks information about the navigation document (table of contents, list of pages / landmarks, etc.). This is currently not "standardized" within Readium-1, so SDK / JS apps implement their own parsing logic.
In Readium-2, I would suggest not aiming for any backwards compatibility with Readium-1 's internal EPUB representation. We have an opportunity to design from the ground up, using a more cohesive / coherent approach (should we need Readium-1/2 interoperability in the interim ; for example to re-use existing software modules ; then we would easily be able to implement data conversion utilities).
I think that @HadrienGardeur 's proposed JSON syntax is a good candidate for a next-gen eBook / web-publication serialization format, and I think it would make sense as a Readium-specific interchange data format too (i.e. the internal output of the "publication parser" module).

from architecture.

HadrienGardeur avatar HadrienGardeur commented on June 19, 2024

@danielweck thanks, this covers additional interactions that I didn't had access to before.

Overall I agree with your comment, backward compatibility will only be a bonus here, not a goal.
In the case of the navigation document and its various lists, it's indeed a bonus:

  • link objects in the Web Publication Manifest already use href and title
  • children could be useful elsewhere, for example guided navigation for comics or in OPDS

Compared to what's in Readium-1, the Web Publication Manifest will be less "runtime ready" but also far more structured and extensible, for instance:

  • the metadata section will cover all EPUB metadata not a limited sub-set
  • links provide an easy way to expose a number of additional services available for a publication
  • in the example from Readium-1 that I've seen, the spine is clearly exposed but not the manifest. In the Web Publication Manifest, items that show up in the spine are separated from the rest (resources). Knowing about these other resources in advance could help for a number of things (caching for example).
  • the manifest itself can be easily extended using new collection roles, rel values or properties on link objects

from architecture.

iherman avatar iherman commented on June 19, 2024

On 2 Nov 2016, at 11:18, Hadrien Gardeur [email protected] wrote:

@danielweck https://github.com/danielweck thanks, this covers additional interactions that I didn't had access to before.

Overall I agree with your comment, backward compatibility will only be a bonus here, not a goal.
In the case of the navigation document and its various lists, it's indeed a bonus:

link objects in the Web Publication Manifest already use href and title

Just to avoid misunderstandings for an outside reader: "Web Publication Manifest" does not exist yet. If everything goes as planned, then we may end up setting up a WG sometimes in 2017 and, most probably, defining a Web Publication Manifest will be part of the work. But no decision has been taken on any of its content as of now, and we should not give the impression that any decision has been taken.

As a related issue/comment: I very much hope that (a) such a WG will be set up and we will work on this in ernest and (b) that Readium may play a leading role as an implementer/proof-of-concept tester. Meaning that if the internal structure of Readium-2 will be such that the strictly EPUB3 specific parts are well separated from the rendering, etc, parts, that would be a win for that perspective…

from architecture.

HadrienGardeur avatar HadrienGardeur commented on June 19, 2024

@iherman I've been using the term "Web Publication Manifest" for some time now to reference the post-EPUB BFF work that I've been doing. At the time, PWP was only referenced as such and was not really working on a manifest (as far as I can tell, this is still the case).

For this particular discussion, we're talking about https://github.com/HadrienGardeur/webpub-manifest, not the future specification from the DPUB WG.

Given the schedule for Readium-2 (planning in 2016, development in 2017), it's quite likely that the DPUB WG variant won't be ready in time for the first version, but the current architecture being discussed here should be flexible enough to adopt another manifest format in the future.

from architecture.

iherman avatar iherman commented on June 19, 2024

On 2 Nov 2016, at 11:46, Hadrien Gardeur [email protected] wrote:

@iherman https://github.com/iherman I've been using the term "Web Publication Manifest" for some time now to reference the post-EPUB BFF work that I've been doing. At the time, PWP was only referenced as such and was not really working on a manifest (as far as I can tell, this is still the case).

O.k., no problem. We will have to be careful about terminology, I just wanted to avoid any kind of misunderstanding (not between you and me, but with any third party observer)
For this particular discussion, we're talking about https://github.com/HadrienGardeur/webpub-manifest https://github.com/HadrienGardeur/webpub-manifest, not the future specification from the DPUB WG.

Which, at some point, should converge, but that is for another discussion:-)

Given the schedule for Readium-2 (planning in 2016, development in 2017), it's quite likely that the DPUB WG variant won't be ready in time for the first version, but the current architecture being discussed here should be flexible enough to adopt another manifest format in the future.

Great. I think that is really the important point…

Thx

from architecture.

HadrienGardeur avatar HadrienGardeur commented on June 19, 2024

To continue our discussions from the weekly call, we need to decide:

  • if the ToC and associated navigations lists are embedded in the manifest or accessible through a separate call (exposed in the manifest using a link)
  • same thing for the "media overlay resolver"

I think it might also be worth starting a separate issue for MO, where we can discuss what the output of the MO resolver should be.

@danielweck @rkwright @jccr @dmitrym0 thoughts?

from architecture.

rkwright avatar rkwright commented on June 19, 2024

@HadrienGardeur Agreed about a new issue for MO and the manifest.

from architecture.

danielweck avatar danielweck commented on June 19, 2024

I'm not fully convinced on this yet, but ... I feel that navigation "superstructures" (i.e. centralized sets of links that target several documents within a single publication) such as the hierarchical table of contents, and various semantic lists (e.g. page breaks, illustrations, figures, generic landmarks, etc.) should be available in the main publication "manifest", thereby requiring no level of (fetch) indirection. A single HTTP request, or a single call to the EPUB parsing module, would return what a typical content processor would consider "core" information about the publication. Just like in EPUB3's OPF, sophisticated metadata would be available separately too.
I should point out that in this model, I also envision the possibility of richer styled markup alternative to the core TOC, not dissimilar to EPUB3's dual-purpose Navigation Document (which is both "raw" microdata, and regular HTML5 suitable for direct rendering), except that the "raw" navigation data would be encoded as JSON (perhaps with limited localization / internationalization capabilities, compared to HTML).
Media Overlays (SMIL text-audio synchronization data) should definitely be exposed through a link, otherwise there would be too much "noise" in the main publication manifest. This extra level of indirection will cause no hardship for reading system implementations. Readium-2's own MO engine would query this data in a separate pass, after the main "loading" of the publication's core information.

from architecture.

HadrienGardeur avatar HadrienGardeur commented on June 19, 2024

@danielweck do we have any idea of what the impact is in terms of performance if we include these navigation collections in the manifest?

We've discussed this during the call and while everyone seemed to agree that the MO should be a separate call, there was no clear consensus regarding navigation.

For the "richer styled markup alternative to the core TOC", this can still show up in the manifest in either spine or resources and can be clearly identified using the contents rel value.

from architecture.

danielweck avatar danielweck commented on June 19, 2024

Well, I can see how the expression of a "tree of links, with language/locale -identified textual labels" could become quite verbose when using JSON, so this is worth keeping in mind.

from architecture.

HadrienGardeur avatar HadrienGardeur commented on June 19, 2024

I've tweaked one of the example from the Web Publication Manifest draft and added to it a few things:

  • a new toc collection
  • media-overlays (SMIL + audio) are available in resources along with a MO resolver link in links
  • self link points to a local HTTP server
  • there's also a link to search specific terms in the book that will return locators

I've followed @jccr and pushed this example to the develop branch at https://github.com/readium/readium-2/blob/develop/examples/manifest.json

For now there's no equivalent to the media-overlay attribute of the item element in EPUB, I don't know if this is truly necessary or not.
If it is, even with the resolver (@danielweck ?) it could be handled using a new value for properties that directly points to the location of a SMIL file:

{"href": "html/c002.html", "type": "text/html", "properties": ["media-overlay: chapter2_audio.smil"]}

from architecture.

HadrienGardeur avatar HadrienGardeur commented on June 19, 2024

I've just created a new issue on the Web Publication Manifest repo, with a focus this time on link properties: HadrienGardeur/webpub-manifest#1

EPUB 3.x has a long list of properties that I'd like to avoid in the streamer with an easier model instead.

Thoughts? cc @danielweck @rkwright @llemeurfr

from architecture.

danielweck avatar danielweck commented on June 19, 2024

+1 for allowing an HTML link item to reference its associated SMIL media overlays, as this improves readability and discoverability (reading system implementations do not have to scan the MO dataset in order to find the matching Media Overlay based on ID, or ;god-forbid; href/path).
Could use the properties group, but just like image width/height perhaps this should be a first class attribute?

from architecture.

danielweck avatar danielweck commented on June 19, 2024

I would recommend duration attribute (rather than generic item in properties group) on SMIL and mp3 resource items.

from architecture.

HadrienGardeur avatar HadrienGardeur commented on June 19, 2024

duration can and should definitely be used for both SMIL and MP3 references in resources.

For HTML link referencing a SMIL media overlay, here's an example of what I had in mind:

{
  "href": "chapter1.html",
  "type": "text/html",
  "properties": {"media-overlay": "chapter1.smil"}
}

from architecture.

danielweck avatar danielweck commented on June 19, 2024

I'm not sure I like the "resolver" term to define a URL / link that dereferences to a JSON file that represents all or parts of the publication's media overlays.
Wording aside, I think it would be nice if the proposed model / query API allowed fetching the entire (aggregated) set of media overlays / SMIL files, or only partial data sufficient for playback of one given HTML file (for example). The design challenge is that a given HTML can only reference a single SMIL, but a given SMIL may reference multiple HTML files.

from architecture.

danielweck avatar danielweck commented on June 19, 2024

Regarding the media-overlay property, parsers prefer a strong reference to a resource ID, but humans prefer to see the SMIL file relative path instead. Path matching can be tricky at authoring and other-kind-of-processing time, because of URL/URI and unicode escapes, normalization / canonicalisation, etc. Thus why I would personally prefer cross-referencing using an ID / IDREF mechanism, but I am also concerned about the readability pitfalls (see the metadata refines precedent, or even just EPUB's spine - manifest item cascade)

from architecture.

HadrienGardeur avatar HadrienGardeur commented on June 19, 2024

@danielweck we should move the "resolver" discussions to a new issue dedicated to the syntax of the JSON output.

For the media-overlay, id/idref is something that we absolutely want to avoid in the Web Publication Manifest. Even EPUB 3.1 is moving away from this by dropping refines in metadata.

from architecture.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.