adobe / helix-pipeline Goto Github PK

View Code? Open in Web Editor NEW

31.0 45.0 19.0 27.63 MB

request | markdown | html | response

Home Page: https://www.project-helix.io

License: Apache License 2.0

JavaScript 99.99% Shell 0.01%

helix helix2 helix3 library

helix-pipeline's Introduction

Helix Pipeline

This project provides helper functions and default implementations for creating Hypermedia Processing Pipelines.

It uses reducers and continuations to create a simple processing pipeline that can pre-and post-process HTML, JSON, and other hypermedia.

Status

Status

Helix Markdown

The Helix Pipeline supports some Markdown extensions.

Anatomy of a Pipeline

A pipeline consists of the following main parts:

pre-processing functions.
the main response generating function.
an optional wrapper function.
post-processing functions.
error handling functions.

Each step of the pipeline is processing a single context object, which eventually can be used to return the http response.

See below for the anatomy of the context.

Typically, there is one pipeline for each content type supported and pipeline are identified by file name, e.g.:

html.pipe.js – creates HTML documents with the text/html content-type
json.pipe.js – creates JSON documents with the application/json content-type

Building a Pipeline

A pipeline builder can be created by creating a CommonJS module that exports a function pipe which accepts the following arguments and returns a Pipeline function.

cont: the main function that will be executed as a continuation of the pipeline.
context: the context (formerly known as payload) that is accumulated during the pipeline.
action: the action that serves as a holder for extra pipeline invocation argument.

This project's main entry provides a helper function for pipeline construction and a few helper functions, so that a basic pipeline can be constructed like this:

// the pipeline itself
const pipeline = require("@adobe/hypermedia-pipeline");

module.exports.pipe = function(cont, context, action) {
    action.logger.debug("Constructing Custom Pipeline");

    return pipeline()
        .use(adjustContent)
        .use(cont)            // execute the continuation function
        .use(cleanupContent)
}

In a typical pipeline, you will add additional processing steps as .use(require('some-module')).

The Main Function

The main function is typically a pure function that converts the request and content properties of the context into a response object.

In most scenarios, the main function is compiled from a template in a templating language like HTL, JST, or JSX.

Typically, there is one template (and thus one main function) for each content variation of the file type. Content variations are identified by a selector (the piece of the file name before the file extension, e.g. in example.navigation.html the selector would be navigation). If no selector is provided, the template is the default template for the pipeline.

Examples of possible template names include:

html.jsx (compiled to html.js) – default for the HTML pipeline.
html.navigation.jst (compiled to html.navigation.js) – renders the navigation.
dropdown.json.js (not compiled) – creates pure JSON output.
dropdown.html.htl (compiled to dropdown.html.js) – renders the dropdown component.

(Optional) The Wrapper Function

Sometimes it is necessary to pre-process the context in a template-specific fashion. This wrapper function (often called "Pre-JS" for brevity sake) allows the full transformation of the pipeline's context.

Compared to the pipeline-specific pre-processing functions which handle the request, content, and response, the focus of the wrapper function is implementing business logic needed for the main template function. This allows for a clean separation between:

presentation (in the main function, often expressed in declarative templates).
business logic (in the wrapper function, often expressed in imperative code).
content-type specific implementation (in the pipeline, expressed in functional code).

A simple implementation of a wrapper function would look like this:

// All wrapper functions must export `pre`
// The functions takes the following arguments:
// - `cont` (the continuation function, i.e. the main template function)
// - `context` (the context of the pipeline)
// - `action` (the action of the pipeline)
module.exports.pre = (cont, context, action) => {
    const {request, content, response} = context;
    
    // modifying the context content before invoking the main function
    content.hello = 'World';
    const modifiedcontext = {request, content, response};

    // invoking the main function with the new context. Capturing the response
    // context for further modification

    const responsecontext = cont(modifiedcontext, action);

    // Adding a value to the context response
    const modifiedresponse = modifiedcontext.response;
    modifiedresponse.hello = 'World';

    return Object.assign(modifiedcontext, modifiedresponse);
}

Pre-Processing Functions

Pre-Processing functions are meant to:

parse and process request parameters.
fetch and parse the requested content.
transform the requested content.

Post-Processing Functions

Post-Processing functions are meant to:

process and transform the response.

Error Handlers

In default state, the pipeline will process all normal functions but will skip error handlers (.error()). When the pipeline is in the error state normal processing functions are suspended until the end of the pipeline is reached, or the error state is cleared. While in the error state, error handlers will be executed. The pipeline execution is in an error state if context.error is defined. An Error state can happen when a processing function throws an Exception, or if it sets the context.error object directly.

Example:

new pipeline()
  .use(doSomething)
  .use(render)
  .use(cleanup)
  .error(handleError)
  .use(done);

If in the above example, the doSomething causes an error, subsequently, render and cleanup will not be invoked. but handleError will. If handleError clears the error state (i.e. sets context.error = null), the done function will be invoked again.

If in the above example, none of the functions causes an error, the handleError will never be invoked.

Extension Points

In addition to the (optional) wrapper function which can be invoked prior to the once function, pipeline creators can expose named extension points. These extension points allow users of a pipeline to inject additional functions that will be called right before, right after or instead of an extension point. To keep the extension points independent from the implementation (i.e. the name of the function), pipeline authors should use the expose(name) function to expose a particular extension point.

Example:

new pipeline()
  .use(doSomething).expose('init')
  .use(render)
  .use(cleanup).expose('cleanup')
  .use(done);

In this example, two extension points, init and cleanup have been defined. Note how the name of the extension point can be the same as the name of the function (i.e. cleanup), but does not have to be the same (i.e. init vs. doSomething).

Common Extension Points

The creation of extension points is the responsibility of the pipeline author, but in order to standardize extension points, the following common names have been established:

fetch for the pipeline step that retrieves raw content, i.e. a Markdown document.
parse for the pipeline step that parses the raw content and transforms it into a document structure such as a Markdown AST.
meta for the pipeline step that extracts metadata from the content structure.
html for the pipeline step that turns the Markdown document into a (HTML) DOM.
esi for the pipeline step that scans the generated output for ESI markers and sets appropriate headers.

Using Extension Points

The easiest way to use extension points is by expanding on the Wrapper Function described above. Instead of just exporting a pre function, the wrapper can also export:

a before object.
an after object.
a replace object.

Each of these objects can have keys that correspond to the named extension points defined for the pipeline.

Example:

module.exports.before = {
  html: (context, action) => {
    // will get called before the "html" pipeline step
  }
}

module.exports.after = {
  fetch: (context, action) => {
    // will get called after the "fetch" pipeline step
  }
}

module.exports.replace = {
  meta: (context, action) => {
    // will get called instead of the "meta" pipeline step
  }
}

All functions that are using the before and after extension points need to follow the same interface that all other pipeline functions follow, i.e. they have access to context and action and they should return a modified context object.

A more complex example of using these extension points to implement custom markdown content nodes and handle 404 errors can be found in the helix-cli integration tests.

Anatomy of the Context

Following main properties exist:

request
content
response
error

also see context schema.

The `request` object

params: a map of request parameters.
headers: a map of HTTP headers.

also see request schema.

The `content` object

body: the unparsed content body as a string.
mdast: the parsed Markdown AST.
meta: a map metadata properties, including:
- title: title of the document.
- intro: a plain-text introduction or description.
- type: the content type of the document.
- image: the URL of the first image in the document.
document: a DOM-compatible Document representation of the (HTML) document (see below).
sections[]: The main sections of the document, as an enhanced MDAST (see below).
html: a string of the content rendered as HTML.
children: an array of top-level elements of the HTML-rendered content.

also see content schema.

`content.document` in Detail

For developers that prefer using the rendered HTML over the input Markdown AST, content.document provides a representation of the rendered HTML that is API-compatible to the window.document object you would find in a browser.

The most common way of using it is probably calling content.document.innerHTML, which gives the full HTML of the page, but other functions like:

content.document.getElementsByClassName
content.document.querySelector
content.document.querySelectorAll

are also available. Please note that some functions like:

content.document.getElementsByClassName
content.document.getElementByID

are less useful because the HTML generated by the default pipeline does not inject class name or ID attributes.

The tooling for generating (Virtual) DOM nodes from Markdown AST is made available as a utility class, so that it can be used in custom pre.js scripts, and described below.

`content.sections` in Detail

The default pipeline extracts sections from a Markdown document, using both "thematic breaks" like *** or --- and embedded YAML blocks as section markers. If no sections can be found in the document, the entire content.mdast will be identically to content.sections[0].

content.sections is an Array of section nodes, with type (String) and children (array of Node) properties. In addition, each section has a types attribute, which is an array of derived content types. Project Helix (and Hypermedia Pipeline) uses implied typing over declared content typing, which means it is not the task of the author to explicitly declare the content type of a section or document, but rather have the template interpret the contents of a section to understand the type of content it is dealing with.

The types property is an array of string values that describes the type of the section based on the occurrence of child nodes. This makes it easy to copy the value of types into the class attribute of an HTML element, so that CSS expressions matching types of sections can be written with ease. Following patterns of type values can be found:

has-<type>: for each type of content that occurs at least once in the section, e.g. has-heading
has-only-<type>: for sections that only have content of a single type, e.g. has-only-image
is-<type-1>-<type-2>-<type3>, is-<type-1>-<type-2>, and is-<type-1> for the top 3 most frequent types of children in the section. For instance, a gallery with a heading and description would be is-image-text-heading. You can infer additional types using utils.types.
nb-<type>-<occurences>: number of occurences of each type in the section.

Each section has additional content-derived metadata properties, in particular:

title: the value of the first headline in the section.
intro: the value of the first paragraph in the section.
image: the URL of the first image in the section.
meta: the parsed YAML metadata of the section (as an object).

The `response` object

body: the unparsed response body as a string.
headers: a map of HTTP response headers.
status: the HTTP status code.

also see response schema.

The `error` object

This object is only set when there has been an error during pipeline processing. Any step in the pipeline may set the error object. Subsequent steps should simply skip any processing if they encounter an error object.

Alternatively, steps can attempt to handle the error object, for instance by generating a formatted error message and leaving it in response.body.

The only known property in error is:

message: the error message.

Utilities

Generate a Virtual DOM with `utils.vdom`

VDOM is a helper class that transforms MDAST Markdown into DOM nodes using customizable matcher functions or expressions.

It can be used in scenarios where:

you need to represent only a section of the document in HTML.
you have made changes to content.mdast and want them reflected in HTML.
you want to customize the HTML output for certain Markdown elements.

Getting Started

In most cases, you can simply access a pre-configured transformer from the action.transformer property. It will be available in the pipeline after the parse step and used in the html step, so registering after.parse and before.html are the ideal points to adjust the generated HTML.

Alternatively, load the VDOM helper through:

const VDOM = require('@adobe/helix-pipeline').utils.vdom;

Simple Transformations

content.document = new VDOM(content.mdast).getDocument();

This replaces content.document with a re-rendered representation of the Markdown AST. It can be used when changes to content.mdast have been made.

When using action.transformer, this manual invocation is not required.

content.document = new VDOM(content.sections[0]).getDocument();

This uses only the content of the first section to render the document.

Matching Nodes

Nodes in the Markdown AST can be matched in two ways: either using a select-statement or using a predicate function.

action.transformer.match('heading', () => '<h1>This text replaces your heading</h1>');
content.document = vdom.getDocument();

Every node with the type heading will be rendered as <h1>This text replaces your heading</h1>;

action.transformer.match(function test(node) {
  return node.type === 'heading';
}, () => '<h1>This text replaces your heading</h1>');
content.document = vdom.getDocument();

Instead of the select-statement, you can also provide a predicate function that returns true or false. The two examples above will have the same behavior.

Creating DOM Nodes

The second argument to match is a node-generating function that should return one of the following three options:

a DOM Node.
a String containing HTML tags.

action.transformer.match('link', (_, node) => {
  return {
    type: 'element',
    tagName: 'a',
    properties: {
      href: node.url,
      rel: 'nofollow'
    },
    children: [
      {
        type: 'text',
        value: node.children.map(({ value }) => value)
      }
    ]
  }
}

Above: injecting rel="nofollow" using HTAST.

const h = require('hyperscript');

action.transformer.match('link', (_, node) => h(
    'a',
    { href: node.url, rel: 'nofollow' },
    node.children.map(({ value }) => value),
  );

Above: doing the same using Hyperscript (which creates DOM elements) is notably shorter.

action.transformer.match('link', (_, node) => 
  `<a href="${node.url}" rel="nofollow">$(node.children.map(({ value }) => value)).join('')</a>`;

Above: Plain Strings can be constructed using String Templates in ES6 for the same result.

Dealing with Child Nodes

The examples above have only been processing terminal nodes (or almost terminal nodes). In reality, you need to make sure that all child nodes of the matched node are processed too. For this, the signature of the handler function provides you with a handlechild function.

// match all "emphasis" nodes
action.transformer.match('emphasis', (h, node, _, handlechild) => {
  // create a new HTML tag <i class="its-not-semantic…">
  const i = h(node, 'i', { className: 'its-not-semantic-html-but-i-like-it' });
  // make sure all child nodes of the markdown node are processed
  node.children.forEach((childnode) => handlechild(h, childnode, node, i));
  // return the i HTML element
  return i;
});

The handlechild function is called with:

h: a DOM-node producing function.
childnode: the child node to be processed.
node: the parent node of the node to be processed (usually the current node).
i: the DOM node will be the new parent node for newly created DOM nodes.

Infer Content Types with `utils.types`

In addition to the automatically inferred content types for each section, utils.types provides a TypeMatcher utility class that allows matching section content against a simple expression language and thus enrich the section[].types values.

const TypeMatcher = require('@adobe/hypermedia-pipeline').utils.types;

const matcher = new TypeMatcher(content.sections);
matcher.match('^heading', 'starts-with-heading');
content.sections = matcher.process();

In the example above, all sections that have a heading as the first child will get the value starts-with-heading appended to the types array. ^heading is an example of the content expression language, which allows matching content against a simple regular expression-like syntax.

Content Expression Language

^heading – the first element is a heading.
paragraph$ – the last element is a paragraph.
heading image+ – a heading followed by one or more images.
heading? image – an optional heading followed by one image.
heading paragraph* image – a heading followed by any number of paragraphs (also no paragraphs at all), followed by an image.
(paragraph|list) – a paragraph or a list.
^heading (image paragraph)+$ – one heading, followed by pairs of image and paragraph, but at least one.

Inspecting the Pipeline Context

When run in non-production, i.e. outside an OpenWhisk action, for example in hlx up, Pipeline Dumping is enabled. Pipeline Dumping allows developers to easily inspect the Context object of each step of the pipeline and can be used to debug pipeline functions and to generate realistic test cases.

Each stage of the pipeline processing will create a file like $PWD/logs/debug/context_dump_34161BE5KuR0nuFDp/context-20180902-1418-05.0635-step-2.json inside the debug directory. These dumps will be removed when the node process ends, so that after stopping hlx up the debug directory will be clean again. The -step-n in the filename indicates the step in the pipeline that has been logged.

A simple example might look like this:

Step 0:

{}

Step 1:

{
  "request": {}
}

Step 2:

{
  "request": {},
  "content": {
    "body": "---\ntemplate: Medium\n---\n\n# Bill, Welcome to the future\n> Project Helix\n\n## Let's talk about Project Helix\n![](./moscow/assets/IMG_0167.jpg)\n",
    "sources": [
      "https://raw.githubusercontent.com/trieloff/soupdemo/master/hello.md"
    ]
  }
}

Step 3 (diff only):

@@ -1,6 +1,58 @@
 {
   "content": {
-    "body": "Hello World"
+    "body": "Hello World",
+    "mdast": {
+      "type": "root",
+      "children": [
+        {
+          "type": "paragraph",
+          "children": [
+            {
+              "type": "text",
+              "value": "Hello World",
+              "position": {
+                "start": {
+                  "line": 1,
+                  "column": 1,
+                  "offset": 0
+                },
+                "end": {
+                  "line": 1,
+                  "column": 12,
+                  "offset": 11
+                },
+                "indent": []
+              }
+            }
+          ],
+          "position": {
+            "start": {
+              "line": 1,
+              "column": 1,
+              "offset": 0
+            },
+            "end": {
+              "line": 1,
+              "column": 12,
+              "offset": 11
+            },
+            "indent": []
+          }
+        }
+      ],
+      "position": {
+        "start": {
+          "line": 1,
+          "column": 1,
+          "offset": 0
+        },
+        "end": {
+          "line": 1,
+          "column": 12,
+          "offset": 11
+        }
+      }
+    }
   },
   "request": {}
 }

Step 5 (diff only):

@@ -52,7 +52,49 @@
           "offset": 11
         }
       }
-    }
+    },
+    "sections": [
+      {
+        "type": "root",
+        "children": [
+          {
+            "type": "paragraph",
+            "children": [
+              {
+                "type": "text",
+                "value": "Hello World",
+                "position": {
+                  "start": {
+                    "line": 1,
+                    "column": 1,
+                    "offset": 0
+                  },
+                  "end": {
+                    "line": 1,
+                    "column": 12,
+                    "offset": 11
+                  },
+                  "indent": []
+                }
+              }
+            ],
+            "position": {
+              "start": {
+                "line": 1,
+                "column": 1,
+                "offset": 0
+              },
+              "end": {
+                "line": 1,
+                "column": 12,
+                "offset": 11
+              },
+              "indent": []
+            }
+          }
+        ]
+      }
+    ]
   },
   "request": {}
 }

Step 6 (diff only):

@@ -92,9 +92,19 @@
              "indent": []
            }
          }
-        ]
+        ],
+        "title": "Hello World",
+        "types": [
+          "has-text",
+          "has-only-text"
+        ],
+        "intro": "Hello World",
+        "meta": {}
      }
-    ]
+    ],
+    "meta": {},
+    "title": "Hello World",
+    "intro": "Hello World"
  },
  "request": {}
}

Step 9 (diff only):

@@ -169,7 +169,11 @@
        "search": "",
        "hash": ""
      }
-    }
+    },
+    "html": "<p>Hello World</p>",
+    "children": [
+      "<p>Hello World</p>"
+    ]
  },
  "request": {}
}

Step 10 (diff only):

@@ -175,5 +175,9 @@
      "<p>Hello World</p>"
    ]
  },
-  "request": {}
+  "request": {},
+  "response": {
+    "status": 201,
+    "body": "<p>Hello World</p>"
+  }
}

helix-pipeline's People

Contributors

Stargazers

Watchers

Forkers

dinraf imagineagents dalavancloud saritchie ryanbetts csu-anzai csu-xiao-an pelord leeclench parikshit-hooda ajloria lex111 mohansha danrocha guthakondalu isabella232 actinium15 davidbenge ghas-results

helix-pipeline's Issues

Dependency reduction: remove bluebird

$ npm ls bluebird
@adobe/[email protected] /Users/tripod/codez/helix/project-helix/modules/hypermedia-pipeline
├── [email protected]
├─┬ [email protected]
│ └── [email protected]  deduped
└─┬ [email protected]
  └── [email protected]  deduped

request-promise can be replaced with request-promise-native
tmp-promise can be replaced with just tmp and asyncyfied manually. although I think we should even get rid of tmp.

Improve URL construction

The fetch-markdown step struggles with constructing proper URLs when the input data is not clean:

the RAW_URL must contain a trailing slash
the path may not contain a trailing slash

it should be more forgiving in what it accepts and more strict in the URLs it generates

Rename continuation function

I think cont should be named next (similar to the express' middleware)

Set better cache-control headers

@trieloff commented on Feb 23, 2018, 11:10 AM UTC:

etags (based on revision)
Vary (don't)
Surrogate Keys (based on template function, so that all stale content can be invalidated)

This issue was moved by trieloff from adobe/project-helix#44.

Examples from hlx demo (simple and full) fail to validate

As reported by @kptdobe.

[pre.js] Strainconfig missing in the payload

To gather the metadata, a pre.js needs the strainconfig to know the content and code urls to be able to request things on its own.

@trieloff Your sample has an hardcoded version of that url, what if the content needs to be requested locally ? See https://github.com/adobe/parcel-plugin-htl/blob/21167e6168166f6f74c79a7389f03a842c3adbc3/test/example/alex_html.pre.js#L18

Replace yaml with js-yaml

Helix-CLI has 18 different dependencies to js-yaml and only one to yaml. This one.

Enable global scope objects debugging

@kptdobe commented on Jul 11, 2018, 9:54 AM UTC:

While developing the htl markup, I am not sure of the content of the global scope (it for now) objects. Would be cool to have a debugging feature, maybe something like README.html.it.json that would render the object in my browser.
A console log would be enough though in a first step.

Workaround: add a breakpoint in the generated html.js (inside return runtime.run(function* () {)

This issue was moved by tripodsan from adobe/project-helix#257.

implementation detail bleeding (defaults.js)

the example on how to build the pipeline shows that the defaults source file location is exposed:

// the pipeline itself
const pipeline = require("@adobe/hypermedia-pipeline");
// helper functions and log
const { adaptOWRequest, adaptOWResponse, log } = require('@adobe/hypermedia-pipeline/src/defaults/default.js');

suggestion

exports the defaults

move the pipeline code to own file. eg: src/pipeline.js
export pipeline and defaults in index.js:

index.js (draft):

const defaults = require('./src/defaults/defaults.js')
const pipeline = require('./src/pipeline.js')

module.exports = {
  pipeline,
  defaults
}

usage:

// the pipeline itself
const { pipeline } = require("@adobe/hypermedia-pipeline");
// helper functions and log
const { adaptOWRequest, adaptOWResponse, log } = require('@adobe/hypermedia-pipeline').defaults;

or:

const {pipeline,
  defaults: {
    adaptOWRequest, 
    adaptOWResponse, 
    log
} = require("@adobe/hypermedia-pipeline");

support for simple `pre`

As mentioned in https://github.com/adobe/project-helix/issues/267, it should be super easy to write a pre.js without knowing the complexity of the pipeline.

suggest:

pipeline autodetects pre in simple form like:

module.exports = function pre(payload) {
  payload.resource.time = `${new Date()}`;
}

Implement documented Global Scope

As pointed out in adobe/project-helix#237, we need to document the properties of the payload object in a structured fashion.

@tripodsan I've thought that we could maybe write JSON Schemas for the payload object and have a validation step in our pipelines that runs in development-mode, but is disabled in production. WDYT?

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on all branches of this repository. 🚨

To enable Greenkeeper, you need to make sure that a commit status is reported on all branches. This is required by Greenkeeper because it uses your CI build statuses to figure out when to notify you about breaking changes.

Since we didn’t receive a CI status on the greenkeeper/initial branch, it’s possible that you don’t have CI set up yet. We recommend using Travis CI, but Greenkeeper will work with every other CI service as well.

If you have already set up a CI for this repository, you might need to check how it’s configured. Make sure it is set to run on all new branches. If you don’t want it to run on absolutely every branch, you can whitelist branches starting with greenkeeper/.

Once you have installed and configured CI on this repository correctly, you’ll need to re-trigger Greenkeeper’s initial pull request. To do this, please delete the greenkeeper/initial branch in this repository, and then remove and re-add this repository to the Greenkeeper App’s white list on Github. You'll find this list on your repo or organization’s settings page, under Installed GitHub Apps.

Extract Sections (with metadata) from Markdown

See adobe/project-helix#305

Doc generation fails

Originally posted by @tripodsan in 994347c#commitcomment-31115274

Iterable HTML as JSON

@trieloff commented on Apr 20, 2018, 3:09 PM UTC:

While working on #98 I noticed that neither the AST (fully parsed AST, too granular) object nor the children (innerHTML of all children, too coarse) object nor the html (innerHTML of the entire document, way too coarse) work well for the use case of generating a navigation.

What is needed is the ability to walk the tree of lists, list items, and links, so that HTML can be created that matches the requirements of the CSS framework (in this case, Bulma)

One option to explore is https://www.npmjs.com/package/@thi.ng/hiccup

This issue was moved by trieloff from adobe/project-helix#99.

Surrogate Keys should be base64

More compact.

Layout Machine

@simonwex built a state machine for inferring the content type of a section based on the sequence of children of the section.

I think this could be a great complement to the simple frequency-based approach taken in #47

I've been looking for a way to express the same pattern matching logic with less code and stumbled upon https://github.com/mrijk/speculaas (documentation here https://mrijk.github.io/speculaas/api-reference.html), which is a Javascript implementation of Clojure Spec (https://clojure.org/guides/spec#_sequences)

It would allow us to specify sequences in a syntax like this:

// read as regex: heading? image image image+ i.e. one optional heading and at least three images
s.cat(s.question('heading'), 'image', 'image', s.plus('image'));

// one optional heading and then paragraphs or lists
s.cat(s.question('heading'), s.plus(s.alt('paragraph', 'list')));

@kptdobe do you think this would be useful?

Request object from payload is not complete

Step to reproduce:

run hlx init test
edit src/html.pre.js and add console.log(payload.request); to the pre function
run hlx up
open http://localhost:3000/index.html
check the logs:

   { headers: undefined,
     params: 
      { owner: 'helix',
        repo: 'content',
        ref: 'master',
        path: 'index.md' },
     method: undefined }

headers and method are undefined but also missing the request details like uri, server...

`Cannot convert undefined or null to object` while running the pipeline

Input md: https://raw.github.com/Adobe-Marketing-Cloud/reactor-user-docs/master/README.md

Full stack:

TypeError: Cannot convert undefined or null to object
    at Function.assign (<anonymous>)
    at getmetadata (/Users/alex/work/dev/playground/repos/adobe/helix-helpx/node_modules/@adobe/hypermedia-pipeline/src/html/get-metadata.js:22:23)
    at merge (/Users/alex/work/dev/playground/repos/adobe/helix-helpx/node_modules/@adobe/hypermedia-pipeline/src/pipeline.js:74:30)
    at tryCatcher (/Users/alex/work/dev/playground/repos/adobe/helix-helpx/node_modules/bluebird/js/release/util.js:16:23)
    at Object.gotValue (/Users/alex/work/dev/playground/repos/adobe/helix-helpx/node_modules/bluebird/js/release/reduce.js:157:18)
    at Object.gotAccum (/Users/alex/work/dev/playground/repos/adobe/helix-helpx/node_modules/bluebird/js/release/reduce.js:144:25)
    at Object.tryCatcher (/Users/alex/work/dev/playground/repos/adobe/helix-helpx/node_modules/bluebird/js/release/util.js:16:23)
    at Promise._settlePromiseFromHandler (/Users/alex/work/dev/playground/repos/adobe/helix-helpx/node_modules/bluebird/js/release/promise.js:512:31)
    at Promise._settlePromise (/Users/alex/work/dev/playground/repos/adobe/helix-helpx/node_modules/bluebird/js/release/promise.js:569:18)
    at Promise._settlePromise0 (/Users/alex/work/dev/playground/repos/adobe/helix-helpx/node_modules/bluebird/js/release/promise.js:614:10)
    at Promise._settlePromises (/Users/alex/work/dev/playground/repos/adobe/helix-helpx/node_modules/bluebird/js/release/promise.js:693:18)
    at Async._drainQueue (/Users/alex/work/dev/playground/repos/adobe/helix-helpx/node_modules/bluebird/js/release/async.js:133:16)
    at Async._drainQueues (/Users/alex/work/dev/playground/repos/adobe/helix-helpx/node_modules/bluebird/js/release/async.js:143:10)
    at Immediate.Async.drainQueues [as _onImmediate] (/Users/alex/work/dev/playground/repos/adobe/helix-helpx/node_modules/bluebird/js/release/async.js:17:14)
    at runCallback (timers.js:763:18)
    at tryOnImmediate (timers.js:734:5)
    at processImmediate (timers.js:716:5)

cc @trieloff

Skip the fetch step in the default pipeline, if an input has been provided

See adobe/helix-cli#18:

if resource.body is a non-empty string in the fetch step, just skip the fetch step.

Rewrite Links in HTML

In a.md, create a link to b.md.
Helix will render this link with the appropriate <a> tag.

-> When you click on the link, you get a 404 (true on Petridish and production).

Problem: our link in Markdown points to the .md file, but we should link to the .html

This is would not be too hard to achieve with a simple mapping filter in the HTAST structure.

Compact Markdown as JSON

In one of these issues, @kptdobe expressed the hope for a more compact (but still visitable) Markdown representation than mdast offers us.

Here are a couple of ideas how we could compress the representation:

represent paragraph as simple strings, with all inline elements converted into HTML strings.
represent heading, blockquote, listitem, etc. as {type: 'heading', level: 1, value: 'Head <b>Line</b>'} (using the same encoding as above)
represent list as {type: 'orderedlist', items: ["One", "Two", "Three"]}

This would massively limit the depth of the tree you have to walk (unless you are confronted with nested lists, there are only two levels), but you might end up doing a lot of if (typeof node === 'string') {} else if (node.type && node.type ==='heading') etc. switches.

Add support for anchors on headings

Anchors are nicely managed on github.com: when rendering the markdown, github adds an anchor before each heading and shows an icon that gives the link to have a direct access to the corresponding heading.
While working on our Helix documentation, I realised this would be a really useful feature, especially to handle the navigation: you can have a long md file but create a nav that give direct access to a "section" of content below in the page. Without the anchors, I had to split the long md file into various small md files to give direct access.

The pipeline could automatically generate an anchor (<a> tag with an id) for each "section" and "heading". Showing or not an icon would be delegated to the project and could be done with CSS.

This issue was moved by trieloff from adobe/project-helix#154.

Automatically Fix Typographic Quotes

@trieloff commented on Mar 2, 2018, 11:18 AM UTC:

Straight quotes ( " and ' ) into “curly” quote HTML entities
Backticks-style quotes (``like this'') into “curly” quote HTML entities
Dashes (“--” and “---”) into en- and em-dash entities
Three consecutive dots (“...”) into an ellipsis entity

https://github.com/retextjs/retext-smartypants

This issue was moved by trieloff from adobe/project-helix#69.

Provide REPO_RAW_ROOT and REPO_API_ROOT to `pre.js` via secrets

To be able to make extra requests and enhance the payload, the pre.js might need the raw and the api root urls. This is currently achieved in Petridish with the REPO_RAW_ROOT and REPO_API_ROOT. The same should be done in the pipeline.

Add JSON pipeline

Like the HTML pipeline, but assume that the output is a JSON document

Conditional Sections

In order to enable testing and personalization on a section-level, conditional sections allow two modes of operation:

strain-dependent sections: sections that have a strain property set, will only be included in the output if the current strain equals either the value of strain or equals any value of strain if strain is an array.
tested sections: sections that have a test property set will only be included in the output if a test selection algorithm has selected them from all other sections that have the same test value. The algorithm will group all sections based on their test value, then hash the section's contents and the current strain name and select the section with the lowest hash value. This algorithm creates a selection that is random, but stable for stable content and strain names.

npm tests fail when offline

the npm tests read github to tests the markdown parsing. when working offline, the tests fail.

processing html pipeline with non-existent content produces a 200

Stray pipeline.log

@rofe commented on Sep 6, 2018, 3:38 PM UTC:

In my demo setup, a pipeline.log gets created next to instead of in the logs directory.

This issue was moved by trieloff from adobe/project-helix#303.

HTML Navigation/TOC Pipeline and Markdown and JSON Format

Rendering navigation elements is a pretty common use case that can be handled by creating a TOC.md, SUMMARY.md or SITEMAP.md that contains lists of links to be shown in a navigation.

I suggest we define a Markdown subset that is best suited for Navigations and is able to express patterns commonly found in site navigations.

We then provide a separate HTML+JSON Pipeline for navigation that will parse the Markdown, and represent the navigation in a compact JSON format, so that custom navigations can be built easily.

Proposed: We fail fast here

I think receiving an empty response object down the stack is likely to be confusing and result in more difficult debugging if this fails. My suggestion would be to bubble the failure up to comprehensively fail.

https://github.com/adobe/hypermedia-pipeline/blob/6b40a835d21bd59ce094432d5a3b603cd99c866f/src/defaults/default.js#L56

Thoughts?

(https://en.wikipedia.org/wiki/Fail-fast what I'm referencing in title)

Provide tap function in the pipeline builder to enable in-progress inspection

Calling pipe.pre(mystep).tap(myfn)

should ensure that myfn will be called before mystep is executed by the pipeline, but that the return value of myfn will not enter the pipeline.

This allows setting myfn as an observer to mystep, which can ease debugging and inspection.

Use new release management rules

modify circleci to use rules described in adobe/project-helix#278

Rename 'pre' to be less confusing with the `pre.js`

The pipeline execution uses 3 main methods: pre -> once -> after

See https://github.com/adobe/hypermedia-pipeline/blob/master/src/defaults/html.pipe.js#L33-L45

The pre function name is confusing because of the pre.js file name.

Suggestion is to replace by: before -> once -> after

Align parsing of the `req` parameter with VCL and petridish

the req param is currently parsed as JSON. as mentioned in adobe/helix-cli#126 this might change.

Embed External Content

In order to support embeds (OEmbed and Twitter cards, etc.), I suggest following approach:

In the first step, detect embeds and create embed nodes in the MDAST. embeds can take multiple forms:

Gatsby-style: `video: https://www.youtube.com/embed/2Xc9gXyf2G4`
Link + Image-style: [![Audi R8](http://img.youtube.com/vi/KOxbO0EI4MA/0.jpg)](https://www.youtube.com/watch?v=KOxbO0EI4MA "Audi R8")
Link-style: ![](https://www.youtube.com/watch?v=KOxbO0EI4MA)
iA-Writer-Style: https://www.youtube.com/watch?v=KOxbO0EI4MA (the URL is the only content of a paragraph)

The embed node will simply take the previous nodes as children, so that normal HTML processing just skips over it.

In the second step, we add a default handler to the #53 VDOM Util, which finds embed nodes in MDAST and outputs the appropriate HTML to embed the referenced content. The handler uses a service (with configurable URL) for generating appropriate HTML representations.

Remove `request.req` support

as mentioned in #77, the request.req parameter is not really used in production since most of the headers are passed through from the edge and the request.params contains the essential params from the client.

Improve Documentation of Pipeline Types and Helper Functions

We should update README.md to reflect the changes to payload and action. @tripodsan @kptdobe

Set X-ESI header when ESI is used in response

See adobe/helix-cli#275

Debug for the `action` object

With #19, we introduce a console log of the payload. It is still hard to know what is in the action object. This one contains:

logger: which is useless to output
secrets: which in theory should not be output. In dev mode (Petridish), knowing what's in the secrets is still required and useful but could be achieved with the debugger in the pre.js file.
request: needed when doing more advanced stuff

In general, interacting with the action object is needed for advanced operations, thus I think using the debugger would be enough. Especially now we could debug the pre.js directly.

Document the use of the JSON pipeline

#16 has been merged with no explanation / sample on how to use it. ootb, it cannot work since Petridish has not been modified, I do not see how this could be used but maybe I missed something...

cc @trieloff

Improve error handling

The pipeline does not have any meaningful error handling at the moment.

At a minimum, every step of the pipeline should log errors and add found errors in the error object as part of the response payload.

Subsequent steps in the pipeline can then attempt to handle the error, e.g. by formatting it nicely, or transforming it to a proper OpenWhisk error

Rename 'resource' to 'content'

Add pathInfo to request

variable	comment
`request.pathInfo`	request path info, similar to o.a.s.api.request.RequestPathInfo
`request.pathInfo.path`	resource path
`request.pathInfo.extension`	extension
`request.pathInfo.selectors[]`	selectors
`request.pathInfo.selectorString`	selectors
`request.pathInfo.suffix`	suffix

would be nice to have a description here.

Add all owrequest adapter logic from helix-cli

The OutputTemplate.js contains a lot of internal logic that converts the ow-request to (payload, action).

It would be nicer to have this in hypermedia-pipeline/adaptOWRequest

Support authenticated GitHub access

This will be needed eventually for pre-release content and other private code we want to share.

From @trieloff:

https://github.com/adobe/hypermedia-pipeline/blob/master/src/html/fetch-markdown.js needs to be adjusted to make authenticated requests to GitHub using credentials that are provided as parameters.
We need to find a way to pass the parameters. A simple first step would be to just pass them as default params to the action in helix-cli (you would need to create a personal API token in the GitHub UI)

We might also want to eventually support GitCorp repos as well.

This issue covers making Github requests using the provided GITHUB_TOKEN.

Related issues

adobe/helix-cli#901 (hlx publish --github-token)
adobe/helix-publish#165
#72
adobe/helix-dispatch#71
adobe/helix-resolve-git-ref#1
adobe/helix-static#46

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

adobe / helix-pipeline Goto Github PK

helix-pipeline's Introduction

Helix Pipeline

Table of Contents

Status

Helix Markdown

Anatomy of a Pipeline

Building a Pipeline

The Main Function

(Optional) The Wrapper Function

Pre-Processing Functions

Post-Processing Functions

Error Handlers

Extension Points

Common Extension Points

Using Extension Points

Anatomy of the Context

The request object

The content object

content.document in Detail

content.sections in Detail

The response object

The error object

Utilities

Generate a Virtual DOM with utils.vdom

Getting Started

Simple Transformations

Matching Nodes

Creating DOM Nodes

Dealing with Child Nodes

Infer Content Types with utils.types

Content Expression Language

Inspecting the Pipeline Context

helix-pipeline's People

Contributors

Stargazers

Watchers

Forkers

helix-pipeline's Issues

suggestion

suggest:

Recommend Projects

Recommend Topics

Recommend Org

The `request` object

The `content` object

`content.document` in Detail

`content.sections` in Detail

The `response` object

The `error` object

Generate a Virtual DOM with `utils.vdom`

Infer Content Types with `utils.types`