Giter VIP home page Giter VIP logo

proposal-pattern-matching's Introduction

ECMAScript Pattern Matching

Stage: 1

Spec Text: https://tc39.github.io/proposal-pattern-matching

Authors: Originally Kat Marchán (Microsoft, @zkat__); now, the below champions.

Champions: (in alphabetical order)

Introduction

Problem

There are many ways to match values in the language, but there are no ways to match patterns beyond regular expressions for strings. However, wanting to take different actions based on patterns in a given value is a very common desire: do X if the value has a foo property, do Y if it contains three or more items, etc.

Current Approaches

switch has the desired structure -- a value is given, and several possible match criteria are offered, each with a different associated action. But it's severely lacking in practice: it may not appear in expression position; an explicit break is required in each case to avoid accidental fallthrough; scoping is ambiguous (block-scoped variables inside one case are available in the scope of the others, unless curly braces are used); the only comparison it can do is ===; etc.

if/else has the necessary power (you can perform any comparison you like), but it's overly verbose even for common cases, requiring the author to explicitly list paths into the value's structure multiple times, once per test performed. It's also statement-only (unless the author opts for the harder-to-understand ternary syntax) and requires the value under test to be stored in a (possibly temporary) variable.

Priorities for a solution

This section details this proposal’s priorities. Note that not every champion may agree with each priority.

Pattern matching

The pattern matching construct is a full conditional logic construct that can do more than just pattern matching. As such, there have been (and there will be more) trade-offs that need to be made. In those cases, we should prioritize the ergonomics of structural pattern matching over other capabilities of this construct.

Subsumption of switch

This feature must be easily searchable, so that tutorials and documentation are easy to locate, and so that the feature is easy to learn and recognize. As such, there must be no syntactic overlap with the switch statement.

This proposal seeks to preserve the good parts of switch, and eliminate any reasons to reach for it.

Be better than switch

switch contains a plethora of footguns such as accidental case fallthrough and ambiguous scoping. This proposal should eliminate those footguns, while also introducing new capabilities that switch currently can not provide.

Expression semantics

The pattern matching construct should be usable as an expression:

  • return match { ... }
  • let foo = match { ... }
  • () => match { ... }
  • etc.

The value of the whole expression is the value of whatever clause is matched.

Exhaustiveness and ordering

If the developer wants to ignore certain possible cases, they should specify that explicitly. A development-time error is less costly than a production-time error from something further down the stack.

If the developer wants two cases to share logic (what we know as "fall-through" from switch), they should specify it explicitly. Implicit fall-through inevitably silently accepts buggy code.

Clauses should always be checked in the order they’re written, i.e. from top to bottom.

User extensibility

Userland objects should be able to encapsulate their own matching semantics, without unnecessarily privileging builtins. This includes regular expressions (as opposed to the literal pattern syntax), numeric ranges, etc.

Prior Art

This proposal adds a pattern matching expression to the language, based in part on the existing Destructuring Binding Patterns.

This proposal was approved for Stage 1 in the May 2018 TC39 meeting, and slides for that presentation are available here. Its current form was presented to TC39 in the April 2021 meeting (slides).

This proposal draws from, and partially overlaps with, corresponding features in CoffeeScript, Rust, Python, F#, Scala, Elixir/Erlang, and C++.

Userland matching

A list of community libraries that provide similar matching functionality:

  • Optionals — Rust-like error handling, options and exhaustive pattern matching for TypeScript and Deno
  • ts-pattern — Exhaustive Pattern Matching library for TypeScript, with smart type inference.
  • babel-plugin-proposal-pattern-matching — Minimal grammar, high performance JavaScript pattern matching implementation.
  • match-iz — A tiny functional pattern-matching library inspired by the TC39 proposal.
  • patcom — Feature parity with TC39 proposal without any new syntax

Specification

This proposal introduces three new concepts to Javascript:

  • the "matcher pattern", a new DSL closely related to destructuring patterns, which allows recursively testing the structure and contents of a value in multiple ways at once, and extracting some of that structure into local bindings at the same time
  • the match(){} expression, a general replacement for the switch statement that uses matcher patterns to resolve to one of several values,
  • the is boolean operator, which allows for one-off testing of a value against a matcher pattern, potentially also introducing bindings from that test into the local environment.

Matcher Patterns

Matcher patterns are a new DSL, closely inspired by destructuring patterns, for recursively testing the structure and contents of a value while simultaneously extracting some parts of that value as local bindings for use by other code.

Matcher patterns can be divided into three general varieties:

  • Value patterns, which test that the subject matches some criteria, like "is the string "foo"" or "matches the variable bar".
  • Structure patterns, which test the subject matches some structural criteria like "has the property foo" or "is at least length 3", and also let you recursively apply additional matchers to parts of that structure.
  • Combinator patterns, which let you match several patterns in parallel on the same subject, with simple boolean and/or logic.

Value Matchers

There are several types of value patterns, performing different types of tests.

Primitive Patterns

All primitive values and variable names can be used directly as matcher patterns, representing a test that the subject matches the specified value, using SameValue semantics (except when otherwise noted).

For example, 1 tests that the subject is SameValue to 1, "foo" tests that it's SameValue to "foo", etc.

Specifically, boolean literals, numeric literals, string literals, untagged template literals, and the null literal can all be used.

  • Numeric literals can additionally be prefixed with a + or - unary operator: + is a no-op (but see the note about 0, below), but - negates the value, as you'd expect.
  • Within the interpolation expressions of template literals, see Bindings for details on what bindings are visible.

The one exception to SameValue matching semantics is that the pattern 0 is matched using SameValueZero semantics. +0 and -0 are matched with SameValue, as normal. (This has the effect that an "unsigned" zero pattern will match both positive and negative zero values, while the "signed" zero patterns will only match the appropriately signed zero values.)

(Additional notes for SameValue semantics: it works "as expected" for NaN values, correctly matching NaN values against NaN patterns; it does not do any implicit type coercion, so a 1 value will not match a "1" pattern.)

Note: No coercion takes place in these matches: if you match against a string literal, for example, your subject must already be a string or else it'll automatically fail the match, even if it would stringify to a matching string.

Examples

Variable Patterns

A variable pattern is a "ident expression": foo, foo.bar, foo[bar] etc., excluding those that are already primitives like null, and optionally prefixed by a + or - unary operator.

A variable pattern resolves the identifier against the visible bindings (see Bindings for details), and if it has a + or - prefix, converts it to a number (via toValue) and possibly negates it. If the result is an object with a Symbol.customMatcher property, or is a function, then it represents a custom matcher test. See custom matchers for details. Otherwise, it represents a test that the subject is SameValue with the result, similar to a Primitive Pattern.

Note: This implies that, for example, a variable holding an array will only match that exact array, via object equivalence; it is not equivalent to an array pattern doing a structural match.

Note that several values which are often thought of as literals, like Infinity or undefined, are in fact bindings. Since Primitive Patterns and Variable Patterns are treated largely identically, the distinction can fortunately remain academic here.

Note: Just like with Primitive Patterns, no coercion is performed on the subjects (or on the pattern value, except for the effect of +/-, which is explicitly asking for a numeric coercion), so the type has to already match.

Examples

Custom Matchers

If the object that the variable pattern resolves to has a Symbol.customMatcher property in its prototype chain, then it is a "custom matcher".

To determine whether the pattern matches or not, the custom matcher function is invoked with the subject as its first argument, and an object with the key "matchType" set to "boolean" as its second argument.

If it returns a truthy value (one which becomes true when !! is used on it) the match is successful; otherwise, the match fails. If it throws, it passes the error up.

Note: Extractor patterns use the identical machinery, but allow further matching against the returned value, rather than being limited to just returning true/false.

Examples

Built-in Custom Matchers

Several JS objects have custom matchers installed on them by default.

All of the classes for primitive types (Boolean, String, Number, BigInt, Symbol) expose a built-in Symbol.customMatcher static method, matching if and only if the subject is a primitive (or a boxed object) corresponding to that type The return value of a successful match (for the purpose of extractor patterns) is an iterator containing the (possibly auto-unboxed) primitive value.

class Boolean {
    static [Symbol.customMatcher](subject) {
        return typeof subject == "boolean";
    }
}
/* et cetera for the other primitives */

Function.prototype has a custom matcher that checks if the function has an [[IsClassConstructor]] slot (meaning it's the constructor() function from a class block); if so, it tests whether the subject is an object of that class (using brand-checking to verify, similar to Array.isArray()); if not, it invokes the function as a predicate, passing it the subject, and returns the return value:

/* roughly */
Function.prototype[Symbol.customMatcher] = function(subject) {
    if(isClassConstructor(this)) {
        return hasCorrectBrand(this, subject);
    } else {
        return this(subject);
    }
}

This way, predicate functions can be used directly as matchers, like x is upperAlpha, and classes can also be used directly as matchers to test if objects are of that class, like x is Option.Some.

RegExp.prototype has a custom matcher that executes the regexp on the subject, and matches if the match was successful:

RegExp.prototype[Symbol.customMatcher] = function(subject, {matchType}) {
    const result = this.exec(subject);
    if(matchType == "boolean") return result;
    if(matchType == "extractor") return [result, ...result.slice(1)];
}

Note: We have two conflicting desires here: (1) allow predicate functions to be used as custom matchers in a trivial way, and (2) allow classes to be used as matchers in a trivial way. The only way to do (1) is to put a custom matcher on Function.prototype that invokes the function for you, but this is also the only normal way to do (2), since Function.prototype is the nearest common prototype to all classes (save those that explicitly null out their prototype, of course). The solution we hit on here -- having the class{} syntax install a default matcher if there's not one present -- mirrors the behavior of constructor methods, and lets us have both behaviors cleanly. However, if this is deemed unacceptable, we could pursue other approaches, like having a prefix keyword to indicate a "boolean predicate pattern" so we can tell it apart from a custom matcher pattern.

Regex Patterns

A regex pattern is a regex literal, representing a test that the subject, when stringified, successfully matches the regex.

(Technically, this just invokes the RegExp.prototype[Symbol.customMatcher] method; that is, x is /foo/; and let re = /foo/; x is re; are identical in behavior wrt built-in fiddling.)

A regex pattern can be followed by a parenthesized pattern list, identical to extractor patterns. See that section for details on how this works.

Examples

Binding Patterns

A let, const, or var keyword followed by a valid variable name (identical to binding statements anywhere else). Binding patterns always match, and additionally introduce a binding, binding the subject to the given name with the given binding semantics.

Binding Behavior Details

As with normal binding statements, the bindings introduced by binding patterns are established in the nearest block scope (for let/const) or the nearest function scope (for var).

Issue: Or in the obvious scope, when used in a for/while header or a function arglist. Don't know the right term for this off the top of my head.

Bindings are established according to their presence in a pattern; whether or not the binding pattern itself is ever executed is irrelevant. (For example, [1, 2] is ["foo", let foo] will still establish a foo binding in the block scope, despite the first pattern failing to match and thus skipping the binding pattern.)

Standard TDZ rules apply before the binding pattern is actually executed. (For example, when [x, let x] is an early ReferenceError, since the x binding has not yet been initialized when the first pattern is run and attempts to dereference x.)

Unlike standard binding rules, within the scope of an entire top-level pattern, a given name can appear in multiple binding patterns, as long as all instances use the same binding type keyword. It is a runtime ReferenceError if more than one of these binding patterns actually execute, however (with one exception - see or patterns). (This behavior has precedent: it was previously the case that named capture groups had to be completely unique within a regexp. Now they're allowed to be repeated as long as they're in different branches of an alternative, like /foo(?<part>.*)|(?<part>.*)foo/.)

Examples

(x or [let y]) and (z or {key: let y})

Valid at parse-time: both binding patterns name y with let semantics. This establishes a y binding in the nearest block scope.

If x or z matches, but not both, then y gets bound appropriately. If neither matches, y remains uninitialized (so it's a runtime ReferenceError to use it). If both match, a runtime ReferenceError is thrown while executing the second let y pattern, as its binding has already been initialized.

(x or [let y]) and (z or {key: const y})

Early ReferenceError, as y is being bound twice with differing semantics.

x and let y and z and if(y == "foo")

Valid at parse-time, establishes a y binding in block scope.

If x doesn't match, y remains uninitialized, but the guard pattern is also skipped, so no runtime error (yet). If z doesn't match, y is initialized to the match subject, but the if() test never runs.

[let x and String] or {length: let x}

Valid at parse-time, establishes an x binding.

or pattern semantics allow overriding an already-initialized binding, if that binding came from an earlier failed sub-pattern, to avoid forcing authors to awkwardly arrange their binding patterns after the fallible tests.

So in this example, if passed an object like [5], it will pass the initial length check, execute the let x pattern and bind it to 5, then fail the String pattern, as the subject is a Number. It will then continue to the next or sub-pattern, and successfully bind x to 1, as the existing binding was initialized in a failed sub-pattern.

Structure Patterns

Structure patterns let you test the structure of the subject (its properties, its length, etc) and then recurse into that structure with additional matcher patterns.

Array Patterns

A comma-separated list of zero or more patterns, surrounded by square brackets. It represents a test that:

  1. The subject is iterable.
  2. The subject contains exactly as many iteration items as the length of the array pattern.
  3. Each item matches the associated sub-pattern.

For example, ["foo", {bar}] will match when the subject is an iterable with exactly two items, the first item is the string "foo", and the second item has a bar property.

The final item in the array pattern can optionally be a "rest pattern": either a literal ..., or a ... followed by another pattern. In either case, the presence of a rest pattern relaxes the length test (2 in the list above) to merely check that the subject has at least as many items as the array pattern, ignoring the rest pattern. That is, [a, b, ...] will only match a subject who contains 2 or more items.

If the ... is followed by a pattern, like [a, b, ...let c], then the iterator is fully exhausted, all the leftover items are collected into an Array, and that array is then applied as the subject of the rest pattern's test.

Note: The above implies that [a, b] will pull three items from the subject: two to match against the sub-patterns, and a third to verify that the subject doesn't have a third item. [a, b, ...] will pull only two items from the subject, to match against the sub-patterns. [a, b, ...c] will exhaust the subject's iterator, verifying it has at least two items (to match against the sub-patterns) and then pulling the rest to match against the rest pattern.

Array pattern execution order is as follows:

  1. Obtain an iterator from the subject. Return failure if this fails.
  2. For each expected item up to the number of sub-patterns (ignoring the rest pattern, if present):
    1. Pull one item from the iterator. Return failure if this fails.
    2. Execute the corresponding pattern. Return failure if this doesn't match.
  3. If there is no rest pattern, pull one more item from the iterator, verifying that it's a {done: true} result. If so, return success; if not, return failure.
  4. If there is a ... rest pattern, return success.
  5. If there is a ...<pattern> rest pattern, pull the remaining items of the iterator into a fresh Array, then match the pattern against that. If it matches, return success; otherwise return failure.

Issue: Or should we pull all the necessary values from the iterator first, then do all the matchers?

Examples

match (res) {
  when isEmpty: ...;
  when {data: [let page] }: ...;
  when {data: [let frontPage, ...let pages] }: ...;
  default: ...;
}

Array patterns implicitly check the length of the subject.

The first arm is a variable pattern, invoking the default Function.prototype custom matcher which calls isEmpty(res) and matches if that returns true.

The second arm is an object pattern which contains an array pattern, which matches if data has exactly one element, and binds that element to page for the RHS.

The third arm matches if data has at least one element, binding that first element to frontPage, and binding an array of any remaining elements to pages using a rest pattern.

Array Pattern Caching

To allow for idiomatic uses of generators and other "single-shot" iterators to be reasonably matched against several array patterns, the iterators and their results are cached over the scope of the match construct.

Specifically, whenever a subject is matched against an array pattern, the subject is used as the key in a cache, whose value is the iterator obtained from the subject, and all items pulled from the subject by an array pattern.

Whenever something would be matched against an array pattern, the cache is first checked, and the already-pulled items stored in the cache are used for the pattern, with new items pulled from the iterator only if necessary.

For example:

function* integers(to) {
  for(var i = 1; i <= to; i++) yield i;
}

const fiveIntegers = integers(5);
match (fiveIntegers) {
  when [let a]:
    console.log(`found one int: ${a}`);
    // Matching a generator against an array pattern.
    // Obtain the iterator (which is just the generator itself),
    // then pull two items:
    // one to match against the `a` pattern (which succeeds),
    // the second to verify the iterator only has one item
    // (which fails).
  when [let a, let b]:
    console.log(`found two ints: ${a} and ${b}`);
    // Matching against an array pattern again.
    // The generator object has already been cached,
    // so we fetch the cached results.
    // We need three items in total;
    // two to check against the patterns,
    // and the third to verify the iterator has only two items.
    // Two are already in the cache,
    // so we’ll just pull one more (and fail the pattern).
  default: console.log("more than two ints");
}
console.log([...fiveIntegers]);
// logs [4, 5]
// The match construct pulled three elements from the generator,
// so there’s two leftover afterwards.

When execution of the match construct finishes, all cached iterators are closed.

Object Patterns

A comma-separated list of zero or more "object pattern clauses", wrapped in curly braces. Each "object pattern clause" is either <key>, let/var/const <ident> or <key>: <pattern>, where <key> is an identifier or a computed-key expression like [Symbol.foo]. It represents a test that the subject:

  1. Has every specified property in its prototype chain.
  2. If the key has an associated sub-pattern, then the value of that property matches the sub-pattern.

A <key> object pattern clause is exactly equivalent to <key>: void. A let/var/const <ident> object pattern clause is exactly equivalent to <ident>: let/var/const <ident>.

That is, when {foo, let bar, baz: "qux"} is equivalent to when {foo: void, bar: let bar, baz: "qux"}: it tests that the subject has foo, bar, and baz properties, introduces a bar binding for the value of the bar property, and verifies that the value of the baz property is the string "qux".

Additionally, object patterns can contain a "rest pattern": a ... followed by a pattern. Unlike array patterns, a lone ... is not valid in an object pattern (since there's no strict check to relax). If the rest pattern exists, then all enumerable own properties that aren't already matched by object pattern clauses are collected into a fresh object, which is then matched against the rest pattern. (This matches the behavior of object destructuring.)

Issue: Do we want a key?: pattern pattern clause as well? Makes it an optional test - if the subject has this property, verify that it matches the pattern. If the pattern is skipped because the property doesn't exist, treat any bindings coming from the pattern the same as ones coming from skipped or patterns.

Issue: Ban __proto__? Do something funky?

Object pattern execution order is as follows:

  1. For each non-rest object pattern clause key: sub-pattern, in source order:
    1. Check that the subject has the property key (using in, or HasProperty(), semantics). If it doesn't, return failure.
    2. Get the value of the key property, and match it against sub-pattern. If that fails to match, return failure.
  2. If there's a rest pattern clause, collect all enumerable own properties of the subject that weren't tested in the previous step, and put them into a fresh Object. Match that against the rest pattern. If that fails, return failure.
  3. Return success.

Examples

Object Pattern Caching

Similar to array pattern caching, object patterns cache their results over the scope of the match construct, so that multiple clauses don’t observably retrieve the same property multiple times.

(Unlike array pattern caching, which is necessary for this proposal to work with iterators, object pattern caching is a nice-to-have. It does guard against some weirdness like non-idempotent getters (including, notably, getters that return iterators), and helps make idempotent-but-expensive getters usable in pattern matching without contortions, but mostly it’s just for conceptual consistency.)

Whenever a subject is matched against an object pattern, for each property name in the object pattern, a (<subject>, <property name>) tuple is used as the key in a cache, whose value is the value of the property.

Whenever something would be matched against an object pattern, the cache is first checked, and if the subject and that property name are already in the cache, the value is retrieved from cache instead of by a fresh Get against the subject.

For example:

const randomItem = {
  get numOrString() { return Math.random() < .5 ? 1 : "1"; }
};

match (randomItem) {
  when {numOrString: Number}:
    console.log("Only matches half the time.");
    // Whether the pattern matches or not,
    // we cache the (randomItem, "numOrString") pair
    // with the result.
  when {numOrString: String}:
    console.log("Guaranteed to match the other half of the time.");
    // Since (randomItem, "numOrString") has already been cached,
    // we reuse the result here;
    // if it was a string for the first clause,
    // it’s the same string here.
}

Issue: This potentially introduces a lot more caching, and the major use-case is just making sure that iterator caching works both at the top-level and when nested in an object. Expensive or non-idempotent getters benefit, but that's a much less important benefit. This caching is potentially droppable, but it will mean that we only cache iterables at the top level.

Extractor Patterns

A dotted-ident followed by a parenthesized "argument list" containing the same syntax as an array matcher. Represents a combination of a custom matcher pattern and an array pattern:

  1. The dotted-ident is resolved against the visible bindings. If that results in an object with a Symbol.customMatcher property, and the value of that property is a function, then continue; otherwise, this throws an XXX error.

  2. The custom matcher function is invoked with the subject as its first argument, and an object with the key "matchType" set to "extractor" as its second argument. Let result be the return value.

  3. Match result against the arglist pattern.

Note: While custom matchers only require the return value be truthy or falsey, extractor patterns are stricter about types: the value must be exactly true or false, or an Array, or an iterable.

Arglist Patterns

An arglist pattern is a sub-pattern of an Extractor Pattern, and is mostly identical to an Array Pattern. It has identical syntax, except it's bounded by parentheses (()) rather than square brackets ([]). It behaves slightly differently with a few subjects, as well:

  • a false subject always fails to match
  • a true subject matches as if it were an empty Array
  • an Array subject is matched per-index, rather than invoking the iterator protocol.

If the subject is an Array, then it's matched as follows:

  1. If the arglist pattern doesn't end in a rest pattern, then the subject's length property must exactly equal the length of the pattern, or it fails to match.

  2. If the arglist pattern does end in a rest pattern, then the subject's length property must be equal or greater than the length of the pattern - 1, or it fails to match.

  3. For each non-rest sub-pattern of the arglist pattern, the corresponding integer-valued property of the subject is fetched, and matched against the corresponding sub-pattern.

  4. If the final sub-pattern is a ...<pattern>, collect the remaining integer-valued properties of the subject, up to but not including its length value, into a fresh Array, and match against that pattern.

  5. If any of the matches failed, the entire arglist pattern fails to match. Otherwise, it succeeds.

Other than the above exceptions, arglist patterns are matched exactly the same as array patterns.

Issue: Do we cache arglists the same way we cache array patterns?

Note: The Array behavior here is for performance, based on implementor feedback. Invoking the iterator protocol is expensive, and we don't want to discourage use of custom matchers when the by far expected usage pattern is to just return an Array, rather than some more complex iterable. We're (currently) still sticking with iterator protocol for array matchers, to match destructuring, but could potentially change that.

Examples

class Option {
  constructor() { throw new TypeError(); }
  static Some = class extends Option {
    constructor(value) { this.value = value; }
    map(cb) { return new Option.Some(cb(this.value)); }
    // etc
    static [Symbol.customMatcher](subject) {
      if (subject instanceof Option.Some) { return [subject.value]; }
      return false;
    }
  };

  static None = class extends Option {
    constructor() { }
    map(cb) { return this; }
    // Use the default custom matcher,
    // which just checks that the subject matches the class.
  };
}

let val = Option.Some(5);
match(val) {
  when Option.Some(String and let a): console.log(`Got a string "${a}".`);
  when Option.Some(Number and let a): console.log(`Got a number ${a}.`);
  when Option.Some(...): console.log(`Got something unexpected.`);
  // Or `Option.Some`, either works.
  // `Option.Some()` will never match, as the return value
  // is a 1-item array, which doesn't match `[]`
  when Option.None(): console.log(`Operation failed.`);
  // or `Option.None`, either works
  default: console.log(`Didn't get an Option at all.`)
}

Issue: We don't have an easy way to get access to the "built-in" custom matcher, so the above falls back to doing an instanceof test (rather than the technically more correct branding test that the built-in one does). To work "properly" I'd have to define the class without a custom matcher, then pull off the custom matcher, save it to a local variable, and define a new custom matcher that invokes the original one and returns the [subject.value] on success. That's a silly amount of work for correctness.

Regex Extractor Patterns

Similar to how regex patterns allowed you to use a regex literal as a pattern, but simply invoked the normal custom matcher machinery, a regex literal can also be followed by an arglist pattern, invoking the normal extractor pattern machinery.

For this purpose, on a successful match the "return value" (what's matched against the arglist pattern) is an Array whose items are the regex result object, followed by each of the positive numbered groups in the regex result (that is, skipping the "0" group that represents the entire match).

Execution order is identical to extractor patterns, except the first part is matched as a regex pattern, and the second part's subject is as defined above.

The full definition of the RegExp.prototype custom matcher is thus:

RegExp.prototype[Symbol.customMatcher] = function(subject) {
    const result = this.exec(subject);
    if(result) {
        return [result, ...result.slice(1)];
    } else {
        return false;
    }
}

Examples

match (arithmeticStr) {
  when /(?<left>\d+) \+ (?<right>\d+)/({groups:{let left, let right}}):
    // Using named capture groups
    processAddition(left, right);
  when /(\d+) \* (\d+)/({}, let left, let right):
    // Using positional capture groups
    processMultiplication(left, right);
  default: ...
}

Combinator Patterns

Sometimes you need to match multiple patterns on a single value, or pass a value that matches any of several patterns, or just negate a pattern. All of these can be achieved with combinator patterns.

And Patterns

Two or more patterns, each separated by the keyword and. This represents a test that the subject passes all of the sub-patterns. Any pattern can be (and in some cases must be, see combining combinators) wrapped in parentheses.

Short-circuiting applies; if any sub-pattern fails to match the subject, matching stops immediately.

and pattern execution order is as follows:

  1. For each sub-pattern, in source order, match the subject against the sub-pattern. If that fails to match, return failure.
  2. Return success.

Or Patterns

Two or more patterns, each separated by the keyword or. This represents a test that the subject passes at least one of the sub-patterns. Any pattern can be (and in some cases must be, see combining combinators) wrapped in parentheses.

Short-circuiting applies; if any sub-pattern successfully matches the subject, matching stops immediately.

or pattern execution order is as follows:

  1. For each sub-pattern, in source order, match the subject against the sub-pattern. If that successfully matches, return success.
  2. Return failure.

Note: As defined in Binding Behavior Details, a binding pattern in a failed sub-pattern can be overridden by a binding pattern in a later sub-pattern without error. That is, [let foo] or {length: let foo} is valid both at parse-time and run-time, even tho the foo binding is potentially initialized twice (given a subject like [1, 2]).

Not Patterns

A pattern preceded by the keyword not. This represents a test that the subject does not match the sub-pattern. The pattern can be (and in some cases must be, see combining combinators) wrapped in parentheses.

Combining Combinator Patterns

Combinator patterns cannot be combined at the same "level"; there is no precedence relationship between them. Instead, parentheses must be used to explicitly provide an ordering.

That is, foo and bar or baz is a syntax error; it must be written (foo and bar) or baz or foo and (bar or baz).

Similarly, not foo and bar is a syntax error; it must be written (not foo) and bar or not (foo and bar).

Guard Patterns

A guard pattern has the syntax if(<expression>), and represents a test that the expression is truthy. This is an arbitrary JS expression, not a pattern.

match expression

match expressions are a new type of expression that makes use of patterns to select one of several expressions to resolve to.

A match expression looks like:

match(<subject-expression>) {
    when <pattern>: <value-expression>;
    when <pattern>: <value-expression>;
    ...
    default: <value-expression>;
}

That is, the match head contains a <subject-expression>, which is an arbitrary JS expression that evaluates to a "subject".

The match block contains zero or more "match arms", consisting of:

  • the keyword when
  • a pattern
  • a literal colon
  • an arbitrary JS expression
  • a semicolon (yes, required)

After the match arms, it can optionally contain default a "default arm", consisting of:

  • the keyword default
  • a literal colon
  • an arbitrary JS expression
  • a semicolon

After obtaining the subject, each match arm is tested in turn, matching the subject against the arm's pattern. If the match is successful, the arm's expression is evaluated, and the match expression resolves to that result.

If all match arms fail to match, and there is a default arm, the default arm's expression is evaluated, and the match expression resolves to that result. If there is no default arm, the match expression throws a TypeError.

Bindings

The <subject-expression> is part of the nearest block scope.

Each match arm and the default arm are independent nested block scopes, covering both the pattern and the expression of the arm. (That is, different arms can't see each other's bindings, and the bindings don't escape the match expression. Within each arm, they shadow the outer scope's bindings.)

Examples

match (res) {
  when { status: 200, let body, ...let rest }: handleData(body, rest);
  when { const status, destination: let url } and if (300 <= status && status < 400):
    handleRedirect(url);
  when { status: 500 } and if (!this.hasRetried): do {
    retry(req);
    this.hasRetried = true;
  };
  default: throwSomething();
}

This example tests a "response" object against several patterns, branching based on the .status property, and extracting different parts from the response in each branch to process in various handler functions.


match (command) {
  when ['go', let dir and ('north' or 'east' or 'south' or 'west')]: go(dir);
  when ['take', /[a-z]+ ball/ and {let weight}: takeBall(weight);
  default: lookAround()
}

This sample is a contrived parser for a text-based adventure game.

The first match arm matches if the command is an array with exactly two items. The first must be exactly the string 'go', and the second must be one of the given cardinal directions. Note the use of the and pattern to bind the second item in the array to dir using a binding pattern before verifying (using the or pattern) that it’s one of the given directions.

The second match arm is slightly more complex. First, a regex pattern is used to verify that the object stringifies to "something ball", then an object patterns verifies that it has a .weight property and binds it to weight, so that the weight is available to the arm's expression.

Statement vs Expression

For maximum expressivity, the match expression is an expression, not a statement. This allows for easy use in expression contexts like return match(val){...}.

It can, of course, be used in statement context, as in the first example above. However, the match arms still contain expressions only.

It is expected that do-expressions will allow for match arms to execute statements (again, as in the first example above). If that proposal does not end up advancing, a future iteration of this proposal will include some way to have a match arm contain statements. (Probably just by inlining do-expr's functionality.)

is operator

The is operator is a new boolean operator, of the form <subject-expression> is <pattern>. It returns a boolean result, indicating whether the subject matched the pattern or not.

Bindings

Bindings established in the pattern of an is are visible in the nearest block scope, as defined in Binding Patterns.

This includes when used in the head of an if() statement:

function foo(x) {
    if(x is [let head, ...let rest]) {
        console.log(head, rest);
    } else {
        // `head` and `rest` are defined here,
        // but will throw a ReferenceError if dereferenced,
        // since if the pattern failed
        // the binding patterns must not have been executed.
    }
}

function bar(x) {
    if(x is not {let necessaryProperty}) {
        // Pattern succeeded, because `x.necessaryProperty`
        // doesn't exist.
        return;
    }
    // Here the pattern failed because `x.necessaryProperty`
    // *does* exist, so the binding pattern was executed,
    // and the `necessaryProperty` binding is visible here.
    console.log(necessaryProperty);
}

When used in the head of a for(), the usual binding scopes apply: the bindings are scoped to the for() head+block, and in the case of for-of, are copied to the inner per-iteration binding scopes.

while and do-while do not currently have any special scoping rules for things in their heads. We propose that they adopt the same rules as for-of blocks: the head is in a new scope surrounding the rule, and its bindings are copied to a per-iteration scope surrounding the {} block. For do-while, the bindings are TDZ on the first iteration, before the head is executed.

Motivating examples

Below are selected situations where we expect pattern matching will be widely used. As such, we want to optimize the ergonomics of such cases to the best of our ability.


Validating JSON structure.

Here's the simple destructuring version of the code, which does zero checks on the data ahead of time, just pulls it apart and hopes everything is correct:

var json = {
  'user': ['Lily', 13]
};
var {user: [name, age]} = json;
print(`User ${name} is ${age} years old.`);

Destructuring with checks that everything is correct and of the expected shape:

if ( json.user !== undefined ) {
  var user = json.user;
  if (Array.isArray(user) &&
      user.length == 2 &&
      typeof user[0] == "string" &&
      typeof user[1] == "number") {
    var [name, age] = user;
    print(`User ${name} is ${age} years old.`);
  }
}

Exactly the same checks, but using pattern-matching:

if( json is {user: [String and let name, Number and let age]} ) {
  print(`User ${name} is ${age} years old.`);
}

Matching fetch() responses:

const res = await fetch(jsonService)
match (res) {
  when { status: 200, headers: { 'Content-Length': let s } }:
    console.log(`size is ${s}`);
  when { status: 404 }:
    console.log('JSON not found');
  when { let status } and if (status >= 400): do {
    throw new RequestError(res);
  }
};

More concise, more functional handling of Redux reducers (compare with this same example in the Redux documentation):

function todosReducer(state = initialState, action) {
  return match (action) {
    when { type: 'set-visibility-filter', payload: let visFilter }:
      { ...state, visFilter };
    when { type: 'add-todo', payload: let text }:
      { ...state, todos: [...state.todos, { text, completed: false }] };
    when { type: 'toggle-todo', payload: let index }: do {
      const newTodos = state.todos.map((todo, i) => {
        return i !== index ? todo : {
          ...todo,
          completed: !todo.completed
        };
      });

      ({
        ...state,
        todos: newTodos,
      });
    }
    default: state // ignore unknown actions
  }
}

Concise conditional logic in JSX (via Divjot Singh):

<Fetch url={API_URL}>
  {props => match (props) {
    when {loading}: <Loading />;
    when {let error}: do {
      console.err("something bad happened");
      <Error error={error} />
    };
    when {let data}: <Page data={data} />;
  }}
</Fetch>

Possible future enhancements

Void Patterns

The keyword void is a pattern that always matches, and does nothing else. It's useful in structure patterns, when you want to test for the existence of a property without caring what its value is.

This is the most likely proposal to move back into the main proposal; it's pulled out solely because we want to make sure that it stays consistent with Void Bindings.

async match

If the match construct appears inside a context where await is allowed, await can already be used inside it, just like inside do expressions. However, just like async do expressions, there’s uses of being able to use await and produce a Promise, even when not already inside an async function.

async match (await subject) {
  when { let a }: await a;
  when { let b }: b.then(() => 42);
  default: await somethingThatRejects();
} // produces a Promise

Relational Patterns

Currently there are patterns for expressing various types of equality, and kinda an instanceof (for custom matchers against a class). We could express more types of operator-based checks, like:

match(val) {
    when < 10: console.log("small");
    when >= 10 and < 20: console.log("mid");
    default: "large";
}

Generally, all the binary boolean operators could be used, with the subject as the implicit LHS of the operator.

(This would slightly tie our hands on future syntax expansions for patterns, but it's unlikely we'd ever want to reuse existing operators in a way that's different from how they work in expression contexts.)

Default Values

Destructuring can supply a default value with = <expr> which is used when a key isn’t present. Is this useful for pattern matching?

Optional keys seem reasonable; right now they’d require duplicating the pattern like ({a, b} or {a}) (b will be bound to undefined in the RHS if not present).

Do we need/want full defaulting? Does it complicate the syntax to much to have arbitrary JS expressions there, without anything like wrapper characters to distinguish it from surrounding patterns?

This would bring us into closer alignment with destructuring, which is nice.

Destructuring enhancements

Both destructuring and pattern matching should remain in sync, so enhancements to one would need to work for the other.

Integration with catch

Allow a catch statement to conditionally catch an exception, saving a level of indentation:

try {
  throw new TypeError('a');
} catch match (e) {
  when RangeError: ...;
  when /^abc$/: ...;
  // unmatched, default to rethrowing e
}

Or possibly just allow an is check in the catch head:

try {
  throw new TypeError('a');
} catch (e is RangeError) {
    ...
} catch (e is /^abc$/) {
    ...
}

(In both cases, the name used for the subject automatically creates a binding, same as catch (e) does today.)

Chaining guards

Some reasonable use-cases require repetition of patterns today, like:

match (res) {
  when { status: 200 or 201, let pages, let data } and if (pages > 1):
    handlePagedData(pages, data);
  when { status: 200 or 201, let pages, let data } and if (pages === 1):
    handleSinglePage(data);
  default: handleError(res);
}

We might want to allow match constructs to be chained, where the child match construct sees the bindings introduced in their parent clause, and which will cause the entire parent clause to fail if none of the sub-classes match.

The above would then be written as:

match (res) {
  when { status: 200 or 201, let data } match {
    when { pages: 1 }: handleSinglePage(data);
    when { pages: >= 2 and let pages }: handlePagedData(pages, data);
  };
  default: handleError(res);
  // runs if the status indicated an error,
  // or if the data didn't match one of the above cases,
  // notably if pages == 0
}

Note the lack of a <subject-expression> in the child (just match {...}), to signify that it’s chaining from the when rather than just being part an independent match construct in the RHS (which would, instead, throw if none of the clauses match):

match (res) {
  when { status: 200 or 201, let data }: match(res) {
    when { pages: 1}: handleSinglePage(data);
    when { pages: >= 2 and let pages}: handlePagedData(pages, data);
    // just an RHS, so if pages == 0,
    // the inner construct fails to match anything
    // and throws a TypeError
  };
  default: handleError(res);
}

The presence or absence of the separator colon also distinguishes these cases, of course.

proposal-pattern-matching's People

Contributors

bterlson avatar chicoxyzzy avatar chocolateboy avatar coderitual avatar dotproto avatar fregante avatar geoffreybooth avatar geoffreydhuyvetters avatar getify avatar gibson042 avatar indierojo avatar jack-works avatar jlhwung avatar jonathansampson avatar juancaicedo avatar kimroen avatar kyle-cameron avatar legodude17 avatar ljharb avatar mpcsh avatar oguimbal avatar oliverbrotchie avatar pyrolistical avatar rkirsling avatar robpalme avatar tabatkins avatar thescottyjam avatar tniessen avatar vitapluvia avatar zkat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

proposal-pattern-matching's Issues

Collision/confusion of syntax, bindings, and variable refs

In the pattern syntax, raw syntax, binding names, and variable references are mixed in ways that are not immediately obvious, and which restrict what you can actually do in seemingly confusing and unfortunate ways:

  • In x, x is a variable reference, forming an identifier pattern - the x variable in the containing scopes will be queried for a Symbol.matches property. In [x], x is a binding name - it matches a length-1 array and binds its first value to x. It's thus impossible to nest an identifier pattern matching inside of an array matching. However, it is possible to nest it into an object matching, via {x: identifier}.
  • In else, else is raw syntax - it denotes the fall-through case. It's thus impossible to have an identifier pattern named by else. This is a small hazard for user code (disallowed names sucked in ES3), and is an extension hazard for the future, as you'll have to wrestle with the possibility of breaking userland code already using a new name as an identifier pattern.

I don't have good solutions for either of these, tho.

guard clauses

this very much looks like the erlang if notation, which is wonderful

one thing that's super useful in erlang matching is guard clauses, which allow you to express constraints which are not about physical layout, but frequently still important

Foo = case Bar of 
    { label, Quux } when is_integer(Quux) -> ...

Proposal to add an ExpressionPattern

I think when working with primitives, an ExpressionPattern will be enormously helpful. Take the following trivial example code:

if (x < 5) {
  console.log("small");
} else if (x < 10) {
  console.log("medium");
} else {
  console.log("large");
}

By having an ExpressionPattern this code is greatly simplified:

match (x) {
  < 5: console.log("small")
  < 10: console.log("medium")
  else: console.log("large")
}

An alternate syntax (which, admittedly, would significantly impact this proposal) would be to auto-bind to a generic value (like it in Kotlin), allowing expressions like this:

match (x) {
  it < 5: console.log("small")
  it < 10: console.log("medium")
  else: console.log("large")
}

How implicit do expressions impact this proposal?

Hi!

there is an idea for implicit/aggressive do expressions tc39/proposal-do-expressions#9 and facebook/jsx#88 (for jsx). This basically makes various statements in JS expressions (ex. switch would be an expression).

With this in mind, maybe instead of adding a new keyword match, improving switch would be a better idea. CSharp recently did something similar — they added special powers to good old switch statement instead of adding a new keyword (unfortunately switch is still a statement which sucks but we would avoid it).

I wonder if this was discussed somewhere already?

CC: @xtuc

How to match multiple patterns (empty legs, &&, ||)

Implicit fall-through is bad in switch because you have to manually insert a terminator, so in cases where you have an interesting body, you still fall through if you forget the terminator.

However, in this proposal, there's an obvious automatic terminator: the end of the expression or block.

So I think there's room for implicit fall-through, in the case that something is not terminated. Which is only when there is no body. Consider:

const result = match(node.type) {
  "tag":
  "script":
  "style":
    createElement(node);
  "root":
    createDocument(node);
  "text":
  "comment":
  "cdata":
    createText(node);
};

Forcing me to add continue;s here seems pointless. It's much more likely that I meant for there to be a fall-through, than that I just forgot to write a body for tag/script/text/comment.

Disallow unreachable clauses

I would propose to ensure that no other clause is after the else one since it should be the last (probably throwing an error at that point).

Currently nothing in the spec disallow this:

match (4) {
  1: "one",
  else: "else",
  4: "abandoned"
}

Since the 4 cause will never match, I don't see why the developers could be able to write it. That's also a behavior linter will need to implement for case exhaustiveness.

Missing other literals

I think you'll want to allow undefined, null, true, and false. (Plus bigint literals, eventually.)

I'd also suggest regexp literals, although I agree making them go down the [Symbol.matches] path makes the most sense.

See also #6 for a perhaps-more-elegant way of making this all work.

What happens if nothing matches?

I didn't see this specified, although I might have missed it.

I think it'd be pretty useful if it threw by default, so you had to add an explicit fallthrough value with else.

clarify instanceof checks

I am curious especially with classes that extend null on what this should act like. Symbols are not typically added to things by using a class vs a constructor function.

Syntax sugar for scala case class style usage with match

Structural pattern matching is great.
But nominal is too, for matching on union-type like set of classes:

class Dog { x = 5 }
class Cat { y = 6 }

match (p) {
  { x } && o if o instanceof Dog: x,
  { y } && o if o instanceof Cat: y
};

match (p) {
    { constructor: Dog, x }:  x,
    { constructor: Cat, y }: y
};

// could be something like
match p {
  Dog{ x }: x,
  Cat{ y }: y
}

... on any pattern and CustomType clarified?

Can you do some of the following under the current proposal?

var names = {first:"John",father:{first:"Joseph"}};
var t = function(){};
match (obj) {
    { first, father:{first} }: // first is?
    { first : f, father:{first: ...fatherfirst} }: // f contains John and fatherfirst contains "Joseph"
    [ ...[1,y] ] : /*  match?? [[1,2],[1,3]] */
    [ ...t ] : /* matches? [new t(),new t()] or stores ... in t */
}

Also I probably should add this somewhere but this proposal I don't believe supports subclassing arrays depending on how the subclassing is done at least after reading the pull request polyfill. Forgive me if something I stated was obvious.

Matching undefined

I'm not sure how this is supposed to be handled, but consider this example:

match (e) {
  {x: undefined}: true,
  else: false
}

Since undefined is just a regular identifier, this will not match objects with a property x set to the value undefined, but rather any object which has an x and bind the value of x property to undefined, or am I reading this proposal wrong?

Disallow SequenceExpression consequents

I believe you will need to disallow SequenceExpression consequents:

match (foo) {
  1: a, 
  2: b
}

the above would parse as a SequenceExpression of a, 2, and then error at the colon.

I assume you were aware of this, but I didn't see it specified in the grammar.

(This is commonly accomplished in the babylon codebase, for example, by calling parseMaybeAssign instead of parseExpression)

match { } without parameter

I feel like 90% of use cases of match will be like this example from the README:

let isVerbose = config => match (config) {
    { output: { verbose: true } }: true,
    else: false
}

Wouldn't it make sense to allow a no-arg version of match that effectively returned a function? Thus:

let isVerbose = match {
    { output: { verbose: true } }: true,
    else: false
}

This will remove the need to bind a name to config, thus allowing point-free pattern matching.

Iterable vs. array-like for the [] syntax

I noted this was in the readme as a design alternative. I am strongly in favor of the iterable version, in order to not be super-confusing when compared with destructuring. So I thought I'd open an issue to discuss.

Match against array values

Would be nice to extend array matching such that we can match arrays with specific literals. I wrote a post about pattern matching in JavaScript with a similar proposal. It works similarly to how object matching works, using a :. Here is the post.

http://jarrodkahn.com/posts/6/slug/

I really don't mean this as a plug, I think it might be a meaningful contribution. Glad to see other people are pushing for this feature as well.

I did some work to implement this feature as a `polyfill`.

I love pattern-match, and want to use it now in javascript.
Reference to async-promise, I thought if we can add a function to implement this feature and use it before this feature formal add into ECMAScript, without syntax support.

I make a try to transform the example above here :

let getLength = vector => match (vector) {
    { x, y, z }: Math.sqrt(x ** 2 + y ** 2 + z ** 2),
    { x, y }:    Math.sqrt(x ** 2 + y ** 2),
    [...]:       vector.length,
    else: {
        throw new Error("Unknown vector type");
    }
}

to this form:

let getLength = match([
    [[num, num, num],   (x, y, z) => Math.sqrt(x ** 2 + y ** 2 + z ** 2)],
    [[num, num],        (x, y) => Math.sqrt(x ** 2 + y ** 2)],
    [res,               (...vector) => vector.length],
    [els, (...args) => { throw new Error("Unknown vector type: " + String(args)); }]
]);

And then, I build the match as a higher-order function:

const match = ((isArr, isFn, any, mPns) => (mFns => mFns([[[isArr], mFns], [[isArr, isFn], mFns], [[any, isArr], mPns]]))(
    (fns, elseFn) => (...args) => (
        fns.find(([pattern]) => mPns(pattern, args)) || [, elseFn || ((...args) => { throw new Error('no matched pattern.'); })]
    )[1](...args)
))(arr => arr instanceof Array, fn => fn instanceof Function, () => true,
    (pattern, args) => pattern instanceof Function ?
        pattern(...args) : pattern.length == args.length && pattern.every((p, i) => p instanceof Function ? p(args[i]) : p === args[i]));

And with this polyfill , it works as my expect. num res els are some properties validate function, I don't sure if I should add them into this polyfill, they are easy to implement, like this:

const num = Number.isInteger;
const res = (...args) => args.length > 1;
const els = (..._) => true;

You can find more about it in this repository: js-pattern-match .

What do you think about my thought? Is it possible to add a new function like this to bring pattern-matching into javascript in advance? Or put it another way, is it possible to implement this feature without syntax upgrade, instead only by some new function?

`match` keyword

It would be probably better to avoid adding another language keyword, above all this one (String#match, ..)

I'd have loved to do

function foo({a, b}) { return 'matches a and b' }
function foo({a}) { return 'matches a only' }

const bar = (a, {b}) => '...';
const bar = (a, {c}, e) => '...';

allowing a transgression for const in that case of pattern matching

Binding an outer form

In Haskell's matching syntax, it's possible to specify a nested form, but also bind the outer object to a variable name. A pattern that splits a list into a head/tail might be (x:xs), binding the head to x and the tail to xs, while one that also binds the entire list might be s@(x:xs), binding the list to s.

(This is a more general requirement for destructuring; it just also applies here.)

Alternate proposal

Edit: Note: the bulleted list of differences may be out of date at this point.

I was working on my own personal strawman about the same time, and I didn't realize that one of you all would end up beating me to it.

I did, shortly after noticing this, make a few modifications (allow blocks and modify how I define bindings), but I am curious how you all feel about mine compared to this.

Note: Please comment here, not on my gist. I'm not going to get notified of comments there, because GitHub seems to have missed a very nice bit of UX in that area. 😦


Here's a quick summary of some of the differences:

  • I use case instead of match. There is a cover grammar (included in the gist) to differentiate from case clauses in switch statements, but it's fairly trivial, and I'm aware of no other potential ambiguities.
  • I separate value checks, binding creation, and brand checks, to allow some better optimization.
  • I use is for all native brand checks, including typeof and instanceof. (A little Ruby-like.)
  • I allow custom destructuring, with Map, Set, and libraries like Lodash/Immutable in mind.
  • I match iterables, not array-like objects (although that could be added)
  • I separate values from bindings, to keep it from becoming visually or grammatically ambiguous.
  • Instead of binding -> pattern, I use binding = pattern. (It's not ambiguous.)
  • I have two symbols involved: @@test and @@unapply.
  • My explanation of the proposal in that gist admittedly isn't the best way to explain it. 😄

And yes, it's a bit meatier, but I've also looked into some of the optional extensions:

  • Active patterns like in F# involve @@test and potentially @@unapply for advanced cases. (I drew more inspiration from Scala than F# in it, though. The similarities are just coincidental)
  • if predicates are already included, since I thought of them more as a primitive than an add-on.
  • Arrays match iterables by default. I honestly didn't even think of matching array-like structures until reading this proposal.
  • All the dynamic stuff is in is clauses (i.e. brand checking), as opposed to something like @@matches. It's destructuring or === everywhere else.
  • I decided against allowing closed objects at all, since it's rarely useful in practice, and the community tends to treat object literals as open, mutable records anyways. (TypeScript doesn't even support exact types.)
  • I addressed some of the optimization issues (object properties accessed at most once, iterables iterated at most once). Of course, that's only observable behavior.

Algebraic datatypes and pattern matching

If we want to take JS in a more functional direction, one way is to continue with using object literals each time we want to construct something, and another way would be to use something more like algebraic data types which are constructed explicitly. For example, you could have a class with two constructors: Cons(car, cdr) and Nil. Nil would fit cleanly into this pattern matching system, but to pattern match on Cons, it seems pretty tough: the arguments passed in would be evaluated, rather than bound to. Do you think this could be fixed in a follow-on extension, if we decide to go down a path of having algebraic data types?

Multiple heads and nested lists and functions?

Is there a way to match multiple 1,1+ or 0+ heads(also known as variable type/instance of depending on the implementation or first value in a list)? For example let's say I have the following array of JavaScript instances where Number and String could be any type. There is the left and right side of the .. operator. The left side defines the name of the the pattern and the right is the instanceof object(with _caveats).

    [1,2,"3")] // matches [x..Number,y..Number,z..String]
    [1,2] // matches [x..Number,y..Number,z...String]
    [1,2,"3","4"] // matches [x..Number,y..Number,z...String]

In addition what about nested patterns of functions? Doing so would make JavaScript much more like a LISP.

I'm thinking something like the following in Mathematica which would return True. Remember functions in Mathematica start with are written like the following f[agrs] instead of f(args) and arrays are written like {1,2}.

MatchQ[f[1,{"List",1,2}], f[x_,{"List",y__},z___]]

If you are familiar with LISP the following is a basic example.

(setf (readtable-case *readtable*) :invert)
(load 'init.lisp)
(compile-mma)
(load-mma)
(|ReplaceAll| '(f 2 2) '(|Rule| (|Pattern| a (|BlankSequence|)) (f1 a) ) )
 > (F1 ((F1 F) (F1 2) (F1 2)))
(elt '(F1 ((F1 F) (F1 2) (F1 2))))
(|ReplaceAll| '(f 2) '(|Rule| (f (|Pattern| a (|Blank|))) (f1 a) ) )
 > (f1 a)
(|ReplaceAll| '(f a b) '(|Rule| (|Pattern| a (|Blank|))) (f 1 2) )))
> (f a b)

In Mathematica the function name you are trying to match exists before the _ while the variable name exists after the underscore.
Adopting how Mathematica does it it except using the ...

f(x..,["List",y...],z...])

Basically in LISP you can match both functions and lists which are nested. Adapting more examples give me a minute.

let length = vector => match (vector) {
    { x : x.. , y : y.. , z: z.. }: Math.sqrt(x ** 2 + y ** 2 + z ** 2),
    { x: x.. , y : y.. }:   Math.sqrt(x ** 2 + y ** 2),
    [v...] :      [v].length,                   // supports 0 or more elements
    else: {
        throw new Error("Unknown vector type");
    }
}

match (arr) {
    []: /* match an empty array */,
    [x...]: /* match array of any size including 0 */,
    [x..]: /* match an array of length 1, bind its first element as x */,
    [x]: /* match an array with the first variable an actual x */,
    [x..]: /* match an array of at least length 1, bind its first element as x */,
    [ { x: 0, y: 0}, ... ]: /* match an array with the 2d origin as the first element */
}

match (val) {
    1: /* match the Number value 1 */,
    "hello": /* match the String value "hello" */,
}

match (val) {
    r..someRegExp: /* val & r matches the regexp */,
    a..Array: /* val & a is an instance of an array */,
    c..CustomType: /* val & c is an instance (or something) of CustomType */,
    p..PointInterface: /* val & p perhaps a tagged union of some sort */
}

let isVerbose = config => match (config) {
    {output: {verbose: true }}: true,
    else: false
}

values that can get reevaluated can get tricky.

let obj = {
    get x() { /* calculate many things */ }
}
match (obj.x) {
    //...
    else: obj.x // recalculates.
}

match (obj.x) {
    // ...
    x..: x // the result of evaluluating obj.x is bound as x and returned
}

Advices from already existing pattern matching for Javascript

Hello, I'm a contributor from z , a native pattern matching for Javascript: https://github.com/z-pattern-matching/z, yes, native pattern matching for Javascript exists for a while.

I have some advices about working 2 years with pattern matching in Javascript world, and seems you guys are missing some stuff with this proposal:

1. In the beginning I tried to make it more genuine pattern matching possible, as Haskell is, more like functional programming as I could do in Javascript. The article Why type classes aren’t important in Elm yet described why that approach failed. The Javascript community don't adopt the strictness of a genuine pattern matching and Javascript programmers aren't really acquainted with functional programming. As soon z became "less Haskell" and "more Javascript", it started to shine and being adopted instead being just a fancy-feature-which-is-used-on-experimental-projects-only.

2. At first I got mad being limited under original ES2015 syntax, I wished to have more power being able to change the syntax to bring more power, but now I realized how good it was! The challenge about being trapped into the original syntax made me be creative to bring the power I wish in the "Javascript Way", now every expert or beginner programmer which take a look on z pattern matching code understands at first and is familiar, being experienced on functional programming or not. Still pattern matching, still Javascript, still simple and still powerful.

3. TL;DR; be Javascript, less is more, eg. why the fancy monster else to be exhaustive pattern in the proposal code:

const result = a => match (a) {
    { x: 0 }: someCode(),
    else: {
        throw new Error("this is the default");
    }
}

if we can achieve that naturally? as in z

const result = match(a)(
  (x = 0) => someCode(),
  (x) => throw new Error("this is the default") 
)

Disallow multiline MatchExpressionPattern

The spec is currently allowing multiple lines in the MatchExpressionPattern since it includes ObjectMatchPattern and ArrayMatchPattern.

In some cases It seems to me to impact the readability of the code, for example:

let detail = match(test) {
    {
        name,
        birthday: {year},
        birthplace: {country},
    }: name + "(" + year + ")" + " from " + country
}

On the other hand, puting this pattern on one line would be even less readable but I hope that it will force the developer to refactor it somehow. Thus it will be harder to do bad things.

What do you think?

Clarify object matching w.r.t. own vs. proto properties

It's not clear to me whether a pattern will match only own properties or properties found along the prototype chain.

For example, is the following match expression true or false?

match(Object.create({p: 0})) {
  {p}: true,
  else: false
}

Bonus question: Would { length } match an array, or do only array patterns match arrays?

What Are Idiomatic Brand Checks for Built-ins?

The different types of brand checks I've identified in JS code follow these patterns:

x instanceof y
y in x
x.hasOwnProperty(y)
!!x.y
typeof x === y
isArray(x)
IsPromise(x) // (abstract operation not yet exposed)
x.nodeType === y

It would be ideal if these were unified by pattern matching so that it is not necessary to use other mechanisms. Currently we tend to mix these in the same place which looks pretty messy. If we were to continue using the same brand checks we'd have to bail out of the pattern matching syntax to do so. That in turn means that whatever the default matchers are basically say what the best practice is.

One interpretation is that the proposal only defines the x instanceof y and y in x. This means that this matching strategy is what we define as the best practice or common use case. However, this is not the cow path often used by libraries.

For duck typing properties it is fairly uncommon to use in. It's common to use hasOwnProperty to protect against prototype chains being tainted or expanded. Although it is also common to check for truthiness.

For primitives it is more common to check typeof x === "string" than x instanceof String to ensure that you have a true primitive string rather than the object form. An instanceof check mostly works when you only use primitives but it lets through cross-realm objects which can be bad since most code relies on the prototype methods being the same. It doesn't compose well when other parts of the system use typeof. It might be ok to try to push through instanceof as the defacto way everyone should do it going forward, if we assume that the normal use case is to use primitive values in those cases.

To be resilient against multiple realms, it is common to use isArray to allow arrays to pass between realms. Similarly, iframes have multiple documents and each of those have different Node and Element prototypes. Meaning that instanceof Element is not safe across frames. node.nodeType === Node.ELEMENT_NODE is used for such brand checks. Matching on Array or Element won't catch those in the current proposal. However, similarly to allowing cross-realm strings, it might also be confusing that they do match when the prototype may have different methods on it.

Promise polyfills typically have a variant of IsPromise (which is not directly exposed by the standard library) to check for Promises for the same reason. Should matching a Promise do instanceof or IsPromise?

Pass a matches function into [Symbol.matches](...)

Basically I'm suggesting passing a function into [Symbol.matches] as the second argument so that you can test for other patterns.

Some simple examples:

class Point {
    static [Symbol.matches](value) {
        return value instanceof Point
    }
    
    constructor(x, y) {
        this.x = x
        this.y = y
    }

    [Symbol.matches](other, matches) {
        // Check another pattern first
        // Ideally we don't want to need to inline `other instanceof Point`
        // and it'd be nice that if Point[Symbol.matches] changes in future
        // this will still be fine
        return matches(other, Point)
            && other.x === this.x
            && other.y === this.y
}

// ------

const greaterThan = x => {
    return {
        [Symbol.matches](value) {
            return matches(value, number)
                && value > x
        }
    }
}

// ------

const FunctionExpressionNode = {
    return {
        [Symbol.matches](astNode, matches) {
            return matches(astNode, FunctionNode)
                && // Rest of logic
        }
    }
}

Support multiple parameters to pattern matching

While JavaScript doesn't have a formal concept of a tuple, matching multiple values in one statement rather than using nested match statements would be nice. A proposed syntax using commas draws from Swift:

match (a, b) {
  { x }, { t }: // handle case
  { y }, { z }: // handle case
  else: throw new Error("unmatched")
}

Without the multiple match syntax, this code expands to the following:

match (a) {
  { x }: match (b) {
    { t }: // handle case
    else: throw new Error("unmatched")
  }
  { y }: match (b) {
    { z }: // handle case
    else: throw new Error("unmatched")
  }
  else: throw new Error("unmatched")
}

Optional explicit binding in else leg

In the discussion of Else Leg Syntax, the idea of an F#-like syntax (_: /* do something with _ */) is floated. The advantage of this would be that the result of the match value would be bound, allowing it to be used without re-evaluation or double side effects.

You could allow for optional explicit binding of the match value with else foo: /* do something with foo */ where foo is any valid identifier which would achieve the above as an addition to the proposed else: /* do something */ syntax with the added benefit of binding to any name you wish.

For example:

let obj = {
  get getterWithSideEffects() { /* calculate many things */ }
};

match (obj.getterWithSideEffects) {
  //...
  else foo: /* foo is bound to the result of obj.getterWithSideEffects without re-evaluating */
}

Another example:

match (window.prompt('Where do you want to go?')) {
  'north': y--,
  'east': x++,
  'south': y++,
  'west': x--,
  else direction: window.alert(`Sorry, '${direction}' is not a direction I'm aware of!`)
}

Nested Named Patterns/Groups

So there is a proposal currently to get named pattern groups in RegExp. I'm only really familiar with pattern matching in Mathematica so I can't comment on other languages but it would be nice if the proposal supported something like named grouped pattern matching from the get go. Matematica uses : which is kinda taken for obvious reasons. name : pattern.

Basically they are nested. all references the entire array and first references 1. Here is an example using Isiah's syntax.

match ([1,2,3]) {
  all of [first of 1,2,3]: ...
}

Reconsider Symbol.matches mechanism for destructuring

It would be neat to allow the destructuring pattern to just be a single identifier to create a simple binding. It's helpful when it is otherwise unnecessary to create another binding:

match (getValue()) {
  MyMatcher -> foo: doSomething(foo),
  else: doSomethingElse()
}

Currently the proposal implies that the return value of Symbol.matches can be further destructured. However, that only works if the value is truthy. There are matchers that match on something falsy but you may still want the actual value. Similarly, there are somethings that might match falsy value that you want to create a binding for.

Example:

let Some = { [Symbol.matches](v){ return v != null; } };
let None = { [Symbol.matches](v){ return v == null; } };

match(getValue()) {
  Some -> value: ...,
  None: ...
};

Imagine that getValue() can be an empty string or the number zero for example.

It would be good to decouple this mechanism so that you can both indicate that something matches, and if so, what its binding value should be.

Ability to no-bind multiple slots

Haskell blesses the _ identifier in match expressions to match anything but bind nothing; as a result, you can use it in multiple places in the match expression to indicate multiple holes you don't care about.

It's not clear from the README whether binding the same name multiple times would be an error or not, but if it is, this would conflict with this useful Haskell ability.

How strict is the matching for with regards to rest syntax?

Love this proposal, can imagine discovering many great applications for it.

I would just like to know, how strict is the matching - specifically for the ... (rest syntax)

For example:

const obj = { x: "something" };

match (obj) {
    { x }: /* match an object with x */,
    { x, ... y }: /* match an object with x, stuff any remaining properties in y */,
}

Would this match both conditions? I feel like it probably would, but I can't help thinking that being more strict about it might have its uses down the line.

The same applies with the array pattern:

const arr = ["something"];

match (arr) {
    [x]: /* match an array of length 1, bind its first element as x */,
    [x, ...]: /* match an array of at least length 1, bind its first element as x */,
}

Should it match the second condition, even though it seems to imply that there are more values? I see in the comment that at the moment matching both is the intended behaviour.

Consideration of generic sequence and mapping literal syntax

@isiahmeadows suggested on https://esdiscuss.org/topic/stage-0-proposal-extensible-collection-literal that we consider possible sequence and mapping literal proposals in tandem with pattern matching and generic destructuring to ensure that everything fits together nicely.

My proposal for such is here: https://github.com/alex-weej/es-extensible-collection-literal

As an example of how these might be related

// Note: 42 is a number, not a string.
const someMap = Immutable.Map#{42: "forty two"};

console.log(
  match (someMap) {
    #{42: _}: `It knew the answer, and it was ${_}!`,
    _: `It didn't know the answer...',
  }
);

More nested matching examples

Especially compositions of different matching types. For example, the following was raised by @marcfawzi on Twitter:

match (p) {
  { name: String, age: Number }: "valid input",
  else: "invalid input"
}

Why not more connection to destructuring?

When people theorized about pattern matching syntax in the past, it was always very related to destructuring, using the same syntax in most cases. I can appreciate there are probably reasons to depart, but I think it'd be useful to have a section explaining the connection, or lack thereof.

Multiline support

Has this been considered or some variant so it supports multiple lines.

match( obj ) {
  case {x, y}:
    ...
  break;
}

Matching identifier value

Is there a way in current proposal to match against identifier value? Passing someIdent in a match leg currently would only bind that value, so maybe explicit parens like (someIdent) could be used to treat it as an expression for comparison purposes?

This is also somewhat relevant to undefined case introduced by 44531a0 and mentioned in #15 - technically, undefined is an identifier and not a literal, so that commit creates a precedent for it being special, which in turn could raise similar questions about NaN, Infinity and potentially others.

An explicit way to compare expressions could generalise this nicely.

Examples with try/catch

Firefox implements non-standard conditional catch clauses like

try {
  myCode();
} catch (error if error instanceof MyError) {
  // ...
}

I think this is a useful shorthand as it helps avoid accidental trolls with forgetting to rethrow errors you're not interested in.

The reason I bring it up in this proposal is that it is often mentioned that pattern matching is in some sense a blocker for standardizing conditional catches. e.g. here or here.

Do you see a path for applying this proposal to conditional catches?

Confusing syntax

The syntax for

let length = vector => match (vector) {
    { x, y, z }: ...,
    { x, y }:   ...,
}

is confusing as it is not clear what does it mean in this case

let x = 'foo';
let length = vector => match (vector) {
    { x, y, z }: ...,
    { x, y }:   ...,
}

does this use the new proposed understanding or the short object syntax that is expanded as

let x = 'foo';
let length = vector => match (vector) {
    { x: x, y, z }: ...,
    { x: x, y }:   ...,
}

Should literals also go through Symbol.matches? Plus, general special-casing considerations

It seems at least doable that numeric, string, and boolean literals (see #3 for the latter) could go through Number.prototype[Symbol.matches].

I'm not sure whether this buys you much though, given that you already have special-cases for the object and array patterns.

I guess maybe it would allow you to simplify the syntax to:

  • Object pattern
  • Array pattern
  • Any other expression

where in the third case you go the Symbol.matches route. This third case would subsume the current identifier and literal cases.

Curious what people think.

Integration with try/catch

I've been hoping that we could eventually get some way of having multiple catch blocks which switch on which type of exception is thrown. The key here would be, if none of them match, it implicitly rethrows. This is a feature of most other languages with exceptions.

This pattern matching proposal ended up pretty differently from what might be more useful there. One problem is, we already do destructuring on the catch variable, so we can't start applying a different grammar.

Do you see extensions that would allow it to be used in a context like this? One alternative would be https://github.com/zenparsing/es-typed-catch . I guess the viability of this ties into the answer to #13 as well.

labeled `match` expressions and `continue`

There is a brief discussion of 'continue' to achieve fall-thru, considering match expressions can be nested continue would be more useful to if they could be labelled.

Has this been considered? I think this is doable:

(a: match { 
   {x: true}: 
      match { 
         {y: false}: continue a; },
   else: true
 })

There is no reason to have `else`

Given that these work:

let length = vector => match (vector) {
    { x, y, z }: Math.sqrt(x ** 2 + y ** 2 + z ** 2),
    { x, y }:   Math.sqrt(x ** 2 + y ** 2),
}

Then there is no reason why this would not work

let length = vector => match (vector) {
    { x, y, z }: Math.sqrt(x ** 2 + y ** 2 + z ** 2),
    { x, y }:   Math.sqrt(x ** 2 + y ** 2),
    x: x,
}

Clarify what it means for an object to be "with certain properties"

Specifically, while it's clear from the spec that {a: 1} would match the ObjectMatchExpression {a} and not match {a, b}, it's not clear which would be matched by {a: 1, b: undefined}. I argue that {a: 1, b: undefined} should in fact match {a} and not match {a, b}. Even if you disagree, the spec must be clarified.

Background

At a language level, 'prop' in obj is not the same as obj.prop !== undefined, even though if either is false it follows that obj.prop === undefined. In particular, when discussing whether an object contains a property, it would be imprecise at best to describe {a: undefined} as not containing the property a, since both 'a' in {a: undefined} and ({a: undefined}).hasOwnProperty('a') are true.

Reasons to use !== undefined semantics rather than in semantics

However, in real-world code/APIs it's common to treat {a: 1, b: undefined} equivalently to {a: 1}. For example, with jQuery, $('#thing').css({color: 'red', backgroundColor: undefined}) is equivalent to $('#thing').css({color: 'red'}). In fact, there are even articles discouraging people from delete-ing properties and encouraging them to instead overwrite them with undefined or null, claiming that this is more performant in modern JS classes because delete changes the hidden class of the object (example of such an article, another example). It doesn't matter whether this claim is true, my point is that people believe this and deliberately write their code this way.

This means in practice it's common for people to write APIs with the intention that this:

const options = {a: 1};
if (someCondition()) options.b = 2;
someAPICall(options);

be equivalent to this:

someAPICall({
  a: 1,
  b: someCondition() ? 2 : undefined
});

For match to be useful to the implementor of such an API, {a: 1, b: undefined} would have to count as an object "with" property a but not b.

Finally, the spec mentions an intention to be "similar" to destructuring. Note that for destructuring, var {a, b} = {a: 1}; is equivalent to var {a, b} = {a: 1, b: undefined};.

Open question

Should we go even further and allow even {a: 1, b: null} to match {a}? What about all falsy values, so that the ternary expression in the example above could be replaced with simply someCondition() && 2? I don't think so, I think matching destructuring makes the most sense here.

Contrast with #27

This issue concerns the same ambiguous language that #27 is asking about, but #27 is asking a different question. This issue asks about whether to use !== undefined semantics rather than in semantics. #27 is asking whether to use in semantics or .hasOwnProperty() semantics.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.