Giter VIP home page Giter VIP logo

bch-loops's Introduction

CHIP-2021-05-loops: Bounded Looping Operations

    Title: Bounded Looping Operations
    Type: Standards
    Layer: Consensus
    Maintainer: Jason Dreyzehner
    Status: Draft
    Initial Publication Date: 2021-05-28
    Latest Revision Date: 2021-05-28

Summary

This proposal includes two new BCH virtual machine (VM) operations, OP_BEGIN and OP_UNTIL, which enable a variety of loop constructions in BCH contracts without increasing the processing or memory requirements of the VM.

Deployment

Deployment of this specification is proposed for the May 2022 upgrade.

This proposal requires CHIP: Targeted Virtual Machine Limits, or equivalent limits on the processing and memory usage of the VM.

Motivation

The Bitcoin Cash VM is strictly limited to prevent maliciously-designed transactions from requiring excessive resources during transaction validation.

Loops were originally excluded from this design on the mistaken assumption that they risk consuming excessive validation resources. In reality, a small number of VM operations make up the majority of resource usage during transaction validation, and any implementation of bounded loops can be many orders of magnitude faster than the most pathological VM bytecode constructions.

Introducing this basic control flow structure would make BCH contracts significantly more powerful and efficient.

Benefits

Looping operations enable many use cases where specifying a precise procedure using OP_IF is impractical or impossible, and they also significantly reduce the bytecode length of many contracts with repeated procedures.

Costs & Risk Mitigation

The following costs and risks have been assessed.

Modification to Transaction Validation

Without adequate limits, looping operations have the potential to increase worst-case transaction validation costs and expose VM implementations to Denial of Service (DOS) attacks.

Mitigations: this proposal includes strict limits which ensure loops do not increase the maximum executable bytecode length (MAX_SCRIPT_SIZE, 10,000 bytes). This altogether avoids increasing the available surface area for Denial of Service (DOS) attacks.

Node Upgrade Costs

This proposal affects consensus โ€“ all fully-validating node software must implement these VM changes to remain in consensus.

These VM changes are backwards-compatible: all past and currently-possible transactions remain valid under these new rules.

Ecosystem Upgrade Costs

Because this proposal only affects internals of the VM, standard wallets, block explorers, and other services will not require software modifications for these changes. Only software which offers VM evaluation (e.g. Bitauth IDE) will be required to upgrade.

Wallets and other services may optionally upgrade to add support for new contracts which will become possible after deployment of this proposal.

Technical Specification

Two new VM operations are added: OP_BEGIN (0x65/101) and OP_UNTIL (0x66/102).

Note: these codepoints were previously used by OP_VERIF (0x65) and OP_VERNOTIF (0x66) in early versions of Bitcoin, but those operations were disabled when it was realized that they created a fork-risk between different versions of the Bitcoin software. Both codepoints currently render the transaction invalid (even when found in an unexecuted OP_IF branch), so they are effectively equivalent to any codepoints in the undefined ranges. They are particularly well-suited for this application because they are the only two available, adjacent codepoints within the control-flow range of operations.

To support these operations, the control stack is modified to support integer values, and a new unsigned integer counter Repeated Bytes is added to the VM state.

Control Stack

This proposal modifies the control stack (A.K.A. ExecStack or ConditionStack) to support integer values. When testing for operation execution, integer values must be ignored.

Prior to this proposal, the control stack (A.K.A. ExecStack or ConditionStack) is conceptually an array of boolean values: when a flow control operation (like OP_IF) is encountered, a true is pushed to the stack if the branch is to be executed, and false if not. (OP_ELSE toggles the top value on this stack.) Non-flow control operations are only evaluated if the top of the stack is true.

Repeated Bytes

A new unsigned integer counter, Repeated Bytes, is added to the VM state, initialized to 0 at the beginning of evaluation.

Repeated Bytes is only used by OP_UNTIL to prevent excessive use of loops.

OP_BEGIN (0x65/101)

OP_BEGIN pushes the current instruction pointer index to the control stack (A.K.A. ConditionStack).

This operation sets a pointer to which the evaluation will return when the matching OP_UNTIL is encountered. If a matching OP_UNTIL is never encountered, the evaluation returns an error.

OP_UNTIL (0x66/102)

OP_UNTIL pops the top item from the control stack:

  • if this control value is not an integer, error.
  • The difference between the current instruction pointer and the control value is added to Repeated Bytes; if Repeated Bytes + script.size() is greater than MAX_SCRIPT_SIZE, error.
  • The top item is popped from the stack, if the value is 0 (the same test as OP_NOTIF) the instruction pointer is set to the OP_BEGIN instruction (otherwise, evaluation continues past the OP_UNTIL).

OP_UNTIL has two possible error conditions: 1) a mismatched OP_BEGIN (either OP_BEGIN is missing or the loop contains an OP_IF without a matching OP_ENDIF), or 2) executing the loop would cause the evaluation to exceed MAX_SCRIPT_SIZE. If neither of these occur, and the top stack item is 0, evaluation returns to the OP_BEGIN (and that instruction pointer index is again pushed to the control stack).

Rationale

This section documents design decisions made in this specification.

Naming and Semantics

This proposal is influenced by the BEGIN ... UNTIL indefinite loop construction in the Forth programming language (upon which the rest of the Bitcoin Cash VM is based). This particular construction offers maximal flexibility: definite loops can be easily emulated with indefinite loops, but indefinite loops cannot be emulated with definite loops.

Suitability in Covenant Applications

One alternative to the proposed OP_BEGIN ... OP_UNTIL construction could be a single OP_GOTO operation which (with certain limits) jumps to the instruction pointer at the top of the stack.

While such an operation could help to further increase the efficiency and expressiveness of the BCH instruction set, a GOTO instruction also makes modeling execution more difficult. This is particularly limiting for covenant applications โ€“ it is notably more difficult for covenants to keep track of expected instruction pointer indexes (to reconstruct contracts with hypothetical OP_GOTO operations) than it is for them to use this proposal's construction.

The higher-level OP_BEGIN ... OP_UNTIL construction also better matches the semantics of the existing VM instruction set (e.g. OP_IF ... OP_ENDIF).

Usage Examples

The following examples demonstrate usage of OP_BEGIN and OP_UNTIL.

Simplifying Repeated Procedures

Bounded loops reduce wasted bytes in repeated procedures like merkle proof validation.

Merkle trees offer a versatile data structure for storing state in BCH contracts. Below is one possible strategy for validating a merkle proof (shown validating the left-most element in a tree of 16 elements):

<sibling_tier_4_hash>
<sibling_tier_3_hash>
<sibling_tier_2_hash>
<sibling_leaf_hash>
<value>
<1> OP_TOALTSTACK
<1> OP_TOALTSTACK
<1> OP_TOALTSTACK
<1> OP_TOALTSTACK

OP_FROMALTSTACK // check if leaf requires swap
OP_IF OP_SWAP OP_ENDIF
OP_CAT OP_HASH160

OP_FROMALTSTACK // check if tier 2 requires swap
OP_IF OP_SWAP OP_ENDIF
OP_CAT OP_HASH160

OP_FROMALTSTACK // check if tier 3 requires swap
OP_IF OP_SWAP OP_ENDIF
OP_CAT OP_HASH160

OP_FROMALTSTACK // check if tier 4 requires swap
OP_IF OP_SWAP OP_ENDIF
OP_CAT OP_HASH160

<root> OP_EQUALVERIFY

With bounded loops, this procedure can be simplified to:

<sibling_tier_4_hash>
<sibling_tier_3_hash>
<sibling_tier_2_hash>
<sibling_leaf_hash>
<value>
<0> OP_TOALTSTACK
<1> OP_TOALTSTACK
<1> OP_TOALTSTACK
<1> OP_TOALTSTACK
<1>

OP_BEGIN
  OP_IF OP_SWAP OP_ENDIF // check if leaf requires swap
  OP_CAT OP_HASH160
  OP_FROMALTSTACK
OP_UNTIL

<root> OP_EQUALVERIFY

Aggregation

Without bounded loops, contracts are forced to emulate aggregation by re-specifying a particular aggregation procedure for each possible set size.

For example, to aggregate the UTXO value of each transaction input without bounded loops (for up to 4 inputs using transaction introspection operations):

<0> // aggregated value
<4> // the count of values to aggregate

// locking bytecode
OP_DUP OP_TXINPUTCOUNT OP_LESSTHANOREQUAL OP_VERIFY // prevent malleability
OP_1SUB OP_DUP <0> OP_GREATERTHANOREQUAL OP_IF
    OP_SWAP
    <0> OP_UTXOVALUE
    OP_ADD
    OP_SWAP
OP_ENDIF
OP_1SUB OP_DUP <0> OP_GREATERTHANOREQUAL OP_IF
    OP_SWAP
    <1> OP_UTXOVALUE
    OP_ADD
    OP_SWAP
OP_ENDIF
OP_1SUB OP_DUP <0> OP_GREATERTHANOREQUAL OP_IF
    OP_SWAP
    <2> OP_UTXOVALUE
    OP_ADD
    OP_SWAP
OP_ENDIF
OP_1SUB OP_DUP <0> OP_GREATERTHANOREQUAL OP_IF
    OP_SWAP
    <3> OP_UTXOVALUE
    OP_ADD
    OP_SWAP
OP_ENDIF
OP_DROP // drop remaining index, leaving aggregated value on top

With bounded loops, this aggregation can be simplified to:

<0> <0>
OP_BEGIN                            // loop (re)starts with index, value
  OP_OVER OP_UTXOVALUE OP_ADD       // Add UTXO value to total
  OP_SWAP OP_1ADD OP_SWAP           // increment index
  OP_OVER OP_TXINPUTCOUNT OP_EQUAL
OP_UNTIL                            // loop until next index would exceed final index
OP_NIP                              // drop index, leaving sum of UTXO values

Implementations

Please see the following reference implementations for additional examples and test vectors:

[TODO: after initial public feedback]

Stakeholders & Statements

[TODO]

Feedback & Reviews

Copyright

This document is placed in the public domain.

bch-loops's People

Contributors

bitjson avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

yttiktak

bch-loops's Issues

Motivation reads a bit glib :-)

(yes, I have a new account as github blocked my old one).

At this point the 'Motivation' chapter has this text;

Loops were originally excluded from this design on the mistaken
assumption that they risk consuming excessive validation
resources. In reality, a small number of VM operations make up
the majority of resource usage during transaction validation,
and any implementation of bounded loops can be many orders
of magnitude faster than the most pathological VM bytecode
constructions.

This makes people believe that the actual loops are the thing that was the worry. Naturally that makes no sense, jumping a number of times is indeed cheap, and that was never the problem.
The reason loops were disabled was the fact that it makes calling those really heavy resource-usage type of operations much much cheaper.

The logic simply is that if you have an operation that has cost Y to it, and then you surround it with a bunch of opcodes to be able to repeatedly do that same operation, you worry about the n * Y cost. Not the overhead of looping, but the fact that it became cheap to multiply cost Y.

It probably makes sense to be honest about this in the motivation and look at what kind of mitigation the original Satoshi stack had and compare that with what you present.

Motivation needs non-techy usecases

Motivation assumes we need loops, but the concept of CHIPs is that the status-quo is what we have today. Any divergence should be backed up with actual rationale's and preferably usecases.

So the motivation needs to be improved a lot in order to actually show what can not be done today (or may be too expensive) but could be possible with this new feature. Because from the CHIP process this is a new feature since it is not available in BCH today.

ps. the merkle-proof code doesn't seem to take under consideration that the unlocking transaction is the one that picks the depth. And as more tx go in a block, the depth will change.

Constrol stacks "integer" is confusing.

Under OP_UNTIL you write:

if this control value is not an integer, error.

This is confusing as the push to the control stack is described as "pushes the current instruction pointer index to the control stack".

The proposal freely mixes "integer" and "instruction pointer" for this concept and I feel that the 'integer' is an implementation detail that should not be described in the spec. The spec doesn't care if its an actual pointer or some integer offset or something.

Please consider dropping the term 'integer' in the context of the Condition Stack and simply using the term "instruction pointer". Maybe with a single note saying that this likely is best implemented as an offset-integer.

Repeated Bytes isn't recursive as described.

The spec states that begin/until are using a stack. And this makes them recursive. You can have a loop in a loop.

This doesn't vibe with the Repeated Bytes concept as described as that is stated to be a VM-global variable.

I would drop the idea of adding a global state to the VM and simply make the UNTIL opcode do the math of the amount of bytes moved (until-pos minus begin-pos).

While this is unfair in case of skipped blocks, its pretty darn cheap to do.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.