Collected notes

Presentation

General quantifiers r{min,delta,greedy}

The problem with general quantifiers is that their compilation expands the regex min amount of times in the bytecode. With an example such as ((a{20}){20}){20}, this generates 20^3 instructions.

Implementing general quantifiers in the PikeVM can be approached from two perspectives, either,

Since the complexity proof of linearity is in terms of the bytecode length, the second approach would work well. Otherwise, the first approach requires incorporating a separate AST rewrite step. It will be easy to incorporate in proofs if it is done in the compile function. That might be also good grounds for supporting proofs of AST rewrites for optimization.

Adding anchors to the PikeVM

Adding anchors to the PikeVM was a straightforward change. The following changes were needed:

  1. Add a CheckAnchor bytecode instruction and define its semantics

    PikeVM’s small step semantics were extended to process this instruction. This is done by handling the CheckAnchor instruction when computing the epsilon step of a thread. When handling it, we check if the anchor is satisfied by using the anchor_satisfied function. It is exactly the same check performed by the backtracking tree semantics.

    1. If the anchor is satisfied, the PikeVM progresses that thread to the next instruction (one program counter ahead)
    2. If not, the PikeVM kills that thread
  2. Compile the Anchor AST node to a CheckAnchor instruction. The next instruction to be executed is stored at the program counter following CheckAnchor’s.

  3. Add Anchor to the subset of regexes the PikeVM supports. Add AnchorPass to the subset of backtracking trees the PikeVM supports.

  4. Handle AnchorPass in the PikeTree algorithm by simply continuing exploration of the backtracking tree.

  5. UNCLEAR WHAT THIS IS: Add anchor support to tree_rep in TreeRep.v

  6. State that a representation of the bytecode can start with a CheckAnchor instruction.

  7. Finally, update all proofs to handle the new cases.

To discuss

Action items