Declarative String Processing Overview

Patrick_Smith · September 30, 2021, 2:14am

Declarative parsers like this are really powerful, and conceptually I find a lot easier to understand.

I’ve being playing around with similar ideas with generator functions in JavaScript (GitHub - JavaScriptRegenerated/yieldparser: Parse using JavaScript generator functions — it’s like components but for parsing!). I think of the model as being akin to components for parsing. The power of components is that allow you to compose larger solutions out of smaller solutions, and they force you to give those smaller pieces a name.

import { parse, mustEnd } from 'yieldparser';

function* Digit() {
  const [digit]: [string] = yield /^\d+/;
  const value = parseInt(digit, 10);
  if (value < 0 || value > 255) {
    return new Error(`Digit must be between 0 and 255, was ${value}`);
  }
  return value;
}

function* IPAddress() {
  const first = yield Digit;
  yield '.';
  const second = yield Digit;
  yield '.';
  const third = yield Digit;
  yield '.';
  const fourth = yield Digit;
  yield mustEnd;
  return [first, second, third, fourth];
}

parse('1.2.3.4', IPAddress());
/*
{
  success: true,
  result: [1, 2, 3, 4],
  remaining: '',
}
*/

Another interesting library is elm/parser (GitHub - elm/parser: A parsing library, focused on simplicity and great error messages) which uses parser combinators. Their prior art is also worth reading: parser/comparison.md at master · elm/parser · GitHub