Current Project

My current free time project is the implementation of an XML SAX-like parser (much like that offered by Apache), but written in (and for) Mythryl, a functional language heavily derived from SML/NJ; the goal of the folks working on Mythryl is to transform SML/NJ from an academic language mainly used for doing PhD work into a production level language usable by commercial entities.  I’m doing this for these reasons:

  1. Mythryl doesn’t have such an XML parser;
  2. I’m doing it using a recursive descent parser engine;
  3. And I’m doing it using the EBNF definition of XML

The latter two reasons can be best illustrated from an example.  From the definition of XML comes this specification:

prolog ::= XMLDecl? Misc* (doctypedecl Misc*)?

The ‘?’ means optional, and the ‘*’ means the element may appear 0 or more times.  This is an ordered set of instructions, as in “first this and then this and then this ….” other statements can also include “or” statements.  The words themselves refer to other specifications; eventually, those specifications will actually mention characters in specific formats.

In Mythryl I can write, given sufficient preparation,

prolog = |xml_decl| & <misc> & |(doctypedecl & <misc> )| ;

Here, the |xyz| means xyz is optional, the <xyz> means xyz may apply 0 or more times, and the ‘a & b’ means ‘a and then b’, as discussed above – i.e, the ordering mechanism.   Given a little leeway for capitalization conventions and symbol changes, and the Mythryl code assumes an amazing resemblance to the EBNF – which means I can copy the EBNF from the document, make a couple of changes, and all of a sudden I have a parser to handle the syntax; a little more meddling and I have semantic support.  And the semantic can be added as I have time; the parsing works on any valid XML doc, and I can slowly add in the other details, such as detecting problems with well-formedness, etc.

I’m interested in just how close I can come to using the EBNF, how quickly I can go from copying to full-blown functionality, and the post-development maintenance aspects – if any, of course.  One of the interesting facets of using Mythryl is that about half the time, once you get it to compile something (and that can be a challenge, even for an experienced programmer, for someone new to Mythryl), it Just Works – a phrase that the developer of Mythryl has been using.  No further debug …

That hasn’t applied to the parsing engine, as shoving large amounts of data at it had a performance impact; I may have gotten around that with an optimization; generally, though, functional programmers are encouraged to design a good solution without worrying about performance – let the compiler do it.  One estimate of its garbage collection is that it’s 10 times faster than Java.

As a quick PS, this the recursive descent parser isn’t built into Mythryl; it’s a part of a small library I developed using a Mythryl tutorial as a starting point.  Operator overloads and currying are very interesting after decades of programming in C and some OO languages.

 

Bookmark the permalink.

About Hue White

Former BBS operator; software engineer; cat lackey.

Comments are closed.