An update on this project and the informal methodology I’m adopting. The BNF approach has, as previously noted, simplified the work necessary for syntax – the work has been mostly done for me.
It also appears the semantics are also greatly simplified, at least when it comes to recording them. I’ve been very slowly working on the DTD section of XML, which requires various options for recording allowable data formats. I decided to take the naive approach of simply recording the metadata in question as implied by the BNF, and I’m now predicting this will be very useful when the time comes to actually implement the strictures implied by the DTD. And it appears that it just works.
Typically, I’d try to be clever and probably wrap myself around an axle or two; here I’m letting the BNF tell me what metadata is important and, given their Validity Constraint notes, how to save it and apply it. No real cleverness, just paying attention to the spec. I really can’t emphasize this enough – my usual tendency to be clever (or cleverly obvious) is being suppressed here, and that’s unusual. Except you could say that I’m being cleverly obvious by taking this approach.
Interestingly, I’ve also observed something similar about Mythryl, the implementation language. In general, once you can get something to compile, it does what you want it to do, at least at the top levels – No Debug Necessary. In C and MODeL (a C-based OO language – company proprietary), my primary development languages, once you start writing anything complex, compilation is only the first step. Once you have it compiling, then the loosy-goosy type systems mean you can have written something that is illegal in the last analysis – but not in the first analysis, i.e., the compilation.
The extremely tight type system of Mythryl quite often means that compilation can be quite laborious, but once you’ve satisfied the type checker, your code is now so good that you don’t need to worry about it.
This is not 100% certain. I’ve found when working with parsers, I can often get it wrong and it still compiles. I suspect a language expert could tell me exactly why. (It does make intuitive sense – and has something to do with parsers working with languages.) But it happens often enough that the lead has coined the phrase It Just Works. (To help realize that dream, he’s written an informal tutorial on how to write better code in Mythryl.)
Speaking of parsers and the XML parsing project, I also plan to use a different parser to test the DTD against the actual data in the XML document. The parsers will be built dynamically using the DTD, and then run against the applicable data elements as they are encountered.
Bonus: there should be no memory leaks. In C and many other languages, if you have garbage collection at all, it’s slow. In Mythryl, according to the lead developer, it’s devastatingly fast.
I’m also guessing that in production systems handling enterprise-levels of data we may run into problems. But evaluating that capability is a future post and probably dependent on the cleverness of the programmers – not to mention contributions of other programmers in the areas of DB access, etc etc etc.