A few days ago I described my current project, and this morning I had one of those early morning thoughts that rings clear as a bell, probably because my critical faculties aren’t online yet. I’ll bring it up anyways; I think it not only holds true for me, but for a lot of software engineers.
I’ll start with an analogy: physicists, especially theoretical physicists, aspire to mathematical descriptions of phenomena which are not only accurate and tractable, but also elegant. I am not a mathematician, so I’m not quite sure what makes for elegant mathematics, but I can say that, within programming, there are elegant programs and there are hacks. I don’t know that anyone’s ever defined an elegant program, but I’m sure a number of these qualities might be ascribed to it:
- Immediately comprehensible;
- Documented (somewhat redundant with the first point);
- Abstracted such that modification of one part of the program only positively affects all other parts;
- Extensible, which is to say adding new capabilities is easy and does not require working on parts not directly applicable to the problem to be solved;
- Others that don’t come immediately to mind
Extra points if it’s
- Scalable; which is to say, if you pump more data through the program, or sometimes even change the data’s topology, the program’s consumption of resources (memory, CPU cycles) does not grow even more vigorously; we often talk about the complexity of a program, measured by “order of”, symbolized O(). So if your amount of data is symbolized with the number n, then we hope to do no worse than O(n). Order of, say, O(n^2), in which doubling the amount of data would result in the amount of time and/or memory used is SQUARED rather than doubled, is definitely a bad program, resulting in a program that runs in an unacceptable amount of time, or actually crashes due to consumption of all available memory; O(log n), on the other hand, is doing quite well; nirvana is O(1), i.e., your algorithm is insensitive to the amount (or topology) of the data.
- Performs well; this is allied, yet sometimes opposed to, the Scalability point. For example, sometimes you can really make a program fly by profligate use of memory, which then makes you vulnerable to that user who applies a great deal more data to your program than you ever expected.
So I bring this all up to better explain the project I mentioned earlier. I am well aware of yacc and lex, the standard parsing tools. I view them as inelegant to the point where I don’t use them, because they are not part and parcel of any other language than themselves; instead, they are used to generate C source code, which then must be sometimes massaged into usefulness, and even if not, there are other problems, such as the use of global variables, a key source of inelegance; and I’ve come to regard the mixture of languages in a single program to be a hack in itself. I am well aware of “the right tool for the right problem” philosophy, but I take the position that we all have only so many neurons available, and using them on multiple languages is a waste and a kludge when it should be eminently possible to create a language that can resolve all of these problems.
So this is an evaluation of this intuition, you might say. Does a language from the functional paradigm (Mythryl) have the capability to take code that looks like a BNF and result in a parser that is easy to build, expand, and fix, while running well in terms of performance and scalability? I have some concerns on the last point, but they’re not worth going into at the moment; better to wait until I’m at least testing partial functionality.
And for those wondering, I have nearly all of the BNF for the XML specification in place. I have discovered, unsurprisingly, that recursive and mutually recursive BNFs cannot be handled with BNF-like code, so there’s a minus. OTOH, they can be handled and with only some small disturbance to the general feel of BNFs.
I have, BTW, ignored the question of whether or not BNFs are really readable. They are, to my knowledge, the most common method for specifying something requiring parsing. Reading one easily takes some practice; but then, nothing easy is worth doing, said JFK.
Or is it wrong to mix politics with programming?
Given the importance of algorithms in today’s society, this is not exactly a jest.