Extensible Programming

A programming language is a language for describing programs, virtual machines with memory that accept inputs and produce outputs. The constructs of the programming language are used to describe the relationship between inputs, the contents of the computer's memory, and outputs.

These are some ideas inspired by an article by Greg Wilson on extensible programming. The idea is to create a computer language, the source code of which is written in XML. Although programmers could edit the XML directly, they would usually choose to use an editor that would transparently convert between the XML and a more easily-understood format.

Viewing Modes

An XML-based language would allow an improvement on documentation generation systems like JavaDoc or Doxygen: One could choose to view a source file as either an implementer or as a user. A user would see only the public interfaces and external documentation. An implementer would also see private methods, internal documentation and code.

A separate viewing mode for debugging would be a plus, with each subcommand and side effect on its own line. Note: this might be as difficult with XML source as it is with plain text, as many optimizers do an assembly to assembly code translation that obscures the correspondence between source and object.

For platform dependent languages like C++, one could insert pragmas into the source that would only show up in certain environments. Yes yes, pragmas are evil, I know, yes...

Editor as an Outliner

Once one accepts the idea that an editor is mediating one's interaction with the source, a variety of fun things become easy. For instance, it becomes trivial to expand and compress control structures as one does in environments like Eclipse. One might choose to hide the actual code, showing only the comments and control structures as a skeleton. This seems promising, but in use programmers might find themselves expanding and contracting control structures all the time.

Magnification

It might be better to have the middle band of each source window fully expanded, with source above and below increasingly compressed, outlined and greeked. Scrolling through a file then becomes like moving a magnifying glass over it. Alternatively, one could always see the entire file and move the magnifier up and down the window.

Running with this idea, a programmer often compares two or more sections of a source file. This argues that the editing system should allow the programmer to move magnifying glasses over multiple sections of the source, leaving the rest of the file visible in compressed form.

This brings the sections to be compared closer together on the screen while illustrating the relationship of the magnified sections to the rest of the source.

Smart Search

Once we break from the one-to-one mapping between storage format and what we see on the screen, we can make every view of the source the result of a smart search, updated dynamically as the programmer and coworkers work on different parts of the project. Search criteria might collect all methods belonging to a certain class, or to all subclasses of a certain class (with all overrides of a method grouped together and sorted superclass-to-subclass), or all methods that call a certain method.

Canonical View

Although smart search views are seductive, programmers also need the ability to apply their own order and formatting to the code. The order in which methods are defined can express subtle relationships. For instance, in a C++ class I often group the methods that access and use a certain set of members. The order of the methods tells a story, and the order of the groups of methods is also significant.

Whitespace variations can also be expressive. E.g. I use extra blank lines to separate semantic paragraphs within a method. Sometimes I compress each case in a switch statement onto a single line to facilitate comparisons between similar lines. At other times each case spreads over several lines with an extra blank line to set it off from the case below.

This is an argument for a canonical view of the source in which the user can play with order and formatting. Playing with the source, trying out one order or grouping and then another, is integral to the act of finding a good object model. Having a persistent ordering would also give the programmer landmarks by which to navigate.

It might be possible to have the best of both worlds, i.e. user-defined ordering and every view as the result of a search, by making formatting and ordering hints first-class members of the language. I.e. you've got to be able to include 'is-semantically-related-to' as a search term, and allow user-defined method ordering.

Graphical Programming

Over the years I've chatted about graphical programming languages with Jason Bellenger (works at Alias), who dreams of using graphical metaphors to capture the multidimensional nature of source code. An underlying XML representation might allow an editing system to assign graphical metaphors to different code dimensions as desired.

Traditional programming languages use slightly more than two dimensions (vertical and horizontal within a text file, multiple files) to represent the interwoven tree structures that make up a modern computer program. A GUI program will have a class hierarchy, "uses" graph, hierarchies of visual objects on screen, a command-handling hierarchy, control structures within methods, etc.

In his No Silver Bullet chapter, Frederick Brooks argues that this inherent complexity makes it difficult to depict a program graphically. As his argument conflates "flowchart" with "graphical representation", I think there's hope. The key is to view only one or two networks at a time, and to allow the user to "pivot" on an object of interest, keeping that object visible and choosing which relationships to display.

Graphical Metaphors

In traditional programming languages such as Java, the way source is laid out on screen makes multiple uses of each dimension. Depending on context, horizontal indentation can mean control structure nesting or containment within an enclosing class. The vertical dimension indicates order of operations within methods and is often used to group related methods within a class.

This contextual overloading makes for compact presentation and comparison of ideas. Programmers are unlikely to move to a new system unless it offers similar information density. So far, the most successful environment I've used has been CodeWarrior's C++ IDE, with syntax colouring, class hierarchy views, GUI painter, and popups for convenient navigation to related classes and methods. The important thing is the way they've limited the problem, only using the directed graph view to represent the class hierarchy and not all the other potential networks.

Insofar as an XML representation of the underlying code would make this sort of thing easier to implement and extend, I'm all for it.

back to my homepage