My thoughts on Software Engineering
I spent the last year thinking about and prototyping a program for manipulating HTML documents visually within a web browser. This project both served as the inspiration to get involved in the software industry and as a muse guiding me to experiment with various approaches. My experiences have lead me to liken software engineering to many complicated things, though always I have felt a voice in the back of my head telling me, this is wrong. Most of the machines we build inside computers are simple at heart, though their simplicity is lost in translation. Many solutions are in fact designed to defensively handle leaky hardware abstractions rather than the actual problem. Being forced to design a system around the computers implementation is a sure way to take the fun out of programming. The problems I most often find myself having to work around include:
- Equality
Most languages base their equality tests on memory addresses; if it is stored in the same location then it is the same. This is not equality at all, that is identity. Using '==' to check identity is misleading syntax. This is made even more complicated when languages start storing all numbers and strings in global sets to enable == to effectively check by value, since all strings and numbers can only have one identity in such a system. Now the language is using '==' correctly and incorrectly at the same time.
- Identity
Because most languages are built around storing data in memory addresses it is difficult to change a value once it is set. Because when I want to update the value of an entity I must update it in place. This makes working with data difficult because what I am looking at isn't really a value, its a location referring to a value. This is fine as long as the value can't change between the time I collect it and the time I actually do some work with it. However, in all modern systems it can. In the case of multi-thread machines it is obvious how this could happen, though, more subtle in environments with a single thread of control, such as a web browser. In such an environment you can only be sure a value hasn't changed under your nose if you know what processes have run between the time you collected the value and now. This is not always practical; most developers like to decouple the functionality of their app through the use of message passing systems, thereby, electing not to keep track of when processes are run. As a counter measures we must ensure all identities are private to the unit of functionality they will eventually be used.
- Composition
Functions are designed to be black boxes. You give them data and they give you a response. They are a contract to which agreement is implied through use. They are non-negotiable, simply take it or leave it offers. Their authors may make provisions for some options though, if they are forward thinking enough. Functions do exactly what they say they will do no more no less. This might sound like a good thing, but negotiating a contract before signing it does not makes it any less of a contract. By allowing a function to be extended or shortened at run time will dramatically increase the odds of it being reusable. Just like with contracts, a little bit of cooperation goes a long way.
One way I increase the flexibility of a function is by making it emit an event on completion rather than return a value. This event can then be subscribed to by an arbitrary number of functions at run time. Effectively extending the functionality of the function. Though monkey patching 3rd party functions to use this pattern is not a great reuse story. It seems fundamentally limiting to work in a language with the weak data flow model of dump in place as its only built in option, even if only for the design upfront sentiment it creates. One possible solution could be a function subtype with multiple output options. Like the pin arrangements of electrical system components. I have experimented with this idea in the syntactically limited way I was able to.
The quality of a software system should not be judged solely on the correctness of its operations. Correctness is of-course important but if the code only achieves it by voiding all hardware abstraction leaks, then I would not consider it a good solution. A good software solution should teach their readers about the problem domain. They should be concise not through clever syntax but because every part directly addresses an aspect of the problem. Software is a model of a machine. If the language the software is made from correctly models the machines natural environment, then we are free to express the machine as clearly as we understand it.