Information and Software Development

by Alexandru Bolboacă and Maria Diaconu

Software development is a young domain and thus still trying to find its ways. The truth is that we only partially understand why existing practices work (or not) as they do.

We believe that one of the core questions of software development has not yet been answered. The question is:

What are we actually doing when developing software?

In the following, we will explore a possible answer to this question.

Refining Information

Software projects are essentially steps in refining information until it’s precise enough to be executed on a computer.

We start with an idea and very little information about what we’re trying to accomplish. We learn about the market and the users, then about technical aspects, then we detail the features and develop and then we learn from our users. During these steps, we gather and process information until it becomes executable.

In order to make information executable, two things are needed:

  • A system that can execute the information
  • Information has to follow very strict rules dictated by the system that executes it

Software programs are computer-executable information, and the computer contains the system that executes information. The rules for executable information (software programs) are provided by the machine language. Generally, the machine language is created by compiling or interpreting a program written in a programming language.

There is at least another type of executable information that we know of: genes. The system that runs the genes is biochemical and it allows growing and maintaining an organism based on the information encoded in them.

A Model for Information

According to Shannon’s Theory of Information, information is measured in bits. A bit is the answer to a yes/no question.

We will start from the idea that any question can be reduced to a number of yes/no questions. For example, the question “where” is equivalent to asking for each possible location X “Is [the object] at X?”, which is a yes/no question. Similar models are true for “why”, “how”, “when”, “what”. Therefore, any question can be expressed as a number of bits. (We have no idea if this number is finite, but intuitively, there are other ways to ask the questions such that the number becomes finite for common cases.)

Thus any piece of information can be viewed as a string of questions and answers.

Operations on Information

In software development projects, the team is responsible for storing the information and executing operations on it with the purpose of refining it until it becomes executable.

The types of operations that we identified are the following:

  • Store
  • Add
  • Discard
  • Copy
  • Collapse
  • Execute

Let’s look at each operation in detail.

Store Information

In software projects, there are multiple ways to store the information, classified based on the storage medium:

  • Human memory
  • Physical medium (e.g. paper, post-its etc.)
  • Electronic medium (files on a file system)

and based on how structured it is:

  • Quasi-chaotic (the human memory looks this way from the outside, presumably because we need to learn more about it)
  • Unstructured text, on electronic or physical medium (words on paper)
  • Partially structured (wiki, semantic knowledge base, project wall), on electronic or physical medium
  • Very precise, in the form of executable information (programs)

The ideal situation is when all information is precise, thus executable. However, ideas start in human minds so we cannot escape the process of refining the information until we reach the point when it’s executable.

The way the human brain stores information is thus extremely important. A very simple description would be that the human brain stores information in a compressed lossy format. The compression is done based on pattern matching and each time a piece of information is accessed the compressed information and the pattern are used to reconstruct the initial information. The format is lossy because the initial information is not reconstructed exactly, but as close as possible. In a way, each time I access my memory I get another slightly different version of the initially stored information.

We know at least one way to alleviate this problem. If a person uses the same information again and again, the connections in the brain are strengthened and the model becomes more accurate.

Add Information

Since we defined information as a string of questions and answers, adding information is simply asking a new question and getting a new answer.

Storing the new information in the human mind is more interesting. In order to do it, the existing model has to be decompressed resulting in a slightly different version of the initial information. The new piece of information is added on top of it and the result is compressed again in a new model. This increases the risk of errors since the new model is also lossy.

Discard Information

Sometimes, a piece of information outlives its use and we discard it. In an electronic medium, we simply delete it. Inside the human brain, things are more complex, but in principle the connections between pieces of information weaken over time if not used. Therefore, the discarded information becomes unrelated to the model, even though team members may still remember it.

Copy Information

Copying information means passing all the bits from one storage to another.

In the case of electronic medium, this is very easy, very accurate and very low risk. Copying from one material medium to another or from material medium to electronic medium are well studied problems (copiers, scanners can help).

Copying information from physical medium to human memory has to be connected with the way we build the mental model when learning. We will not insist upon it because it’s a long topic and because we assume that each member of a software development team knows how to do it.

In software projects we have two other interesting paths for copying information:

  • From person to person
  • From person to computer

Copy Information from Person to Person

Based on our model, the certain way to copy information from person to person is to provide all the questions and answers that form the information. The problem is that, as we discussed, people don’t store information as questions and answers.

So is there a certain way of copying information from person to person?

We don’t know any. There are however tools for copying information from person to person with less friction:

  • Stories
  • Mindmaps
  • Structured conversations
  • Questions and answers

and probably others.

None of them works perfectly by itself, but the more variety of tools we use, the more accurate the copy.

Since we know that there will be copy errors, we also need to have mechanisms for identifying and recovering from such situations. In general, this mechanism is a feedback loop of some sort:

  • Asking questions to see if the other person understood
  • Let the other person do something based on the information and provide feedback (e.g. wireframes, prototypes etc.)

and others.

Copy Information from Person to Computer

Information from people’s minds can be copied to electronic formats. The difficulty in doing so is directly proportional to how precise the information will be. That is, it’s simpler to write documents than to write executable programs because programs will not execute unless the information is sufficiently precise.

Collapse

This is probably the most puzzling operation that can be applied to information. Under specific conditions, the same information can be expressed in fewer bits than before.

We intuitively know that this happens, but we don’t claim to understand it completely. A few ways to collapse information are:

  • Remove duplication – if a bit pattern repeats, it can be replaced with only one instance and references
  • Generalize – replace a series of bits with a pattern that can generate them
  • Deduce – replace two series of bits with their logical conclusion

Execute

As we discussed before, executing information requires a medium that allows execution and very precise information that follows the rules accepted by that medium.

The whole purpose of a software project is to reach the point where information becomes executable. Therefore, the purpose of any software development framework or methodology should be to optimize the time to get to the executable information.

Brief Recap

This article introduces a way of looking at software development that tries to answer the question “What we’re actually doing when developing software”. The way we are proposing is to look at a software project as a sequence of steps in processing information with the purpose of reaching the level of precision required for executable information. The information is transformed through a number of operations: store, add, discard, copy, collapse, execute. Since ideas start in the human mind, the way people process information is very important to understand and use effectively.

Further

This idea opens many exciting possibilities, starting with understanding why certain development practices work better than others and possibly opening the road for creating new practices or improving the existing ones.

The first consequence that we see deriving from this model is that software developers improve over time their skills at processing information by practicing them day by day. These skills are important from the very beginning of the project. So not only should we involve developers as early as possible in the project but we should also seek ways of improving their information processing skills.

We will revisit in future articles this idea and discuss current development practices on the basis of this model.

We would like to thank our reviewers, JB Rainsberger, Sarah Rainsberger and Felix Plesoianu for their suggestions and patience.