Tool building organizations

Douwe Osinga
6 min readApr 20, 2022

Last time we talked about the superpower of being able to build your own tools. This might be a bit of a lost art with the Industrial Revolution replacing artisans with laborers doing standard jobs. And now it looks like in the data economy building your own tools is going to be key again. In a reply on twitter Mikio Braun wondered how we can make this work in the “enterprise”.

Today I wanted to dive into the question a bit. What information ecosystems foster self tool-building? And what can we learn from the successful ones when designing our own organizations?

A fascinating example of such an ecosystem was early Twitter. When Twitter started it was almost nothing more than an app to send SMS. You would send an SMS to Twitter and then Twitter would send that SMS to your followers. Oh, and you could only use 140 characters. SMS allows for 160 characters, but Twitter wanted to use the other 20 for ads.

The beauty of such a simple system is that it is all text based. If you have a fancy looking UI adding a feature is hard work. But if it is all simple, malleable text, conventions can replace implemented features. Think using ‘*’ in a plain text to simulate a bulleted list.

When early twitter users wanted to address a tweet to somebody, they just added @<username>. Similarly, if you wanted to retweet, you would write “RT @username <original-tweet>”. DM’ing somebody? Start with “DM” — never mind that everybody can see what you typed.

Even the now ubiquitous hashtag was invented by the early twitter community. IRC, an ancient chat protocol, has the convention to prefix names of public channels with the ‘#’ sign, so maybe that Chris Mesina when he tweeted in 2007:

And that’s how the hashtag started. No focus groups or product managers needed.

All these conventions of course turned into actual features and spread beyond twitter. They are now universal concepts. We barely remember where they came from.

Web 3.0 now sounds like crypto but originally it meant the Semantic Web. After inventing Web 1.0, Tim Berners-Lee decided there should be more than the chaotic creative explosion of the early Internet. What if instead of free form text, people would contribute knowledge directly? Mark-up knowledge, not just text.

And so a Plan was made to encode all knowledge using Ontologies using the Resource Description Framework. It failed. Imposing pre-set rules on how to do this is hard enough in a professional bureaucracy. It turned out to be almost impossible to do on the web. On the web official standards follow practice, not the other way around. Around the same time though, a group of people embarked on a different project to encode all knowledge: Wikipedia.

The Wikipedians allowed anybody to write articles. No plan and no structure beyond text based linking between articles and some text based formatting. Linking to another page was a matter of putting a term between [[ and ]]. If the target page didn’t exist, it would show up in red, inviting others to write that missing article. As we all know, the anybody-can-edit-anything encyclopedia took off like a jack rabbit.

But then something beyond plain text started to happen. Editors came up with certain rules to keep things somewhat organized. Take category pages. Today most wikipedia pages end with a list of categories. But category pages started as lists of pages with something in common, for example all renaissance painters. A convention sprung up to start the names of those pages with “Category” and editors started to link back. Now Categorizing is close to the hashtag. An article linking back to [[Category:1452_births]] is really saying, this person was born in 1452.

Wikipedia’s templates started out as a way to reduce the amount of boilerplate text. Instead of typing out a full citation every time, insert a {{cite …}} text fragment. Add the author, the source and the year and voila, all citations look the same. Similarly, all cities ended up having an “infobox settlement” template. Infoboxes show a right side panel with the key facts about a city in a standardized format.

But from a different perspective, infoboxes are just semantic claims about a page. Vinci,_Tuscany contains a template of type “infobox settlement” where the field population_total has a value of 14579. This is really saying this is a city with 14579 people in it. There’s your semantic web! The category hierarchy builds out your ontology. The key/value pairs in the templates are your semantic triples (the subject is the page).

Anybody who has spent time mining Wikipedia for data can tell you that the reality is quite messy. The categories and templates grow organically and only slowly do editors agree on shared standards. Should there be a category of renaissance painters? Or renaissance artists? Are the infoboxes for cities different from village infoboxes? But it works. Over time a consensus emerges. The irony of it all is of course that the Wikidata project is the closest thing we have to the semantic web.

Early computer terminal

The archetypal ecosystem of tool growing is probably software engineering. In the olden days programmers typed on large keyboards staring at small, green screens filled with only letters and numbers. Hollywood still very much likes to depict hackers like that. Sure, keyboards tend to be smaller and the screens are certainly bigger and mostly not green (unless the person doing the typing feels nostalgic). But writing code involves a lot of typing commands into terminals, basically applications that look like 40 year old CRTs.

It turns out that having a set of small, text based tools that can be combined and chained is a very effective way of getting work done. One might remove all lines that don’t contain a word. Another might for each line only keep the third column. A third one might just sort the lines. Combine them and you have something that automates what a spreadsheet could do. These combinations become their own tools that can be shared with others. And it is easy to write a new basic tool (if not easy to come up with one that does something interesting).

The three examples mentioned here are all text based and I don’t think that is a coincidence. Text is malleable and so anybody can build text based tools — if a community decides that RT stands for retweet, it does and then at some point Twitter will make it so. And that will be better, but not even by that much. Text is magic.

We’ve talked about this before, but spreadsheets do this well, too — they are text based, but in a more structured way. Anybody can start a spreadsheet because it just starts out as a list. And then you add some more columns and now you have a little database. And now you add some formulas and you are programming (and of course if you keep adding complexity it gets better and better until you hit a brick wall from where there is no coming back).

How does this translate to organizations? As Mikio was hinting at, good organizations have got this working. Early Google did in a way. I doubt that there are any hard and fast rules, but certain principles sound right. Let small groups self organize and come up with their own processes; small but not too small — Bezos’ 2 pizza rule seems like a good guideline.

Don’t be afraid if different teams have different solutions for the same problem — different corners of Wikipedia tried different approaches too before settling on the template to use for populated areas. The key is to try different things and over time zoom in on the one that works.

Use simple tools that can be adapted to the task at hand — it’s remarkable how little you need to run an organization beyond a whiteboard, a folder with text files and some spreadsheets. And by all means pick the cloud based version if you are remote, but don’t start too fancy — do retros using sticky notes on the whiteboard, write peer feedback in Google Docs and design your roadmap in Excel.

And finally, when you establish a practice, automate. Machines are really good at repetitive and boring stuff. Humans are not and make mistakes. (Power) shell scripts are a good start. There are many no-code and low-code solutions these days. Lots of tools come with their own macros to make mundane jobs happen automatically. Ultimately though there is no substitute for real code and for organizations that know how to program. In the data economy writing code is the only real way to build your own tools.



Douwe Osinga

Entrepreneur, Coding enthusiast and co-founder of Neptyne, the programmable spreadsheet