What is make?

Before we go and talk about a better make, we should make sure we define what make actually is. Its a rather simple tool for transforming files from their source form into something else and doing some amount of dependencies tracking. Its a “dependency maintenance tool” and a “software construction tool” which is at its best when used together with other tools to form a “build system” and/or a “package management system”.

(This article is part of a series on build tools.)

Make as a tool

The wikipedia page on make starts off like this:

make is a utility that automates the process of converting files from one form to another, doing dependency tracking and invoking external programs to do additional work as needed. Its dependency tracking is very simple and centers on using the modification time of the input files. Most frequently it is used for compiling source code into object code, joining and then linking object code into executables or libraries. It uses files called “makefiles” to determine the dependency graph for a given output, and the build scripts which need to be passed to the shell to build them. The term “makefile” stems from their traditional file name of “makefile” or (later) “Makefile”.

That’s a rather good, if compact description. There are many variants of make available, all with slightly different features. The ones in most common use today are GNU make and BSD make.

Make as a part of a process

There is more to software than transforming source files into object files. I’ve written more about the software management process. Make is often the “driver” program for several of these stages. To build software, you often just type make. To test it, you type make test. To package software up into a release, you’ll type make dist. To distribute it, you’ll type make publish. To install it, you type make install.

Make really is optimized primarily for the “build” step, e.g. for the actual sourcefile to object file transformations. But since it uses the shell for executing commands, and the shell is the usual way you execute commands (yeah yeah, I know there is such a thing as a GUI), its real easy to hook up commands for doing most of the other stuff, too, and its possible to implement rules such as if a source code file changed, recompile the output files that are created from that file, then rerun the tests that test those output files, even if it quickly becomes quite complex and awkward to maintain those rules.

Using make for building software

Now, even when using make for just “the build” it usually does not operate in isolation. In the more-or-less standard way make is used to build a piece of “native” software, the sequence of commands is actually more like ./configure; make. Here, configure is a (rather extensive) shell script.

Since writing a really complex, portable shell script is very hard, many developers of C/C++ software use GNU Autoconf and the other parts of the GNU build system (autoconf, automake, and libtool) to assist them. These tools help to with generating the configure script and the Makefile. The complete process looks something like this:

Simplified program flow for the GNU build system/div>

So to make effective use of make for builds of software of realistic complexity which must build on a variety of platforms, we have to add several more steps of macro expansion, introspection, transformation, compilation, etc. Automake and autoconf are usually run by the original package developer, and then the generated configure script and generated Makefile are shipped to the end user. This means that the automake and autoconf dependencies (like perl and the GNU version of the M4 macro language) are not needed by the end user. However, since the configure script and Makefile are very complex, if the end user wants or needs to make changes to the build process, they usually still need to have automake and autoconf installed.

Note that while the GNU build system has some support for software written in languages such as perl, python or java, developers using those languages tend to not use make at all, instead opting for a language-specific tool.

Using make for other tasks

Any kind of task which involves the transformation of one kind of file into another kind of file is something where make can be very useful. For example, make can be used to package up generated code into a tarball, or to invoke any of the multitude of latex tools out there to generate HTML or PDF documentation.

Make is also often used for tasks that have little to do with transformation of input files into output files, but which are somewhat related to the more general process of dealing with software. For example, many Makefiles support an install target, which copies compiled software into a location on the filesystem where it can be easily invoked. Similarly, make is the basis of the BSD ports system.

How make interacts with packaging systems

Many linux distributions have a special kind of “packaging format” which adds its own “metadata” to a particular piece of software. This metadata describes what commands should be invoked to make the software compile and install successfully on the target platform. Several management tools are usually provided for managing this kind of metadata and/or these kinds of packages. In most cases, these tools at some point in their execution invoke make to do the actual software build. The BSD ports system is an interesting exception — it is completely based around make, and hence make tends to be used to invoke itself.

Here’s a picture showing the relationship between make and many common packaging tools:

Image illustrating the relationship between packaging tools and Make

(Note how building software for windows generally does not involve make. Also note how messed up package management on windows generally is. Coincidence? I don’t think so.)

Make as something to improve on

Make is rather simple, stable and mature software available on just about every computing platform out there, and installed by default on about every operating system but Microsoft Windows. While make is primarily optimized for building object files from source code and tracking dependencies between object files and the corresponding source files, it is flexible enough to be integrated with and usable for many other tasks of the software management process.

So what is there to improve on? Together with the previous two posts, this post should provide enough “background” so I can start compiling a list…

April 17 update: minor formatting updates and reference to the overarching series.

Build systems for games

Thinking more about build systems, I figured the gaming industry must have some pretty complex problems to solve there as well. Indeed they do, and there’s quite a bit of info about it readily found on the web. Performance is key, and no-one seems to care much about “open development”. But with microsoft XNA on the horizon, that might change.

(This article is part of a series on build tools.)

Where gaming industry engineers hang out

Noel Llopis provides “an engineering look at the game development process” on his blog, and has written a bunch about build systems. He refers to the sweng-gamedev mailing list, which is a true gem of insight into what goes on in the game engineering world. Its a place where people from actual game development houses interact directly with the microsoft developer teams.

I’m going to equate the gamedev community to the gaming industry for a bit. That’s probably not all that accurate, but it helps the thought process.

The gamedev list has build systems as a recurring theme. For example, Dec 2004 has several long threads about build systems, and then June 2005 rehashes some of those, after which Noel wrote some of the entries referenced above, after which these guys spend a lot of time on a Quest for the (More) Perfect Build Tool.

The gaming industry likes perforce and jam

The mailing list policies for the gamedev list discourage quoting from it or “publishing” from it, so I shouldn’t. I’ll sum up some things I thought were very interesting:

  • Games are developed in C++ and mostly on windows, secondly on linux (I presume especially for the servers), meaning Visual Studio is pretty much the default (and only) option. Those that used CodeWarrior still long for it. Poor bastards. Us java people have eclipse, intellij, and more, with integrated support for our favorite tools (Ant, and up and coming, maven). They tend to generate visual studio project files.
  • Microsoft seems to listen well to concerns and complaints and has a team in place to get their act together
  • Most of them currently seem to use MS tools with various extensions, with Jam being a favorite (because of speed)
  • They seem to like scons (except of course the people that don’t like python its syntax, these are C++ people after all), but it is too slow for their big builds
  • They use perforce (again, main argument seems speed) a whole lot (hence Jam makes a lot of sense)
  • Relatively speaking, they don’t need a whole lot of “SCM” for code at all. Most of their code is “just C++”. They need a whole lot of management of “resource management”, with resources (or “assets” or “data”) being things like textures and 3D models and all the other stuff the non-programmers make for games.

The gaming industry doesn’t do much open development (yet)

In many ways the gaming industry seems to be at about the stage where the “content management” and “middleware” industries were 10 years ago as far as “getting” open source is concerned. They use some open source tools, yet don’t contribute a whole lot back (given the size of the industry!), and don’t seem to have all that much of a clue when it comes to licensing. Sharing takes the form of posting code and/or patches to the perforce public depot.

With all the stories of big losses made on games and game studies going bankrupt, I’m guessing they’ll be playing catch-up much faster — they have to. There are quite a few open source game engines already. I wonder if other parts of their workflow systems are going to become open source soon(ish) too.

I think there is not a lot of open development in gaming yet is because the gaming industry hasn’t been feeling the commoditization of software very much just yet. Game publishers do a lot more than just write software, and have business models that go way beyond being about “just” software.

This probably is one reason why efforts to collaborate together on things like an improved Jam don’t seem to work out — the decision makers don’t see the incentive clearly enough.

It’s either going to be open or “completely microsoft”

Doing a good analysis of the requirements for build systems that the big gaming studios have and then pitching an open source solution to them doesn’t sound undoable at all. Apparently sony seems to like the open standards thing here (how ironic given all their DRM slip-ups), and it seems they’ll be churning out some nice tools for the PS3, but they’re doing a stellarly bad job of talking about it.

Microsoft has a strong XNA team who are building on top of the horrible product (yes yes, in many ways its rather cool too) that is Visual Studio 2005. With all these developers already working on windows, and all of them using C++ as the main environment, it seems visual studio is firmly posed to remain the standard baseline environment for these people, and this puts XNA in a rather strong position. Gotta love monolopies!

Paraphrased from this presentation on XNA (its a good presentation, bit slow though):

Why should each studio have to uniquely solve the same sets of problems faced across the industry? (…) Without tools vendors devoted to support the unique needs of the gaming industry, studios are left with no other choice but to continue to invest into building and maintaining these technologies.

And then, of course, microsoft is posed as the ideal partner and the only real tool vendor. Yeah right. Uhm, hello, gaming studios? Don’t listen to this crap please. There is this other choice, called Commons-based peer production, or more briefly, open source, which is the direction your tool and build teams have been pursuing for many many years, with arguably stellar success — you’ve been shipping some very complex pieces of software on insane schedules for years now. How much of your pipeline do you think is built around visual studio today? Not much. Why do you think that is?

Whose judgement do you trust best? That of your own experts, or that of the people who want to go and store all your gargantuan asset files in a rather crappy relational database?

Evolving make

Inventing another “better make” does not make sense. Its better to work with the maintainers of GNU make. I’ve taken a brief look at how make has evolved over the years. Pretty impressive.

Evolution instead of revolution

After defining “a better make”, I’ve had a few people ask me when I was going to show off the replacement tool I was building. Guess what — I’m not building one. I made quite a few sketches of what such a thing would look like if I were to do it all from scratch, thinking of how I would do it, etc etc. I ended up with a design not unlike rake, and the basics of a simple scripting language not unlike miniruby.

Now, it dawned on me rather quickly that going down that path further would eventually lead to a rather broken variant of rake. Besides, rake hasn’t really seen the adoption that I described as the target goal, so I somehow doubt this road would be successful.

This leads me to the conclusion that the best kind of “revolution” I can come up with is not good enough. That leaves evolutionary improvements. Which might often be a better idea anyway.

Looking at make’s evolution

Interestingly enough, GNU make (the most popular one) for the last 5 years or so has had only two active developers who have dutyfully done maintainance over the last few years, but they have not added a whole lot of new features, especially not big ones. The last “sizeable” new feature I could identify as such was the implementation of “if… else if… endif” on May 13, 2005. Most everything else seems bugfixes, optimizations, and additions of new built-in functions. The last really big new thing seems to have been the “job server” which was added on April 13, 1999.

Given such a track record, I wonder how eager these guys are to accept patches constituting big change…we’ll see. They have an issue tracker, a CVS repository, and mailing lists. The only difference with the open source environment I’m used to is the license.

Peeking deep into unix history…

On the GNU make track record, it is really impressive how easy it is to follow and trace back. The changelog (split into three files) goes back to 1988 when make was apparently split off from the RCS tree. Its hard to get the story of its development from before 1988 without actually harassing the authors, though Google Groups help a lot (for example, here is one of the oldest online sources for Cake, the make replacement, and here is version 1.0 of perl, and lets top it off with the initial release of flex).