Web application platform technology choices

The hardest bit in the web application platform challenge is making reasonable choices. Here’s a stab at some of them…

Hosting models

I see these basic choices:

  1. LAMP virtual hosting. If you can build everything you need with mysql+php and you have few enough users that you need only one database server, by far the easiest and cheapest.
  2. Application hosting. Code on github, project management with basecamp or hosted jira, build on AppEngine or Heroku or force.com. You don’t have to do your own infrastructure but you’re limited in what you can build. Also comes with a large chance of lock-in.
  3. Managed hosting. Rent (virtual) servers with pre-installed operating systems and managed networking. Expensive for large deployments but you don’t need all web operations team skills and you have a lot of flexibility (famously, twitter do this).
  4. Dedicated hosting. Buy or rent servers, rent rackspace or build your own data center. You need network engineers and people that can handle hardware. Usually the only cost-effective option beyond a certain size.

Given our stated requirements, we are really only talking about option #4, but I wanted to mention the alternatives because they will make sense for a lot of people. Oh, and I think all the other options are these days called cloud computing πŸ™‚

Hardware platform

I’m not really a hardware guy, normally I leave this kind of stuff to others. Anyone have any good hardware evaluation guides? Some things I do know:

  • Get at least two of everything.
  • Get quality switches. Many of the worst outages have something to do with blown-up switches, and since you usually have only a few, losing one during a traffic spike is uncool.
  • Get beefy database boxes. Scaling databases out is hard, but they scale up nicely without wasting resources.
  • Get beefy (hardware) load balancers. Going to more than 2 load balancers is complicated, and while the load balancers have spare capacity they can help with SSL, caching, etc.
  • Get beefy boxes to run your monitoring systems (remember, two of everything). In my experience most monitoring systems suffer from pretty crappy architectures, and so are real resource hogs.
  • Get hardware RAID (RAID 5 seems common) with a battery-backed write-through cache, for all storage systems. That is, unless you have some other redundancy architecture and you don’t need RAID for redundancy.
  • Don’t forget about hardware for backups. Do you need tape?

Other thoughts:

  • Appliances. I really like the idea. Things like the schooner appliances for mysql and memcache, or the kickfire appliance for mysql analytics. I have no firsthand experience with them (yet) though. I’m guessing oracle+sun is going to big in this space.
  • SSD. It is obviously the future, but right now they seem to come with limited warranties, and they’re still expensive enough that you should only use them for data that will actually get hot.

Operating system

Choice #1: unix-ish or windows or both. The Microsoft Web Platform actually looks pretty impressive to me these days but I don’t know much about it. So I’ll go for unix-ish.

Choice #2: ubuntu or red hat or freebsd or opensolaris.

I think Ubuntu is currently the best of the debian-based linuxes. I somewhat prefer ubuntu to red hat, primarily because I really don’t like RPM. Unfortunately red hat comes with better training and certification programs, better hardware vendor support and better available support options.

FreeBSD and solaris have a whole bunch of advantages (zfs, zones/jails, smf, network stack, many-core, …) over linux that make linux seem like a useless toy, if it wasn’t for the fact that linux sees so much more use. This is important: linux has the largest array of pre-packaged software that works on it out of the box, linux runs on more hardware (like laptops…), and many more developers are used to linux.

One approach would be solaris for database (ZFS) and media (ZFS!) hosting, and linux for application hosting. The cost of that, of course, would be the complexity in having to manage two platforms. The question then is whether the gain in manageability offsets the price paid in complexity.

And so, red hat gains another (reluctant) customer.

Database

As much sympathy as I have for the NoSQL movement, the relational database is not dead, and it sure as hell is easier to manage. When dealing with a wide variety of applications by a wide variety of developers, and a lot of legacy software, I think a SQL database is still the default model to go with. There’s a large range of options there.

Choice #1: clustered or sharded. At some point some application will have more data than fits on one server, and it will have to be split. Either you use a fancy database that supports clustering (like Oracle or SQL Server), or you use some fancy clustering middleware (like continuent), or you teach your application to split up the data (using horizontal partitioning or sharding) and you use a more no-frills open source database (mysql or postgres).

I suspect that the additional cost of operating an oracle cluster may very well be worth paying for – besides not having to do application level clustering, the excellent management and analysis tools are worth it. I wish someone did a model/spreadsheet to prove it. Anyone?

However, it is much easier to find developers skilled with open source databases, and it is much easier for developers to run a local copy of their database for development. Again there’s a tradeoff.

The choice between mysql and postgres has a similar tradeoff. Postgres has a much more complete feature set, but mysql is slightly easier to get started with and has significantly easier-to-use replication features.

And so, mysql gains another (reluctant) customer.

With that choice made, I think its important to invest early on in providing some higher-level APIs so that while the storage engine might be InnoDB and the access to that storage engine might be MySQL, many applications are coded to talk to a more constrained API. Things like Amazon’s S3, SimpleDB and the Google AppEngine data store provide good examples of constrained APIs that are worth emulating.

HTTP architecture

Apache HTTPD. Easiest choice so far. Its swiss army knife characteristic is quite important. Its what everyone knows. Things like nginx are pretty cool and can be used as the main web server, but I suspect most people that switch to them should’ve spent some time tuning httpd instead. Since I know how to do that…I’ll stick with what I know.

As easy as that choice is, the choice of what to put between HTTPD and the web seems to be harder than ever. The basic sanctioned architecture these days seems to use BGP load sharing to have the switches direct traffic at some fancy layer 7 load balancers where you terminate SSL and KeepAlive. Those fancy load balancers then may point at a layer of caching reverse proxies like which then point at the (httpd) app servers.

I’m going to assume we can afford a pair of F5 Big-IPs per datacenter. Since they can do caching, too, we might avoid building that reverse proxy layer until we need it (at which point we can evaluate squid, varnish, HAProxy, nginx and perlbal, with that evaluation showing we should go with Varnish πŸ™‚ ).

Application architecture

Memcache is nearly everywhere, obviously. Or is it? If you’re starting mostly from scratch and most stuff can be AJAX, http caching in front of the frontends (see above) might be nearly enough.

Assuming a 3-tier (web, middleware, db) system, reasonable choices for the front-end layer might include PHP, WSGI+Django, and mod_perl. I still can’t see myself rolling out Ruby on Rails on a large scale. Reasonable middelware choices might include java servlets, unix daemons written in C/C++ and more mod_perl. I’d say Twisted would be an unreasonable but feasible choice πŸ™‚

Communication between the layers could be REST/HTTP (probably going through the reverse proxy caches) but I’d like to try and make use of thrift. Latency is a bitch, and HTTP doesn’t help.

I’m not sure whether considering a 2-tier system (i.e. PHP direct to database, or perhaps PHP link against C/C++ modules that talk to the database) makes sense these days. I think the layered architecture is usually worth it, mostly for organizational reasons: you can have specialized backend teams and frontend teams.

If it was me personally doing the development, I’m pretty sure I would go 3-tier, with (mostly) mod_wsgi/python frontends using (mostly) thrift to connect to (mostly) daemonized python backends (to be re-written in faster/more concurrent languages as usage patterns dictate) that connect to a farm of (mostly) mysql databases using raw _mysql, with just about all caching in front of the frontend layer. I’m not so sure its easy to teach a large community of people that pattern; it’d be interesting to try πŸ™‚

As for the more boring choice…PHP frontends with java and/or C/C++ backends with REST in the middle seems easier to teach and evangelize, and its also easier to patch up bad apps by sticking custom caching stuff (and, shudder, mod_rewrite) in the middle.

Messaging

If there’s anything obvious in today’s web architecture it is that deferred processing is absolutely key to low-latency user experiences.

The obvious way to do asynchronous work is by pushing jobs on queues. One hard choice at the moment is what messaging stack to use. Obvious contenders include:

  • Websphere MQ (the expensive incumbent)
  • ActiveMQ (the best-known open source system with stability issues)
  • OpenAMQ (AMQP backed by interesting startup)
  • 0MQ (AMQP bought up by same startup)
  • RabbitMQ (AMQP by another startup; erlang yuck)
  • MRG (or QPid, AMQP by red hat which is not exactly a startup).

A less obvious way to do asynchronous work is through a job architecture such as gearman, app engine cron or quartz, where the queue is not explicit but rather exists as a “pending connections” set of work.

I’m not sure what I would pick right now. I’d probably still stay safe and use AMQ with JMS and/or STOMP with JMS semantics. 2 months from now I might choose differently.

[RT] MyCouch

The below post is an edited version of a $work e-mail, re-posted here at request of some colleagues that wanted to forward the story. My apologies if some of the bits are unclear due to lack-of-context. In particular, let me make clear:

  • we have had a production CouchDB setup for months that works well
  • we are planning to keep that production setup roughly intact for many more months and we are not currently planning to migrate away from CouchDB at all
  • overall we are big fans of the CouchDB project and its community and we expect great things to come out of it

Nevertheless using pre-1.0 software based on an archaic language with rather crappy error handling can get frustrating πŸ™‚

Subject: [RT] MyCouch
From: Leo Simons 
To: Forge Engineering 

This particular RT gives one possible answer to the question “what would be a good way to make this KV debugging somewhat less frustrating?” (we have been fighting erratic response times from CouchDB under high load while replicating and compacting)

That answer is “we could probably replace CouchDB with java+mysql, and it might even be easy to do so”. And, then, “if it really is easy, that’s extra cool (and _because of_ CouchDB)”.)

Why replace CouchDB?

Things we really like about CouchDB (as the backend for our KV service):

  • The architecture: HTTP/REST all the way down, MVCC, many-to-many replication, scales without bound, neat composable building blocks makes an evolvable platform.
  • Working system: Its in production, its running, its running pretty well.
  • Community: open source, active project, know the developers, “cool”.
  • Integrity: it hasn’t corrupted or lost any data yet, and it probably won’t anytime soon.

Things we like less:

  • Debugging: cryptic error messages, erlang stack straces, process deaths.
  • Capacity planning: many unknown and changing performance characteristics.
  • Immaturity: pre-1.0.
  • Humanware: lack of erlang development skills, lack of DBA-like skills, lack of training material (or trainers) to gain skills.
  • Tool support: JProfiler for erlang? Eclipse for erlang? Etc.
  • Map/Reduce and views: alien concept to most developers, hard to audit and manage free-form javascript from tenants, hard to use for data migrations and aggregations.
  • JSON: leads to developers storing JSON which is horribly inefficient.

Those things we don’t like about couch unfortunately aren’t going to change very quickly. For example, the effort required to train up a bunch of DBAs so they can juggle CouchDB namespaces and instances and on-disk data structures is probably rather non-trivial.

The basic idea

It is not easy to see what other document storage system out there would be a particularly good replacement. Tokyo Cabinet, Voldemort, Cassandra, … all of these are also young and immature systems with a variety of quirks. Besides, we really really like the CouchDB architecture.

So why don’t we replace CouchDB with a re-implemented CouchDB? We keep the architecture almost exactly the same, but re-implement the features we care about using technology that we know well and is in many ways much more boring. “HTTP all the way down” should mean this is possible.

We could use mysql underneath (but not use any of its built-in replication features). The java program on top would do the schema and index management, and most importantly implement the CouchDB replication and compaction functionality.

We could even keep the same deployment structure. Assuming one java server is paired with one mysql database instance, we’d end up with 4 tomcat instances on 4 ports (5984-5987) and 4 mysql services on 4
other ports (3306-3309). Use of mysqld_multi probably makes sense. Eventually we could perhaps optimize a bit more by having one tomcat process and one mysql process – it’ll make better use of memory.

Now, what is really really really cool about the CouchDB architecture and its complete HTTP-ness is that we should be able to do any actual migration one node at a time, without downtime. Moving the data across
is as simple as running a replication. Combined with the fact that we’ve been carefully avoiding a lot of its features, CouchDB is probably one of the _easiest_ systems to replace πŸ˜€

Database implementation sketch

How would we implement the database? If we think of our KV data as having the form

  ns1:key1 [_rev=1-12345]: { ...}
  ns1:key2 [_rev=2-78901]: { subkey1: ..., }
  ns2:key3 [_rev=1-43210]: { subkey1: ..., subkey2: ...}

where the first integer part of the _rev is dubbed “v” and the remainder part as “src”, then a somewhat obvious database schema looks like (disclaimer: schema has not been tested, do not use :-)):

CREATE TABLE namespace (
  id varchar(64) NOT NULL PRIMARY KEY
      CHARACTER SET ascii COLLATE ascii_bin,
  state enum('enabled','disabled','deleted') NOT NULL
) ENGINE=InnoDB;

CREATE TABLE {namespace}_key (
  ns varchar(64) NOT NULL
      CHARACTER SET ascii COLLATE ascii_bin,
  key varchar(180) NOT NULL
      CHARACTER SET ascii COLLATE ascii_bin,
  v smallint UNSIGNED NOT NULL,
  src int UNSIGNED NOT NULL,

  PRIMARY KEY (ns, key, v, src),
  FOREIGN KEY (ns) REFERENCES namespace(id)
) ENGINE=InnoDB;

CREATE TABLE {namespace}_value (
  ns varchar(64) NOT NULL
      CHARACTER SET ascii COLLATE ascii_bin,
  key varchar(180) NOT NULL
      CHARACTER SET ascii COLLATE ascii_bin,
  v smallint UNSIGNED NOT NULL,
  src int UNSIGNED NOT NULL,
  subkey varchar(255) NOT NULL
      CHARACTER SET utf8 COLLATE utf8_general_ci,
  small_value varchar(512) DEFAULT NULL
      CHARACTER SET utf8 COLLATE utf8_general_ci
      COMMENT 'will contain the value if it fits',
  large_value mediumtext DEFAULT NULL
      CHARACTER SET utf8 COLLATE utf8_general_ci
      COMMENT 'will contain the value if its big',

  PRIMARY KEY (ns, key, v, src, subkey),
  FOREIGN KEY (ns) REFERENCES namespace(id),
  FOREIGN KEY (ns, key, v, src)
      REFERENCES {namespace}_key(ns, key, v, src)
      ON DELETE CASCADE
) ENGINE=InnoDB;

With obvious queries including

  SELECT id FROM namespace WHERE state = 'enabled';

  SELECT key FROM {namespace}_key WHERE namespace_id = ?;
  SELECT key, v, src FROM {namespace}_key WHERE namespace_id = ?;
  SELECT v, src FROM {namespace}_key WHERE namespace_id = ?
      AND key = ?;
  SELECT v, src FROM {namespace}_key WHERE namespace_id = ?
      AND key = ? ORDER BY version DESC LIMIT 1;
  SELECT subkey, small_value FROM {namespace}_value
      WHERE namespace_id = ? AND key = ? AND v = ? AND src = ?;
  SELECT large_value FROM {namespace}_value
      WHERE namespace_id = ? AND key = ? AND v = ? AND src = ?
      AND subkey = ?;

  BEGIN;
  CREATE TABLE {namespace}_key (...);
  CREATE TABLE {namespace}_value (...);
  INSERT INTO namespace(id) VALUES (?);
  COMMIT;

  UPDATE namespace SET state = 'disabled' WHERE id = ?;
  UPDATE namespace SET state = 'deleted' WHERE id = ?;

  BEGIN;
  DROP TABLE {namespace}_value;
  DROP TABLE {namespace}_key;
  DELETE FROM namespace WHERE id = ?;
  COMMIT;

  INSERT INTO {namespace}_key (ns,key,v,src)
      VALUES (?,?,?,?);
  INSERT INTO {namespace}_value (ns,key,v,src,small_value)
      VALUES (?,?,?,?,?),(?,?,?,?,?),(?,?,?,?,?),(?,?,?,?,?);
  INSERT INTO {namespace}_value (ns,key,v,src,large_value)
      VALUES (?,?,?,?,?);

  DELETE FROM {namespace}_key WHERE ns = ? AND key = ?;
  DELETE FROM {namespace}_key WHERE ns = ? AND key = ?
      AND v < ?;
  DELETE FROM {namespace}_key WHERE ns = ? AND key = ?
      AND v = ? AND src =?;

The usefulness for {namespace}_value is debatable; it helps a lot when implementing CouchDB views or some equivalent functionality (“get my all the documents in this namespace where subkey1=…”), but if we decide not to care, then its redundant and {namespace}_key can grow some additional small_value (which should then be big enough to contain a typical JSON document, i.e. maybe 1k) and large_value columns instead.

Partitioning the tables by {namespace} manually isn’t needed if we use MySQL 5.1 or later; table partitions could be used instead.

I’m not sure if we should have a ‘state’ on the keys and do soft-deletes; that might make actual DELETE calls faster; it could also reduce the impact of compactions.

Webapp implementation notes

The java “CouchDB” webapp also does not seem that complicated to build (famous last words?). I would probably build it roughly the same way as [some existing internal webapps].

The basic GET/PUT/DELETE operations are straightforward mappings onto queries that are also rather straightforward.

The POST /_replicate and POST /_compact operations are of course a little bit more involved, but not that much. Assuming some kind of a pool of url fetchers and some periodic executors…

Replication:

  1. get last-seen revision number for source
  2. get list of updates from source
  3. for each update
    • INSERT key
    • if duplicate key error, ignore and don’t update values
    • INSERT OR REPLACE all the values

Compaction:

  1. get list of namespaces
  2. for each namespace:
    • SELECT key, v, src FROM {namespace}_key WHERE namespace_id = ? ORDER BY key ASC, v DESC, src DESC;
    • skip the first row for each key
    • if the second row for the key is the same v, conflict, don’t compact for this key
    • DELETE IGNORE FROM {namespace}_key WHERE ns = ? AND key = ? AND v = ? AND src =?;

So we need some kind of a replication record; once we have mysql available using “documents” seems awkward; let’s use a database table. We might as well have one more MySQL database on each server with a
full copy of a ‘kvconfig’ database, which is replicated around (using mysql replication) to all the nodes. Might also want to migrate away from NAMESPACE_METADATA documents…though maybe not, it is nice and flexible that way.

Performance notes

In theory, the couchdb on-disk format should be much faster than innodb for writes. In practice, innodb has seen quite a few years of tuning. More importantly, in our tests on our servers raw mysql performance seems to be rather better than couchdb. Some of that is due to the extra fsyncs in couchdb, but not all of it.

In theory, the erlang OTP platform should scale out much better than something java-based. In practice, the http server inside couchdb is pretty much a standard fork design using blocking I/O. More importantly, raw tomcat can take >100k req/s on our hardware, which is much much more than our disks can do.

In theory, having the entire engine inside one process should be more efficient than java talking to mysql over TCP. In practice, I doubt this will really show up if we run java and mysql on the same box. More importantly, if this does become an issue, longer-term we may be able to “flatten the stack” by pushing the java “CouchDB” up into the service layer and merging it with the KV service, at which point java-to-mysql will be rather more efficient than java-to-couch.

In theory and in practice innodb has better indexes for the most common SELECTs/GETs so it should be a bit faster. It also is better at making use of large chunks of memory. I suspect the two most common requests (GET that returns 200, GET that returns 404) will both be faster, which incidentally are the most important for us to optimize, too.

We might worry java is slow. That’s kind-of silly :). In theory and in practice garbage collection makes software go faster. We just need to avoid doing those things that make it slow.

The overhead of ACID guarantees might be a concern. Fortunately MySQL is not _really_ a proper relational database if you don’t want it to be. We can probably set the transaction isolation level to READ UNCOMMITTED safely, and the schema design / usage pattern is such that we don’t need transactions in most places. More importantly we are keeping the eventual consistency model, with MVCC and all, on a larger scale. Any over-ACID-ness will be local to the particular node only.

Most importantly, this innodb/mysql thing is mature/boring technology that powers a lot of the biggest websites in the world. As such, you can buy books and consultancy and read countless websites about mysql/innodb/tomcat tuning. Its performance characteristics are pretty well-known and pretty predictable, and lots of people (including here at $work) can make those predictions easily.

So when are we doing this?

No no, we’re not, that’s not the point, this is just a RT! I woke up (rather early) with this idea in my head so I wrote it down to make space for other thoughts. At a minimum, I hope the above helps propagate some ideas:

  • just how well we applied REST and service-oriented architecture here and the benefits its giving us
  • in particular because we picked the right architecture we are not stuck with / tied to CouchDB, now or later
  • we can always re-engineer things (though we should have good enough reasons)
  • things like innodb and/or bdb (or any of the old dbs) are actually great tools with some great characteristics

Just like FriendFeed?

Bret Taylor has a good explanation how FriendFeed built a non-relational database on top of a relational one. The approach outlined above reminds rathe a lot of the solution they implemented, though there’s also important differences.

Getting a feel for the cost of using mysql with WSGI

So is the database the bottleneck?

Note that compared to the last post, I enabled KeepAlive, i.e. I added the -k flag to ab:

ab -k -n 10000 -c 100 http://localhost/hello.txt

I also toned down the MPM limits a bunch:

StartServers         16
MinSpareServers      16
MaxSpareServers     128
MaxClients           64
MaxRequestsPerChild   0

(Apache will now complain about reaching MaxClients, but given the db is on the same machine, things end up faster overall, this way)

And then I decided to just run ‘ab’ locally since I couldn’t really see that much difference vs running it over the network.

Then I proceeded to get myself a MySQL 5 server with a single InnoDB table inside with a single row in it that reads (key:1, value:'Hello world!'). I also set up a memcached server. Now, look at this…

What? Requests per second relative performance hit CPU usage
hello.txt 9700 httpd 2.2% per process
python string 7500 23% httpd 2% per process
memcached lookup 2100 23% * 72% memcached 10%, httpd 2% per process
_mysql 1400 23% * 82% MySQL 77%, httpd 1.2% per process
mysql 880 23% * 82% * 37% MySQL 65%, httpd 4.5% per process
SQLAlchemy 700 23% * 82% * 37% * 20% MySQL 19%, httpd 5.5% per process

Using a mysql database backend costs 82% req/s compared to the “serve python string from global variable”. Totally expected of course, and we have also been taught how to deal with it (caching).

In this example, we’re not actually getting a lot out of using memcached – the memcached approach (I’m using python-memcached) is still 72% slower than having the in-process memory, though if you look at it the other way around, it is 50% faster than using the _mysql driver. Nevertheless, this shows that the mysql query cache is pretty fast, too, esp. if you have a 100% cache hit ratio πŸ™‚

Using the SQLAlchemy ORM framework makes things 20% slower, and it gobbles up a lot of CPU. We also expected that, and we know how to deal with (hand-optimize the SQL in performance-sensitive areas, or perhaps use a faster ORM, like storm).

But, did you know that you can take a 37% performance hit just by using the _mysql bindings directly instead of the MySQLdb object oriented wrappers? Looks like some optimizations should be possible there…

Now, in “real life”, imagine a 90% hit ratio on the memcache, and for the other 10% there’s some expensive ORM processing. Further imagine a 5% performance hit in the logic to decide between memcache and ORM layer. You’d end up doing about 1850 requests per second.

This leads to Leo’s WSGI performance rule of thumb number 1: Even if your python webapp is really really really fast, it will still be about 5 times slower than apache serving up static files.

Getting a feel for the performance of mod_wsgi

Disclaimer: there’s a lot that’s wrong with my testing setup and/or methodology. Given the kind of concurrency I’m playing around with, I run into a whole bunch of operating system and TCP/IP limits. Though I’ve tweaked those a little bit (for example # of open file descriptors, TCP window size), I haven’t been particularly scientific about it. Though I did various warm-ups and system reboots, I wasn’t particularly good at keeping consistent timing or consistently checking that there were no lingering connections in TIME_WAIT or whatever.

System under test:

  • Server: Latest-gen macbook 2.4ghz intel core 2 duo
  • Test client: Mac Pro tower dual core G5
  • Switch: some crappy 1gbit whitelabel
  • Stock Apache on Mac OS X 10.5 (prefork MPM, 64bit)
  • Stock python 2.5 on Mac OS X 10.5
  • mod_wsgi 2.3
  • ApacheBench, Version 2.3, commands like ab -n 5000 -c 100 http://10.0.0.2/wsgitest

Out of the box:

  • about 2000-3000 req/s for static file
  • about 2000-2800 req/s for mod_wsgi in-process
  • about 1500-2500 req/s when using mod_wsgi in daemon mode (with errors beyond concurrency of about 250, for various settings of p, t)
  • concurrency=1000 makes ApacheBench start reporting lots of failures

Follows some MPM tuning, arriving to:

StartServers        100
MinSpareServers      10
MaxSpareServers     500
MaxClients          500
MaxRequestsPerChild   0

Results then become better (especially more stable):

  • about 5000 req/s for static file
  • With EnableMMAP off and EnableSendfile off, even concurrency=10 is already a problem for static file scenario, and req/s doesn’t go above 3000 req/s for concurrency>=5
  • about 4300 req/s for mod_wsgi in process
  • about 2700 req/s for mod_wsgi in daemon mode
  • concurrency=1000 still makes ApacheBench start reporting lots of failures

Some more data:

                    hello.txt           wsgitest        wsgitest    wsgitest    wsgitest
                                        in-process      p=2,t=10    p=2,t=100   p=20,t=10
concurrency 10      req/s:  4784
                    ms/req:    0.21

concurrency 100     req/s:  5081        4394            3026        3154        2827
                    ms/req:    0.20        0.23            0.33        0.32        0.35

concurrency 200     req/s:  5307        4449            2688        2988        2711
                    ms/req:    0.19        0.24            0.37        0.34        0.36

concurrency 500     req/s:  4885        4137            2779        3019        2738
                    ms/req:    0.21        0.24            0.36        0.33        0.36

hello.txt is a 13 byte file containing "Hello World!\n"
wsgitest is a really simple wsgi script spitting out "Hello World!\n"
concurrency is the argument to ab -c
p is number of processes for mod_wsgi daemon mode
t is number of threads for mod_wsgi daemon mode
ms/req is the mean time per request across all concurrent requests as reported by ab

Tentative conclusions:

  • With my hardware I have little chance of actually finding the limits of apache or mod_wsgi unless I spend a whole lot of time on much more carefully testing what I’m actually measuring
  • running mod_wsgi in-process is probably a good idea if you tweak the MPM for it
  • mod_wsgi on my laptop can probably easily serve over a billion requests/month after a little tuning
  • mod_wsgi on my laptop can deal with 500 “concurrent” users without errors

…so, in other words, mod_wsgi is most likely “fast enough” and most likely will no be a bottleneck if I build something with it. Not exactly surprising. But, more importantly, now I have some baseline numbers for req/s performance on my system, so that I can run some “performance smoke tests” against software I write.

Repeatable builds and maven?

A repeatable build is a build which you can re-run multiple times using the same source and the same commands, and that then results in exactly the same build output time and time again. The capability to do repeatable builds is an important cornerstone of every mature release management setup.

Of course, lots of people use some silly much more limited definition of a repeatable build, and are happy as long as “all tests pass”.

Getting to repeatable builds is nearly impossible for mere mortals using maven 1 (heck, maven 1 out of the box doesn’t even work anymore, since the ibiblio repository was changed in a way that causes maven 1 to break), and it is still prohibitively difficult with maven 2.

Of course, the people that do repeatable builds really well tend to create big all-encompassing solutions that are really hard to use with the tools used in real life, and they only really help you when you either do not have a gazillion dependencies, or you do the SCM for all your dependencies, too. For the average java developer, that all breaks down when you find out you can’t quite bootstrap the sun JDK very well, or are missing some other bit of important ‘source’.

Don’t get me wrong. Maven can be a very useful tool. Moreover, in practice, if you do large-scale java, you simply tend to run into maven at some point, and as a release engineer you cannot always do much about that. You must simply realize that when you’re doing release engineering based around maven, that is only sensible if you still really, really pay lots of attention to what you’re doing. Like running maven in offline mode for official builds. And wiping your local repository before you build releases. And keeping archived local repositories around with your distributions. And such and so forth.

Not paying attention or not thinking these kinds of tricky release engineering things through just isn’t very sensible, not when you’re doing so-called enterprise stuff where you might have to re-run a build 3 years after the fact. You cannot afford to count on maven to just magically do the right thing for you. Historically and typically, it doesn’t, at least not quite.

Memories from ApacheCon US 2005

Leo Simons at the ApacheCon US 2005 Lightning Lottery Talks, holding two beers and an iBook

Talk about blogging backlog. Over on my old blog, the last entry is about the day before the hackathon before apachecon. The longtime reader will know by now that after that I got a little too busy doing other things to be blogging about those things. Fortunately, other people took pictures. Shockingly, this picture of my 5 minutes of fame bashing the ASF at the lightning lottery talks is not under a creative comments license, so I don’t think I’m allowed to reproduce it here. On the other hand, I think I’m allowed to show a thumbnail…

Update: Ted promptly put his ApacheCon photos on flickr under the creative commons license. Thank you!