LSD::RELOAD

May 9, 2008

Can I connect my console to my apple cinema display?

Filed under: Tech — Leo Simons @ 12:58

No.

XBOX 360 and PS3 both supply a HDMI signal.

Apple’s cinema displays only accept a digital DVI signal.

There is no way to convert the HDMI signal from xbox into a DVI signal.

One company (forget which one) sold a special HDMI-to-DVI converter box for a bit, but they’re no longer selling it and aren’t intending to start selling something like it again.

If you’re buying a LCD computer display, buy one that has a HDMI input, or at least composite or VGA.

(figured I’d put this out here since it’s the 2nd time someone asked me, and I remember being really frustrated when I couldn’t get it to work)

April 12, 2008

App Engine is an interesting start

Filed under: Python, Tech — Leo Simons @ 14:44

So I looked at google’s App Engine today. Some thoughts:

  1. We need to know what it costs.
  2. Yay for WSGI.
  3. Yay for Apache 2 licensing.
  4. Let me deep-link to the code.
  5. Amazon provides more power. For example, queues.
  6. As others have said the lock-in risk needs some attention. I’m sure someone will solve it soon.
  7. I still don’t believe we’ll stop distributing entity relationships. They are useful. Facebook at least provides subselects. Amazon doesn’t.

All in all, if I were to launch some new web app today, I would probably still do that on Amazon S3/EC2.

January 7, 2008

software darwinism and technology bubbles

Filed under: Tech — Leo Simons @ 19:10
  • Some developers are 10 times as productive as others. That’s one factor of 10.
  • 50% of software projects fail. That’s a factor of 2.
  • Bigger teams are less productive. Let’s assume a 16 people team is a factor of 4 less productive than a 1 person team.
  • 80% of software projects are not actually aligned with the market; only 20% of software projects really build the right thing. Due to competition the eventually get replaced by better software. That’s a factor of 5.

Multiplying the multipliers, taking the worst case everywhere, the productivity of your team is 400 times less than in the best case scenario.

Add a few statistics, and then it follows that one successful scenario could be to hire 99 one-person teams working on semi-random solutions, and 1 person to pick the right solution out of all of those. It’s quite likely most of those solutions are not worst-case scenarios. You would probably still be about 4 times more productive than in the worst case scenario.

The math above is easy to argue with, but it’s the general principle of having extremely skewed productivity that matters.

Applying this principle is sometimes called software darwinism, which is a term sometimes used to explain the success of open source. It’s probably also one reason why companies like google and microsoft are as consistently successful as they are.

The silver bullet in this story would be a way to somehow measure and predict which of those many teeny tiny tiger teams will end up doing the really successful project. If you are really good at betting on the right startup, and also get really really lucky, you might just get back 400 dollars for every dollar you invest. The odds are really tempting. This is why so much of the software industry is controlled by venture capitalism, and this is why we have bubbles like the dot com bubble or the web 2.0 bubble.

Software development, then, is something that attracts cowboys. I do hope that someday software engineering will consistently and provably offer better ROI than software cowboyism, but I’m afraid that’s empirically not true today. You cannot be in the software business today without accepting that you are part of an ecosystem is full of cowboys, some of whom are quite successful.

December 11, 2007

Our choices for python web applications

Filed under: Python, Tech, Work — Leo Simons @ 21:23

So, at work, we’re doing some “next generation” versions of a bunch of our backoffice tooling. That involves producing a bunch of cute little web applications, that often control not so cute and not so little processes (like transcoding and publishing and whatnot). The course-grained architecture pattern is pretty simple and familiar: database with information about files, jobs, tasks and metadata, some common libraries for interacting with the database, some web application middleware using those libraries, and a web server frontend serving up the middleware.

Pretty much normal bread-and-butter stuff. It’s not quite like document-based CMS work (you don’t really want to store many-gigabyte video in a JCR repo), but a lot of the technology choices are still similar.

Customizing a snake

Based on the various tech we have deployed today, and the skills of the people working on this kind of thing, we’re trying to standardize around two main server-side technologies: java and python. This post explains the choices we made for the python universe. At the moment those choices are actually not so easy, since there’s so much happening and so many projects are moving so fast. We scouted the web quite a bit to figure out what to do.

Lower layers

  • OS: ubuntu 7.10 (still some nodes on 6.10)
  • Database: MySQL 5.0.45 (comes with ubuntu, a bit reconfigured of course) with some little bits of replication
  • Python: Python 2.4.4 (2.5 is not on ubuntu 6.10 and not on all our developer workstations, but we’re testing with it and will upgrade eventually)
  • WSGI server: right now we have a slightly customized cherrypy wsgi server (so that it accepts signals, restarts itself, runs from /etc/init.d, logs in all the right places, etc) behind an apache httpd 2.2 ProxyPass, which also handles SSL/AAA. We want to try and move to mod_wsgi but first we need its mac install to suck a bit less, and so far, cherry is not quite falling over on us. If mod_wsgi doesn’t work out it’ll probably be back to twisted, probably also behind apache for SSL reasons.

Application glue

  • Database access layer: storm, our own slightly modified version. We really like storm, and every now we find we are pushing it a bit beyond its limits, which leads to some bits of patching (by people smarter than me!). Fortunately it seems the guys working on it are quite responsive on IRC. I expect there’ll be a few (more) patches from us that flow back upstream. I really hope someone implements support for forking out reads and writes to different nodes (like you get for free with MySQL Connector/J), either in MySQLdb or inside storm.
  • Python web glue: We’re trying to do everything completely WSGI-based, though most everything at the moment is actually inside CherryPy 3.1b1 handlers. The WSGI pattern works just fine and scales nicely enough in our tests.
  • Templating: Genshi 0.4.4 (we had to pick one, there’s a few good choices here)
  • XML bits and pieces: lxml 1.3.6. It’s the best XML support in python so far, but it still isn’t quite as good as what you get in java. All the various bits and pieces just aren’t quite as mature, and the underlying libxml2 doesn’t quite do XML schema support as well, and I also miss something like XMLBeans for python.

Out of the box?

We took a look at a bunch of the web frameworks out there. We didn’t seriously consider zope, but we took a long stare at pylons, turbogears and django before deciding not to bother with them. We’re not using much of paste either. Basically we missed one or more of

  • good support for storm out of the box
  • doing everything the WSGI way
  • good and correct documentation
  • easy to scale / make efficient
  • stable core with excellent compatibility and bugfixing

And perhaps a few other things, and on the balance we guessed it would be easier to roll our own and integrate components, rather than strip something else down, and maintain lots of vendor branches.

Key point: standardization good

Two years ago I would’ve picked twisted without blinking and invented another fancy wheel on top of it, but I’m happy I don’t have to do that anymore. Twisted has quite a learning curve, not just for app developers, but also for the people that need to deploy and scale the beast.

Two good things happened to the python webapp world: competition and standardization. Now things are progressing rapidly.

Progress is good, but it can result in various kinds of chaos that don’t help the application developer that likes to plan ahead a bit. The new scripting language based mega frameworks seem to attract a certain kind of developer and they probably work for a certain set of use cases, but standardizing on patterns and interfaces is much more useful for (opinioned!) people like us (with subtly deviating use cases). So framework authors: please do keep working on bridging the gap between all of them by cutting ‘em down into tiny little WSGI middleware bits and pieces, and turn frameworks into libraries where you can.

November 21, 2007

cherry-picking changesets is hard but possible

Filed under: Open Source, Tech — Leo Simons @ 1:41

At work we use svn, with svnmerge for managing branches (most projects tend to use a trunk with a stable branch or two). I’m pretty much a svnmerge newbie, so every now and then I mess up one of my trees. When that happens, I’m lucky enough to have a fair share of total svn experts around to help me figure out what happened.

So here’s a little test scenario that shows what kind of merge conflict you tend to get when you cherry-pick a change to merge to stable, out of chronological order of change to the trunk, and then later you merge an older change that has a conflicting line in it:

$ cd /tmp
$ mkdir svnmergetest
$ cd svnmergetest
$ mkdir repo
$ svnadmin create /tmp/svnmergetest/repo
$ svn co file:///tmp/svnmergetest/repo co
Checked out revision 0.
$ cd co
$ svn mkdir trunk
A
$ cd trunk
$ cat > A <<END
>       line ASTART
>       line A1
>       line A2
>       line AEND
> END
$ svn add A
A         A
$ svn commit -m "Init trunk"
Adding
Adding         trunk/A
Transmitting file data .
Committed revision 1.
$ cat > A <<END
>       line ASTART
>       line A1
>       line A3
>       line AEND
> END
$ svn diff A
Index: A
===================================================================
--- A   (revision 1)
+++ A   (working copy)
@@ -1,4 +1,4 @@
       line ASTART
       line A1
-      line A2
+      line A3
       line AEND
$ svn commit -m "commit 1"
Sending        trunk/A
Transmitting file data .
Committed revision 2.
$ cat > A <<END
>       line ASTART
>       line A1
>       line A4
>       line AEND
> END
$ svn diff A
Index: A
===================================================================
--- A   (revision 2)
+++ A   (working copy)
@@ -1,4 +1,4 @@
       line ASTART
       line A1
-      line A3
+      line A4
       line AEND
$ svn commit -m "commit 2"
Sending        trunk/A
Transmitting file data .
Committed revision 3.
$ cd ..
# FWIW, I messed up commit in 4, and 5, commit 6 is to un-do the mess-up
$ svn cp -m "Init stable" -r 1 file:///tmp/svnmergetest/repo/trunk \
>   file:///tmp/svnmergetest/repo/stable

Committed revision 7.
$ svn up
A    stable
A    stable/A
Updated to revision 7.
$ cd stable
$ cat A
      line ASTART
      line A1
      line A2
      line AEND
$ svnmerge.py init
property 'svnmerge-integrated' set on '.'

$ svn commit -F svnmerge-commit-message.txt
Sending

Committed revision 8.
$ rm svnmerge-commit-message.txt
$ svnmerge.py avail
2-3
$ svnmerge.py merge 3
svnmerge: "3" is not a subversion working directory
$ svnmerge.py merge -r 3
C    A

property 'svnmerge-integrated' set on '.'

$ cat A
      line ASTART
      line A1
<<<<<<>>>>>> .merge-right.r3
      line AEND
$ cat > A <<END
>       line ASTART
>       line A1
>       line A4
>       line AEND
> END
$ svn resolved A
Resolved conflicted state of 'A'
$ svn diff

Property changes on: .
___________________________________________________________________
Name: svnmerge-integrated
   - /trunk:1
   + /trunk:1,3

Index: A
===================================================================
--- A   (revision 7)
+++ A   (working copy)
@@ -1,4 +1,4 @@
       line ASTART
       line A1
-      line A2
+      line A4
       line AEND
$ svn commit -m "Hand-resolve merge conflict"
Sending
Sending        stable/A
Transmitting file data .
Committed revision 9.
$ svnmerge.py avail
2
svnmerge.py merge -r 2
C    A

property 'svnmerge-integrated' set on '.'

$ cat A
      line ASTART
      line A1
<<<<<<>>>>>> .merge-right.r2
      line AEND
$ cat > A <<END
>       line ASTART
>       line A1
>       line A4
>       line AEND
> END
$ svn resolved A
Resolved conflicted state of 'A'
$ svn diff

Property changes on: .
___________________________________________________________________
Name: svnmerge-integrated
   - /trunk:1,3
   + /trunk:1-3
$ svn commit -m "Hand-resolve merge conflict"
Sending

Committed revision 10.

Justin tells me there’s very little in terms of handy fancy dandy software tooling that magically does the ‘right thing’ here without a human’s intervention. The main reason for that is that there is, fundamentally, no ‘right thing’.

So it’s possible to resolve this cleanly, and svn + svnmerge make it pretty clear what is going on, but it’s still a bit of work to figure out what to do. To stay out of trouble, it’s a safe bet that you should limit cherry picking as much as possible, and try and do merges between branches in chronological order whenever you can.

March 25, 2007

Review of Google Apps

Filed under: Moved from old blog, Tech — Leo Simons @ 14:00

If you e-mail lsimons A jicarilla dot nl (which no-one ever does), the e-mail no longer ends up in my inbox, it ends up at gmail.jicarilla.nl. Encouraged by a collegue, I’ve been giving Google Apps a try.

Setting up

They have this figured out amazingly well. It definitely made me gow “wow”. And again, “wow”. If you own a domain, and you have access to the DNS settings for that domain, they guide you, step by step, to making their different services avaialble under that domain, including using it for e-mail.

I had it all done within an hour, just clicking around and following some instructions (most of them, I didn’t read). It just works so much better than other managed hosting solutions I’ve tried in the past (like things using cPanel), it’s amazing.

It was a little annoying that it kept forgetting that I, while being Dutch and living in The Netherlands, like to have all my user interfaces in English. I had to change that behavior in every individual app, and when I changed my location information, some of it flipped back to English.

GMail

The free account is 2GB, which is not enough for me. The paid account is 10GB, which is still not enough for me. So migrating is not even an option, even if GMail was deemed secure enough by the powers that be @ work.

Things I like

  • the ‘conversation’ model where you can manage whole conversations (wouldn’t work for my work e-mail though, where many people just hijack threads all the time)
  • the keyboard shortcuts
  • the gtalk integration
  • how editing of an e-mail starts inline by default, and you can break it out later

Things I don’t like

  • The unneccessary information and hints like, “visit settings to save time with keyboard shortcuts”. I’m a geek, the first thing I visited was the settings.
  • The primitive filtering. I am used to having all the power of procmail available to me (filtering by mailing list headers that ezmlm adds, for example). I can’t ever see this working.
  • Tags. I’m used to folders, and they are a vital part of what helps me mentally “switch hats” between job/open source/personal tech interest/private life. I don’t see how I could do that with tags.
  • No offline mode, or at least its not that clear to me how to get that.

All in all, it’s just very different, and I’m not actually looking for different. I suspect large parts of the world feel similarly and just want to keep their basic three-pane interface. Of course, there’s not that many people that process as much e-mail (and as weirdly) as I do, so I can understand how the GMail interface is not exactly optimized for people like me. I definitely would recommend it to, for example, my parents, if they hadn’t been using Outlook Express (and before that, Netscape Mail) and just want to keep using such an interface.

Google Calendar

Day view, week view, month view. Snappy interface. Search. Multiple calendars distinguished by color. iCal support. Creating an event in the right spot takes just a few seconds. I like.

I miss is the GTalk mini-window, that GMail does have, in the calendaring interface.

What would truly make this a killer app for me is if I could sync from Apple’s iCal to it, instead of just subscribe to Google Calendar from iCal. The reason for that is that I often jot things down while offline, and I’d want those automatically appearing (or at the click of a button) on google as soon as I had a network connection again.

Oh, and then of course I’d need Apple’s iCal to support SSL.

I would definitely recommend google calendar to anyone that needs a calendaring solution, especially if you want to use shared calendaring, and you don’t need to edit your calendar while offline.

Google Docs and spreadsheets

It works, and it autosaves your work, and it has revisions. I could never actually use this (since I need to work with all these big technical and/or legal MS Word documents which have comments, master pages, revision tracking, etc), but it is otherwise really nice for simpler things. Especially the shared editing.

This would be one of my favorite apps on the web, and I would probably actually use it the next time I need some loose planning for something (like a party or a conference call or whatever) if it weren’t for the existence of Dabble DB. Google’s solution works, but dabble is magic.

Integration between pieces

While everything except the page creator-generated website is available over SSL, the links between the pieces (i.e. when you’re in google calendar and you click on “gmail”) keep redirecting back to plain HTTP. In general, security seems, ehm, not paranoid enough for people like me.

While, wherever you are, there’s always a link to the other apps within reach in a consistent location, there’s still a highly modal way of working, where you have to mentally switch from writing an e-mail to writing a document. You can’t actually send a google docs document as an e-mail easily; you have to export it, save it locally, switch to gmail, create an e-mail, navigate to the file, upload the attachment, type the message, and send it. Hpff. Similarly, you can’t select a document in google docs and use “send to” (windows) or drag-and-drop (mac) to create an e-mail out of it.

This is really where either the microsoft solution (outlook, office, windows explorer integration, “start > my recent documents”) or a geeky solution (like me…mutt, SubEthaEdit, spotlight, locate/find/ls, office available when I need it) offers stellar productivity, and where trying to do it “in the web browser” still completely breaks down.

Managedness

Think of all the things you get for free in this solution! High availability (from any computer!), good backups (or so I assume), incredible search and filtering capabilities, real web-based collaborative editing. I think what it removes from the average computer user (or IT administrator) is the need to worry about many of these things. Once you get used to it, it works. And google has a proven record to keep things working.

It’s just that I already have these things…high availability (yay, svn), good backups (hurray for backuppc), great search (spotlight, mairix), better filtering (procmail, unix command line) and more mature collaborative editing (again svn, and SubEthaEdit)…and I can trust in myself instead of placing trust in a US company (with the US having laws I am not always a fan of).

Page Creator

It seems to work ok enough, and is easy enough to understand. It produces HTML 4.01 traditional which doesn’t validate. I didn’t spend much time with it; I prefer writing my XHTML-compliant HTML by hand…

Google Talk

We pretty much standardize on skype at work, and my non-geek friends pretty much standardize on MSN these days. Until they integrate GTalk with those closed silos, it is not that useful to me.

Start Page

Yawn. I’ve never seen one of these that I liked. I tried to use it, but quickly just dragged the service URLs into my firefox bookmarks toolbar.

Conclusion

Google’s current offering is easy to set up and easy to use for certain distinct, well-defined tasks. Yet, it is not secure enough for the paranoid (even if you trust google with your data). Nor does it offer productivity resembling anything a modern-day knowledge worker is used to.

As far as web-based solutions go…I like the e-mail offering from Yahoo! better. I like Dabble DB better for collaborative editing. Yet, google currently has the best calendaring solution I’ve seen, and is seemingly the fartest along with offering integration between all the different pieces.

The big plus to this offering, and google is getting this right in an amazingly intuitive way, is that it simplifies administration dramatically. For any small business IT administrator that is struggling to get MS Exchange and SharePoint up and running, going with the google setup will definitely be a breath of fresh air.

I won’t be switching to actual use of Google Apps any time soon, but I would definitely recommend small businesses, schools, families, and individuals to evaluate it and try it out. It is simple, it is intuitive, and it is free (or really cheap at $50/user/year if you want paid support and more storage).

Google Apps is serious competition for microsoft, even if perhaps not yet for the enterprise. Even if it doesn’t take off in a big way, this new solution will help to break the microsoft monopoly, and will get microsoft to produce better products at a lower price. Yay!

March 24, 2007

Replace maven with a shell script

Filed under: Java, Moved from old blog, Open Source, Tech — Leo Simons @ 16:29

One of the things I find myself trying to instill in a lot of our developers these days is that a little pragmatism can often go a long way.

By popular demand (really!), here’s my trivial shell script that pretends to be maven. For smallish projects and small sizes of your local maven repository, it is orders of magnitude faster than doing an actual maving run, and it has many other advantages over the “real” maven.

Of course, I don’t actually use this script (much). Lately I’ve been using Ant 1.7 with Ivy. Oh, And mod_perl’s Apache::Test for TripleSoup.

#!/usr/bin/env bash

artifactId=`xmllint --noblanks project.xml |

        egrep -o '<id>[^>]+<\/id>' |

        sed -e 's/<id>//' -e 's/<\/id>//'`

groupId=`xmllint --noblanks project.xml |

        egrep -o '<groupId>[^>]+<\/groupId>' |

        sed -e 's/<groupId>//' -e 's/<\/groupId>//'`

currentVersion=`xmllint --noblanks project.xml |

        egrep -o '<currentVersion>[^>]+<\/currentVersion>' |

        sed -e 's/<currentVersion>//' -e 's/<\/currentVersion>//'`

shortDescription=`xmllint --noblanks project.xml |

        egrep -o '<shortDescription>[^>]+<\/shortDescription>' |

        sed -e 's/<shortDecription>//' \

            -e 's/<\/shortDescription>//'`

package=`xmllint --noblanks project.xml |

        egrep -o '<package>[^>]+<\/package>' |

        sed -e 's/<package>//' \

            -e 's/<\/package>//'`

organization=`xmllint --noblanks project.xml |

        grep -A5 '<organization>' |

        egrep -o '<name>[^>]+<\/name>' |

        sed -e 's/<name>//' \

            -e 's/<\/name>//'`for jar in `find $HOME/.maven/repository -name "*.jar"`; do

    CLASSPATH=$CLASSPATH:$jar

done

CLASSPATH=`pwd`/target/classes:`pwd`/target/test-classes:$CLASSPATH

export CLASSPATH

echo Building $artifactId-$currentVersion.jar...

rm -Rf target

mkdir -p target/classes

mkdir -p target/test-classes

cd src/java

javac -nowarn -Xlint:-deprecation -source 1.4 -target 1.4 \

        -d ../../target/classes \

        `find . -name '*.java'`

for dir in `find . -type d -not -path '*svn*'`; do

    mkdir -p ../../target/classes/$dir

done

cp -r `find . -type f -not -name '*.java' -not -path '*svn*'` \

        ../../target/classes

cd ../..

mkdir -p target/classes/META-INF

cp -f LICENSE* NOTICE* target/classes/META-INF 2>/dev/null

cat > target/classes/META-INF/MANIFEST.MF <<MFEND

Manifest-Version: 1.0

Created-By: Apache Maven Simulator 1.0

Extension-Name: $artifactId

Specification-Title: $shortDescription

Specification-Vendor: $organization

Specification-Version: $currentVersion

Implementation-Vendor: $organization

Implementation-Title: $package

Implementation-Version: $currentVersion

MFEND

cd target/classes

jar cf ../$artifactId-$currentVersion.jar *

cd ../..

echo Installing $artifactId-$currentVersion.jar...

mkdir -p $HOME/.maven/repository/$groupId/jars

cp target/$artifactId-$currentVersion.jar \

        $HOME/.maven/repository/$groupId/jars

echo done

March 18, 2007

RDF modelling at Joost: no bnodes

Filed under: Moved from old blog, Tech, Work — Leo Simons @ 16:24

As I mentioned in a previous entry, Joost™ uses quite a bit of RDF. I’m sorry, but I’m not going to share our full data model with you (though we might do that in the future). All I want to try and do is highlight some basic choices that we (mostly Alberto) have made on how to model things using RDF.

Choice number one:

No bnodes

A blank node (definition) or “bnode” for short is when you have a subject in RDF that doesn’t have a ‘real’ URI.

Where do you use bnodes?

You encounter bnodes when modelling things in a ‘normal’ object-oriented fashion, and especially a lot in ‘normal’ modern XML. For example, the XML document

Listing 1
<!--[CDATA[

]]-->

might be turned into RDF as

Listing 2
<!--[CDATA[@prefix :  .

leo      isA               Person ;

         name              "Leo Simons" .

alberto  isA               Person ;

         name              "Alberto Reggiori" .

foo      isA               Group ;

         name              "Mentioned in article" ;

         containsPeople    ( leo alberto ) .]]-->

which is a special Notation3 (or Turtle) shorthand for

Listing 3
<!--[CDATA[@prefix :  .

leo      isA               Person ;

         name              "Leo Simons" .

alberto  isA               Person ;

         name              "Alberto Reggiori" .

foo      isA               Group ;

         name              "Mentioned in article" ;

         containsPeople    _:1 .

_:1      rdf:first         leo ;

         rdf:next          _:2 .

_:2      rdf:first         alberto ;

         rdf:next          rdf:nil .]]-->

_:1 and _:2 are bnodes. Doesn’t seem to be a problem with this, does there? (aside from rdf collections being cumbersome)

How does it look without bnodes?

Well, consider this alternative:

Listing 4
<!--[CDATA[@prefix :  .

leo      isA               Person ;

         name              "Leo Simons" .

alberto  isA               Person ;

         name              "Alberto Reggiori" .

foo      isA               Group ;

         name              "Mentioned in article" ;

         containsPerson    leo ;

         containsPerson    alberto .]]-->

It consists of less triples, obviously meaning less storage space, and, given the nature of RDF databases today, also better performance. As a data model grows in complexity, it seems that the percentage of bnodes will normally grow a bit as well, so the effect is more pronounced for lots of data.

But, more importantly, the software you have to write becomes more involved. Let’s investigate.

The effect of bnode use on source code

Here’s some imaginary java code (using jena) that prints certain data it finds in the model:

Listing 5

  List names = new ArrayList();

  for(RDFNode nameNode : m.listObjectsOfProperty(r, Example.name)) {

    if(!nameNode.isLiteral()) {

      continue;

    }

    Literal nameLiteral = (Literal)node.as(Literal.class);

    try {

      names.add(nameLiteral.getString());

    } catch(DatatypeFormatException e) {

    }

  }

  returns names.toArray(new String[names.size()]);

}private boolean hasType(Model m, Resource toCheck, Resource expectedType) {

  return m.contains(m, toCheck, RDF.isA, expectedType);

}

private void printHeader(Model m, Resource groupResource) {

  String[] groupName = getName(m, groupResource);

  if(groupName.length > 0) {

    for(String name : groupName.length) {

      System.out.println("Group name: " + name.getString());

    }

  } else {

    System.out.println("Group name: ");

  }

  System.out.println("----");

}

private void printName(Model m, Resource personResource) {

  String[] personName = getName(m, personResource);

  if(personName.length > 0) {

    for(String name : personName.length) {

      System.out.println("  " + personName.getString());

    }

  } else {

    System.out.println("  ");

  }

}

public void printInfoAboutGroup(Model m, URI groupId) {

  ValidationUtil.checkNotNull(m, "m");

  ValidationUtil.checkNotNull(groupId, "groupId");

Resource groupResource = m.getResource(groupId.toString());

  if(groupResource == null) {

    System.err.println("Warning: no such group: " + groupId.toString());

    return;

  }

if(!hasType(m, groupResource, Example.Group)) {

    System.err.println("Warning: not typed as a group: " + groupId.toString());

  }

printHeader(m, groupResource);

//

  // NOTE: for loop in a for loop in a for loop

  //

  for(RDFNode groupHead : m.listObjectsOfProperty(groupResource, Example.containsPeople) {

    if(!groupHead.canAs(Container.class)) {

      System.err.println("Warning: Group "+groupId.toString()+" containsPeople points to a non-container");

      continue;

    }

    Container container = groupHead.as(Container.class);

for(RDFNode peopleNode : container.iterator()) {

      if(!peopleNode.isResource()) {

        System.err.println("Warning: Group "+groupId.toString()+" rdf:next points to a literal: " +

          ((Literal)peopleNode.as(Literal.class)).getLexicalForm();

        continue;

      }

      Resource peopleResource = (Resource)peopleNode.as(Resource.class);

      if(!hasType(m, peopleResource, Example.Person)) {

        String identifier = (peopleResource.isAnon())?

            peopleResource.getId().getLabelString() :

            peopleResource.getURI();

        System.err.println("Warning: not typed as a person: " + identifier);

      }

      printName(peopleResource);

    }

  }

}

...]]>

Here’s the printInfoAboutGroup() method again, now for the RDF model structure from listing 4.

Listing 6

I suspect that, if you haven’t seen RDF-inspecting source code before, all of the above looks a little scary. There’s a load of looping and checking that you don’t have to take into account when using simple javabeans. This is the price to pay for the open world assumption, though of course a lot of it can be abstracted out in utility code a lot better than I’ve done above.

However, in the midst of all that java fluff, the difference should still be clear — Listing 6 has one nested for loop less than Listing 5. No matter how much you clean up this code, that fundamental difference remains, and, because of the open world assumption, it is rather more important…compare…

Conclusion

Because of the open-world assumption, making use of bnodes is very expensive when doing real-world software development. Therefore, bnodes should be avoided. Compare:

  • Object oriented world: foo.bar.getXyz() vs. foo.getBarXyz().
  • XML world: foo.getElementsByTagName("bar").getAttribute("xyz") vs. foo.getAttribute("barXyz")
  • RDF world: for(foo) { for(bar) { for(xyz) { ... }}} vs. for(foo) { for(barXyz) { ... }}.

You can forget all of the above, just remember these rules:

  • Don’t use RDF collections. Use one-to-many properties that result in “collections” instead.
  • If you need ordering, define the sorting algorithm instead of putting the ordering in your data.
  • If you have (sort-of) one-to-one relationships in your model, and one or both sides of the relationship is identified by a bnode, merge the concepts into one and distinguish using properties.

I upgraded to SVN 1.4

Filed under: Moved from old blog, Tech — Leo Simons @ 13:20

What should’ve been simple (”port install subversion“) took about two months. The reason for this is that the SVN working copy format changed in a backwards-incompatible fashion. The SVN team has always mentioned that the format would break it at some point, but many people just ignored those words of warning (mainly because there wasn’t exactly a convenient way to get the same information in another way with early subversion releases). So…

  • I had to buy an upgrade for IntelliJ IDEA, from version 5 to version 6, to use my IDE with SVN
    • (because IDEA uses SVNKit, a pure java replacement for the SVN client, written by a different developer team)
  • I had to fiddle for hours to get the security settings just right (I still haven’t found out exactly what broke, it has to do something with our certificate-based setup at work, I made sure to tell the SVN developers we employ around here since I don’t dare share details of our security measures; I hope one of them finds time to fix it)
  • I had to upgrade my custom blog software, xblog, since it, like SVNKit, also parsed the contents of .svn/entries. Here’s the patch.

Needless to say, I got consistently more unhappy during all of this. What I hated the most was having to fork out a bunch of money (of course, the company paid, but I’d rather see that cash go to the soccer table so badly desired by some) just to keep my trusted developer toolset working.

It’s also rather painful to note that the XML support in the subversion client still isn’t quite what you would expect out of a ‘normal’ XML tool. For example:

$ svn info --xml foo

<?xml version="1.0"?>

<info>

foo:  (Not a versioned resource)</info>

$ echo $?

0

$ # note how the error message does go to stderr properly

$ svn info --xml foo 2>/dev/null

<?xml version="1.0"?>

<info>

</info>

Much better would be if, in this case, --xml would return some kind of <error>, or at least would set the exit code to something that readily enables you to see there might be a problem. It doesn’t really matter for my little xblog script, but for a tool like gump, it seems likely to cause problems one way or another.

Make Rocks (too)

Filed under: Moved from old blog, Tech — Leo Simons @ 12:49

William writes:

Nothing made this more clear than working with Rake, Make, and Ant—all in the same day. Make is ridiculous, Ant is reasonable, and Rake rocks.

He has a nice example of how to do flash-y things without flash and how to use rake to do cool things. However, I think it doesn’t justify his “Make is ridiculous” statement at all. Here’s a solution to his problem using make and bash (might be proper sh, who knows):

Makefile

IMAGES:=$(wildcard resources/images/*.jpg) $(wildcard resources/images/*.png)resources/images.xml: $(IMAGES)

 bash resources/images.xml.sh $(IMAGES) > resources/images.xml

images.xml.settings.sh

SWF_VERSION=7

SWF_WIDTH=450

SWF_HEIGHT=550

SWF_BACKGROUND="#ffffff"

SWF_FRAMERATE=24

images.xml.sh

# load settings

. images.xml.settings.sh# header

cat << END

<?xml version="1.0" encoding="iso-8859-1"?>

<movie version="$SWF_VERSION"

       width="$SWF_WIDTH"

       height="$SFW_HEIGHT"

       framerate="$SWF_FRAMERATE">

  <background color="$SWF_BACKGROUND"/>

  <frame>

    <library>

END

# line for each clip

for fname in $*; do

  name=`basename "$fname" | sed -r 's/\.(png|jpg)$//'`

  echo "      <clip id=\"$name\" import=\"$fname\"/>"

done

# footer

cat << END

    </library>

  </frame>

</movie>

END

I haven’t tested it elsewhere but on my laptop, but I’m reasonably confident this setup will work by default on just about all linux/unix/mac os x machines out there, including ones from 10 years ago. It also doesn’t require one to learn a new language (ruby) or a new domain-specific language (rake) if you’re an “old fart”, integrates easily with most existing build systems one can imagine, and has about the same number of lines of code.

Next Page »

Blog at WordPress.com.