Java Is Now Officially Fast
I don’t use our product outside of debug mode often enough apparently. Having played around with the wiki system I mentioned previously (you do read your planet’s from bottom to top right?), I suddenly noticed that the progress bar our editor applet shows while it’s starting up wasn’t displaying. Turns out the applet was loaded and ready instantly so the screen wasn’t getting a chance to repaint. Awesome! This by the way is only on a 1.4Ghz AMD machine with 512Mb RAM so it’s about the average for corporate desktops these days, maybe a little above and it is running the Java 1.5 beta which provided some massive improvements in start up time. Still, our ActiveX based editor doesn’t load this fast. Even better, I’m using an old version of the applet since I didn’t have a recent copy on my laptop when I came home tonight and couldn’t be bothered downloading a new version. It should be a faster still with the performance improvements we’ve put in.
I Love Regex, I Hate Regex
I’ve been playing around with writing a mini-wiki that uses the full compliment of HTML as it’s syntax (instead of forcing me to learn yet another markup language) and use EditLive! for Java as the editor – eating ones own dog food and all that. Frankly, that’s the way a wiki should work, no messing around with mark up at all, just simple, easy to use WYSIWYG markup. Anyway, I wrote the back end in PHP since we don’t have any PHP examples in our SDK and I couldn’t be bothered working out why perl refused to install the MySQL drivers. Loading and saving from the database is simple enough, and I settled in to make the CamelCase works hyperlinks. The obvious answer: regex. The obvious problem: working out which regex expression to use (I don’t use regex often since I usually live in the land of custom automatons instead). I’ve wound up with:
The Default Namespace
Byron complains about what he calls a limitation of XPath. It’s not actually a limitation of XPath at all but rather a very common mistake people make when working with XML namespaces. Lets take a tour into the dark depths of XML namespaces to discover what’s really going on. Originally, XML didn’t have namespaces at all, every element was identified purely by it’s name. So the element <html> was known as html and the world was simple. Then people discovered that they wanted to combine XML documents and that quite often they’d wind up with two elements called html that had completely different meanings and uses – ie: they were actually two different elements. To solve this problem, the clever folk over at the W3C changed the way XML elements and attributes were named. Now instead of a simple string, elements would be named using a QName (short for Qualified Name). A QName is a compound data object consisting of an URI and the regular name we’re used to. Now the important thing to note in this distinction is that it is impossible to refer to an element only by it’s local-name, you have to use a QName to refer to it and that QName must have a namespace attribute (all QName’s do). It is however possible to assign nil to the namespace attribute of a QName, which is referred to as an element in the nil namespace. Now, one of the important things about adding namespaces to XML was backwards compatibility so there had to be some way to assign a namespace to all those elements which were previously referred to only by their local-name. This is where the default namespace comes in (it also happens to be quite convenient). By default in an XML document, any element that doesn’t have a prefix to declare which namespace it is in, is assigned the “default namespace”. The default namespace however isn’t an actual namespace, it’s just a default value for the namespace attribute. By default, the default namespace is the nil namespace. Now, in XML you can specify what the default namespace is by adding an xmlns attribute, eg: xmlns="http://www.w3.org/1999/xhtml". The new value for the default namespace is then in effect for that element and any element under it (unless it’s changed again). It is important to note here that an XML document doesn’t have a default namespace, but rather each element has a value which it inherits (attributes never ever use the default namespace). There can therefore be as many different values for the default namespace as there are elements. So if we were to add Byron’s idea of matching any element in the default namespace, it would never match anything because every element (and attribute) would have an explicit namespace. Worse still, the default namespace would change depending on which element we were in and what the specific representation of the XML was (maybe the default namespace was left as nil and every element used a prefix or maybe no elements were prefixed and the default namespace was changed all through the document). Depending on the representation of XML instead of the actual data it represents is very bad practice and will cause problems. This leads right into the other common mistake in Byron’s post: //*[name() == 'foo'] will do nothing particularly useful. What it will do is match any element with a local-name of foo, in any namespace as long as the element was represented without a prefix. It will not match <my:foo xmlns:my="http://www.intencha.com/my" /> because name() for my:foo will return my:foo. Elements are the same if they have the same QName regardless of whether or not a prefix used or if different prefixes were used. The correct way to select any element with a local-name of foo in any namespace regardless of what prefix was used is: //*[local-name() = 'foo'] If I were to take a shot at selecting any node which had a namespace-uri the same as the default namespace in effect in the original representation of the XML document’s root node it would be: //*[string(/*/namespace::*[name() = ""]) = namespace-uri()] Which is to say:
The Curse Of Testing Text
One of the major challenges in my job is testing our product. Now most people think that testing is reasonably easy but requires discipline, this is not true if the product you write happens to be a styled text editor and it's nearly impossible to do really well if you're working with something as flexibly defined as HTML.
The problems start at the unit testing level. Try taking the standard JTextPane class and writing unit tests for it. How do you test that it can render a HTML list correctly? You could write a test to make sure that the list numbering at least comes out in order and in the right format but that still won't guarantee that the list numbers actually paint correctly. For instance, we recently had a bug where the list numbers painted correctly if the list fit on screen, but if it required scrolling, whatever item was at the top of the screen was numbered 1 even if it was actually the third item in the list. Our list numbering unit tests had no way of picking up on that because it was the actual rendering code (which we didn't write) that was wrong.
Object.equals()
Andrae Muys provides some excellent advice on implementing Object.equals() however I do have to correct one thing. Andrae presents the code:
class A {
int n1;
public boolean equals(Object o) {
if (!(o instanceof A)) {
return false;
} else {
A rhs = (A)o;
return this.n1 == rhs.n1;
}
}
}
and suggests that it is incorrect. It is not. This is absolutely, 100% the correct way to implement equals for that class. The alternative he presents on the other hand is incorrect:
Riverfire
I was fortunate enough to be invited out to a friends place to see the fireworks last night. They happen to own the penthouse apartment on the river front with an awesome view of nearly all the fireworks. The fireworks are launched from numerous sites along the river so being able to see all of them is really quite unusual. Better yet though, one of the launching barges is positioned directly in front on the balcony as if it were a private show just for us. That particular barge this year had some issues launching it’s fireworks. There were about three sections where all the other barges launched fireworks in unison but “our barge” sat there doing nothing. Apparently it was unplanned because at the end our barge suddenly started firing off every firework it had missed all at once. A display that was intended to take about 10 minutes was sent up in about a minute flat. Extremely impressive! Also impressive was the dump and burn which was close enough to feel the heat hit you and felt like you could just reach out and catch the plane. I’ve never been overly impressed by fly-bys when standing at ground level at South Bank but from the penthouse it really is quite an experience. Oh, and Iain has photos.
When Marketing Goes Wrong
I’m currently wearing one of the shirts that James Gosling hurled into the crowd by various means which depicts Duke aiming a rocket launcher at a weird looking demon with four arms labeled complexity. Earlier today I was accosted by a very young man who asked what that was on my shirt. When I pointed to Duke and said “this guy’s called Duke” the response was: “he’s crap. I like this guy ‘cause he has four arms”. I’m not sure that’s the reaction the designer was after….
We Will Rock You
And they did. Went to see We Will Rock You – the musical by Queen and Ben Elton last night and it was sensational. The songs fit into the story line brilliantly and the story itself was interesting and not just an excuse for singing the songs. The constant use of song references as bad puns just added to the experience for me and it was particularly impressive to see the customization for the Australian audience. The lead bohemian was titled after the great rock singer from the past “John Farnam” and when captured he was told – “This really is the last time”. That’s the kind of joke that flowed through the night and all of them went down extremely well. The music itself was very loud and very energetic. Unless you are an absolute die-hard, noone can match Freddy Mercury type of person you’ll appreciate the vocal talent that performed the extremely difficult songs Queen put together. Definitely worth seeing.
Just When You Thought It Was Safe…
Just when you thought it was safe to turn the TV on again, Young Talent Time makes a come back. The worst part is that the Minogue sisters have promised to appear on the show – any hope that some actual talent may be found is lost…
Amazon Goodness
I have slightly obscure tastes in music – particularly, I like musicals, not the highlights CDs the full recording of the original cast. It’s certainly not the most obscure taste in music but it does lead to an awful lot of trouble tracking down what I want and worse still I know what I want ahead of time unlike most people with really obscure tastes who just stumble across things they like. Coming back to the point though, you can’t just walk into HMV or pretty much any music store that I’ve found and pick up a copy of the original 1986 cast recording of The Phantom of The Opera or Miss Saigon or Les Miserables. However with this new fangled technology intarweb thingy I can head on over to Amazon and order it from there. There’s a bunch of other online stores around that may or may not have what I want for prices roughly equal to Amazon but what I love about Amazon is watching it try to predict my buying habits. I know most people freak out about privacy violations when computer systems start gathering data about them but with Amazon it’s like a fun game. It managed to pick that I was looking to purchase Miss Saigon and The Phantom Of The Opera the last time I went there and offered a package deal on them. It’s recommendations are also really quite good – including detecting that I tend to buy the versions that Lea Salonga is in (she tends to be part of the original cast of a lot of big musicals). Sadly, it doesn’t seem to have worked out that I only buy CDs from them as it keeps offering books and sheet music. While I do tend to buy a fair bit of sheet music of musicals I won’t purchase it without first flicking through it to make sure I have a chance of being able to play it. The big downside of buying from Amazon though is I have to wait two and a half weeks for things to arrive (that or pay an extra arm for postage).
That Pesky Caps Lock
Tor Norbye politely requests that the caps lock key be removed and the control key put there instead. There’s one very good reason why that shouldn’t be done: Everyone (except old school UNIX geeks) is used to the control key being where it is. Moving the control key would seriously annoy people. If you’re one of the people who are used to control being next to ‘a’ then imagine the whole world being as annoyed as you every time they use a computer and find that control is in the “wrong” place. More importantly though, putting control beside ‘a’ isn’t a good place anyway. The little finger is the most difficult finger to control on the human hand and is used least commonly. In touch typing, currently the left little finger is positioned over ‘a’ and moves up for ‘q’ and ‘z’. If you’re British or Australian, ‘q’ and ‘z’ are incredibly uncommon letters (American’s customi_z_ed their language by putting a bunch of Zs in). Now think of the most common keyboard shortcuts used on computers these days (think Windows users, not emacs users):
Pointless Schemas
There seems to be a growing trend for projects to use XML configuration files – fine. There seems to be a growing trend for those projects to provide a schema for those files – good. There seems to be a growing trend for those projects never to validate their configuration files against the schema – bad. As I’ve previously mentioned, my job involves creating an XML forms editor and it turns out that this forms editor is really quite good at editing configuration files (see our very own configuration tool). We thought it might be nice to create a simple editor for Maven POM files. Sadly, it seems that the schema for a POM file doesn’t come anywhere near close to describing what should actually be in a POM. Maven has support for inheriting a POM and using the information in it, thus allowing that information to be omitted from the POM itself. Now admittedly, it’s not possible to specifically describe this in XML schema (the POM being inherited from can omit any information it likes as well assuming that it will be filled in by the extending POM), but the current schema insists that everything be specified in every POM file which makes validation completely useless. The POM files from the plugins don’t validate against the schema for a number of reasons as well. JDNC is even worse – Xerces finds errors in the schema itself (and I’m fairly sure Xerces is correct). This, combined with fairly poor documentation, makes it extremely difficult to implement tool support. It’s a shame, I’d probably use Maven if it wasn’t such a pain in the neck to create the POM file correctly (that and dependency management which is wonderfully simple for things in the public repository and annoyingly difficult for things that aren’t). Maybe I’ll come back and create a more useful schema for Maven at some point, but updating the schema doesn’t really help me identify the areas of our product that need improving.