Archive for February, 2008

Making MySQL OR performance better

Thursday, February 28th, 2008

MySQL experts and Old SQL hands look away now. This is like teaching grannies to suck eggs, but if you’ve always had experience of advanced (which used to mean commercial) SQL databases then you probably don’t know this trick.

In the old days performance for OR clauses was pretty bad and a lot of DBMSs would ignore the index and start table scanning and performance would stink.

The old (and MySQL) way to get around this if the problem exhibits itself is to simply use a UNION and have each part of the OR on each side.So for example for table employee we need to find any employee where their name is ’stuart’ or ‘roger’ (i’m assuming we’re using an OR here - we could of course use an IN..).

The first version is the clear version

SELECT * FROM employee WHERE forename = 'stuart' OR forename = 'roger'

However, if you do an EXPLAIN on it and notice that it’s table scanning it’s trivial to re-write as a UNION

SELECT * FROM employee WHERE forename = 'stuart' UNION SELECT * FROM employee WHERE forename = 'roger'

Now if you know about your data (and the OR condition) then you can optimise a dash further. UNION merges the two result sets and checks for duplicate entries (de-dupes - just like a DISTINCT) before returning the values, but if you’re sure that the two sets won’t clash (and above is a great example) then use UNION ALL which ignore the de-duping stage.

So, why not always use a UNION? Well firstly it’s not clear and secondly there’s a duplication of code. Imagine if your OR clause was more complex (forename could be ’stuart’, ‘bob’, ‘roger’, ‘terry’, ‘june’) then that’s 5 repeats of the same statement. Throw in the fact that you’ll often explicitly mention the columns you want (remember i’m using a * in the SELECT clause for example only - try your best not to do this in real code of course) and that’ll you’ll have multiple joins then you realise you only want to do this if and when you get a performance problem.

Have fun and I hope not too many of you are sucking eggs at the moment :)

Java: (IMHO) Why independent application builders are shying away from it

Saturday, February 23rd, 2008

It’s time for a little bit of opinion which I hope will feed into the developer usability of Java in future.

Recently we’ve been working with Java for a major client. They’re pretty typical really: they don’t use Windows on anything but the desktop and so Windows/Microsoft technology is out. They needed an “enterprise” quality language and framework so they chose Java EE over Microsoft’s .NET software stack.

Now before I go any further I need to tell you that I use VIM for most of my editing/coding needs - it’s quick and easy and for coding there isn’t an IDE that can beat it (although it’d be great if somebody could add some refactoring functionality). Using a test editor like VIM keeps the code clean and means that the cost of switching between IDEs for each language isn’t needed (for most tasks anyway).

Until recently we’ve been doing a lot of PHP for clients (with all the problems that brings - see earlier blog posts) so we’ve got used to a simple way of working. Build the database, PHP database tier, PHP Business Logic tier, and use Smarty for presentation, it’s easy and quick to do, then for unit testing we use SimpleTest (which I can’t recommend enough).

With the Java work it’s becoming impossible to work outside of an IDE; you’re either building xml files for descriptors, or bagging things up into EARs of WARs and then deploying to the app server. Now this isn’t a problem once you’re up and running and the environment and ANT scripts are in place, but if you’re working on a 5 day development for a smaller client that wants Java then you need to add another day for setup and environment matching - and that’s a large cost to most clients (that’s a 20% surcharge).

Now I’m not saying that Java’s much worse than .NET (esp. ASP.NET) but at least the Microsoft platform is a single platform and that makes things simpler, but even there with each incarnation of ASP.NET it seems to get more complex.

Java deployment needs a simple mode to build and deploy web based application, otherwise small clients just won’t and can’t afford to use it. How about a command line tool that you run against the top of the source tree, it builds it and deploys JSPs to the front end, and EJBs into the app server without needing a load of xml files? Is that even possible or realistic?

I originally loved Java because it was so much simpler than the alternatives; VB (with all it’s foibles and awful/non-existent thread handling) and C++ (with myriad libraries to use) it was simple and quick to do things. Then for the Microsoft crowd C# did the same thing and allowed them to leap forwards. Now both those frameworks are big (and bloated?) and there’s few of us who can quote what all parts of the framework libraries do.
What I think we’ll see in the coming future is more enterprises will follow the lead of smaller businesses and move to scripting languages for their internal and external application development. By doing this they’ll reap quicker turnaround and faster dev cycles. Some of the biggest IT firms on the planet are already doing this (google for example) so It’ll be interesting to see what happens.