Archive for the 'Architecture' Category

Web Services - Picking the right one for the right job

Thursday, July 24th, 2008

Picking your web service standard is an important decision and shouldn’t be taken lightly, and development teams shouldn’t be allowed to make the decision on their own.

Roughly there’s 3 “standards” to choose from; SOAP (Simple Object Access Protocol), XML-RPC, and “REST” (”REpresentational State Transfer”).  It’s worth touching briefly on what each is before continuing;

  • SOAP is very heavily pushed by IBM and Microsoft and has standard tooling within various IDEs and frameworks.  This means for a lot of developers this is the only Web Service standard that they consider.  It allows the Web Service to handle complex (and “rich” datatypes) both as parameters and for returning, and has a standard called WSDL that allows Web Services to be defined in a manner which is language agnostic (for those of you old enough; yes, this is very much like IDL and CORBA’s IDL) and is downloaded so the script can adapt things itself.  Best of all this means that you can both pass objects and retrieve them without having to realise how the information is transferred, which for programmers is a massive boon.  It works by encoding everything as XML and passing the XML as an encoded object to via HTTP GET, then the return values are passed back as XML, the SOAP framework does all the hard work of translating to and from XML on both sides.
  • XML-RPC seemed like a reasonable competitor to SOAP, but seems to have diminished over time. It too allows “rich” and complex data to be passed to and returned from the Web Service.
  • REST is the outsider, although there are toolkits and frameworks to help it relies on simplicity of parameter passing, although any data format can be passed back.  This means it’s hard to pass complex data, and what it returns needs to be interpreted (rather than being magically converted into objects/structs that are usable within the calling language).  It also frequently requires custom coding to create the web service, and most of that work will be manual.

So, you’re probably thinking, why would I go for anything but SOAP?  It seems the best, most widely used and easy to develop web service technology, so surely it’s the only choice? The answer is unfortunately not obvious..

The answer is related to who you’re creating the Web Service for (your “target audience”) and what you want them to do with your data.

If you’re on an intranet and you need .NET and Java code bases to cross communicate then SOAP is an excellent way to go, development is quick, it’s easily secured and toolkits are in place for both platforms to allow you to create and consume (and debug) Web Services very easily.

In addition, if you’re an internet based app and you’re working with medium and large enterprises, then SOAP could again be a good fit, especially if you’re looking at clients that are based heavily around .NET and Java.

If, however you’ve got a wide development language base within your intranet (using scripting languages such as python, php, perl, ruby etc.) then it may not be a good fit.  Scripting languages will need to execute more code before they can make the calls (and the calls may still have minor issues when they communicate with Web Services - despite the popular press SOAP still has issues and incompatibilities when things are getting complex) and you’ll need to start including additional libraries to support it.

REST really shines when you have either; a varied user base (you can’t check which languages consumers of the service will use), and/or some of the users will be small businesses or even single developers creating mash-ups.  This is because it’s incredibly simple - parameters are either encoded as part of the path or on the query string and it returns whatever is most useful.  If it makes sense to return CSV information from a REST Web Service then that’s just fine, JSON? yep fine again.

Because REST is based around a standard HTTP GET or POST call it’s easy to implement, even if you don’t have an easy way to open up a GET or POST call (for instance using cURL from just about any language - including shelling out to it if needed from UNIX shell scripts or MS-DOS bat files) you can custom code it - it’s trival even by creating a socket yourself and sending a GET HTTP request on port 80 (although i seriously doubt you’ll do that).
The other time you should seriously consider REST above the other choices is when you want to use the Web Service for more than backend server usage - for instance when you want javascript based AJAX to use the same information or you’re creating javascript based mashups.  Ever tried building a SOAP client usable from javascript?  it’s non-trivial and you’ll be fixing bugs from here to eternity.
So, let’s take a look at some code and see why SOAP is so attractive to developers and why REST is more difficult but more accessible.  For the examples we’ll create a fake music based web service that allows users to find tracks that have been created by an artist.

For a SOAP example (in PHP)

/* create an instance of the web service proxy that you'll call on to - pass it the link to the WSDL file that tells the SOAP client what the web service needs as parameters and what it returns */

$musicfinder = SoapClient('http://someexample.com/webservices/musicfinder.wsdl')

/* use the proxy object and call the "findTracks" method and pass the first argument (in this case the artist) */

$tracks = $musicfinder->findTracks('madonna');

/* iterate and do whatever you need with them */

foreach($tracks as $track) {

    print("TRACK IS : " . $track->trackName());

}

REST example (in PHP) version 1 - we’ll assume it returns XML, and we’ll use the verbose (non-wrappered) version of curl

function CallWebService($url) {

    $ch = curl_init($url); $

    response = curl_exec($ch);

    //optionally you can check the response and see what the HTTP Code returned was

    $response_code = curl_getinfo($ch, CURLINFO_HTTP_CODE);

    $curl_close($ch);

    return $response;

}

//build the string to call it $url = "http://someexample.com/webservices/rest/findtracks/" .$artistname;

$resultsxml = CallWebService($url);

//load the dom - skipped here and set up an iteration etc.

REST example (in PHP) version 2 - we’ll assume it returns XML, and we’ll use the file_get_contents cheat method

function CallWebService($url) {

    return file_get_contents($url);  //may be disabled on some ISPs

}

//build the string to call it

$url = "http://someexample.com/webservices/rest/findtracks/" .$artistname;

$resultsxml = CallWebService($url); //load the dom - skipped here and set up an iteration etc.

REST example (in PHP) version 3 - we’ll assume it returns CSV, and we’ll use the file_get_contents cheat method again

function CallWebService($url) {

    return file_get_contents($url);  //may be disabled on some ISPs

}

//build the string to call it

$url = "http://someexample.com/webservices/rest/findtracks?artistname=" .$artistname;

$resultcsv = CallWebService($url);

foreach(split("\n", $resultcsv) as $line){

    print("TRACK: " . $line[ COLUMN_CONTAINING_TRACK_NAME ]);

}

You can see it’s easy to use SOAP, but the REST call is very clear in the server logs, and it’s easy to construct the call and to process the call on the server side (read the query string parameters, do something and return something).

I know of one major company that has provided SOAP interfaces for it’s Web Services, without realising it’s core audience is single developers and small development houses.. almost all using scripting langauges very heavily (and most of them have poor programming skills), as a consequence it’s barely being used.  The company keeps enhancing it’s interfaces to give more capabilties (I can only presume they think the issue is that they’re not providing enough functionality) but the user base just can’t use them.
There’s a reason why Google, Amazon, eBay etc. all provide REST interfaces - hopefully now you can get an idea why.
PS: If you’re trying hard to use PHP with SOAP then i’ll recommend this “Dirty Secrets” document with inside scoop with regards to using PHP with SOAP.   Also, i’d ignore anything but the core SOAP library - they’re poorly maintained (although i’d love to be proved wrong), however sometimes it’s difficult to get the extension installed.  In that case you’ll just have to try other libraries until it works.

Architects need to properly Communicate with Developers: Scale-Up vs. Scale-Out

Wednesday, July 2nd, 2008

This is a problem that unfortunately we so often see.

Architects are employed to ensure that software projects have a reasonable base; solutions are well thought out, robust, and follow industry best practices; and provide a more detailed but higher-level view of application spread (and inter-operation) than a senior developer, project or programme manager could affect.

However, there’s a gap, the architects produce plans and ideas and don’t communicate them effectively to the developers, so new problems arise. Often it then gets worse because the architects blame the developers (and especially the lead developers) for not following the plan, while the developers blame the architects for poorly thought out plans and non-committal/non-understandable language. Out of frustration, I thought we could cover one of those major communication problems today: Scale-Up vs. Scale-Out.

Any developer who’s been in the industry has heard these terms a few times (probably by some people who have used them inter-changeably), however they mean very different things and have different design decisions and costs associated with them.

a. Scale-Up -> To add additional hardware (processing power, memory, faster/bigger disk) to a solution to enable it to scale further

b. Scale-Out -> To use additional hardware nodes (web servers/database servers/application servers) to an existing solution to enable scaling by spreading the workload further. (Note: this also includes strategies where duplicated hardware is put in place and the solution is segmented down functional/data lines so that each set of hardware runs independently of the other). Grid computing is an example of Scale-Out pushed to the maximum.
Scale-Up is usually the first port of call for badly performing applications, but there’s always a high point of what can be done. Even if you started on a single CPU Windows 2003 server and then ported up to a SUN Fire E25K you’ve still hit a ceiling. However, for most projects this makes sense as the user/work loading will never be exponentially larger than it started. So long as there’s sufficient upgrade path (and the application can handle it) then Scale-Up will often be the best in cost-benefit analysis, primarily because one only upgrades (or moves on to new hardware) if it’s strictly necessary.
True Scale-Out requires designing from the start as it’s hard to push a Scale-Out strategy on to existing solutions. If the service is known to be massive from day one then segment your workload like Google/Ebay and co and have multiple servers that all do a minor part of the workload, and other servers that collate that and have responses back to the user. In that way you can always add more hardware to do individual tasks without a major problem. However, for smaller projects this isn’t worth the cost or the stress.

Some architects like to say they’ve got the ultimate scaling strategy: a mix of Scale-Out and Scale-Up, and to an extent some do. However, having your webserver, app server and database server on a single machine, then scaling out to three machines, before beefing up each of those machines further isn’t true Scale-Out (it’s just a modification of the Scale-Up strategy really). For those solutions you have to be extremely careful about issues like marshalling (inefficient data passing between tiers won’t really hurt when you’re on the same machine, however if you have to pass that data across a network, serializing and de-serializing as you go, then that can cause major issues.

Hope that helps explain Scale-Up (aka ScaleUp) and Scale-Out (aka ScaleOut). As ever i’m happy to hear from you with questions etc.