Archive for the 'SQL Server' Category

OLAP MDX Optimisation Tip

Wednesday, September 12th, 2007

Ok Ok I know this is trivial but it’s saved more than my own bacon before…

You’ve built an OLAP (SSAS2005) Cube (and this isn’t your normal job) and some brain box has asked for a calculated member to be available (or alternately you’re adding this as a calc. member to a report and you want it calculated on the fly - see below it’s basically identical)

For example you want to find the ratio of the current bank accounts balance relative to all the others so you add a member of the ilk (in script form this is);

CREATE MEMBER CURRENTCUBE.[MEASURES].[Ratio of Account Balance]
AS iif ( IsEmpty( [Measures].[Account Balance] ),
[Measures].[Account Balance],
( [Measures].[Account Balance] ) / ( ROOT([Account]), [Measures].[Account Balance] )
),
VISIBLE = 1 ;

So, you run the cube in a report or the browser and suddenly that lovely cube response has shot up from sub-10 seconds to never coming back… what could you have done?

Well i’m assuming you’ve got a lot of data ™ - probably hundreds of millions of fact rows and you’ve now stopped SSAS optimising the search path and ignoring the empty bits. So, give it a hint and add the NON EMPTY BEHAVIOR (sic - behaviour - that gets me every time - colour becoming color i get but behaviour is just wrong) tip.

In the GUI simply select the requisite measure field, in the script tab add
NON_EMPTY_BEHAVIOR = { [Account Balance] }

voila! instant speed up and less heart attack!

(note: if this was an inline calc-on-the-fly type member then the full wording would be…

WITH MEMBER CURRENTCUBE.[MEASURES].[Ratio of Account Balance]
AS iif ( IsEmpty( [Measures].[Account Balance] ),
[Measures].[Account Balance],
( [Measures].[Account Balance] ) / ( ROOT([Account]), [Measures].[Account Balance] )
),
NON_EMPTY_BEHAVIOR = { [Account Balance] },
VISIBLE = 1 ;

SQL Server Reporting Services - doing custom subtotals

Wednesday, September 12th, 2007

Chris has an excellent post about custom subtotals using INSCOPE that has been pretty helpful to me.

Using a Matrix with OLAP cubes can sometimes be a bit painful and icky, but this technique makes adding additional info really easy.

Now all we need is access to a proper ternary operator instead of that horrible IIF vb-istic construct and we’d be laughing.

SQL Server Triggers containing Cursors (AKA: how *NOT* to audit)

Monday, June 25th, 2007

I’m working with a client at the moment who has a lot of legacy code.

One of the things that makes the systems they’ve got so bad is that there’s little thought of performance when things have been written. An excellent example is the use of multiple nested cursors within triggers in SQL Server.

Obviously at some point a developer was given a requirement for auditting. As not all of of the tables (nor every column in a table) was required to be auditted they thought to themselves that triggers using cursors would be an excellent plan.

So, they created a table called “audit” that contained a timestamp; user id and name that caused the change, the source row id, table and column; and before and after values.

Now i’ve no doubt that this is a nice scheme for quickly showing a list of changes on a screen (i.e. select * from audit where sourcetable = ‘address’ and sourcetableid = ‘’ order by changedate DESC) but this is a performance killer if you ever have much concurrency and you update things in batch.

To get the data within the trigger to build the audit info you need to encode the table name, look up the username (for the user that made the change - as that was denormalised into this table too - ironically no doubt to speed up read performance), then iterate around another table that tells you which columns you’re interested in auditting, for each of those check if they’ve changed (in a cursor of course), then find the differences and do an insert for each.

So, if we’ve got 8 columns to change then we’ve got one trigger, one lookup for the user’s username, one cursor iterating around 8 “columns to audit” entries, at least 8 lookups against before and after data in the trigger, then up to 8 inserts into the audit table.

Now, some people are thinking… what’s the problem?

Firstly, one of volume: the client mentioned has a few million rows and does updates that can sometimes effect 10s of thousands of rows, suddenly a trigger per row hurts.

Secondly, they’ve obviously been looking at performance the wrong way - they have about 1000 users (and the table for users is the ntusername, vs the userid - not exactly large - so it’s easy for sql server to cache in tempdb as it’s used all the time). Next up the audit data isn’t going to change on a day by day basis - so why not generate that list of columns to check straight into the trigger code? At least that way you’ve skipped another table lookup.

I’m guessing you’re thinking - so what’s the right way? what’s the magic bullet/golden hammer? - There isn’t one, for every situation there’s a different solution, but it all starts by understanding the way your application needs to work.

I use two basic practises a lot;

a) Never update a row - have a timestamp against a row and the user id that created it and a marker that it’s historical. when you need to change data simply insert a new row (with the Now timestamp and the current user’s id) with the new data, then mark the original row as defunct. You either get the history live by showing before and afters, or have a batch job that turns it into user friendly stuff in a downtime (if one exists). If you use Stored Procedures/SPs (i’m assuming you do!) then it’s easy to implement this scheme in there.
b) Generate a table that matches the schema of the original table you wish to audit and prefix the name with audit or history (potentially you can trim down the columns in there if you’re not interested in all of them) and then insert the original data into the audit table whenever you go to update it. Again this is easy to do in the Stored Procs/SPs.

In general (a) is the simplest to implement but it’s only really usable if you have smallish data sets (especially if you put a clustered key on the “defunct” marker column - as that update will require a table reshuffle). (b) is more work but at least you’re segmenting your data into historical vs. current - so any stupid programmer mistakes (such as not using the defunct marker in a select query on the data can return a history row instead and confuse the user).