Clean, Tidy, Fast. Pick Three

Ten Simple Tricks to Speed up Your Excel VBA Code is a fine post that offers some highly practical insights into Excel/VBA performance that insiders have been using to their benefit. This is the kind of post that would quickly result in one’s expulsion from an Excel Magic Circle. If there was one. Which there isn’t. Unless you know otherwise.

The first five tips all deal with asking Excel to suspend various time-consuming activities during macro execution. Of the five, #1 is both the most important and the most dangerous:

Application.Calculation = xlCalculationManual
'Place your macro code here
Application.Calculation = xlCalculationAutomatic

It has the potential to save a lot of time, but crucially, it may also affect the way your workbook works. A naive (or just plain bad) workbook construction may have macro code interacting with worksheet calculations, so the state of the sheets may change during the execution of the macro depending on where calculations happen. This is a situation that can often develop when macros are developed from code created with the Macro Recorder. Be warned.

Of course, we should also be wary of assuming that the calculation mode started out as automatic: the user may have switched to manual for very good reasons. Anyway, it’s a bit rude.

Tips #2 to #5 add further performance boosters, managing the enabled and disabling of screen and status bar updating, , page break display and event processing behaviour (which may also need consideration similar to calculation mode as described above). By now, we’ve got quite a lot of boiler-platey code to wrap around the code that’s actually supposed to do something useful. I don’t like that.

How much cleaner would it be if we could write something like this:

With New ExcelInteractionSuspender ' please feel free to think of a better name...
    'Place your macro code here
End With

Quite a lot cleaner, I’d say.

Such a class might look something like this:

Option Explicit

Private storedCalcMode
Private storedScreenUpdateMode

Private Sub Class_Initialize()
    ' store current settings
    storedCalcMode = Application.Calculation
    storedScreenUpdateMode = Application.ScreenUpdating
    ' now switch 'em off
    Application.Calculation = xlCalculationManual
    Application.ScreenUpdating = False
End Sub

Private Sub Class_Terminate()
    Application.Calculation = storedCalcMode
    Application.ScreenUpdating = storedScreenUpdateMode
End Sub

For the sake of brevity I’ve only included the first two options (which are the ones I usually find provide the most benefit in any case). The others are left as an exercise for the student…

Optically Illusory

I came across a charming little optical brain-bender a while back, and wondered what would be needed to reproduce it in Excel/VBA. Not much, as it transpired (see below). I’ve golfed the code somewhat for brevity. Dump the following in an empty VBA module and run “DrawIllusion”…

Private pi As Single
Public Sub DrawIllusion()
  Dim shp, c
  pi = WorksheetFunction.pi()
  ActiveWindow.DisplayGridlines = False
  For Each shp In ActiveSheet.Shapes: shp.Delete: Next
  NewShape 0, 0, 420, 0, vbWhite
  For c = 1 To 4: DrawACircle 200, 200, 60 + (c - 1) * 40, 15, 5, -20 + (c Mod 2) * 40: Next
End Sub

Private Sub DrawACircle(x, y, r, side, gap, rotationOffset)
  Dim numSquares, i, angle, shp
  numSquares = NumToDraw(r, side, gap)
  For i = 1 To numSquares
    angle = (i - 1) * 2 * pi / numSquares
    Set shp = NewShape(x + r * Cos(angle), y + r * Sin(angle), side, angle * 180# / pi, IIf(i Mod 2 = 1, vbBlack, vbWhite))
    shp.Rotation = shp.Rotation + rotationOffset
  Next
End Sub

Private Function NumToDraw(r, side, gap)
  NumToDraw = Int(2 * pi * r / (side + gap))
  NumToDraw = NumToDraw - NumToDraw Mod 2
End Function

Private Function NewShape(x, y, side, rot, color)
  Set NewShape = ActiveSheet.Shapes.AddShape(msoShapeRectangle, x, y, side, side)
  NewShape.Rotation = rot
  NewShape.Line.ForeColor.RGB = color
  NewShape.Line.Weight = 1
  NewShape.Fill.ForeColor.RGB = RGB(128, 128, 128)
End Function

Getting Bigger All The Time

in which we try to get tricky with Excel and find that Excel has adequate built-in trickiness already, thank you very much indeed

Over the years I’ve had a number of roles that involved fixing, tuning, cleaning and otherwise future- and bullet-proofing a fair number of Excel workbooks. There’s a whole catalog of potential issues to entertain and amuse but very common is the Extending Input Data Set. (I capitalised that in the hope that I might have christened some kind of Excel anti-pattern).

Your average Excel user is smart enough to know that some input data may grow over time: they may be copy-pasting from some other source, grabbing a database query result or even just typing in more values over time. It’s very common to see a row-by-row enrichment of such data for analysis and the formulae need to be extended to match the rows of input data. In the worst case, users have pre-copied more rows than they expect ever to need and don’t notice when they’ve started to miss data. More happily, they copy down as they see the data extending. If they see the data extending, that is1.

Helping a colleague to avoid such an unpleasantness recently led to a couple of interesting (to me) solutions. Firstly, we looked at extending the calculations applied to a list of data that we knew would extend every time we ran the spreadsheet model.

We assume we have a defined name, input_data that does what it says. Further, we have output_formulae, which should correspond one-for-one with the input_data rows. When input_data  changes, we want to automatically extend output_formulae to match. Pretty straightforward, as it turns out:

Private Sub Worksheet_Change(ByVal Target As Range)
  If Intersect(Target, Range("input_data")) Is Nothing Then
    Exit Sub
  End If
  ExtendOutputFormulaeAsRequired
End Sub

Sub ExtendOutputFormulaeAsRequired()

  Dim inputRowCount As Long, calcRowCount As Long
  inputRowCount = Range("input_data").Rows.Count
  calcRowCount = Range("output_formulae").Rows.Count

  If inputRowCount <= calcRowCount Then
    Exit Sub ' enough formula rows already - could reduce if needed, but not done here
  End If

  With Range("output_formulae")
    ' Assumes just formulae are needed:
    .Offset(calcRowCount, 0).Resize(inputRowCount - calcRowCount, .Columns.Count).Formula = _
        .Rows(calcRowCount).Formula
  End With
End Sub

That works pretty well. At least, it does if input_data is defined, something we can usually do fairly easily, using a named formula, something like =OFFSET(B2,,,COUNT(B:B),6) (assuming we have 6 rows and there’s a non-numeric column heading in B1, which in this case there is).

Here’s the wrinkle that gives rise to the second interesting (to me) question, that necessitated a question on Stack Overflow. What if I can’t be sure that all my columns will be the same length? I want the length of the longest column (there won’t be gaps vertically, which is something at least). So in the sample below, there are 7 rows of data in columns B to G, with a heading on row 1, so 7 is the longest column and defines the number of rows in input_data.

Input data – extend me!
1 1 1 1 1 1
1 1 2 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1
1 1 1
1
1

I don’t want to use .CurrentRegion because I can’t be certain that adjacent cells won’t be populated at some time in the future. We can’t do something like MAX(COUNT(B:G)) because COUNT() is built to handle two-dimensional ranges. I tried to be tricky, {=MAX(COUNT(OFFSET(B1:G1),,,1000000,1))}, hoping that Excel would evaluate the COUNT() for a set of OFFSETs from B1 to G1. Excel wasn’t fooled. Stack Overflow time then, and enter SUBTOTAL(), a function I confess I’ve never used in over 20 years of Excellence.

I define a new name, input_data_columns as =Sheet1!$B:$G, which I can then use in the definition of input_data_rowcount:

=MAX(SUBTOTAL(2,OFFSET(input_data_columns,,COLUMN(input_data_columns)-MIN(COLUMN(input_data_columns)),,1)))

In a worksheet this needs to be Control-Shift-Entered as an array formula, but there’s no obvious way to do this when defining a Named Formula. Fortunately, although I don’t understand why, it works OK anyway, leading in turn to input_data becoming

=OFFSET(Sheet1!$B$2,0,0,input_data_rowcount,6)

That works! Is it future-proof? Probably not2 but it might very well be an improvement on what came before.

Briefly checking what happens, if we remove the MAX() and put the remainder into a worksheet as an array formula, we get {4, 4, 7, 4, 5, 5}, which is indeed the result of COUNT() on each column.


1 From a selfish perspective, this is a Good Thing – it requires “professional” expertise to deal with and has put bread on my table many a time.

2 See 1

Another notch in the utility belt

Having been largely an all-thumbs c# programmer for the last year or so I’ve had little opportunity to add to the world of Excel know-how (and I don’t really feel qualified to say much about the .Net world yet, even if I did once get a Tweet from Jon Skeet). I do keep an eye out for Excel questions on StackOverflow, though, and one caught my eye this morning.

The questioner wanted to add a zero to a chart range. Not sure why he didn’t just add an extra constant value at the end (start or beginning) of the range, but perhaps he couldn’t (maybe the data was in another, read-only workbook?). Anyway, I came up with something that extended an existing trick a little and I thought it was mildly interesting.
Using a named formula in a chart series is1 a fairly well-known idea theses days. It lets us define a dynamic series (usually using OFFSET()) and have our charts update automagically as our series grow and shrink. What I hadn’t previously thought of2 was that I could use a VBA function call as the Name’s “Refers to” with the result being useable in a chart Series definition.

Turns out it could.

Public Function PrependZero(rng As range)
Dim val As Variant, res As Variant
Dim idx As Long
    val = Application.Transpose(rng.Columns(1).Value) ' nifty!
    ReDim res(LBound(val) To UBound(val) + 1)
    res(LBound(res)) = 0 ' this is where the function lives up to its name
    For idx = LBound(val) To UBound(val)
        res(idx + 1) = val(idx) ' copy the remainder
    Next
    PrependZero = res
End Function

(That Transpose thing is handy – I hadn’t known about it as a way of getting a 1-D range value as a vector. Two transposes needed for data along a row.)

You could use this approach for lots of things. Like, well, lots of things. Honestly, you don’t need me to spell them out.


1 This makes me a little sad – it was a source of Excel-fu that impressed people to a quite disproportionate degree. Array formulae and VBA classes still have this power.
2 I’m aware that I might be the last kid in the class here. Be gentle.

Split Me, Shape Me

in which we publish something we wrote a while back but forgot to post for reasons now lost in the mists of time

While I’d love to be able to take credit for having thought of this myself, the truth is I can’t. I did clean up the code a fair bit, though, so any tidiness that particularly appeals is entirely down to me.

Reading this post a while back, I leapt to post my CellSplit() function in the comments, then realised that I typically use it in conjunction with a rather more substantial macro, one that wasn’t going to sit well in someone else’s blog comments.

Here’s my CellSplit(), which is just a wrapper for the VBA Split() function:

Public Function CellSplit(celValue As String, delim As String) As Variant
    CellSplit = Split(celValue, delim)
End Function

On its own, it’s moderately useful, but unless you know how many parts your input string will split into, you’re going to be fooling about with range resizing. In these dark days there’s no time for fooling about – we need a tool to get that resizing down for us in One Click.

Enter a toolbar (sorry, RibbonX) button, assigned to the Range Resize Wizard…

Public Sub RangeResizeWizard()

Dim result As Variant
Dim fmla As String
Dim rng As range
Dim targetRows As Long
Dim targetCols As Long

On Error GoTo Catch

Set rng = Selection
If IsEmpty(rng) Then Exit Sub

If rng.HasArray Then
  Set rng = rng.CurrentArray
  fmla = rng.FormulaArray
ElseIf rng.rows.Count = 1 And rng.Columns.Count = 1 Then
  fmla = rng.Formula
Else
  Exit Sub
End If

result = Evaluate(fmla)

With rng
  .ClearContents
  If IsArray(result) Then
    If NumberOfDimensions(result) = 2 Then
      targetRows = UBound(result, 1) - LBound(result, 1) + 1
      targetCols = UBound(result, 2) - LBound(result, 2) + 1
    Else
      targetRows = 1
      targetCols = UBound(result, 1) - LBound(result, 1) + 1
    End If
    On Error GoTo RestoreFormula:
    .Resize(targetRows, targetCols).FormulaArray = fmla
    On Error GoTo Catch
  Else
    .Formula = fmla
  End If
End With

Finally:
  On Error GoTo 0
  Exit Sub

Catch:
  Debug.Print Err.Description
  Resume Finally

RestoreFormula:
  Debug.Print Err.Description
  rng.FormulaArray = fmla
  Resume Finally
End Sub

Function NumberOfDimensions(arr As Variant)

Dim dimensions As Long
Dim junk As Long

On Error GoTo FinalDimension

For dimensions = 1 To 60000
  junk = LBound(arr, dimensions)
Next
Exit Function

FinalDimension:
  NumberOfDimensions = dimensions - 1
End Function

Note that the array formula in question will be re-evaluated twice, once to determine the dimensions of the output array and again when the formula is pushed back into the new, perfectly-sized range. Further, the routine assumes you haven’t left anything important lying around where it can get overwritten. So it’s to be used with care, but if you spend much time with array formula, you’ll probably find it as useful as I do (if you haven’t already written it yourself).

Note further that it’s not really a “wizard” in the programming sense, in that it doesn’t do any of that walking-you-through-steps thing. It’s just a bit magic.

My (Im)perfect Cousin?

in which we start to worry about the source of our inspiration
Mona Lisa Vito: So what’s your problem?
Vinny Gambini: My problem is, I wanted to win my first case without any help from anybody.
Lisa: Well, I guess that plan’s moot.
Vinny: Yeah.
Lisa: You know, this could be a sign of things to come. You win all your cases, but with somebody else’s help. Right? You win case, after case, – and then afterwards, you have to go up somebody and you have to say- “thank you“! Oh my God, what a fuckin’ nightmare!


It is one of the all-time great movies, and netted Marisa Tomei an Oscar in the process. Yes it is. It really is1.

Not only that, but My Cousin Vinny2 throws up parallels in real life all the time. Yes it does. It really does3.

Why only recently, I was puzzling over the best (or least worst) way to implement a particularly nonsensical requirement for an intransigent client. After summarising the various unpalatable options in an email, a reply arrived from a generally unproductive source. The message content made it obvious that he’d somewhat missed the point but the conclusion he drew from that misunderstanding triggered a new thought process that gave us a new, even less, er, worser solution to our problem.

Sadly, my unwitting muse has moved on now, but he left his mark for all time4 on our latest product. I suppose he should also take partial credit for the creation of a hitherto unknown development methodology: Powerpoint-Driven Development, but that’s a story for another day.


1 All right, IMHO
2 See also My Cousin Vinny At Work, application of quotes therefrom
3 YMMV.
4 Or at least until we have a better idea and change the whole damn thing

Pin Me Up, Pin Me Down

(in which we return to VBA for a Small But Useful macro-writing session)

My friend Jens comes up with interesting questions. Even better, he doesn’t come up with them too often.

Today’s poser refers to Excel 2007, a technological marvel that has only reached our work PCs relatively recently. Jens had discovered the Usefulness that is the “pin” tool on the “Recent Documents” list:

So the pinned documents don’t go away, but they do move around on the list, rising up when used, drifting down as lesser workbooks are opened. What Jens wanted was to have his pinned files stay at the top. Or the bottom. Either, really, providing they hung around together.

We couldn’t find anything in Options, Advanced or otherwise, and some Googling didn’t turn up much of use1 and at time of writing (admittedly not too long ago) I hadn’t got any useful feedback on Stackoverflow. Time to apply a different set of talents.

Excel has Application.RecentFiles, which does pretty much what it suggests. Even better, the “.Add” method, when called with a file that’s already on the list, promotes that file to the top. If we could call that method for the pinned files, we’d be laughing. If we could identify them. It turns out we can.

Excel has to store the list somewhere so it can find it when restarting. Where do “modern” Windows programs store that kind of information? The registry2. Specifically, “HKEY_CURRENT_USER\Software\Microsoft\Office\12.0\Excel\File MRU”, where MRU stands, most likely, for Most Recently Used. My registry has 50 entries here, probably reflecting that Advanced Excel Options allows me to display up to 50 documents, screen real estate permitting.

So looking at the entries, we see that there’s one string for each of our MRU entries and that each string has some structure to it:

That first part: “[F00000001]”, is particularly interesting: the last digit is a “1” for all the files we’ve marked as pinned. So we can identify pinned files. All we need now is to be able to read the registry from VBA. We can’t use “GetSetting” because it’s limited in scope. More Googling and we discover that the Windows Scripting Host knows about registries so we add a reference to “Windows Script Host Object model” in order to be able to early-bind to WshShell.

So here’s the first attempt. Probably heaps of unhandled situations, but right now it works on my machine…

Public Sub PromotePinnedFiles()

Const KEY_ROOT = "HKEY_CURRENT_USER\Software\Microsoft\Office\12.0\Excel\File MRU\Item "

Dim itemIndex As Long
Dim itemData As String
Dim itemPath As String

On Error Resume Next ' general-purpose error-ignoring

With New WshShell ' Or use CreateObject("WScript.Shell") if reference not set
 For itemIndex = 1 To 50 ' I think 50 is the most Excel will store
  itemData = .RegRead(KEY_ROOT & itemIndex) ' get the item (ignoring errors if fewer than 50 present)
  If Len(itemData) > 0 Then
   If InStr(itemData, "*") > 0 Then
    If Mid(itemData, 10, 1) = "1" Then
     itemPath = Mid(itemData, InStr(itemData, "*") + 1)
     Application.RecentFiles.Add itemPath ' "promotes" the item to #1 in the list
    End If
   End If
  End If
 Next
End With

End Sub

The effect of which is this:

It just remains to decide how best to package it for my learned colleague.


1 Inevitably, there will be any number of comments telling me precisely where I should have looked.
2 Readers who answered “INI files” are invited to remove themselves to the back of the classroom, where the machines are still running Excel 3.0.

This Wheel Goes To Eleven

(in which we make an unexpected connection regarding the D in SOLID and get all hot under the collar about it)

Let’s not beat about the bush: I think I may have reinvented Dependency Injection. While it looks rather casual, stated like that, I’ve actually spent much of the last six months doing it. (Were you wondering? Well, that.)

I’ve been designing/building/testing/ripping apart/putting back together again a library/app/framework/tool thing that allows us to assemble an asset allocation algorithm for each of our ten or so products1, each of which may have been modified at various times since inception. It’s been interesting and not a little fun, plus I’ve been climbing the C# learning curve (through the three versions shipped since my last serious exposure) like Chris Bonnington on amphetamines.

Our products are broadly similar but differ in detail in some places. So there’s lots of potential for reuse, but no real hierarchy (if you can see a hierarchy in the little chart here, trust me, it’s not real).

So Product A need features 1, 2 & 3, in that order. B needs 1 & 4, C 1, 3 & 5, etc. What I came up with was to encapsulate each feature in a class, each class inheriting from a common interface. Call it IFeature or some such. At run-time, I can feed my program an XML file (or something less ghastly perhaps) that says which classes I need (and potentially the assemblies in which they may be found), applying the wonder that is System.Reflection to load the specified assembles and create instances of the classes I need, storing them in, for example, a List<IFeature>. To run my algorithm, all I need to do is call the method defined in my interface on each object in turn. A different product, or a new version of an existing one has a different specification and it Should Just Work.

It’s all very exciting.

So changing a single feature of an existing product means writing one new class that implements the standard interface and pointing the product definition at the library that contains the new class (which may – should – be different from those already in use).

The discerning reader may, er, discern that there are elements of Strategy and Command patterns in here as well. Aren’t we modern?

While all this is very exciting (to me at least – a profound and disturbing symptom of work-life imbalance) it’s still not the end of the line. I’ve built functions and then chosen to access them serially, relying on carefully (or tricky & tedious) XML definitions to dictate sequence. I’m thinking that I can go a long way further into declarative/functional territory, possibly gaining quite a bit. And there’s a whole world of Dynamic to be accessed plus Excel and C++ interfaces of varying degrees of sexiness to be devised .

More on much of that when I understand it well enough to say something.


1 There are billions at stake, here, billions I tell you.

All You Wanna Do Is Jaw-Jaw

(in which we apply our face, at speed, to the brick wall of Enterprise IT Development Methodologies)

I am, for my sins, embroiled in a Project with our IT people. I assume that some Holy Standards Manual1 somewhere must decree that in order to transition from stage 0.7.4.1.a to stage 0.7.4.1.b2 there is a requirement to develop and publish a “Communication Plan”, such a document having also assaulted my inbox this week. That’s an eight-slide Powerpoint presentation, adorned with pithy quotes and charmingly apposite images. More Powerpointedness than I’ve personally produced in my life, for content that could adequately be summarised in a four-sentence email. Three, if you worked at it. Something like this:

  1. There will be a weekly meetings: on Thursdays at 1400 to review status and issues. (There are a couple more, but they don’t have general applicability so let’s not bother everybody with those).
  2. Everybody should talk to each other as often as necesary to ensure that lack of information does not hold up progress (I’m paraphrasing wildly here, attempting to extract some sense from two ludicrously complicated diagrams)
  3. Documents will be stored on the Sharepoint server that everybody already uses.
  4. Communication will be in English except where everyone involved speaks German and no non-German speaker will need to read/listen.

There. I did it in four without even trying that hard. Took about five minutes. No pictures though. Sorry about that.


1 With no apparent irony, a document received this week declares that the Project is “following a strict Waterfall approach.”
2 The stages may be more fine-grained than that – I don’t know and I don’t want to know.

Ghosts in the Machines

in which we consider the death of Social Networking? How Seasonal…

The world is Socially Networked these days, or at least a seriously chunky proportion of the interwebular world is. I’ll admit to being present (to a greater, or more likely lesser extent) on Facebook, Friends Reunited, Twitter, LiveJournal and to having at least registered on a good few others.

Beyond those, there are places like WordPress, Stackoverflow (and sundry other StackExchange sites), Blogger, Opera, Google, Yahoo! etc etc ad nauseam. You’re reading this so it’s likely that you’re aware of most, if not all of this stuff.

As well as grumpy, I’m also old. Well, by the dictionary I appear to be in the first blush of middle-age, but when you reach the stage in life when all your relatives from the previous generation have died and your peers are starting to shuffle off this mortal coil then you’re at least beginning to be aware – uncomfortably so – of mortality.

So right now I’m all over the Internet. But in fifty years it’s statistically almost certain that I won’t be actively contributing any more.

People die.

More pertinently to my musings here, people on Facebook die. Of the ten countries with the highest estimated Facebook populations, assuming their age distributions are similar to the general population (obviously not, but stay with me) and that mortality rates as per the 2009 CIA World Factbook can be applied, then we get this:

Country FB Pop (m) Deaths/K FB deaths
USA 145.3 8.38 1,217,614
Indonesia 31.4 6.26 196,564
UK 28.8 10.02 288,576
Turkey 23.8 6.10 145,180
France 20.3 8.56 173,768
Philippines 18.8 5.10 95,880
Mexico 17.8 4.80 85,440
Italy 17.6 10.72 188,672
Canada 17.4 7.74 134,676
India 16.5 6.23 102,795
TOTAL 2,629,165

..or about 2.6 million Facebook users dying each year. Obviously the estimate may be a little low, there are only ten countries represented for one thing. What are you going to do when all your friends are dead?

Add in the all the other sites and we’re looking at a small-to-medium sized country-worth of “ghost” accounts, persisting – and inviting interaction, even.

I’m not offering any solutions – heck, I’m not even suggesting that this is even a Bad Thing – perhaps it’s as close to life after death as we’ll ever get. I just keep extrapolating to some inflection point where half the content on the Internet was written by people who are now dead.

So if I stop posting here, is it because I’m bored, out of things to write about or something more … terminal? And if the latter, how will you know?