Grumpy Old Programmer

He’s old, he’s a programmer and he’s grumpy

Johnny 99

with one comment

Sometimes I write legacy code. There, I’ve said it. The secret’s out, the dirty laundry’s aired and the cat’s out of Pandora’s box.

I don’t think it makes me a bad person. I understand the value of tests, especially in the highly incremental development world I currently inhabit, and I strive to use tests to drive my code. Sometimes it doesn’t happen, for many reasons, none of which I’m proud of, such as my Excel test framework being a little clumsy, there not being anyone around me to nag, the “quick” change that turns into just a little more uncovered code than I expected (but it works and the user needs it Right Now).

I understand that all that code sitting in the wild in an uncovered state is Legacy and represents an accumulation of Technical Debt that will have to be repaid. By me.

It’s not that code without tests is necessarily bad. I mean, heck, even Kent Beck sometimes flies without a safety net. If it’s write-once-and-throw-away code, then there’s an argument for just getting it done. But honestly, how often does that code get resurrected months, maybe years later? It’s when code needs to be changed, often requiring some refactoring in the process, that the absence of that warm, covered-by-tests feeling starts to be felt.

Still there, still doing a job, but just try changing it...

Still there, still doing a job, but just try changing it...

The trouble with beginning to repay technical debt is that interest tends to accrue, compounded, at some arbitrary (but almost always positive) rate. So no matter how trivial the original untest-covered change seems to have been, the longer you leave it, the more unpleasantness tends to have accreted around it by the time you come back to it. I suppose there’s always the possibility that you’ll come back to discover a glistening pearl, but in my programming life I’ve never returned to anything other than a thick coating of rust.

Worse than that, getting the encrusted nodule of code under test usually turns out to be painful: it’s seldom structured as it would have been had tests been used in the first place, so unwanted and hard-to-separate interdependencies are rife and the whole thing becomes, well, a bit tricky.

I’d been meaning to buy Michael Feathers‘ “Working Effectively with Legacy Code” for years and recently got around to buying it. Scott Hanselman’s interview last week with Mr Feathers was a serendipitous bonus. The book, even for non-Java or C++ programmers, is excellent, full of practical advice on how to break things up in a way that can let you get tests around the locus of change. In fact, when you see how just plain nasty working with C++ code can be, working with code in any other language starts to look like a breeze.

I wish I’d read the book earlier.

Written by mikewoodhouse

3 July 2009 at 11:20

Posted in Development, design

Tagged with , ,

The Glove Puppet Programmer

leave a comment »

While I prevaricate over the four (!) drafts sitting at various stages of incompletion in my “Posts” view, I’ll take lunchtime out to reminisce about a non-contemporaneous* colleague and his unique status as probably the most expensive programmer I’ve ever had to deal with.

This particular individual, who was either French or French Canadian, had the singular identifying characteristic of a name that was a literal translation of a stuffed children’s TV character from my childhood. This in itself should have been a warning.basil_brush He had created the Credit Risk reporting system that a team of ten expensive contractors, myself including, were labouring feverishly to replace. On his own. As a series of Excel spreadsheets. With extensive VBA macros, coming from no discernible programming background. If you’re wincing, I’m not surprised.

Why am I asserting he was peerless in the cost stakes? Because the application he’d built actually worked. Badly, slowly, requiring regular manual intervention, occasionally obviously inaccurate and more frequently rather less so, but it worked. And we were trying to reverse-engineer it**, distinguish between the right and wrong parts and deliver something that was everything the old “system” was not. Well, almost: we did need our system to work.

This is one of the gems I discovered in the VBA:


    br_OutputWB.Worksheets(bv_sOutPutWSName).Range("A1").offset(br_lCount + 1, 0).value = bv_sSTRATEGY
    br_OutputWB.Worksheets(bv_sOutPutWSName).Range("A1").offset(br_lCount + 1, 1).value = bv_sTRADENAME
    br_OutputWB.Worksheets(bv_sOutPutWSName).Range("A1").offset(br_lCount + 1, 2).value = bv_sTRADESTATUS
    br_OutputWB.Worksheets(bv_sOutPutWSName).Range("A1").offset(br_lCount + 1, 3).value = bv_sStructType
    br_OutputWB.Worksheets(bv_sOutPutWSName).Range("A1").offset(br_lCount + 1, 4).value = bv_sSTRUCTMODEL
    br_OutputWB.Worksheets(bv_sOutPutWSName).Range("A1").offset(br_lCount + 1, 5).value = bv_sTRANCHENAME
    br_OutputWB.Worksheets(bv_sOutPutWSName).Range("A1").offset(br_lCount + 1, 6).value = bv_iTRANCHENUMBER
    br_OutputWB.Worksheets(bv_sOutPutWSName).Range("A1").offset(br_lCount + 1, 7).value = bv_lTRANCHENOTIONAL

    ... are you getting the picture? We continue incrementing the offset for a while, until:

    br_OutputWB.Worksheets(bv_sOutPutWSName).Range("A1").offset(br_lCount + 1, 29).value = bv_sCOMMENTDEFAULT
    br_OutputWB.Worksheets(bv_sOutPutWSName).Range("A1").offset(br_lCount + 1, 30).value = bv_sNEWTRANSID
    br_OutputWB.Worksheets(bv_sOutPutWSName).Range("A1").offset(br_lCount + 1, 31).value = bv_sCOMMENT

In case you were wondering, the “bv_” prefix signifies “By Value.” And yes, that does mean that each of those values appeared as parameter to the routine. I counted 32 in total. That’s thirty-two input parameters. You really should be wincing by now.

gogglesSeveral other routines also had the same (or 99% the same) code blocks. Our friend the cut-and-paste programmer must have really earned his money that day…

Oh yes, you should see an example of how our chum called his parameter-heavy subroutines. Consider acquiring welding goggles before looking at this, by the way.


Call SS_BuildMTMByTrade(br_OutputWB:=OutputWB, br_lCount:=lMMcount_SS, bv_dbFXrate:=GetFXRate(vMM_FXRates, RgCursor.offset(lOffset, Range("TRANCHECCY").Column - 1).value), _
bv_sSourceFile:=m_sSS_MM_Source & IIf(bv_sTradeNamePrefix <> "", bv_sTradeNamePrefix & RgCursor.offset(lOffset, Range("TRADENAME").Column - 1).value, RgCursor.offset(lOffset, Range("TRADENAME").Column - 1).value) & "_" & "MM" & "_" & RgCursor.offset(lOffset, Range("MM_CSB").Column - 1).value & "_" & RgCursor.offset(lOffset, Range("MM_R").Column - 1).value & "_" & IIf(RgCursor.offset(lOffset, Range("MM_PW").Column - 1).value, "T", "F") & "_" & IIf(RgCursor.offset(lOffset, Range("MM_SCS").Column - 1).value, "T", "F") & "_" & Format(m_dSS_MM_Asof, "yyyymmdd") & ".xls", bv_dbParticipationFactor:=RgCursor.offset(lOffset, Range("DEALNOTIONAL").Column - 1).value / RgCursor.offset(lOffset, Range("TRANCHENOTIONAL").Column - 1).value, bv_iShift:=RgCursor.offset(lOffset, Range("MM_R").Column - 1).value, bv_bIgnoreFirstCprty:=CBool(RgCursor.offset(lOffset, Range("IGNORECPTY").Column - 1).value), _
bv_dbSpread:=RgCursor.offset(lOffset, Range("DEALSPREADBP").Column - 1).value, _
bv_bUsePVFormula:=RgCursor.offset(lOffset, Range("USEPVFORMULA").Column - 1).value, _
bv_sSTRATEGY:=RgCursor.offset(lOffset, Range("STRATEGY").Column - 1).value, _
bv_sTRADENAME:=IIf(bv_sTradeNamePrefix <> "", bv_sTradeNamePrefix & RgCursor.offset(lOffset, Range("TRADENAME").Column - 1).value, RgCursor.offset(lOffset, Range("TRADENAME").Column - 1).value), _
bv_sTRADESTATUS:=RgCursor.offset(lOffset, Range("TRADESTATUS").Column - 1).value, _
bv_sStructType:=RgCursor.offset(lOffset, Range("STRUCTTYPE").Column - 1).value, bv_sSTRUCTMODEL:=RgCursor.offset(lOffset, Range("STRUCTMODEL").Column - 1).value, _
bv_sTRANCHENAME:=RgCursor.offset(lOffset, Range("TRANCHENAME").Column - 1).value, bv_iTRANCHENUMBER:=RgCursor.offset(lOffset, Range("TRANCHENUMBER").Column - 1).value, bv_lTRANCHENOTIONAL:=RgCursor.offset(lOffset, Range("TRANCHENOTIONAL").Column - 1).value, bv_sTRANCHESUBORDINATION:=RgCursor.offset(lOffset, Range("TRANCHESUBORDINATION").Column - 1).value, bv_sTRANCHECCY:=RgCursor.offset(lOffset, Range("TRANCHECCY").Column - 1).value, bv_dtTRANCHEMATURITY:=RgCursor.offset(lOffset, Range("TRANCHEMATURITY").Column - 1).value, _
bv_sBUYSELL:=RgCursor.offset(lOffset, Range("BUYSELL").Column - 1).value, _
bv_lDEALNOTIONAL:=RgCursor.offset(lOffset, Range("DEALNOTIONAL").Column - 1).value, _
bv_iDEALSPREADBP:=RgCursor.offset(lOffset, Range("DEALSPREADBP").Column - 1).value, _
bv_sUPFRONTFEE:=RgCursor.offset(lOffset, Range("UPFRONTFEE").Column - 1).value, _
bv_sCOUNTERPARTY:=RgCursor.offset(lOffset, Range("COUNTERPARTY").Column - 1).value, _
bv_dtTRADEDATE:=RgCursor.offset(lOffset, Range("TRADEDATE").Column - 1).value, _
bv_dtSETTLMTDATE:=RgCursor.offset(lOffset, Range("SETTLMTDATE").Column - 1).value, _
bv_sREPACKVEHICLE:=RgCursor.offset(lOffset, Range("REPACKVEHICLE").Column - 1).value, _
bv_sCOMMENT:=RgCursor.offset(lOffset, Range("COMMENT").Column - 1).value, _
bv_lADRNOTIONAL:=RgCursor.offset(lOffset, Range("ADRN").Column - 1).value, _
bv_sCOMMENTDEFAULT:=RgCursor.offset(lOffset, Range("COMMENTDEFAULT").Column - 1).value, _
bv_sNEWTRANSID:=RgCursor.offset(lOffset, Range("NEWTRANSID").Column - 1).value, _
bv_sOutPutWSName:=MMTEMPSHEETNAME, _
bv_BookName:=RgCursor.offset(lOffset, Range("BOOKNAME").Column - 1).value, bv_UniqueID:=RgCursor.offset(lOffset, Range("UNIQUEID").Column - 1).value, bv_KMVCorr:=RgCursor.offset(lOffset, Range("KMVCORR").Column - 1).value, bv_MarketCorr:=RgCursor.offset(lOffset, Range("MARKETCORR").Column - 1).value, bv_RunFlag:=RgCursor.offset(lOffset, Range("RUNLISTFLAG").Column - 1).value, bv_DataSource:=RgCursor.offset(lOffset, Range("SOURCETYPE").Column - 1).value)

The second line is about 900 characters long. Nine. Hundred. If you’re starting to want to make it go away, perhaps by clawing out your eyeballs, I don’t blame you.

Finally, for dessert, if you will, or maybe like one of those hot steamy towels presented at the end of some Indian meals, here’s our friend’s take on copying an array:


Public Sub CopyArray(ByVal bv_FromArray As Variant, ByRef br_ToArray() As Variant)
    Dim l As Long
    Dim i As Integer

    For l = LBound(bv_FromArray, 2) To UBound(bv_FromArray, 2)
        ReDim Preserve br_ToArray(1, l)
        For i = LBound(bv_FromArray, 1) To UBound(bv_FromArray, 1)
        Debug.Print bv_FromArray(i, l)

            br_ToArray(i, l) = bv_FromArray(i, l)
        Next
    Next

End Sub

From the blank line following the “Debug.Print” line I surmise that some operation may, in the distant past, have been occurring that made the routine more than a criminal waste of electricity. I hope the use of lower-case “L” is a soothing balm to your eyes, as is the continuous “ReDim Preserve” of the outer loop…

I don’t know the final cost of the project that replaced this, I elected not to renew my contract after nine months. I’d guess it didn’t exceed $10 million, not counting the possible losses incurred from trading on inaccurate information in the several years the “system” was live.

I’d call that dangerous.


*we worked at the same place but not at the same time, if that makes any sense

** specs? Don’t be ridiculous.

Written by mikewoodhouse

12 June 2009 at 13:38

harrylillis.com would probably have been cheaper

leave a comment »

Putting 2 and 2 together, Jeff Atwood appears to have paid* a fairly large (to me) sum to acquire the superuser.com domain. I wonder how much Microsoft paid for bing.com?

I switched my default search engine to Microsoft’s new beta search engine yesterday. Today I switched back to Google. Not that bing was all that bad – to be honest I couldn’t see much difference between what it gave me and what I see from Google. The background picture, which I guessed was of some Greek island yesterday (it’s somewhere different, but similarly attractive today) was certainly pleasant.

The killer was that after Firefox (3.0.10) reported that the page load was “Done” (and the results certainly seemed to be present) there was a delay – during which time FF froze completely – of about 12 seconds, after which my browser shook itself and woke up.

I don't need to search to find Microsoft being annoying...

I don't need to search to find Microsoft being annoying...

Bing could be the best search engine in the world ever and I’d still not use it if that delay were present. I can’t believe it occurs for all users for all browsers but I’m only me and I prefer FireFox. It could be some interaction with one or more of my – fairly standard – plugins. Maybe I’ll try it again when the beta is done.

Your mileage, of course, may vary.

* Unless of course he was tweeting about another one, which is entirely possible.

Written by mikewoodhouse

2 June 2009 at 10:13

Posted in Web technology

Tagged with , , ,

Version control for Excel workbooks, Part 2

with 2 comments

(stackoverflow rep: 4824, Project Euler 72/241 complete)

Not a month ago (about seven , actually) I started to gibber about source code version control and Excel. Then Stuff got In The Way and I was derailed. Well, Ruby-on-Railed, really – I’ve been almost totally immersed in intranet development and related matters and mostly having lots of fun.

exportThen along came Internal Audit (shudder). Some degree of consternation appeared to exist about the potential (albeit slight) vulnerability of our Source Control solutions (SourceSafe for Visual Studio and Excel stuff, subversion for the webby bits). Combined with the newly-available – and rather sexy – new enterprise SourceForge that had materialised, we bit the bullet and set out to put everything into the Company-Approved Solution. Which dragged me seven months into the past to ask again – how does one diff an XLS?

That’s not a rhetorical question, I do have a sort of answer. I’m thinking along the following lines:

  • Some VBA component-extraction code in an add-in
  • Perform the export on a Save or Close
  • Save VBA references as a separate text file

I’m not going to get into dumping worksheet content, although there are arguments for doing so.

Excel exposes the handy WorkbookBeforeClose and WorkbookBeforeSave events on the Application object. We’ll need an Application object and it’ll have to be declared WithEvents, so we need a class. I’m going to call it WorkbookDecomposer:


Option Explicit

Public wkbk As Workbook
Public WithEvents xlapp As Application

Private Sub xlapp_WorkbookBeforeSave(ByVal Wb As Workbook, ByVal SaveAsUI As Boolean, Cancel As Boolean)
    Set wkbk = Wb
    Decompose
End Sub

(We’ll have the same code for the Close event). Each component can be exported, and the references can be dealt with something like this:


Public Sub ExportReferences()

Dim ref As Reference

    With New FileSystemObject
        With .OpenTextFile(.BuildPath(wkbk.path, wkbk.Name & ".references"), ForWriting, True)
            For Each ref In wkbk.VBProject.References
                .WriteLine RefToString(ref)
            Next
        End With
    End With

End Sub

To start the thing, we can have a module with a little Sub such as:


Option Explicit
Option Private Module

Public decomposer As WorkbookDecomposer

Public Sub StartDecomposer()
        Set decomposer = New WorkbookDecomposer
        Set decomposer.xlapp = Application
End Sub

Now, as long as we don’t lose state, running this code once will give us an object that will automatically export copies of our code for storing separately in our code control system. So we can diff. The decomposer can be extended to shell out useful source control instructions such as commit, perhaps automatically, perhaps not.

I don’t like that “as long as we don’t lose state” thing. When we’re coding in VBA we lose state all the time. The best I can come up with is to use an OnTime macro to keep trying to restart the decomposer object:


    If decomposer Is Nothing Then
        Set decomposer = New WorkbookDecomposer
        Set decomposer.xlapp = Application
        Debug.Print "Decomposer (re)started @ " & Now
    End If
    Application.OnTime Now + TimeSerial(0, 0, 10), "StartDecomposer"

It sort of works, but it’s potentially intrusive when we have incomplete/invalid code (hardly unlikely in the scenario under consideration) and the OnTime macro may fail to fire for many reasons.

What are the alternatives? Custom external scripts? An external process that monitors changes on XLS files and auto-exports?

Written by mikewoodhouse

20 April 2009 at 16:28

Fifty Candles (each)

with one comment

(stackoverflow rep: 4675, Project Euler 72/240 complete)

The first programming language I learned was a peculiar version of BASIC, running on the ICL mainframe installed during my abbreviated university career. I seem to recall it used a magnetic drum as its primary storage device. My second programming language, and the first I was ever paid real money for working with, was COBOL. It was the lingua franca of business computing, had been around forever and I started to learn it early in 1979. It turns out that “ancient” old COBOL had at the time only been around for about 20 years. As indeed had I, and this year we both turn 50.

Ah, the fun we had!

Ah, the fun we had!

When the only tool you have is a hammer, everything looks like a nail. (I remember vividly a school “handyman”, inexplicably nicknamed “Sausage”, who would repair desks by applying a large hammer to drive screws). I used COBOL for things to which it really wasn’t suited. For a couple of years, that didn’t just mean writing peculiar code, it meant punching cards on an IBM 029 punch machine.

Programming for work was done on paper coding sheets, converted into machine-readable format (80-column punched cards) two evenings a week by Hazel the Punch Girl. Times change.

Once upon a time all code was written on these

Once upon a time all code was written on these

It wasn’t as bad as you might think: we only got to compile or run our code twice a day anyway, the rest of the time involved pencils and paper. Lots of paper. Some things change less than others.

Anyway, designing, coding, compiling and testing/debugging a program was a mammoth task: it took months. Lots of months. I think in the three years I was a programmer in my first job I wrote about eight complete programs.

Virgin input to the 029...

Virgin input to the 029...

In all, I was primarily a COBOL programmer for about 12 years, although there were secondary activities in PL/1, Fortran and C in the same period. I haven’t written a line since some time in early 1990. Can’t say that I miss it, not even now that it has object-orientation, a scary concept for a language that didn’t even use to have data scoping below “global”.

So happy birthday to COBOL, whenever it falls during the year. I can’t say I miss you but you paid the bills for over a decade, and for that I’ll always be grateful.

Written by mikewoodhouse

15 April 2009 at 13:41

What Would Proudhon Do?

leave a comment »

(stackoverflow rep: 3958, Project Euler 66/236 complete)

Dick Kusleika wrote about some of his personal code construction methods in Daily Dose of Excel recently. Something that caught my eye was this:

I generally end up with a lot of Property statements that do nothing but read and write to a module-level variable.

Probably fairly familiar, goodness knows I’ve written enough of those, and I never used MZ-Tools to automate the process, either. So I was nodding away as I read, muttering “amen brother” and whatnot, when I had a vicious attack of the Whys.

Why do we write a Property Set/Let/Get just to set, let or get a member variable? What’s a Property for, anyway?

I think a Property should probably only be used to assign a value when some processing needs to occur other than simply setting the member variable. Code such as:


Private mName As String
Public Property Let Name(value As String)
    mName = value
End Property
Serendipitously, I just completed Project Euler problem 84...

Serendipitously, I just completed Project Euler problem 84...

…just doesn’t seem to be adding any value. Rename the member variable, make it Public and move on, secure in the knowledge that you just saved some keystrokes and reduced your carbon footprint (well, maybe) by some infinitesimal fraction of a gram. The world thanks you.

Now if you need to validate the input in some way, a Property Let may be the ideal place to do so. It may not, as well, but that’s not important right now.

What else? What if you need something to happen, other than validation, when you set a property? Here’s where I’d suggest that you’re doing something wrong: I’m not a big fan of hidden side-effects (British understatement warning) – I have too many scars from self-inflicted wounds from doing that. If setting a property causes some activity or state-change, then the code should probably be refactored into a Sub with a suitably informative name.

I suppose write-only properties need a Let/Set (so that we can ensure no Get is written) but how often do we need that?

Otherwise, let’s just access the member variable directly without the indirection (think of the planet).

Getting.

Honestly, how different is a Property Get from a Function?

Given a class defined thus:


Public ws As Worksheet

Public Property Get Wksh() As Worksheet
    Set Wksh = ws
End Property

Public Function WkshF() As Worksheet
    Set Wksh = ws
End Function

… what are the differences between the three Sets here:


Public Sub GetWksh()
Dim wsh As Worksheet
    With New Class1
        Set wsh = .ws
        Set wsh = .Wksh
        Set wsh = .WkshF
    End With
End Sub

I submit, Your Honours, that the only time a Property Get makes much sense over a Function is when there’s no pre- or (ugh) post-processing around the accessing of the member variable. In which case, why bother?

Anyway, Proudhon wouldn’t have liked Properties.

Written by mikewoodhouse

17 March 2009 at 15:11

Posted in Development, Excel

Tagged with , ,

Round-tripper

leave a comment »

(stackoverflow rep: 3856, Project Euler 63/235 complete)

Good grief. I wrote the first draft of this about a month ago, planning on completing and posting it when the code was done. I expected that to take a few more days. A little more work required on the estimating front, then.

I’m starting to go off Oracle.

Let me put that into context a little. I first encountered Oracle some time in 1998, when version 5 was all the rage. I’d actually taught data analysis, third and fifth normal form, stuff like that for a few years previously but actual hands-on table creation had to wait. Strange but true. Anyway, over the next two or three years, some of which I spent as “Technical Architect” for the investment bank where I worked, I got to be something of a whiz with both version 5 and the swanky new version 6. Heck, I know the query optimiser’s rules off by heart. I’m not just blowing my own trumpet, mind: when I was untimately (fortuitous typo retained) laid off, I was offered a job by Oracle, which I rejected because I didn’t want to take a pay cut.

I spent five more years in Oracle-land with another bank before drifting into the realms of Sybase in its early MS SQL Server guise, and then Sybase itself across three jobs and four years (it seems like longer). Now, fourteen years after we parted company, Oracle and I are back together.

But we’ve both changed. I no longer code in COBOL and have acquired a pathological dislike of business logic in the database. Oracle has a cost-based optimiser, loves to grab all your business rules (more processors = more revenue) and has become a fat bloated porcine creation. Even the free “personal” 10g Express Edition for Windows is a 165MB download.  (OK, SQL Server Express 2008 is even larger, I checked). When running, the thing takes out a 642MB virtual machine. OK, it’s almost entirely swapped out, but still.

How we did parallel processing in the old days

How we did parallel processing in the old days

But Oracle is still a helluva fast platform. Unoptimised I was seeing about 8K inserts a minute on my development PC, three times that on a real server. Unfortunately our db server currently lives abroad for tax reasons (or something) and the network latency is fierce. About 900 inserts a minute fierce. So I needed to batch up my inserts or enter the living hell that is SQL Loader.

In order to get multiple insert processes working within my Ruby On Rails-based code, I split each file into several fragments, then run a separate process on each fragment. This takes a bit of doing, generating lots of CMD files that run ruby scripts with “START [/WAIT] CMD /C whatever_I_want_goes_here“.

My file-splitting code, I thought, was rather spiffy – it needs to put the headings from the original to each fragment (because they’re used to figure out what’s in the file) then it starts dealing out the records:


def create_fragment_files(paths)
  File.open(file_path, 'r') do |fin|
    hdgs = fin.readline.chomp
      files = paths.map { |path| File.open(path, 'w+') }
      files.each { |fout| fout.puts hdgs }
      fin.each_line do |line|
        files.first.puts line
        files.push files.shift # the first shall be last...
      end
    files.each { |fout| fout.close }
  end
end

There are faster ways, I’m sure – I could calculate the “ideal” file size and dump records into a file until it’s reached, but this is fast enough (well under a minute for an 85MB file) and it pleases me.

There’s a handy little library, ar-extensions, that makes batching of inserts possible within ActiveRecord (which is the default data mapping library within Rails). It works nicely with MySQL, but turned out to have the Oracle code stubbed and invalid. It only took me a day or two to find a solution to that problem, although I still haven’t figured out how to push an update through a proxy server to github. Finally a chance to do something open sourceful, and I’m thwarted at every turn.

So all in all, it’s taken a month. OK, a month in which a lot of other stuff got done, but still.On the plus side, I just fired it up and I’m watching about 36,000 inserts a minute go through. It’ll be faster when the lookup tables are fully populated. (Another day on, and I’m looking at it: 46,000 – and I still have a few tricks up my sleeve)

While the nearly-two years’ of data is backfilling I now get to rewrite the front end.

And the point of this post? In no small part, to remind me of what I actually spent the lion’s share of the last month doing. Also, to record my first-ever open-source contribution, even if I still haven’t worked out how to get my source out into the open.

If you have been, thanks for your forebearance.

Written by mikewoodhouse

12 March 2009 at 13:14

Lazing On A Sunny Afternoon

with 2 comments

(stackoverflow rep: 2906, Project Euler 59/230 complete)

Below (lightly edited) is a recent answer to a question on StackOverflow. The question is pretty much a waste of time, but what an answer!

Laziness is indeed the first of the Three Programming Virtues, but it is misunderstood. Programming Perl defines it well:

* LAZINESS: The quality that makes you go to great effort to reduce overall energy expenditure.

Good programming calls for laziness, but laziness requires hard work. Good programmers must constantly think of and implement new ways to be lazy. The first compiler had to be written in assembly, and the first assembler had to be written in machine language. Wonderfully lazy, but hard. You don’t get to call it a day after an hour just because what used to take a day takes an hour.

That’s about as close to an encapsulation of my personal programming philosophy as I’ve ever seen. The virtue of Laziness is closely related, if not identical to the DRY (Don’t Repeat Yourself) Principle as espoused by the Pragmatic Programmers in The Pragmatic Programmer.

(I just found myself wondering if there’s a way to refactor that last sentence to remove some of the duplication)

Laziness - all I have to do is this...

Laziness - all I have to do is this...

The idea behind xlUnit (which has been languishing somewhat since being codeplexed in, good grief, September) was to minimise the amount of manual repetition involved in testing Excel/VBA code as it is developed. As a useful side-effect, it also made TDD (Test-Driven Design, or Development, depending on your choice of definition) possible. A fair amount of the code complexity of the framework is involved in eliminating repetitive manual tasks such as test class creation.

There are only three significant UI-level entry points in xlUnit as it stands: application creation (used rarely, only about once per app) class creation, used rather more often, and test execution, which is used all the time. Oh, and there’s a little “Options” dialog, but it only has one setting and that’s not really very interesting so we’ll gloss over that.

Since the first two things are used relatively infrequently, I tucked them away on a menu, whereas the test execution, which is an all-the-time thing, gets a toolbar button. I could have been fancier with that, but I never got past the simple text “Run Tests”, which had the extra benefit of giving a larger target area for the mouse.

...and I get this for free.

...and I get this for free.

There is no interface within the VBE, which is most definitely a shortcoming. To be honest, I only even tried briefly (failing, obviously) to get it to work: with a dual screen setup, which my working environment has had for the best part of a decade, both workbook and VBA are usually visible, so I can get to the “Run” button easily enough. For a similar reason, there is no built-in keyboard shortcut at present. Of course, in Excel 2007, it’s all a bit of a mess. I have the (massive) RibbonX book, by the way, I just need the intestinal fortitude to sit down and read it.

The “Create Testable Application” and “Create Testable Class” routines both live in the add-in itself and are located within UserAccessibleEntryPoints.bas. I’ll come back to “Run Tests” another time – not only was it a bit tricky but I have some new ideas that would make it even trickier.

Choosing to create an application causes a throwaway instance of the xlUnitCodeBuilder class to create a workbook and its tester and create references from the tester to both the application (so it can call classes there) and the framework (so it can use the framework). The same class contains the code to create new classes. I think I put the functions together because they’re about building code but I don’t think I’ve achieved very good separation of concerns here: there’s workbook creation and test class creation and they would probably be better off being separated. Since there’s work to do with Excel 2007 anyway, I think I’ll refactor that bit next time it’s open.

I mentioned “throwaway instance” above without explanation: it’s a way I describe this VB pattern:

    With New UsefulClassThatDoesntReallyNeedToBeDimmed
        .DoSomethingProfound
        .DoSomethingSimilarlyDeep "AndMeaningful"
    End With

Is there a more “standard” way to describe this? Something using “anonymous” perhaps? Anyway, I use it a lot (and I’m now fervently hoping that no-one points out some awful risk that I’m running as a consequence) as it’s a way to get part of the economy of class (”static” methods in Java or C#, where the word means something rather different than in VBA) method.

Written by mikewoodhouse

5 February 2009 at 14:20

[Wi|Ga]dgets – what’s the use?

with one comment

(stackoverflow rep: 2539, Project Euler 47/227 complete)

Gadgets? We don’t need no stinking gadgets!

Microsoft, Yahoo! and Google, to name but three proponents, each have a desktop widget (or gadget, or whatever) model, either providing stuff spread out all over your Windows (are OSX and other *nix environments similarly afflicted?) desktop or tucked away in a “Sidebar”. These handy-dandy little applets provide information on such vitally otherwise hard-to-get functions such as:

  • The weather (for those not in sight of a window?);
  • RSS feeds (for those who want to read their feeds in a 2″x1″ window);
  • An e-mail notifier (because you still don’t understand the meaning of ‘asynchronous’);
  • The time (for those who can’t read the little digital clock on their taskbar and who don’t own a watch);
  • A calculator (because the world needs another computer desktop calculator);
  • Resource usage monitors (so you can tell your computer isn’t running slow);
  • A mediocre controller for your media player (because media players don’t have minimised controls … oh, yes they do);
  • Teeny-tiny picture viewers (for when flickr is just too much detail)
  • Out-of-date stock tickers (so you can see in delayed real-time how much poorer you are today)

… you get the idea.

Even better, all these are neatly tucked away on a “sidebar” (a sort of non-window window) or even better, spread out all over the shop, which is exactly what you want on a two- or three-screen setup. And they’re all handily concealed beneath the applications you’re running at any time. You know, those stupid time-wasters like Outlook, Excel, Firefox, Word, Visual Studio, a couple of Explorers and a database utility or two.

Marvellous.

What we seem to have here are a collection of (sometimes) graphically pleasing little applications that deliver functionality available elsewhere, each having one or more of the following drawbacks:

  • Always-present, seldom needed;
  • Inadequate functionality;
  • Duplicate of something that’s already fit-for-purpose;
  • Pointless eye-candy;
  • Invisible.

Unloved

Looking at Yahoo!’s programming category, I find that the most popular has been downloaded 80,000 times. It’s a widget that performs geolocation for a given IP address. With a flag. I’m trying to imagine a situation where I (or anyone) would need that often enough to abandon a browser-based function, opting for a desktop-resident applet against a “proper” application because I don’t need it that much. I don’t exactly see the “programming” connection either, come to think.

Yahoo programming widget downloads as at 12-Jan-2009

Data as at 12-Jan-2009

I’m probably not being fair – I thought the most useful stuff would be written for programmers. What does it look like overall? I can’t tell much from Google’s list because they don’t give download stats, although sorting by popularity shows the expected four C’s (clock, climate, calculator, calendar). So back to Yahoo! where the current Number One, with a snappy 4.5 million downloads, is Yahoo! Weather. In fact, as I write this, a whopping six widgets have passed the million mark. Only two are clocks.

Oh look - Microsoft Sidebar does weather too

Oh look - Microsoft Sidebar does weather too

The highlights of a quick-and-dirty breakdown of the top 100 are 19 fun-and-games, 15 system monitor thingies, 14 clocks, 9 calendars, 8 media players, 7 weather reports, 6 post-it notes, 5 gold rings. Very similar to the Google list. I was too depressed to look at Microsoft’s in any detail, but it’s the same ol’ same ol’, although their weather widget claims over 22 million downloads. I’m guessing it’s downloaded automatically when a Vista PC connects to the Internet…

Number one Google gadget - because you can never have too many clocks

Number one Google gadget - because you can never have too many clocks

I’m not getting it. Looks like many others aren’t either. The numbers of people who come back to provide a rating are miniscule: about 7,000 for the weather app. About 0.16%

Apart from the shocking paucity of imagination in the applets themselves, what’s wrong with the whole idea? (IMHO, of course, YMMV).

Real estate is Precious

There are people out there who have enough monitors to be able to allocate space for widgets. Two 1280×1024 screens isn’t enough for me though. Utility drops to almost zero if the things aren’t always available. Google make things worse by allowing applets to live anywhere on the desktop. Stuff needs to go in a sidebar that manages window maximisation to keep itself visible. So it needs to be on the right- or left-most monitor, unless you’re prepared to give up on dual-screen workbooks. For the average user, the bar needs to be a lot narrower than at present to be tolerable. Somewhere between, say, 25 and 50 pixels? I could live with that.

Useful vs Pretty

Useful doesn’t always win. Useful-but-ugly often doesn’t get past “Go”, whereas Pretty at least gets a chance. Long-term, it’s got to have both: too much of what’s on offer seems to be limited to pretty useless.

To offer a compelling argument against rapid deletion, applets have to either provide something in a better way than is currently available or provide something that isn’t available at all elsewhere . Example: there is a Ruby script that allows, from the command line, simple copying of files to an Amazon S3 bucket. Useful. Maybe we could have a widget (maybe it already exists, but I couldn’t find one) that allows upload via drag-and-drop from Explorer. That would be better than what’s already available. Something I can’t do at all except via cut and paste is store stuff in Google Notebook. A drag-and-drop gadget to simplify that would be providing something I can’t do at all.

Conclusion

I don’t know that I have one. There are several similar implementations of a desktop XML/Javascript applet technology that has a lot of money invested in it. Well, I don’t know how much exactly, but I bet I’d be a happy old programmer if you’d given it all to me instead. And you might as well have done exactly that, for all the benefit mankind appears to be accruing. It ought to be good for something, oughtn’t it?

Written by mikewoodhouse

14 January 2009 at 22:56

Route 55 (and Route 19)

with one comment

(Stackoverflow reputation down to 2232 after they cleaned up some over-voted stuff from the early days, sniff)

While I prevaricate over all kinds of things, including a redesign of the xlUnit interface, I have been enjoying Michael’s series of articles on Project Euler solutions in VBA, posted at Daily Dose of Excel. There are some ingenious solutions to long-standing VB/VBA deficiencies, not least the absence of built-in facilities for handling arbitrarily large numbers beyond double precision variables.

I have an abiding fondness for the “classic” VB family – not least because it was the skill that fed and housed me and my family for a good 15 years or more. But boy, it can look a bit tired these days.

As it happens, I’d been taking a few shots at the Euler problems myself, but in Ruby, since that’s my language of choice these days (not least because much of my working day is currently spent working on intranet applications using Rails). So it was interesting to compare the two.

Let’s take problem 55. I took Michael’s code (with the neat little large number AddAsStrings routine) into Excel on my whizz-bang dual-dual-Xeon machine and it solved the problem in 0.554 seconds, which, considering the amount of string-based arithmetic that’s going on, is a testament to the speed of modern PCs.

Below is my Ruby version, which takes a rather different approach. Firstly, in Ruby we have support for arbitrarily large numbers, via the built-in Bignum class, so the string adding business is taken care of. Secondly, classes in Ruby, even compiled standard ones, are open to modification, via a technique colloquially known as monkey-patching. So I could patch in a method directly to the Integer class, which seems appropriate, since we’re looking for a property of the number.

Here’s the code:

class Integer
  def lychrel?(max = 50)
    temp = self
    max.times do
      temp = temp + temp.to_s.reverse.to_i
      return false if temp.to_s == temp.to_s.reverse
    end
    true
  end
end
puts (1..9999).inject(0) { |t, i| t + (i.lychrel? ? 1 : 0) }

That took 0.410 seconds on the same machine. I can see at least one inefficiency: calling to_s twice on the same number, which is expensive.

On the other hand, VBA has the edge on problem 19. I spotted a little optimisation in Michael’s code, which gave me this in VBA, which is about 20 times faster at 0.0019 seconds than the original:

Dim Start   As Date
Dim Answer  As Long
Start = DateSerial(1901, 1, 1)
Do While Start < DateSerial(2001, 1, 1)
   If Weekday(Start) = vbSunday Then
      Answer = Answer + 1
   End If
   Start = DateSerial(Year(Start), Month(Start) + 1, 1)
Loop
Debug.Print Answer

Ruby’s Date class doesn’t do anything clever with months outside the 1 to 12 range, so I had to inject a little logic, but otherwise we’re pretty much in synch, algorithmically:

d = Date.new(1901,1,1)
end_date = Date.new(2000,12,31)
until d > end_date do
  res += 1 if d.cwday == 7
  y, m = d.year, d.month + 1
  y, m = y + 1, 1 if m > 12
  d = Date.new(y, m, 1)
end
puts res

Not much in it, lines-of-code wise as you’d probably expect, but the Ruby code takes about 0.36 seconds, which is a hell of a difference.

Call it one-all for now.

Written by mikewoodhouse

2 January 2009 at 20:01

Posted in Excel, Ruby

Tagged with , , ,