in which we try to get tricky with Excel and find that Excel has adequate builtin trickiness already, thank you very much indeed
Over the years I’ve had a number of roles that involved fixing, tuning, cleaning and otherwise future and bulletproofing a fair number of Excel workbooks. There’s a whole catalog of potential issues to entertain and amuse but very common is the Extending Input Data Set. (I capitalised that in the hope that I might have christened some kind of Excel antipattern).
Your average Excel user is smart enough to know that some input data may grow over time: they may be copypasting from some other source, grabbing a database query result or even just typing in more values over time. It’s very common to see a rowbyrow enrichment of such data for analysis and the formulae need to be extended to match the rows of input data. In the worst case, users have precopied more rows than they expect ever to need and don’t notice when they’ve started to miss data. More happily, they copy down as they see the data extending. If they see the data extending, that is^{1}.
Helping a colleague to avoid such an unpleasantness recently led to a couple of interesting (to me) solutions. Firstly, we looked at extending the calculations applied to a list of data that we knew would extend every time we ran the spreadsheet model.
We assume we have a defined name, input_data that does what it says. Further, we have output_formulae, which should correspond oneforone with the input_data rows. When input_data changes, we want to automatically extend output_formulae to match. Pretty straightforward, as it turns out:
Private Sub Worksheet_Change(ByVal Target As Range)
If Intersect(Target, Range("input_data")) Is Nothing Then
Exit Sub
End If
ExtendOutputFormulaeAsRequired
End Sub
Sub ExtendOutputFormulaeAsRequired()
Dim inputRowCount As Long, calcRowCount As Long
inputRowCount = Range("input_data").Rows.Count
calcRowCount = Range("output_formulae").Rows.Count
If inputRowCount <= calcRowCount Then
Exit Sub ' enough formula rows already  could reduce if needed, but not done here
End If
With Range("output_formulae")
' Assumes just formulae are needed:
.Offset(calcRowCount, 0).Resize(inputRowCount  calcRowCount, .Columns.Count).Formula = _
.Rows(calcRowCount).Formula
End With
End Sub
That works pretty well. At least, it does if input_data is defined, something we can usually do fairly easily, using a named formula, something like =OFFSET(B2,,,COUNT(B:B),6) (assuming we have 6 rows and there’s a nonnumeric column heading in B1, which in this case there is).
Here’s the wrinkle that gives rise to the second interesting (to me) question, that necessitated a question on Stack Overflow. What if I can’t be sure that all my columns will be the same length? I want the length of the longest column (there won’t be gaps vertically, which is something at least). So in the sample below, there are 7 rows of data in columns B to G, with a heading on row 1, so 7 is the longest column and defines the number of rows in input_data.
Input data – extend me! 
1 
1 
1 
1 
1 
1 
1 
1 
2 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 


1 

1 
1 


1 





1 



I don’t want to use .CurrentRegion because I can’t be certain that adjacent cells won’t be populated at some time in the future. We can’t do something like MAX(COUNT(B:G)) because COUNT() is built to handle twodimensional ranges. I tried to be tricky, {=MAX(COUNT(OFFSET(B1:G1),,,1000000,1))}, hoping that Excel would evaluate the COUNT() for a set of OFFSETs from B1 to G1. Excel wasn’t fooled. Stack Overflow time then, and enter SUBTOTAL(), a function I confess I’ve never used in over 20 years of Excellence.
I define a new name, input_data_columns as =Sheet1!$B:$G, which I can then use in the definition of input_data_rowcount:
=MAX(SUBTOTAL(2,OFFSET(input_data_columns,,COLUMN(input_data_columns)MIN(COLUMN(input_data_columns)),,1)))
In a worksheet this needs to be ControlShiftEntered as an array formula, but there’s no obvious way to do this when defining a Named Formula. Fortunately, although I don’t understand why, it works OK anyway, leading in turn to input_data becoming
=OFFSET(Sheet1!$B$2,0,0,input_data_rowcount,6)
That works! Is it futureproof? Probably not^{2} but it might very well be an improvement on what came before.
Briefly checking what happens, if we remove the MAX() and put the remainder into a worksheet as an array formula, we get {4, 4, 7, 4, 5, 5}, which is indeed the result of COUNT() on each column.
^{1} From a selfish perspective, this is a Good Thing – it requires “professional” expertise to deal with and has put bread on my table many a time.
^{2} See ^{1}