10
\$\begingroup\$

The code below was refactored for performance improvements for another user on this site.

Functionality, high level:

  • Sheet1 - CodeName aIndex: used as the main reference to the structure of the data being processed in 2 other sheets: mapping column headers for incoming data in sheet2, to column headers to be processed for the final result on Sheet3

  • Sheet2 - CodeName bImport: this where external (raw) data is imported before processing. Importing of data is not part of this process

  • Sheet3 - CodeName cFinal: out of a set of about 50 incoming columns, Sheet1 will define a subset of 20 to 30 columns to be processed for the final result

The code is fully functional, without issues, and decent performance (50,000 rows and 44 columns processed in 4 to 5 seconds); it contains more comments than usual for learning purposes, explaining some basic steps, or things that may not be obvious or clear to an inexperienced person.

Notes:

  • This is not a request that requires understanding of the functionality, or finding inefficiencies (unless there are obvious parts that can be optimized).
  • It's about self improvement relative to coding practices: I am open to any criticism no matter how harsh, for any mistakes I may have made - I'll easily swallow my pride, as long as I can improve any bad habits I may have picked up along the way.
  • When I posted the question intended to make it as relevant to this site as possible: Does this code make my ass look fat?
  • I realize that members of this community are volunteers (like me), and provide feedback out of passion about the subject, so I tried to analyse the question objectively, as a reviewer:
    • The code is way too long to make me feel it's worth the effort, and this is the reason I didn't bring its functionality into the mix: there is less effort required for analyzing it at a high level (coding style), and not intricacies of functionality
    • There is nothing I can do to make it shorter: I was curious about its structure: did I modularize it enough, or maybe too much
    • I wouldn't want to get involved in a long review by attempting to understand its logic and reasons of doing what it does, but just quick feedback about anything obviously bad from a readability and maintainability perspective

.

That said, I will provide relevant details about functionality for each part as a contexts for the algorithm

The first Sub controls the start and end of the entire process (after an imported file): turns off all events and calculations in Excel that can slow down execution, starts a timer, starts the main process, captures the total duration, and turns all Excel features back on: .

Option Explicit Public Sub projectionTemplateFormat() Dim t1 As Double, t2 As Double fastWB True 'turn off all Excel features related to GUI and calculation updates t1 = Timer 'start performance timer mainProcess t2 = Timer 'process is completed fastWB False 'turn Excel features back on 'MsgBox "Duration: " & t2 - t1 & " seconds" 'optional measurement output End Sub 

The next Sub is where the main processing is done, and makes calls to smaller helper functions:

  • Sets up all references needed during processing: the 3 workbooks, and a set of local variables
  • Determines the columns and size of imported data (Sheet2)
  • Determines if there is any previous data on the result sheet (Sheet3) for cleanup
    • It doesn't remove the headers: these are the column to be migrated from the imported data
  • Overwrites the headers in Imported Sheet with a standard set of headers defined on Sheet1
    • The headers on Sheet1 can be adjusted by the user (added, removed, renamed) relative to the expected headers in the imported data
    • They are also aligned with the headers on Sheet3 (the final result)
  • Re-formats the imported data with specific text, number, and date formats
  • If there is at least 1 row of imported data on Sheet2, it starts the main process

The following steps are the most CPU intensive task:

  • Start looping over each column on Sheet3 (columns of the final result)
    • Find the first column to be migrated (based on the header name from Sheet3)
    • If found, set a reference to the entire column with data (50,000 rows or more)
    • Set a reference on Sheet3, to an area of the same size as the column of imported data
    • Copy the data from Sheet2 to Sheet3
  • Move on the the next column on Sheet3 an repeat the process until all predefined columns on Sheet3 are populated

  • Overwrite some imported values on Sheet3 with hard-coded data from Sheet1

  • Reformat the dates on 2 specific columns on Sheet3 to "YYYY" requirement
  • Reformat other specific columns on Sheet3
  • Convert all data on Sheet3 to UPPER CASE
  • Apply cell and font formatting to all data on Sheet3
  • Zoom all sheets to 85%

Private Sub mainProcess() Const SPACE_DELIM As String = " " Dim wsIndex As Worksheet Dim wsImport As Worksheet 'Raw data Dim wsFinal As Worksheet 'Processed data Dim importHeaderRng As Range Dim importColRng As Range Dim importHeaderFound As Variant Dim importLastRow As Long Dim finalHeaderRng As Range Dim finalColRng As Range Dim finalHeaderRow As Variant Dim finalHeaderFound As Variant Dim indexHeaderCol As Range Dim header As Variant 'Each item in the FOR loop Dim msg As String Set wsIndex = aIndex 'This is the Code Name; top-left pane: aIndex (Index) Set wsImport = bImport 'Direct reference to Code Name: bImport.Range("A1") Set wsFinal = cFinal 'Reference using Sheets collection: ThisWorkbook.Worksheets("Final") With wsImport.UsedRange Set importHeaderRng = .Rows(1) 'Import - Headers importLastRow = getMaxCell(wsImport.UsedRange).Row 'Import - Total Rows End With With wsFinal.UsedRange finalHeaderRow = .Rows(1) 'Final - Headers (as Array) Set finalHeaderRng = .Rows(1) 'Final - Headers (as Range) End With With wsIndex.UsedRange 'Transpose col 3 from Index (without the header), as column names in Import Set indexHeaderCol = .Columns(3).Offset(1, 0).Resize(.Rows.Count - 1, 1) wsImport.Range(wsImport.Cells(1, 1), wsImport.Cells(1, .Rows.Count - 1)).Value2 = Application.Transpose(indexHeaderCol) End With applyColumnFormats bImport 'Apply date and number format to Import sheet If Len(bImport.Cells(2, 1).Value2) > 0 Then 'if Import sheet is not empty (excluding header row) With Application For Each header In finalHeaderRow 'Loop through all headers in Final If Len(Trim(header)) > 0 Then 'If the Final header is not empty importHeaderFound = .Match(header, importHeaderRng, 0) 'Find header in Import sheet If IsError(importHeaderFound) Then msg = msg & vbLf & header & SPACE_DELIM & wsImport.Name 'Import doesn't have current header Else finalHeaderFound = .Match(header, finalHeaderRng, 0) 'Find header in Final sheet With wsImport Set importColRng = .UsedRange.Columns(importHeaderFound).Offset(1, 0).Resize(.UsedRange.Rows.Count - 1, 1) End With With wsFinal Set finalColRng = .Range(.Cells(2, finalHeaderFound), .Cells(importLastRow, finalHeaderFound)) finalColRng.Value2 = vbNullString 'Delete previous values (entire column) End With finalColRng.Value2 = importColRng.Value2 'Copy Import data in Final columns End If End If Next End With setStaticData importLastRow extractYears applyColumnFormats cFinal 'Apply date and number format to Import sheet allUpper wsFinal 'wsFinal.UsedRange.AutoFilter applyFormat wsFinal.Range(wsFinal.Cells(1, 1), wsFinal.Cells(importLastRow, wsFinal.UsedRange.Columns.Count)) Dim ws As Worksheet For Each ws In Worksheets ws.Activate ActiveWindow.Zoom = 85 ws.Cells(2, 2).Activate ActiveWindow.FreezePanes = True ws.Cells(1, 1).Activate Next Else MsgBox "Missing raw data (Sheet 2 - 'Import')", vbInformation, " Missing Raw Data" End If End Sub 

Next method is a straight overwrite operation of static data from Sheet1 onto Sheet3


Private Sub setStaticData(ByVal lastRow As Long) With cFinal .Range("D2:D" & lastRow).Value = aIndex.Range("H2").Value .Range("F2:F" & lastRow).Value = aIndex.Range("H9").Value .Range("AC2:AC" & lastRow).Value = aIndex.Range("H3").Value .Range("X2:X" & lastRow).Value = aIndex.Range("H4").Value .Range("Y2:Y" & lastRow).Value = aIndex.Range("H5").Value .Range("AE2:AE" & lastRow).Value = aIndex.Range("H6").Value .Range("AF2:AF" & lastRow).Value = aIndex.Range("H7").Value .Range("AD2:AD" & lastRow).Value = aIndex.Range("H8").Value End With End Sub 

Another method of applying a specific text, number, date format to a set of columns (the same set of columns on either Sheet2 (Import), or Sheet3 (final result)


Private Sub applyColumnFormats(ByRef ws As Worksheet) With ws.UsedRange .Cells.NumberFormat = "@" 'all cells will be "General" .Columns(colNum("G")).NumberFormat = "MM/DD/YYYY" .Columns(colNum("I")).NumberFormat = "MM/DD/YYYY" '.Columns(colNum("A")).NumberFormat = "@" '.Columns(colNum("B")).NumberFormat = "@" '.Columns(colNum("C")).NumberFormat = "@" .Columns(colNum("R")).NumberFormat = "MM/DD/YYYY" .Columns(colNum("Q")).NumberFormat = "MM/DD/YYYY" .Columns(colNum("T")).NumberFormat = "MM/DD/YYYY" .Columns(colNum("W")).NumberFormat = "@" '"YYYY" .Columns(colNum("V")).NumberFormat = "@" '"YYYY" .Columns(colNum("AC")).NumberFormat = "MM/DD/YYYY" .Columns(colNum("N")).NumberFormat = "_($* #,##0.00_);_($* (#,##0.00);_($* ""-""??_);_(@_)" .Columns(colNum("AM")).NumberFormat = "_($* #,##0.00_);_($* (#,##0.00);_($* ""-""??_);_(@_)" .Columns(colNum("AN")).NumberFormat = "_($* #,##0.00_);_($* (#,##0.00);_($* ""-""??_);_(@_)" .Columns(colNum("AO")).NumberFormat = "_($* #,##0.00_);_($* (#,##0.00);_($* ""-""??_);_(@_)" End With End Sub 

Helper method: Cell, border, and font formatting to all data on Sheet3


Private Sub applyFormat(ByRef rng As Range) With rng .ClearFormats With .Font .Name = "Georgia" .Color = RGB(0, 0, 225) End With .Interior.Color = RGB(216, 228, 188) With .Rows(1) .Font.Bold = True .Interior.ColorIndex = xlAutomatic End With With .Borders .LineStyle = xlDot 'xlContinuous .ColorIndex = xlAutomatic .Weight = xlThin End With End With refit rng End Sub 

Helper method: Converts all data to upper case

The main aspect about all helper methods acting on large ranges of data is that they perform:

  • Only one interaction with the worksheet to copy all data to memory
  • Processes each individual value by looping over the memory arrays (unavoidable nested loops for 2 dimensional arrays)
  • Then in another single interaction with the sheet places all data transformed back in the same area

  • This is, by far, the most overlooked performance improvement. It requires minimum coding effort, but might be perceived as a somewhat difficult concept to grasp for novice VBA enthusiasts (including myself) who just want to get the job done, without "complicating" things


Private Sub allUpper(ByRef sh As Worksheet) Dim arr As Variant, i As Long, j As Long If WorksheetFunction.CountA(sh.UsedRange) > 0 Then arr = sh.UsedRange For i = 2 To UBound(arr, 1) 'each "row" For j = 1 To UBound(arr, 2) 'each "col" arr(i, j) = UCase(RTrim(Replace(arr(i, j), Chr(10), vbNullString))) Next Next sh.UsedRange = arr End If End Sub 

Helper method: converts dates on certain columns to a YYYY format. In retrospect, I should have made it generic to accept a column name, range, letter, or number, as a parameter instead of hard-codding 2 columns. The point I was trying to make here was to combine multiple columns within one loop for improved performance, instead of several loops performing the same operation, on different columns


Private Sub extractYears() Dim arr As Variant, i As Long, j As Long, ur As Range, colW As Long, colV As Long Set ur = cFinal.UsedRange '3rd sheet If WorksheetFunction.CountA(ur) > 0 Then colW = colNum("W") colV = colNum("V") arr = ur For i = 2 To getMaxCell(ur).Row 'each "row" If Len(arr(i, colW)) > 0 Then arr(i, colW) = Format(arr(i, colW), "yyyy") If Len(arr(i, colV)) > 0 Then arr(i, colV) = Format(arr(i, colV), "yyyy") Next ur = arr End If End Sub Private Sub refit(ByRef rng As Range) With rng .WrapText = False .HorizontalAlignment = xlGeneral .VerticalAlignment = xlCenter .Columns.EntireColumn.AutoFit .Rows.EntireRow.AutoFit End With End Sub 

Helper method: next, are 2 generic functions that return:

  • The column letter from the column number
  • The column number from the column letter

Not ideal naming convention as it's not descriptive enough (not intuitive or self-documented). My reason (not excuse): long names don't fit well in the small area provided - doesn't make it OK


Public Function colLtr(ByVal fromColNum As Long) As String 'get column leter from column number 'maximum number of columns in Excel 2007, last column: "XFD" (16384) Const MAX_COLUMNS As Integer = 16384 If fromColNum > 0 And fromColNum <= MAX_COLUMNS Then Dim indx As Long, cond As Long For indx = Int(Log(CDbl(25 * (CDbl(fromColNum) + 1))) / Log(26)) - 1 To 0 Step -1 cond = (26 ^ (indx + 1) - 1) / 25 - 1 If fromColNum > cond Then colLtr = colLtr & Chr(((fromColNum - cond - 1) \ 26 ^ indx) Mod 26 + 65) End If Next indx Else colLtr = 0 End If End Function Public Function colNum(ByVal fromColLtr As String) As Long 'A to XFD (upper or lower case); if the parameter is invalid it returns 0 'maximum number of columns in Excel 2007, last column: "XFD" (16384) Const MAX_LEN As Byte = 4 Const LTR_OFFSET As Byte = 64 Const TOTAL_LETTERS As Byte = 26 Const MAX_COLUMNS As Integer = 16384 Dim paramLen As Long Dim tmpNum As Integer paramLen = Len(fromColLtr) tmpNum = 0 If paramLen > 0 And paramLen < MAX_LEN Then Dim i As Integer Dim tmpChar As String Dim numArr() As Integer fromColLtr = UCase(fromColLtr) ReDim Preserve numArr(paramLen) For i = 1 To paramLen tmpChar = Asc(Mid(fromColLtr, i, 1)) If tmpChar < 65 Or tmpChar > 90 Then Exit Function 'make sure it's a letter. upper case: 65 to 90, lower case: 97 to 122 numArr(i) = tmpChar - LTR_OFFSET 'change lettr to number indicating place in alphabet (from 1 to 26) Next Dim highPower As Integer highPower = UBound(numArr()) - 1 'the most significant digits occur to the left For i = 1 To highPower + 1 tmpNum = tmpNum + (numArr(i) * (TOTAL_LETTERS ^ highPower)) 'convert the number array using powers of 26 highPower = highPower - 1 Next End If If tmpNum < 0 Or tmpNum > MAX_COLUMNS Then tmpNum = 0 colNum = tmpNum End Function 

For the next method I applied an extra performance improvement to the usual known method of determining the last cell with data:

  • Normal methods perform an inverse search of the first data value staring at the last row\column of an Excel sheet (which now has over 1 million rows and and 16 thousand columns

  • This method expects only on the "UsedRange" - the notoriously inaccurate range that remembers cell formatting, unused formulas, hidden objects, etc. However, this inaccurate range is much smaller the the entire sheet, but large enough to include all data, so it performs the inverse search over only a few excess rows and columns

  • By my definition, the last used cell can also be empty, a long as it represents the longest row and column with data


Public Function getMaxCell(ByRef rng As Range) As Range 'search the entire range (usually UsedRange) 'last row: find first cell with data, scanning rows, from bottom-right, leftwards 'last col: find first cell with data, scanning cols, from bottom-right, upwards With rng Set getMaxCell = rng.Cells _ ( _ .Find( _ What:="*", _ SearchDirection:=xlPrevious, _ LookIn:=xlFormulas, _ After:=rng.Cells(1, 1), _ SearchOrder:=xlByRows).Row, _ .Find( _ What:="*", _ SearchDirection:=xlPrevious, _ LookIn:=xlFormulas, _ After:=rng.Cells(1, 1), _ SearchOrder:=xlByColumns).Column _ ) End With End Function 

  • Helper method: another set of versatile general functions for turning off Excel features that might hinder VBA performance, main ones:
    • xlCalculationAutomatic - extremely convenient for manual interactions with sheets, huge potential of performance issues when performing VBA updates to large ranges as it triggers exponential calculations to all dependent formulas on the sheet(s)
    • EnableEvents - can trigger nested events (infinite recursion) which Excel terminates eventually). Also may cause inexplicable or unexpected VBA behavior when not turned back on
    • ScreenUpdating - well known
    • DisplayPageBreaks: I've seen an earlier comment referring to this. To me this is insidious, perceived harmless, when in fact it can cause extra work behind the scenes, especially when re-sizing rows and columns. I never print anything, so I never care about page breaks, but Excel cares about them at every move: re-size 1 column\row - it recalculates page size for all used area; it should be used and only when printing

Public Sub fastWB(Optional ByVal opt As Boolean = True) With Application .Calculation = IIf(opt, xlCalculationManual, xlCalculationAutomatic) If .DisplayAlerts <> Not opt Then .DisplayAlerts = Not opt If .DisplayStatusBar <> Not opt Then .DisplayStatusBar = Not opt If .EnableAnimations <> Not opt Then .EnableAnimations = Not opt If .EnableEvents <> Not opt Then .EnableEvents = Not opt If .ScreenUpdating <> Not opt Then .ScreenUpdating = Not opt End With fastWS , opt End Sub Public Sub fastWS(Optional ByVal ws As Worksheet, Optional ByVal opt As Boolean = True) If ws Is Nothing Then For Each ws In Application.ActiveWorkbook.Sheets setWS ws, opt Next Else setWS ws, opt End If End Sub Private Sub setWS(ByVal ws As Worksheet, ByVal opt As Boolean) With ws .DisplayPageBreaks = False .EnableCalculation = Not opt .EnableFormatConditionsCalculation = Not opt .EnablePivotTable = Not opt End With End Sub Public Sub xlResetSettings() 'default Excel settings With Application .Calculation = xlCalculationAutomatic .DisplayAlerts = True .DisplayStatusBar = True .EnableAnimations = False .EnableEvents = True .ScreenUpdating = True Dim sh As Worksheet For Each sh In Application.ActiveWorkbook.Sheets With sh .DisplayPageBreaks = False .EnableCalculation = True .EnableFormatConditionsCalculation = True .EnablePivotTable = True End With Next End With End Sub 

Any suggestions to improve readability for ease of maintenance, restructuring functions, naming conventions, etc, will be much appreciated

\$\endgroup\$
3
  • 2
    \$\begingroup\$ I've asked here why this has been put on hold out of interest for when I post code asking for review. \$\endgroup\$ Commented Jun 17, 2015 at 11:14
  • \$\begingroup\$ The specific problem, from a different aspect: can this code be improved in any way. I will review the question to include details about each part of the functionality so it will be easier to follow and provide as much and reasonable context as possible. Will these be addressing the issue main issue? Also, I carefully read the article on How to Ask, and ultimately this was my interpretation of it - obviously I got it wrong - as this being my first question, any help with this aspect will be appreciated and useful to me in the future \$\endgroup\$ Commented Jun 17, 2015 at 12:53
  • \$\begingroup\$ for a more detailed "how to ask" guidance (which is admittedly huge) you may want to check out meta.codereview.stackexchange.com/q/2436/37660 also you're always welcome to Code Review Chat for questions about closures and general site workings. Feel free to drop by ;) \$\endgroup\$ Commented Jun 17, 2015 at 15:43

3 Answers 3

9
\$\begingroup\$

This isn't going to be a full-blown, fine-combed review. Just a few points.


Use PascalCase for procedure/member identifiers. Being consistent about this helps readability because it makes it easy to tell members from locals and parameters at a glance, without even reading them.


In general your indenting is fine, except here:

fastWB True 'turn off all Excel features related to GUI and calculation updates t1 = Timer 'start performance timer mainProcess t2 = Timer 'process is completed fastWB False 'turn Excel features back on 

Yes, it's a logical block, a bit like On Error Resume Next {instruction} On Error GoTo 0 would be. But it's not a syntactic code block. A different usage of vertical whitespace makes a better job at regrouping the statements I find:

fastWB True 'turn off all Excel features related to GUI and calculation updates t1 = Timer 'start performance timer mainProcess t2 = Timer 'process is completed fastWB False 'turn Excel features back on 

The comments are annoying more than anything else. Consider using more descriptive identifiers instead:

ToggleExcelPerformance startTime = Timer RunMainProcess endTime = Timer ToggleExcelPerformance False 

Note that the difference between startTime and endTime will be skewed if you run this code a few seconds before midnight on your system, because of how Timer works. Shameless plug, but with a little bit of abuse there are much more precise and reliable ways to time method execution (I co-own the project), especially if you don't need the duration to be in your "production code".


This declaration came as a surprise:

Dim ws As Worksheet For Each ws In Worksheets 

Why? Because it's the only declaration in the MainProcess method, that's declared close to usage (as it should). Either stick it to the top of the procedure with the other ones (eh, don't do that), or move the other declarations closer to their first usage (much preferred).

Pretty much the entire procedure's body is wrapped in this If..Else block:

If Len(bImport.Cells(2, 1).Value2) > 0 Then 'wall of code Else MsgBox "Missing raw data (Sheet 2 - 'Import')", vbInformation, "Missing Raw Data" End If 

I suggest you revert the condition to reduce nesting:

If Len(bImport.Cells(2, 1).Value2) = 0 Then MsgBox "Missing raw data (Sheet 2 - 'Import')", vbInformation, "Missing Raw Data" Exit Sub End If 'wall of code 

This is what I like to call an abuse of the With statement:

With Application 'wall of code End With 

I like that you're making explicitly qualified references to the Application object like this, ...but not like this - a With block should look like this:

With someInstance foobar = .Foo(42) .DoSomething .Bar smurf End With 

If you're merely wrapping a whole method with a With block just to avoid having to type Application the 3-4 times you're referring to the Application object, ...sorry to say, but you're just being lazy - and you've uselessly increased nesting for that reason, too.

IMO this is another abusive/lazy usage of With:

With wsImport Set importColRng = .UsedRange.Columns(importHeaderFound).Offset(1, 0).Resize(.UsedRange.Rows.Count - 1, 1) End With 

Versus:

Set importColRng = wsImport.UsedRange.Columns(importHeaderFound) _ .Offset(1, 0) _ .Resize(wsImport.UsedRange.Rows.Count - 1, 1) 

This is awkward:

With rng Set getMaxCell = rng.Cells _ ( _ .Find( _ What:="*", _ SearchDirection:=xlPrevious, _ LookIn:=xlFormulas, _ After:=rng.Cells(1, 1), _ SearchOrder:=xlByRows).Row, _ .Find( _ What:="*", _ SearchDirection:=xlPrevious, _ LookIn:=xlFormulas, _ After:=rng.Cells(1, 1), _ SearchOrder:=xlByColumns).Column _ ) End With 

You open up a With block, but the first statement in it ignores it:

 Set getMaxCell = rng.Cells _ 

Should be

 Set getMaxCell = .Cells _ 

And then After:=rng.Cells(1, 1) is also referring to rng. What do you need that With block for, really?

Now, I really don't like that .Cells call: that 15-liner single instruction is doing way too many things. An instruction should only have as few as possible reasons to fail. If either Find fails, you'll have a runtime error 91, and no clue if it's the row or the column find that's blowing up.

Function GetMaxCell(ByRef rng As Range) As Range On Error GoTo CleanFail Const NONEMPTY As String = "*" Dim foundRow As Long foundRow = rng.Find(What:=NONEMPTY, _ SearchDirection:=xlPrevious, _ LookIn:=xlFormulas, _ After:=rng.Cells(1, 1), _ SearchOrder:=xlByRows) _ .Row Dim foundColumn As Long foundColumn = rng.Find(What:=NONEMPTY, _ SearchDirection:=xlPrevious, _ LookIn:=xlFormulas, _ After:=rng.Cells(1, 1), _ SearchOrder:=xlByColumns) _ .Column Set GetMaxCell = rng.Cells(foundRow, foundColumn) CleanExit: Exit Function CleanFail: Set GetMaxCell = Nothing Resume CleanExit 'break here Resume 'set next statement here End Function 

That will return Nothing to the caller (for it to handle of course) instead of blowing up if the function is given an empty range, or any other edge case that wasn't accounted for. And as a bonus, all you need to do to find the problem is to place a breakpoint just before the error-handling subroutine finishes.


There's certainly a lot more to say about this code, ...but this answer is already long enough as it is ;-)

\$\endgroup\$
5
  • \$\begingroup\$ This is GOLD! Thank you for your time!!!. I knew I was getting lazy and fat :) --- PascalCase - point taken (didn't know) --- Logical block - to make its point, but for myself I usually use your suggestion --- Comments - I don't have any comments in my code and quite long identifiers (on my wide screen) --- Declarations close to first use: I kept all at the top and move them at cleanup (but I never get to it) --- Reverted IF - point taken ! \$\endgroup\$ Commented Jun 19, 2015 at 3:53
  • 1
    \$\begingroup\$ <pre> --- Abuse of the With - I'm not actually lazy about this; a long time ago I learned that the compiler makes a separate reference to the object and it's faster, so I do use it whenever a get a chance - am I wrong about my info though? --- Awkward .Find statements - I know but can't find a shorter way --- Ignored With for the .Find statements - thank you (there was a nested With before, I took it out and forgot to clean up --- On Error - excellent ! thank you I was expecting "brutal" and I just got slapped :) <code> \$\endgroup\$ Commented Jun 19, 2015 at 4:05
  • \$\begingroup\$ Just a note: none of the Markdowns I tried seem to work, so my comments are very ugly (sorry) \$\endgroup\$ Commented Jun 19, 2015 at 4:07
  • \$\begingroup\$ @paulbica comments only support mini-markdown, a bit like in chat but even more limited: `code`, *italic*, **bold**.. but I think \$MathJax\$ is supported. Let's see.. \$O(n)\$ ..yup. oh and [links work, too](url). \$\endgroup\$ Commented Jun 19, 2015 at 4:20
  • \$\begingroup\$ no problem - I think the checkmark might be a bit early though; I'm sure you'll end up with more answers.. give it a day or two :) \$\endgroup\$ Commented Jun 19, 2015 at 5:01
6
\$\begingroup\$

Improved versions of GetMaxCell()

  • The first function, using an array is much faster
  • If called without the optional parameter, will default to .ThisWorkbook.ActiveSheet
  • If the range is empty will returns Cell( 1, 1 ) as default, instead of Nothing

GetMaxCell (Array): Duration: 0.0000790063 seconds
GetMaxCell (Find): Duration: 0.0002903480 seconds

.Measured with MicroTimer

Public Function GetLastCell(Optional ByVal ws As Worksheet = Nothing) As Range Dim uRng As Range, uArr As Variant, r As Long, c As Long Dim ubR As Long, ubC As Long, lRow As Long If ws Is Nothing Then Set ws = Application.ThisWorkbook.ActiveSheet Set uRng = ws.UsedRange uArr = uRng If IsEmpty(uArr) Then Set GetLastCell = ws.Cells(1, 1): Exit Function End If If Not IsArray(uArr) Then Set GetLastCell = ws.Cells(uRng.Row, uRng.Column): Exit Function End If ubR = UBound(uArr, 1): ubC = UBound(uArr, 2) For r = ubR To 1 Step -1 '----------------------------------------------- last row For c = ubC To 1 Step -1 If Not IsError(uArr(r, c)) Then If Len(Trim$(uArr(r, c))) > 0 Then lRow = r: Exit For End If End If Next If lRow > 0 Then Exit For Next If lRow = 0 Then lRow = ubR For c = ubC To 1 Step -1 '----------------------------------------------- last col For r = lRow To 1 Step -1 If Not IsError(uArr(r, c)) Then If Len(Trim$(uArr(r, c))) > 0 Then Set GetLastCell = ws.Cells(lRow + uRng.Row - 1, c + uRng.Column - 1) Exit Function End If End If Next Next End Function 

'Returns last cell (max row & max col) using Find Public Function GetMaxCell2(Optional ByRef rng As Range = Nothing) As Range Const NONEMPTY As String = "*" Dim lRow As Range, lCol As Range If rng Is Nothing Then Set rng = Application.ActiveWorkbook.ActiveSheet.UsedRange If WorksheetFunction.CountA(rng) = 0 Then Set GetMaxCell2 = rng.Parent.Cells(1, 1) Else With rng Set lRow = .Cells.Find(What:=NONEMPTY, LookIn:=xlFormulas, _ After:=.Cells(1, 1), _ SearchDirection:=xlPrevious, _ SearchOrder:=xlByRows) If Not lRow Is Nothing Then Set lCol = .Cells.Find(What:=NONEMPTY, LookIn:=xlFormulas, _ After:=.Cells(1, 1), _ SearchDirection:=xlPrevious, _ SearchOrder:=xlByColumns) Set GetMaxCell2 = .Parent.Cells(lRow.Row, lCol.Column) End If End With End If End Function 

Private Declare PtrSafe Function getFrequency Lib "Kernel32" Alias "QueryPerformanceFrequency" (cyFrequency As Currency) As Long Private Declare PtrSafe Function getTickCount Lib "Kernel32" Alias "QueryPerformanceCounter" (cyTickCount As Currency) As Long 'https://msdn.microsoft.com/en-us/library/office/ff700515(v=office.14).aspx#Anchor_5 Function MicroTimer() As Double Dim cyTicks1 As Currency Static cyFrequency As Currency MicroTimer = 0 If cyFrequency = 0 Then getFrequency cyFrequency 'Get frequency getTickCount cyTicks1 'Get ticks If cyFrequency Then MicroTimer = cyTicks1 / cyFrequency 'Returns Seconds End Function 

More info on Macro performance slow when page breaks are visible (Microsoft)

\$\endgroup\$
1
\$\begingroup\$
Public Function getMaxCell(ByRef rng As Range) As Range 'search the entire range (usually UsedRange) 'last row: find first cell with data, scanning rows, from bottom-right, leftwards 'last col: find first cell with data, scanning cols, from bottom-right, upwards With rng Set getMaxCell = rng.Cells _ ( _ .Find( _ What:="*", _ SearchDirection:=xlPrevious, _ LookIn:=xlFormulas, _ After:=rng.Cells(1, 1), _ SearchOrder:=xlByRows).Row, _ .Find( _ What:="*", _ SearchDirection:=xlPrevious, _ LookIn:=xlFormulas, _ After:=rng.Cells(1, 1), _ SearchOrder:=xlByColumns).Column _ ) End With End Function 

Kudos for using the UsedRange to cut down on unnecessary cell searching, but although this popular method is very good, it's vulnerable to an unlikely bug.

If the active cell of the worksheet is in a filtered ListObject, the code will fail (incorrect Range returned from Find method). To fix this, you have to disable events, select away from the table, then select back to the original cell maybe to avoid any risk of upsetting the user or any other macros.

This means that the most robust method which avoids all bugs(AFAIK) is the below:

Public Function GetLastRow(ByRef rng As Range) As Long Dim arr as Variant arr = rng.Value2 Dim i As Long, j As Long For i = UBound(arr) To 1 Step - 1 For j = Ubound(arr, 2) To 1 Step - 1 If Not IsError(arr(i, j)) If arr(i, j) <> vbNullString Then GetLastRow = i + rng.Row -1 Exit Function End If Else GetLastRow = i + rng.Row -1 Exit Function End If Next j Next i End Function 

I have a similar function for GetLastColumn; with these two combined you can get your MaxCell easily...

\$\endgroup\$
3
  • \$\begingroup\$ Thanks for the feedback. You have a valid poit - any filter would cause an incorrect result (ListObject or not). The only way I see to fix getMaxCell(), using the find method would be to preserve any existing filter, unfilter the data, get last cell, and filter it back with the initial filter - I'll need to update it, however, the GetLastCell() , using the loops, works as expected \$\endgroup\$ Commented Aug 27, 2017 at 11:55
  • \$\begingroup\$ Your GetLastRow() should work but there are a few issues with it: 1) it errors out at line Set arr = rng.Value2 (Type mismatch), to fix it remove the Set. 2) It errors out if parameter rng is Nothing, or something other than a Range object. 3) On lineIf arr(i, j) <> vbNullString Then it will fail if the data contains #N/A (check for error values when comparing). 4) Line getMaxRow = i + rng.Row -1 the name of the function should be GetLastRow \$\endgroup\$ Commented Aug 27, 2017 at 12:13
  • \$\begingroup\$ Nice catch. I'll edit the post for the set and name. I typed it out wrong \$\endgroup\$ Commented Aug 27, 2017 at 14:02

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.