11

I am new to VBA. I have job in my hand to improve performance of VBA code. To improve performance of the code, I have to read entire row and compare it with another row. Is there any way to do this in VBA?

Pseudocode:

sheet1_row1=read row1 from sheet1 sheet2_row1=read row1 from sheet2 if sheet1_row1 = sheet2_row1 then print "Row contains same value" else print "Row contains diff value" end if 
2
  • 1
    I can give you the code but first lets see if you can put in some efforts :) Try searching SO or Google on how to store Excel Range in an Array and work with them. After that if you are still stuck, post the code that you tried and we will take it from there. Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist Commented Oct 16, 2013 at 5:32
  • @Siddharth- For your kind information, i have done two excel sheet comparision by checking each cell value. If you believe me, you can check my question list. I have posted the question for performance improvement(check in my questions: what is best way to compare excel sheets?). I was looking for little help. Commented Oct 16, 2013 at 6:17

11 Answers 11

35
Sub checkit() Dim a As Application Set a = Application MsgBox Join(a.Transpose(a.Transpose(ActiveSheet.Rows(1).Value)), Chr(0)) = _ Join(a.Transpose(a.Transpose(ActiveSheet.Rows(2).Value)), Chr(0)) End Sub 

What's going on:

  • a is just shorthand for Application to keep the code below easier to read
  • ActiveSheet.Rows(1).Value returns a 2-D array with dimensions (1 to 1, 1 to {number of columns in a worksheet})
  • We'd like to condense the array above into a single value using Join(), so we can compare it with a different array from the second row. However, Join() only works on 1-D arrays, so we run the array twice through Application.Transpose(). Note: if you were comparing columns instead of rows then you'd only need one pass through Transpose().
  • Applying Join() to the array gives us a single string where the original cell values are separated by a "null character" (Chr(0)): we select this since it's unlikely to be present in any of the cell values themselves.
  • After this we now have two regular strings which are easily compared

Note: as pointed out by Reafidy in the comments, Transpose() can't handle arrays with more than approx. 65,000 elements, so you can't use this approach to compare two whole columns in versions of Excel where sheets have more than this number of rows (i.e. any non-ancient version).

Note 2: this method has quite bad performance compared to a loop used on a variant array of data read from the worksheet. If you're going to do a row-by-row comparison over a large number of rows, then the approach above will be much slower.

Sign up to request clarification or add additional context in comments.

9 Comments

@Jean-FrançoisCorbett: Chr(0) is the NULL character. It is usually used as a delimiter in strings. However, it is much better to use the inbuilt constant vbNullChar instead of Chr(0) as it doesn't equire runtime evaluation.
@Vicky - Join requires working with a 1-D array: the code is getting a 2-D array from the row, so two passes through Transpose converts it to 1-D. Chr(0) is used as a joining character because it's very unlikely to be in any of the cells on your sheet.
+1, Note: beware of the limitations of working with arrays in vba and the transpose limit of 2^16 (65536 rows).
@Tim: Maybe you need to convert this amazing answer to a full fledged answer (Sort of FAQ) :) Explaining what exactly is happening behind the scenes. This is the 2nd time I have to refer this answer to other users :) First Time Second Time
it's just intended as a "trick" - performance does indeed suck: it's about 30-50x worse than a regular loop after picking up the values in an array.
|
13

For your specific example, here are two ways...

Case Insensitive:

MsgBox [and(1:1=2:2)] 

Case Sensitive:

MsgBox [and(exact(1:1,2:2))] 

...

Below are generalized functions to compare any two contiguous ranges.

Case Insensitive:

Public Function RangesEqual(r1 As Range, r2 As Range) As Boolean RangesEqual = Evaluate("and(" & r1.Address & "=" & r2.Address & ")") End Function 

Case Sensitive:

Public Function RangesEqual(r1 As Range, r2 As Range) As Boolean RangesEqual = Evaluate("and(exact(" & r1.Address & "," & r2.Address & "))") End Function 

2 Comments

For those wondering what is this [ x ] notation, it's the same as Evaluate(" x ")
The square brackets do indeed invoke the Evaluate function, but there is a big difference. With the square brackets, the content between the brackets must be a literal text string... not an expression... and the literal string cannot be enclosed in quotation marks.
6

OK, this ought to be fairly fast: minimal interaction between Excel UI and VBA (which is where much of the slowness lives). Assumes worksheets have similar layouts from $A$1 and that we're only going to attempt to match the common area of the UsedRanges for the two sheets:

Public Sub CompareSheets(wks1 As Worksheet, wks2 As Worksheet) Dim rowsToCompare As Long, colsToCompare As Long rowsToCompare = CheckCount(wks1.UsedRange.Rows.Count, wks2.UsedRange.Rows.Count, "Row") colsToCompare = CheckCount(wks1.UsedRange.Columns.Count, wks2.UsedRange.Columns.Count, "Column") CompareRows wks1, wks2, rowsToCompare, colsToCompare End Sub Private Function CheckCount(count1 As Long, count2 As Long, which As String) As Long If count1 <> count2 Then Debug.Print "UsedRange " & which & " counts differ: " _ & count1 & " <> " & count2 End If CheckCount = count2 If count1 < count2 Then CheckCount = count1 End If End Function Private Sub CompareRows(wks1 As Worksheet, wks2 As Worksheet, rowCount As Long, colCount As Long) Debug.Print "Comparing first " & rowCount & " rows & " & colCount & " columns..." Dim arr1, arr2 arr1 = wks1.Cells(1, 1).Resize(rowCount, colCount).Value arr2 = wks2.Cells(1, 1).Resize(rowCount, colCount).Value Dim rIdx As Long, cIdx As Long For rIdx = LBound(arr1, 1) To UBound(arr1, 1) For cIdx = LBound(arr1, 2) To UBound(arr1, 2) If arr1(rIdx, cIdx) <> arr2(rIdx, cIdx) Then Debug.Print "(" & rIdx & "," & cIdx & "): " & arr1(rIdx, cIdx) & " <> " & arr2(rIdx, cIdx) End If Next Next End Sub 

1 Comment

Yup. Picking up the range in one go makes a huge difference to the speed of your code. It's dismaying that we still have to keep pointing this out...
2

Excel 2016 has a built in function called TEXTJOIN

https://support.office.com/en-us/article/textjoin-function-357b449a-ec91-49d0-80c3-0e8fc845691c

Looking at @Tim Williams answer and using this new function (which does not have the 65536 row limit):

Sub checkit() MsgBox WorksheetFunction.TextJoin(Chr(0), False, ActiveSheet.Rows(1).Value) = _ WorksheetFunction.TextJoin(Chr(0), False, ActiveSheet.Rows(2).Value) End Sub 

Written as a function:

Public Function CheckRangeValsEqual(ByVal r1 As Range, ByVal r2 As Range, Optional ByVal strJoinOn As String = vbNullString) As Boolean CheckRangeValsEqual = WorksheetFunction.TextJoin(strJoinOn, False, r1.Value) = _ WorksheetFunction.TextJoin(strJoinOn, False, r2.Value) End Function 

Comments

1
Match = True Row1length = Worksheets("Sheet1").Cells(1, Columns.Count).End(xlToLeft).Column Row2length = Worksheets("Sheet2").Cells(1, Columns.Count).End(xlToLeft).Column If Row1length <> Row2length Then 'Not equal Match = False Else For i = 1 To Row1length If Worksheets("Sheet1").Cells(1, i),Value <> Worksheets("Sheet2").Cells(1, i) Then Match = False Exit For End If Next End If If Match = True Then Debug.Print "match" Else Debug.Print "not match" End If 

2 Comments

If Worksheets("Sheet1").Cells(1, i),Value <> Worksheets("Sheet2").Cells(1, i) Then A typo there?
cell-by-cell accessing of worksheet values will be slow for any non-trivial numbers of rows/columns
1

Here's a bit of code that will do two vector ranges. You can run it against two rows, two columns.

Don't think it's as fast as the x2 transpose method, but it's more flexible. The column invocation takes a bit longer since there are 1M items to compare!

Option Explicit Public Sub Test() 'Check two columns Debug.Print DataAreasAreSame(Columns("a"), Columns("b")) 'Check two rows Debug.Print DataAreasAreSame(Rows(1), Rows(2)) End Sub Public Function DataAreasAreSame(ByVal DataArea1 As Range, ByVal DataArea2 As Range) As Boolean Dim sFormula As String sFormula = "=SUM(If(EXACT(" & DataArea1.Address & "," & DataArea2.Address & ")=TRUE,0,1))" If Application.Evaluate(sFormula) = 0 Then DataAreasAreSame = True End Function 

Comments

0

=EXACT(B2;D2) formula and drag down, best option for me.

1 Comment

But, it does between only specific cell of a row with another! Not to read entire row and compare it with another row as mentioned in question! @q3kep
0

I'll put in a sledgehammer-to-crack-a-nut answer here, for completeness, because the question 'Are these two ranges identical?' is turning up as an unexamined component of everyone else's 'compare my ranges and then do this complicated thing...' questions.

Your question is a simple question about small ranges. My answer is for large ones; but the question is a good one, and a good place for a more general answer, because it's simple and clear: and 'Do these ranges differ?' and 'Has someone tampered with my data?' are relevant to most commercial Excel users.

Most of the answers to the typical 'compare my rows' questions are cell-by-cell reads and comparisons in VBA. The simplicity of these answers is commendable, but this approach performs very slowly on a large data sets because:

  1. Reading a range one cell at a time is very slow;
  2. Comparing values pair-by-pair is inefficient, especially for strings, when the number of values gets into the tens of thousands,
Point(1) is the important one: it takes the same amount of time for VBA to pick up a single cell using var = Range("A1") as it does to pick up the entire range in one go using var = Range("A1:Z1024")...

...And every interaction with the sheet takes four times as much time as a string comparison in VBA, and twenty times longer than an comparison between floating-point decimals; and that, in turn, is three times longer than an integer comparison.

So your code will probably be four times faster, and possibly a hundred times faster, if you read the entire range in one go, and work on the Range.Value2 array in VBA.

That's in Office 2010 and 2013 (I tested them); for older version of Excel, you'll see quoted times between 1/50th and 1/500th of a second, for each VBA interaction with a cell or range of cells. That'll be way slower because, in both old and new versions of Excel, the VBA actions will still be in single-digit numbers of microseconds: your code will run at least a hundred times faster, and probably thousands of times faster, if you avoid cell-by-cell reads from the sheet in older versions of Excel.

 arr1 = Range1.Values arr2 = Range2.Values
' Consider checking that the two ranges are the same size ' And definitely check that they aren't single-cell ranges, ' which return a scalar variable, not an array, from .Value2
' WARNING: THIS CODE WILL FAIL IF YOUR RANGE CONTAINS AN ERROR VALUE
For i = LBound(arr1, 1) To Ubound(arr1, 2)
For j = LBound(arr1, 2) To Ubound(arr1, 2)
If arr1(i, j) <> arr2(i, j) Then bMatchFail = True Exit For End If
Next j
If bMatchFail Then Exit For
Next i
Erase arr1 Erase arr2

You'll notice that this code sample is generic, for two ranges of the same size taken from anywhere - even from separate workbooks. If you're comparing two adjacent columns, loading a single array of two columns and comparing IF arrX(i, 1) <> arrX(i,2) Then is going to halve the runtime.

Your next challenge is only relevant if you're picking up tens of thousands of values from large ranges: there's no performance gain in this extended answer for anything smaller than that.

What we're doing is:

Using a hash function to compare the values of two large ranges

The idea is very simple, although the underlying mathematics is quite challenging for non-mathematicians: rather than comparing one value at a time, we run a mathematical function that 'hashes' the values into a short identifier for easy comparison.

If you're repeatedly comparing ranges against a 'reference' copy, you can store the 'reference' hash, and this halves the workload.

There are some fast and reliable hashing functions out there, and they are available in Windows as part of the security and cryptography API. There is a slight problem in that they run on strings, and we have an array to work on; but you can easily find a fast 'Join2D' function that gets a string from the 2D arrays returned by a range's .Value2 property.

So a fast comparison function for two large ranges will look like this:

 Public Function RangeCompare(Range1 as Excel.Range, Range2 As Excel.Range) AS Boolean ' Returns TRUE if the ranges are identical. ' This function is case-sensitive. ' For ranges with fewer than ~1000 cells, cell-by-cell comparison is faster
' WARNING: This function will fail if your range contains error values.
RangeCompare = False
If Range1.Cells.Count <> Range2.Cells.Count Then RangeCompare = False ElseIf Range1.Cells.Count = 1 then RangeCompare = Range1.Value2 = Range2.Value2 Else RangeCompare = MD5(Join2D(Range1.Value2)) = MD5(Join2D(Range2.Value2)) Endif
End Function

I've wrapped the Windows System.Security MD5 hash in this VBA function:

 Public Function MD5(arrBytes() As Byte) As String ' Return an MD5 hash for any string
' Author: Nigel Heffernan Excellerando.Blogspot.com
' Note the type pun: you can pass in a string, there's no type conversion or cast ' because a string is stored as a Byte array and VBA recognises this.
oMD5 As Object 'Set a reference to mscorlib 4.0 to use early binding

Dim HashBytes() As Byte Dim i As Integer

Set oMD5 = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider") HashBytes = oMD5.ComputeHash_2((arrBytes))
For i = LBound(HashBytes) To UBound(HashBytes) MD5 = MD5 & Right("00" & Hex(HashBytes(i)), 2) Next i

Set oMD5 = Nothing ' if you're doing this repeatedly, declare at module level and persist Erase HashBytes

End Function
There are other VBA implementations out there, but nobody seems to know about the Byte Array / String type pun - they are not equivalent, they are identical - so everyone codes up unnecessary type conversions.

A fast and simple Join2D function was posted by Dick Kusleika on Daily Dose of Excel in 2015:

Public Function Join2D(ByVal vArray As Variant, Optional ByVal sWordDelim As String = " ", Optional ByVal sLineDelim As String = vbNewLine) As String Dim i As Long, j As Long Dim aReturn() As String Dim aLine() As String ReDim aReturn(LBound(vArray, 1) To UBound(vArray, 1)) ReDim aLine(LBound(vArray, 2) To UBound(vArray, 2)) For i = LBound(vArray, 1) To UBound(vArray, 1) For j = LBound(vArray, 2) To UBound(vArray, 2) 'Put the current line into a 1d array aLine(j) = vArray(i, j) Next j 'Join the current line into a 1d array aReturn(i) = Join(aLine, sWordDelim) Next i Join2D = Join(aReturn, sLineDelim) End Function 

If you need to excise blank rows before you make the comparison, you'll need the Join2D function I posted in StackOverflow back in 2012.

The most common application of this type of hash comparison is for spreadsheet control - change monitoring - and you'll see Range1.Formula used instead of Range1.Value2: but your question is about comparing values, not formulae.

Footnote: I've posted a very similar answer elsewhere. I'd've posted it here first if I'd seen this question earlier.

Comments

0

If you want to do this in MS excel, you can do the following.

For example, you have column range of each row from "A" to "F" and have to compare between Row 2 and Row 3. To check entire row and compare it with another row we can specify this in formula in a new Result column and instead of pressing Enter after typing the formula, press Ctrl + Shift + Enter.

=AND(EXACT(A2:F2,A3:F3)) 

The result will be TRUE if they match and FALSE if they don't. You'll see curly braces around your formula if you've correctly entered it as an array formula. After this, drag down every row so that each cell of this Result Column will have comparison result between this row and the following!

Comments

0

I know there are already answers here, but here is a simple VBA-only function that compares the values in any two ranges, returning TRUE if they match, or the first non-matching item number if they don't. (It returns FALSE if the ranges do not have the same number of cells.)

Function RangesEqualItemNo(Range1 As Range, Range2 As Range) As Variant Dim CellCount As Long If Range1.Count = Range2.Count Then For CellCount = 1 To Range1.Cells.Count If Range1.Cells.item(CellCount).Value <> Range2.Cells.item(CellCount).Value Then RangesEqualItemNo = CellCount Exit Function End If Next CellCount RangesEqualItemNo = True Else RangesEqualItemNo = False End If End Function 

Or as a simple boolean function:

Function RangesEqual(Range1 As Range, Range2 As Range) As Boolean Dim CellCount As Long If Range1.Count = Range2.Count Then For CellCount = 1 To Range1.Cells.Count If Range1.Cells.item(CellCount).Value <> Range2.Cells.item(CellCount).Value Then RangesEqual = False Exit Function End If Next CellCount RangesEqual = True Else RangesEqual = False End If End Function 

Although this may not be fancy, this sort of brute-force approach is often the fastest.

This compares values, so it will automatically transpose between columns and rows, which may or may not be what you want.

To take this to the logical next step, the following function will return an array of each item number that is different.

Function RangeDiffItems(Range1 As Range, Range2 As Range, Optional DiffSizes As Boolean = False) As Long() Dim CellCount As Long Dim DiffItems() As Long Dim DiffCount As Long ReDim DiffItems(1 To Range1.Count) DiffCount = 0 If Range1.Count = Range2.Count Or DiffSizes Then For CellCount = 1 To Range1.Cells.Count If Range1.Cells.item(CellCount).Value <> Range2.Cells.item(CellCount).Value Then DiffCount = DiffCount + 1 DiffItems(DiffCount) = CellCount End If Next CellCount If DiffCount = 0 Then DiffItems(1) = 0 Else DiffItems(1) = -1 End If If DiffCount = 0 Then ReDim Preserve DiffItems(1 To 1) Else ReDim Preserve DiffItems(1 To DiffCount) RangeDiffItems = DiffItems End Function 

If there are no differences, it returns a 0 in the first array slot, or if the arrays are of different sizes, it returns a -1 for the first array spot. To allow it to compare arrays of different sizes, optionally enter TRUE for the third parameter.

There are also a few more answers to this question elsewhere.

Comments

-1

In my version, I do not declare anything (Dim). It might be wrong but for all the years I have used Excel VBA - mostly it never stopped the code from working. In this case, the range is given as an area. If not, then you should use Dim Range1 As Range, etc. In my code, the area must be the same witch it mostly is anyway, the same rows and columns. This here is in my opinion the shortest way:

Set Range1 = Range("A1:B5") Set Range2 = Range("D1:E5") 'Range1 For Each rng In Range1 'Selection st1 = st1 & rng & "," Next rng 'Range2 For Each rng In Range2 'Selection st2 = st2 & rng & "," Next rng 'compare If st1 = st2 Then MsgBox "the same" Else MsgBox "different" End If 

This other example is a bit longer, it reads the rows and columns and uses this for both areas. If you set a range then range(1) will still be the first cell in this range from the top. Here is my code:

Sub COMPARE() Set Range1 = Range("A1:B5") Set Range2 = Range("D1:E5") 'area coly = Range1.Columns.Count 'columns rowy = Range1.Rows.Count 'rows For i = 1 To coly For j = 1 To rowy st1 = st1 & Range1(i, j) st2 = st2 & Range2(i, j) Next j Next i 'compare If st1 = st2 Then MsgBox "the same" Else MsgBox "different" End If End Sub 

6 Comments

"I do not declare anything (Dim). It might be wrong but for all the years I have used Excel VBA - mostly it never stopped the code from working" You're either very lucky or you've never written a script with more than a few dozen lines of code. First of all, if you ever make a typo in the name of one of your variables while using it, it can go unnoticed for a long time, producing unexpected results; and if you ever notice, it can be very difficult to find the source of the bug in a long script. Not to mention that all your variables are of type Variant (which has a performance overhead).
On the other hand, if you declare your variables (and have Option Explicit enabled), the compiler will scream at you as soon as you attempt to compile the code. Think about how much time that could save you.
Please note that this is very inefficient as it reads the cell values one-by-one. Much faster is to read the data into a Variant array and loop the array, as has been already demonstrated in this thread.
I have written really difficult projects in excel. This is the faster way for a small project. I do not use Option Explicit - then you can do it like this. Yes, I know how OE is useful for not making some errors, but it also makes you a lot more work and time to declare everything. So - this may not be the correct way but it is also not the wrong way. If you learn VBA then all the time you use Dim to declare variables - it does not mean that you must use it. Is my code slower? Yes, if there is a lot of data. But works great on small data.
No it's still the slower way, even for a small project. Also not using Option Explicit is a terrible recommendation. In my opinion, this answer adds nothing of value and should be deleted.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.