9
\$\begingroup\$

This fairly clunky looking VBA script successfully pulls names of countries from an Array, but could it be made shorter?

Mass of text from Column C onward in each row may contain the names of countries, which I want to pull into the corresponding cell in row A. What I have so far is not pretty...

Sub PullCountries() Dim Rng As Range Dim Block As Range Dim i As Long i = 1 Dim LastRow As Long LastRow = Range("B1").End(xlDown).Row While i <= LastRow Set Rng = Range("B" & i) Set Block = Rows(i) Range("I1").Value = i If InStr(1, Range("C" & i), "Canada") Or InStr(1, Range("D" & i), "Canada") Or InStr(1, Range("E" & i), "Canada") Then Rng.Offset(0, -1).Value = "Canada" i = i + 1 ElseIf InStr(1, Range("C" & i), "United States") Or InStr(1, Range("D" & i), "United States") Or InStr(1, Range("E" & i), "United States") Then Rng.Offset(0, -1).Value = "United States" i = i + 1 ElseIf InStr(1, Range("C" & i), "Britian") Or InStr(1, Range("D" & i), "Britian") Or InStr(1, Range("E" & i), "Britian") Then Rng.Offset(0, -1).Value = "UK" i = i + 1 ElseIf InStr(1, Range("C" & i), "UK") Or InStr(1, Range("D" & i), "UK") Or InStr(1, Range("E" & i), "UK") Then Rng.Offset(0, -1).Value = "UK" i = i + 1 ElseIf InStr(1, Range("C" & i), "Spain") Or InStr(1, Range("D" & i), "Spain") Or InStr(1, Range("E" & i), "Spain") Then Rng.Offset(0, -1).Value = "Spain" i = i + 1 ElseIf InStr(1, Range("C" & i), "Portugal") Or InStr(1, Range("D" & i), "Portugal") Or InStr(1, Range("E" & i), "Portugal") Then Rng.Offset(0, -1).Value = "Portugal" i = i + 1 ElseIf InStr(1, Range("C" & i), "Ireland") Or InStr(1, Range("D" & i), "Ireland") Or InStr(1, Range("E" & i), "Ireland") Then Rng.Offset(0, -1).Value = "Ireland" i = i + 1 ElseIf InStr(1, Range("C" & i), "Japan") Or InStr(1, Range("D" & i), "Japan") Or InStr(1, Range("E" & i), "Japan") Then Rng.Offset(0, -1).Value = "Japan" i = i + 1 ElseIf InStr(1, Range("C" & i), "Greece") Or InStr(1, Range("D" & i), "Greece") Or InStr(1, Range("E" & i), "Greece") Then Rng.Offset(0, -1).Value = "Greece" i = i + 1 ElseIf InStr(1, Range("C" & i), "Italy") Or InStr(1, Range("D" & i), "Italy") Or InStr(1, Range("E" & i), "Italy") Then Rng.Offset(0, -1).Value = "Italy" i = i + 1 ElseIf Application.WorksheetFunction.CountA(Rng.Offset(0, 1)) = 0 Then i = i + 1 Else: i = i + 1 End If Wend End Sub 

Obviously, this isn't pretty. And given that the task is very repetitive, I reckon it could be made shorter. If this is possible, how could I start.

Furthermore, would it be possible to pull from an array of countries pre-prepared in another worksheet, and refer to that instead of inserting the actual names into the code, as is done here? And finally, if a range contained more than one country, how could I expand the code to pull that information too?

\$\endgroup\$

2 Answers 2

10
\$\begingroup\$

The small(er) things before we tackle the elephant in the room:

Naming

Your names aren't terrible, but could be a lot better. However, whatever you name things, you should comply with standard naming conventions. To wit:

Local Variables: Written in camelCase.

Dim localVariable As String
includes method arguments.

Module / Global Variables: Written in PascalCase.

Private ModuleVariable As String
Global PublicVariable As Long

Method Names: Verbs. Written in PascalCase

Private Function ReturnThisValue() As Long
Public Sub DoThisThing()

Constants: Written in SHOUTY_SNAKE_CASE

Public Const CONSTANT_VALUE As String = "This Value Never Changes"

Also, Block is declared and set but never used. It should be removed.


Don't use :

Just don't. Keep your instructions on separate lines. They're too easy to miss and violate too many conventions.


Put things in Variables

Range("C" & i) 

You see this? This is telling Excel to go and find that range. Every time you write it. What if you want to check a different column? Right now, you'll have to rewrite the declaration on 20 different lines.

Instead, put it in a variable then just reference the Variable. Now, if the variables need to change, you only have to change them in 1 place, and the rest takes care of itself.

Dim cCell As Range, dCell As Range, eCell As Range Set cCell = Range("C" & i) Set dCell = Range("D" & i) Set eCell = Range("E" & i) Dim countryName As String countryName = "Canada" If Instr(1, cCell, countryName) Or Instr(1, dCell, countryName) Or Instr(1, eCell, countryName) Then ... ... 

Don't Repeat Yourself

Also known as DRY. Take your i = i + 1 statement. That is always going to happen. So why write it 20 times when you can just put it at the start or end of your loop?

While i <= lastRow Code Code Code ... i = i + 1 Wend 

Boom. 12 lines of code gone


And now the big stuff:

Refactoring

Refactoring is the process of splitting one big thing into many little things. Any time you find yourself copy-pasting code, you should be thinking "Hmm, this can probably be turned into a method of some kind".


1st Refactoring

This check:

If Instr(1, cCell, countryName) Or Instr(1, dCell, countryName) Or Instr(1, eCell, countryName) Then 

Can be a Separate Method Like so:

Public Function NameIsInRange(ByVal searchName As String, ByRef range1 As Range, range2 As Range, range3 As Range) As Boolean Dim result As Boolean result = InStr(1, range1, searchName) Or InStr(1, range2, searchName) Or InStr(1, range3, searchName) NameIsInRange = result End Function 

And now we're down to:

Sub PullCountries() Dim i As Long i = 1 Dim LastRow As Long LastRow = Range("B1").End(xlDown).Row While i <= LastRow Dim resultRange As Range Set resultRange = Range("A" & i) Dim cCell As Range, dCell As Range, eCell As Range Set cCell = Range("C" & i) Set dCell = Range("D" & i) Set eCell = Range("E" & i) If NameIsInRange("Canada", cCell, dCell, eCell) Then resultRange = "Canada" ElseIf NameIsInRange("United States", cCell, dCell, eCell) Then resultRange = "United States" ElseIf NameIsInRange("Britian", cCell, dCell, eCell) Then resultRange = "UK" ElseIf NameIsInRange("UK", cCell, dCell, eCell) Then resultRange = "UK" ElseIf NameIsInRange("Spain", cCell, dCell, eCell) Then resultRange = "Spain" ElseIf NameIsInRange("Portugal", cCell, dCell, eCell) Then resultRange = "Portugal" ElseIf NameIsInRange("Ireland", cCell, dCell, eCell) Then resultRange = "Ireland" ElseIf NameIsInRange("Japan", cCell, dCell, eCell) Then resultRange = "Japan" ElseIf NameIsInRange("Greece", cCell, dCell, eCell) Then resultRange = "Greece" ElseIf NameIsInRange("Italy", cCell, dCell, eCell) Then resultRange = "Italy" End If i = i + 1 Wend End Sub Public Function NameIsInRange(ByVal searchName As String, ByRef range1 As Range, range2 As Range, range3 As Range) As Boolean Dim result As Boolean result = InStr(1, range1, searchName) Or InStr(1, range2, searchName) Or InStr(1, range3, searchName) NameIsInRange = result End Function 

2nd Refactoring

The only things that actually change in the loop are the name to check and the name to output. So, why don't we make those a list? For an iterable list with more than 1 element per line, I'd use an Array.

Let's make a new sheet and give it the codename wsCountryNames. Then a function to get the table and pass it to an Array:

enter image description here

Public Function GetCountryNamesTable() As Variant With wsCountryNames Dim topLeftCell As Range Set topLeftCell = .Cells(1, 1) '/ "A1" Dim finalRow As Long finalRow = .Cells(.Rows.Count, topLeftCell.Column).End(xlUp).Row Dim tableWidth As Long tableWidth = 2 Dim tableRange As Range Set tableRange = .Range(topLeftCell, Cells(finalRow, topLeftCell.Column + tableWidth - 1)) End With GetCountryNamesTable = tableRange End Function 

The value in Cell A1 is now in GetCountryNamesTable(1, 1), A2, (2, 1), B1, (1, 2) etc.

Now, we can just iterate through your list:

Sub PullCountries() Dim i As Long, j As Long i = 1 Dim LastRow As Long LastRow = Range("B1").End(xlDown).Row Dim namesList As Variant namesList = GetCountryNamesTable Dim searchName As String, displayName As String While i <= LastRow Dim resultRange As Range Set resultRange = Range("A" & i) Dim cCell As Range, dCell As Range, eCell As Range Set cCell = Range("C" & i) Set dCell = Range("D" & i) Set eCell = Range("E" & i) For j = LBound(namesList, 1) + 1 To UBound(namesList, 1) '/ +1 for header row searchName = namesList(j, 1) displayName = namesList(j, 2) If NameIsInRange(searchName, cCell, dCell, eCell) Then resultRange = displayName Exit For '/ We found our result so we can terminate the loop early End If Next j i = i + 1 Wend End Sub Public Function NameIsInRange(ByVal searchName As String, ByRef range1 As Range, range2 As Range, range3 As Range) As Boolean Dim result As Boolean result = InStr(1, range1, searchName) Or InStr(1, range2, searchName) Or InStr(1, range3, searchName) NameIsInRange = result End Function Public Function GetCountryNamesTable() As Variant With wsCountryNames Dim topLeftCell As Range Set topLeftCell = .Cells(1, 1) '/ "A1" Dim finalRow As Long finalRow = .Cells(.Rows.Count, topLeftCell.Column).End(xlUp).Row Dim tableWidth As Long tableWidth = 2 Dim tableRange As Range Set tableRange = .Range(topLeftCell, Cells(finalRow, topLeftCell.Column + tableWidth - 1)) End With GetCountryNamesTable = tableRange End Function 

3rd Refactoring

I'm going to put your search range in an Array:

Public Function GetSearchRange() As Variant With (codename of your sheet here, call it wsSearchSheet for now) Dim topLeftCell As Range Set topLeftCell = .Cells(1, 1) '/ "A1" Dim finalRow As Long finalRow = .Cells(.Rows.Count, 2).End(xlUp).Row '/ "2" for "B" column Dim finalCol As Long finalCol = 5 '/ "E" column Dim tableRange As Range Set tableRange = .Range(topLeftCell, Cells(finalRow, finalCol)) End With GetCountryNamesTable = tableRange End Function 

And Re-Jig the IsInRange function to deal with an array value instead:

Public Function ValueContainsString(ByVal valueToSearch As Variant, ByVal searchString As String) As Boolean ValueContainsString = InStr(1, CStr(valueToSearch), searchString) End Function 

And implement these changes to the main sub:

Option Explicit Sub PullCountries() Dim i As Long, j As Long, k As Long Dim namesList As Variant namesList = GetCountryNamesTable Dim searchNameCol As Long, displayNameCol As Long searchNameCol = 1 displayNameCol = 2 Dim searchArray As Variant searchArray = GetSearchValues Dim searchStartCol As Long, searchEndCol As Long searchStartCol = 3 searchEndCol = 5 Dim outputCol As Long outputCol = 1 Dim foundMatch As Boolean Dim valueToSearch As Variant Dim searchName As String, displayName As String For i = LBound(searchArray, 1) To UBound(searchArray, 1) foundMatch = False For j = searchStartCol To searchEndCol valueToSearch = searchArray(i, j) For k = LBound(namesList, 1) + 1 To UBound(namesList, 1) '/ +1 for header row searchName = namesList(k, searchNameCol) If ValueContainsString(valueToSearch, searchName) Then displayName = namesList(k, displayNameCol) searchArray(i, outputCol) = displayName foundMatch = True Exit For '/ We found our result so we can terminate the loop early End If Next k If foundMatch Then Exit For Next j Next i '/ Read output back to sheet. For i = LBound(searchArray, 1) To UBound(searchArray, 1) Range("A" & i) = searchArray(i, outputCol) Next i End Sub Public Function ValueContainsString(ByVal valueToSearch As Variant, ByVal searchString As String) As Boolean ValueContainsString = InStr(1, CStr(valueToSearch), searchString) End Function Public Function GetCountryNamesTable() As Variant With wsCountryNames Dim topLeftCell As Range Set topLeftCell = .Cells(1, 1) '/ "A1" Dim finalRow As Long finalRow = .Cells(.Rows.Count, topLeftCell.Column).End(xlUp).Row Dim tableWidth As Long tableWidth = 2 Dim tableRange As Range Set tableRange = .Range(topLeftCell, Cells(finalRow, topLeftCell.Column + tableWidth - 1)) End With GetCountryNamesTable = tableRange End Function Public Function GetSearchValues() As Variant With (codename of your sheet here, call it wsSearchSheet for now) Dim topLeftCell As Range Set topLeftCell = .Cells(1, 1) '/ "A1" Dim finalRow As Long finalRow = .Cells(.Rows.Count, 2).End(xlUp).Row '/ "2" for "B" column Dim finalCol As Long finalCol = 5 '/ "E" column Dim tableRange As Range Set tableRange = .Range(topLeftCell, Cells(finalRow, finalCol)) End With GetSearchValues = tableRange End Function 

Want to change your country names? Just change the values in your table. Your data moves about? Just change the GetSearchValues targets. You can extend either as far as you like.

Total number of values you now have to change in your code: Maybe 10. And then only if your sheet data moves around positions.


Codenames Addendum

Codenames are big and clever. Every worksheet and workbook has a "name" that the user can see and change.

MyCurrentWB.Worksheets("Country Names")

is referencing a sheet name.

A Codename on the other hand is a secret name that can only be set/changed in the IDE.

enter image description here

the name in brackets is the "name". The name not in brackets is the "codename". It is set in the properties window.

enter image description here

If you give a sheet a codename then the user can change the name as much as they like, all you have to do is use

wsCountryNames.Cells()

in your code and it will keep running.

\$\endgroup\$
7
  • \$\begingroup\$ Awesome. This is great advice, given that I'm totally oblivious to naming conventions. What do you mean by "Don't Use :" Is there an issue with the line Else: i = i + 1? \$\endgroup\$ Commented Apr 1, 2016 at 14:49
  • \$\begingroup\$ Don't go anywhere. This got posted halfway through me writing it so more is coming. The problem with : is that it literally says to the compiler "Treat everything after the : as though it were on a new line". You should just put it on a new line and prevent any confusion. \$\endgroup\$ Commented Apr 1, 2016 at 14:58
  • \$\begingroup\$ Ok, I'm confused at the 2nd Refactoring. Running the updated code as is is giving me a Object required error for Set topLeftCell = .Cells(1, 1) Is this expected? \$\endgroup\$ Commented Apr 1, 2016 at 16:11
  • 2
    \$\begingroup\$ I've added an addendum on codenames. \$\endgroup\$ Commented Apr 1, 2016 at 16:15
  • 1
    \$\begingroup\$ You could, but then you're accessing the worksheet every time you set a new range, and accessing a worksheet is slow. Do it once (to put it in an array) and you'll never notice, do it 10,000 times and your sub will slow to a crawl. w/rt multiple outputs: I would consider having a table(array) where each country gets a column and for each row, check off if each country appears in said row. \$\endgroup\$ Commented Apr 4, 2016 at 12:53
8
\$\begingroup\$

You could improve readability by lining things up with line continuations:

If InStr(1, Range("C" & i), "Canada") Or _ InStr(1, Range("D" & i), "Canada") Or _ InStr(1, Range("E" & i), "Canada") _ Then 

Now, if you're doing that 20 times, it's still a clunky piece of code.. just with less horizontal scrolling.

You noticed the repetition, that's good! Next step, is to eliminate it.

All branches do the same thing:

Rng.Offset(0, -1).Value = country i = i + 1 

How about you introduce a variable for country, and only write that bit once?

'...some logic to determine the value of 'country' Rng.Offset(0, -1).Value = country i = i + 1 

You need the values from 3 cells - store them in local variables, and reuse variables instead of accessing the worksheet over and over: accessing a worksheet is the slowest thing Excel-VBA does.

'todo: rename to... probably what the headers are saying Dim goodNameForColumnC As String goodNameForColumnC = Range("C" & i).Value Dim goodNameForColumnD As String goodNameForColumnD = Range("D" & i).Value Dim goodNameForColumnE As String goodNameForColumnE = Range("E" & i).Value 

Something bothers me here. Range, when used all by itself, is actually calling the Application.Range function, which implicitly refers to the active sheet. That's bad, especially if you have code that uses Select and Activate (which, luckily, you don't -- kudos for that!).

Never assume what the active sheet is going to be when your macro runs. If your code is supposed to run off Sheet1, then qualify the Range function call with Sheet1:

'todo: rename to... probably what the headers are saying Dim goodNameForColumnC As String goodNameForColumnC = Sheet1.Range("C" & i).Value Dim goodNameForColumnD As String goodNameForColumnD = Sheet1.Range("D" & i).Value Dim goodNameForColumnE As String goodNameForColumnE = Sheet1.Range("E" & i).Value 

Just doing that makes your code much less fragile.


So you have a number of strings you're looking for. Make an array for them:

Dim countries As Variant countries = Array("Canada", "United Stated", "UK", "Spain", "Portugal", "Japan", "Greece", "Italy") 

Now, what you really want to do, is to determine if either of the 3 cells you're looking in contains any of these strings, and if it does, you need to know what that string is. That really sounds like a job for a function.

Make another array for your cell values:

Dim cellValues As Variant cellValues = Array(goodNameForColumnC, goodNameForColumnD, goodNameForColumnE) 

Now you can pass that array as a parameter to a function that does just that:

Private Function ContainsAnyOf(ByRef cellValues(), ByRef countries(), ByRef outCountry As String) As Boolean End Function 

Notice the out prefix in outCountry? That's "the right way" to use Hungarian Notation - kudos for not using the "wrong way" in your variable names. Too often we see unreadable names with prefixes like s for String, i for Integer, o for Object, etc. But out here, means "this variable is really a return value".

So you can do this:

Dim country As String If ContainsAnyOf(cellValues, countries, country) Then Rng.Offset(0, -1).Value = country i = i + 1 End If 

And be done with it.

What's the implementation like, you'll ask? Simple - it's all about loops, and returning early:

Private Function ContainsAnyOf(ByRef cellValues(), ByRef countries(), ByRef outCountry As String) As Boolean Dim valueIndex As Integer Dim countryIndex As Integer For valueIndex = LBound(cellValues) To UBound(cellValues) For countryIndex = LBound(countries) To UBound(countries) If InStr(1, cellValues(valueIndex), countries(countryIndex)) > 0 Then outCountry = countries(countryIndex) ContainsAnyOf = True Exit Function End If Next countryIndex Next valueIndex outCountry = vbNullString ContainsAnyOf = False End Function 

The function will return True when any of the cell values contains any of the country names, and the outCountry will be either an empty string or the country name that was found.

Best of all, if you need to add more countries, all you need to do is add items to the array!


That could leave your final code looking like this:

Public Sub PullCountries() Dim lastRow As Long lastRow = Sheet1.Range("B1").End(xlDown).Row Dim currentRow As Long currentRow = 1 Dim target As Range Dim countries As Variant countries = Array("Canada", "United Stated", "UK", "Spain", _ "Portugal", "Japan", "Greece", "Italy") 'todo: rename to... probably what the headers are saying Dim goodNameForColumnC As String Dim goodNameForColumnD As String Dim goodNameForColumnE As String Dim cellValues As Variant Dim country As String While currentRow <= lastRow Set target = Sheet1.Range("B" & currentRow) goodNameForColumnC = Sheet1.Range("C" & currentRow).Value goodNameForColumnD = Sheet1.Range("D" & currentRow).Value goodNameForColumnE = Sheet1.Range("E" & currentRow).Value cellValues = Array(goodNameForColumnC, goodNameForColumnD, goodNameForColumnE) If ContainsAnyOf(cellValues, countries, country) Then target.Offset(0, -1).Value = country End If currentRow = currentRow + 1 Wend End Sub 
\$\endgroup\$
1
  • 1
    \$\begingroup\$ Awesome answer. These are great suggestions, and I'm really learning a lot. Putting your suggestions to use right now! Cheers! \$\endgroup\$ Commented Apr 1, 2016 at 15:31

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.