The overall question could be better for stackoverflow. This is more about reviewing code, where one would expect to see the code. There is no source for the Levenshtein distance.
But well, let me separate my answer in two parts.
First, the overall question
Are there any good ideas as to how to improve the code, or trim my search space so I might come to a faster solution?
According to the given preconditions:
One string being a description of a item in inventory and one string being a description of a item, but possibly truncated/missing some characters
One string is correct, the other one has flaws in a way, that characters are missing.
If we rephrase this, we have a string correct and a string flawed with this properties:
For all characters char in flawed
char is included in correct and count(char) in flawed is smaller or equal to count(char) in correct ("contained"). index(char) in flawed is smaller or equal to index(char) in correct ("ordered").
We use this properties to base our algorithm.
Trivial version:
For every flawed string, take every char, look if it is inside any correct string, if yes, remove char from correct string and continue with next char. If we found a correct string, which includes all chars from flawed string, we have a good candidate.
An implementation could be:
public static boolean isCharArrayIncludedInString(final char[] arrayChar, final String string) { final char[] targetArray = string.toCharArray(); for (final char arrayCharItem : arrayChar) { boolean isFound = false; for (int i = 0; i < targetArray.length; ++i) { //if we have large targetArrays, we could improve this by a linkedlist and removing the entries if (arrayCharItem == targetArray[i]) { targetArray[i] = 0; isFound = true; break; } } if (!isFound) return false; } return true; } //if we use the ordered property, too: public static boolean isCharArrayIncludedInString(final char[] arrayChar, final String string) { final char[] targetArray = string.toCharArray(); int innerIndex = 0; for (final char arrayCharItem : arrayChar) { boolean isFound = false; while (innerIndex < targetArray.length) { if (arrayCharItem == targetArray[innerIndex++]) { // here we use the "ordered" property isFound = true; break; } } if (!isFound) return false; } return true; } // hint: i choosed isCharArrayIncludedInString instad of isStringIncludedInString, because the second typically assumes contiguous appearance
We can improve this, if you know that at least x of the first characters must be the same. Then do a startsWith for the substring.
The algorithm has a worst time complexity of (nmlen(longest string))
The not trivial solution:
Create a modified trie data structure. If we did not found a character for a node, look for all subnodes. The complexity is hard to guess, but in the average case should be better than the trivial version.
Second part, Code quality
Some parts are already pointed out, I try to avoid them.
public boolean updateItems2(LinkedList<String> arg)
Bad name for method (what is the meaning of the 2? What is the difference between 1 and 2?), expected result from a update method with return boolean is the success state, which is not the case(?). Rename it perhaps to printAllMatchingDescriptions or something like that. getAllMatchingDescriptions if you want to return it later.
public boolean updateItems2(LinkedList<String> arg)
Bad name for argument. Clearly, this is an argument, and arg could be an abbreviation for this. listDescriptions could be a better name. And use List if there is no specific need for LinkedList:
public boolean printAllMatchingDescriptions(List<String> listDescriptions)
// Create a list of items with the CL descriptions LinkedList<Item> clItems = new LinkedList<Item>();
Yes, the comment describes the next line. But it has no additional value as already written by the source code. You should avoid such comments.
LinkedList<Item> clItems = new LinkedList<Item>();
Again, i would avoid abbreviations and would name it listItems.
for(String a : arg) { Item newCL = new Item(); newCL.setDesc(a); clItems.add(newCL); }
It looks like the description is a good candidate for the constructor.
And again, try to avoid abbreviations.
It could be:
for (final String listDescriptionsItem : listDescriptions) listItems.add(new Item(listDescriptionsItem));
System.out.println("Update Items got " + clItems.size() + " items");
A logger could be nice. But ok, for quick & dirty debugging in small programs, println could be fine.
// Try to set one item in the inventory to active for each clItem int setNum = 1; for(Item cl : clItems) for(Item cur : theApp.TheMasterList.Inventory) if(cur.getActv() != "YES" && sameItemDesc(cl.getDesc(), cur.getDesc())) { System.out.println("Set#:" + setNum + "\nInvItem:" + cur.getDesc() + "\nCLItem:" + cl.getDesc()); cl.setActv("YES"); cur.setActv("YES"); setNum++; break; }
int setNum = 1;why does it start with 1? From the code, it looks like you are counting the number of activated items. At least, this is what is happening. From the println it looks like this should be a index. If the second is the case, you should modify the code.
Use brackets for statements with more than one line. This is confusing.
Again, abbreviations.
theApp.TheMasterList.Inventory this does not look good. But I can not make any suggestions, because code is unknown. Try to avoid static access for dynamic parts.
The string "YES"/"NO" thing was already mentioned. Names for better method names could be isActivated() and activate().
You could call the sameItemDesc with two items, rather than the description.
cur.setActv("YES"); this does not make sense. You are only inside the statement, if this is already true. It could be:
for (final Item listItemsItem : listItems) { for (final Item inventoryItem : Inventory) { if (inventoryItem.isActivated()) continue; if (!isFirstItemSameAsSecondItem(listItemsItem, inventoryItem)) continue; System.out.println("Set#:" + numberOfActivatedItems + "\nInvItem:" + inventoryItem.getDescription() + "\nCLItem:" + listItemsItem.getDescription()); listItemsItem.activate(); numberOfActivatedItems++; break; } }
// Print out the items that were not set int notSet = 0; for(Item cl : clItems) if(cl.getActv() != "YES") {System.out.println("Not Set:" + cl.getDesc());notSet++;}
You know it already, abbreviations. And unexpected brackets. And this method is perhaps not necessary. Because the number of deactivated items is size of the list (all items) minus activated:
final int numberOfDeactivatedItems = listItems.size() - numberOfActivatedItems;
// See how many we set int actv = 0; for(Item a : theApp.TheMasterList.Inventory) if(a.getActv() == "YES") actv++; System.out.println("UpdateItems set:" + actv + " notset:" + notSet);
Why do you calculate this a second time?
The sameItemDesc could be changed too, but I will spare this part because I do not know if it is necessary.
Overall, we could have something like this:
public void printAllMatchingDescriptions(final List<String> listDescriptions) { final List<Item> listItems = new LinkedList<Item>(); for (final String listDescriptionsItem : listDescriptions) listItems.add(new Item(listDescriptionsItem)); System.out.println("Update Items got " + listItems.size() + " items"); // Plan: Check if any of the listDescriptions is contained in any state in the overall list int numberOfActivatedItems = 0; for (final Item listItemsItem : listItems) { for (final Item inventoryItem : Inventory) { if (inventoryItem.isActivated()) // if you have a lot of activated ones, it could be better to create a new list with only deactivated entries continue; if (!isFirstItemSameAsSecondItem(listItemsItem, inventoryItem)) continue; System.out.println("Set#:" + numberOfActivatedItems + "\nInvItem:" + inventoryItem.getDescription() + "\nCLItem:" + listItemsItem.getDescription()); listItemsItem.activate(); numberOfActivatedItems++; break; } } final int numberOfDeactivatedItems = listItems.size() - numberOfActivatedItems; System.out.println("Items activated:" + numberOfActivatedItems + ", deactivated:" + numberOfDeactivatedItems); showUserDif(listItems); // I do not know what is happening here. At least abbreviations again } public static boolean isFirstItemSameAsSecondItem(final Item item, final Item other) { return isCharArrayIncludedInString(item.getDescription().toCharArray(), other.getDescription()); } public static boolean isCharArrayIncludedInString(final char[] arrayChar, final String string) { final char[] targetArray = string.toCharArray(); int innerIndex = 0; for (final char arrayCharItem : arrayChar) { boolean isFound = false; while (innerIndex < targetArray.length) { if (arrayCharItem == targetArray[innerIndex++]) { // here we use the "ordered" property isFound = true; break; } } if (!isFound) return false; } return true; }