5

I need to find the longest common substring of n strings and use the result in my project.

Is there any existing implementation/library in java which already does this?

3

5 Answers 5

10

What about concurrent-trees ?

It is a small (~100 KB) library available in Maven Central. The algorithm uses combination of Radix and Suffix Trees. Which is known to have a linear time complexity (wikipedia).

public static String getLongestCommonSubstring(Collection<String> strings) { LCSubstringSolver solver = new LCSubstringSolver(new DefaultCharSequenceNodeFactory()); for (String s: strings) { solver.add(s); } return solver.getLongestCommonSubstring().toString(); } 
Sign up to request clarification or add additional context in comments.

Comments

6

We can use below code to identify longest common sub string of n Strings

public static String identifyCommonSubStrOfNStr(String [] strArr){ String commonStr=""; String smallStr =""; //identify smallest String for (String s :strArr) { if(smallStr.length()< s.length()){ smallStr=s; } } String tempCom=""; char [] smallStrChars=smallStr.toCharArray(); for (char c: smallStrChars){ tempCom+= c; for (String s :strArr){ if(!s.contains(tempCom)){ tempCom=c; for (String s :strAarr){ if(!s.contains(tempCom)){ tempCom=""; break; } } break; } } if(tempCom!="" && tempCom.length()>commonStr.length()){ commonStr=tempCom; } } return commonStr; } 

Comments

0

This page pretty much gives exactly what you want in quite a few languages.

public static int longestSubstr(String first, String second) { if (first == null || second == null || first.length() == 0 || second.length() == 0) { return 0; } int maxLen = 0; int fl = first.length(); int sl = second.length(); int[][] table = new int[fl][sl]; for (int i = 0; i < fl; i++) { for (int j = 0; j < sl; j++) { if (first.charAt(i) == second.charAt(j)) { if (i == 0 || j == 0) { table[i][j] = 1; } else { table[i][j] = table[i - 1][j - 1] + 1; } if (table[i][j] > maxLen) { maxLen = table[i][j]; } } } } return maxLen; } 

4 Comments

its for two substrings... it want it for n substrings... say if you have s1="aa111" s2="aa221" s3="1", lcs for s1 & s2 is aa , lcs for aa nad s3 is "" , but the correct answer is "1"
@user1628340 you can use the above code and just run it iteratively. If you have an N-element array, just run the code N-1 times, running it on the pairs 0,1; 1,2; 2,3; 3,4; ... until you're done.
@sigpwned that does'nt work too. say if have s1= "--a.txt", s2= "--b.txt" ,s3= "--c.swe" ... lcs of s1 and s2 is s4= .txt, lcs of s2 and s3 id s5= "--" ... in the next iteration, lcs of s4 and s5 is "", while the correct answer is "--"
ah, true. you'd need to track all common substrings for each pair, and compute the intersection as you go. choose the longest string from the final set; that's your lcs.
0

Well you could try to extend the version of wikipedia (http://en.wikipedia.org/wiki/Longest_common_substring_problem) for n Strings by putting it into a loop which iterates over all your Strings.

3 Comments

looping wont work! say if you have s1="aa111" s2="aa221" s3="1", lcs for s1 & s2 is aa , lcs for aa nad s3 is "" , but the correct answer is "1"
the iteration should work on original strings, not on temporary results of other comparisons... then you get the correct solution!
that does'nt work too. say if have s1= "--a.txt", s2= "--b.txt" ,s3= "--c.swe" ... lcs of s1 and s2 is s4= .txt, lcs of s2 and s3 id s5= "--" ... in the next iteration, lcs of s4 and s5 is "", while the correct answer is "--"
0
public String findLongestCommonSubstring(String source, String dest) { int table[][] = new int[source.length() + 1][dest.length() + 1]; for (int col = 0; col < table[0].length; col++) { table[0][col] = 0; } for (int row = 0; row < table.length; row++) { table[row][0] = 0; } int max = 0; int index = 0; for (int row = 1; row < table.length; row++) { char sourceChar = source.charAt(row - 1); for (int col = 1; col < table[row].length; col++) { char destChar = dest.charAt(col - 1); if (sourceChar == destChar) { table[row][col] = table[row - 1][col - 1] + 1; if (table[row][col] > max) { max = table[row][col]; index = row; } } else { table[row][col] = 0; } } } return source.substring(index - max, index); } 

1 Comment

Welcome to Stack Overflow! While this code may solve the question, including an explanation of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please edit your answer to add explanations and give an indication of what limitations and assumptions apply.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.