4

I need ideas with the best performance to remove/filter strings

I have:

string Input = "view('512', 3, 159);"; 

What's the best performance way to remove "view(" and ");" and the quotes? I can do this:

Input = Input.Replace("view(","").Replace("'","").Replace("\"","").Replace(");",""); 

but it seems rather inelegant.

Input.Split('(')[1].Split(')')[0].Replace("'", ""); 

it seems rather better

I want no do it by using regular expression; I need make the faster application what I can. Thanks in advance! :)

5
  • There's got to be some linq in here somewhere Commented Jul 11, 2011 at 22:10
  • Why do you not want to use regular expressions? Commented Jul 11, 2011 at 22:11
  • You're saying best as in best performance, right? Commented Jul 11, 2011 at 22:12
  • Regex is possibly faster in this case... Do you need to deal with nested parentheses? Commented Jul 11, 2011 at 22:15
  • @Freed:to best performance @Frédéric Hamidi:yes,response updated Commented Jul 12, 2011 at 2:47

10 Answers 10

4

You could use a simple linq statement:

string Input = "view('512', 3, 159);"; string output = new String( Input.Where( c => Char.IsDigit( c ) || c == ',' ).ToArray() ); 

Output: 512,3,159

If you want the spaces, just add a check in the where clause.

Sign up to request clarification or add additional context in comments.

Comments

2

You could use just a Substring to remove the view( and );:

Input.Substring(5, Input.Length - 7) 

Other than that it looks reasonably efficient. Plain string operations are pretty well optimised.

So:

Input = Input.Substring(5, Input.Length - 7) .Replace("'", String.Empty) .Replace("\"", String.Enmpty); 

Comments

2
char[] Output = Input.SkipWhile(x => x != '(') // skip before open paren .Skip(1) // skip open paren .TakeWhile(x => x != ')') // take everything until close paren .Where(x => x != '\'' && x != '\"') // except quotes .ToArray(); return new String(Output); 

Comments

2

Hope this helps

Regex.Replace("view('512', 3, 159);",@"[(view)';]","") 

Comments

1

IndexOf, LastIndexOf, and Substring are probably fastest.

string Input = "view('512', 3, 159);"; int p1 = Input.IndexOf('('); int p2 = Input.LastIndexOf(')'); Input = Input.Substring (p1 + 1, p2 - p1 - 1); 

Comments

1

Use the following:

 System.Text.StringBuilder sb=new System.Text.StringBuilder(); int state=0; for(var i=0;i<Input.Length;i++){ switch(state){ case 0: // beginning if(Input[i]=='('){ state=1; // seen left parenthesis } break; case 2: // seen end parentheses break; // ignore case 1: if(Input[i]==')'){ state=2; // seen right parentheses } else if(Input[i]!='\''){ sb.Append(Input[i]); } break; } } Console.WriteLine(sb.ToString()); 

3 Comments

Trouble is this might well be quickest (worth checking anyway)
I will reinstate the regular expression if it does turn out to be faster, though I believe it wouldn't make much difference.
As it turns out, the regular expression that was previously here was more than twice as slow, after running each approach 200,000 times.
1
 var result = new string(Input.ToCharArray(). SkipWhile (i => i != '\''). TakeWhile (i => i != ')').ToArray()); 

Comments

1

Why don't you want to use regular expressions? Regular expressions are heavily optimised and will be much faster than any hand written hack.

This is java (as I run linux and can't run c# as a result), but I hope you get the idea.

input.replace("view(","").replace("'","").replace("\"","").replace(");",""); 

A million repetitions of the above runs in about 6 seconds on my computer. Whereas, the regular expression below runs in about 2 seconds.

// create java's regex matcher object // matcher is looking for sequences of digits (valid integers) Matcher matcher = Pattern.compile("(\\d+)").matcher(s); StringBuilder builder = new StringBuilder(); // whilst we can find matches append the match plus a comma to a string builder while (matcher.find()) { builder.append(matcher.group()).append(','); } // return the built string less the last trailing comma return builder.substring(0, builder.length()-1); 

If you want to find valid decimals as well as integers then use the following pattern instead. Though it runs slightly slower than the original.

"(\\d+(\\.\\d*)?)" 

1 Comment

I know you said you didn't want to use regular expressions, but I think you're making a mistake. After the initial learning curve writing a good regex is faster, clearer and cleaner than a hand written alternative, but it will also perform faster than the functional and imperative answers here.
0

More generic

void Main() { string Input = "view('512', 3, 159);"; var statingPoint = Input.IndexOf('(') + 1; var result = Input.Substring(statingPoint, Input.IndexOf(')') - statingPoint); } 

Comments

0

fastest way would be Input = Input.Substring(5, Input.Length - 7)

5 Comments

This won't get rid of the quote after 512.
I assume the question is how to extract part of a string in quotes, not how to extract that specific range in the specific example?
@Femaref, or of the one before 512 either.
That will cause an ArgumentOutOfRangeException. You can't get characters beyond the end of the string.
@Guffa: fixed, I forgot the 2nd param is length and not endPos

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.