2

I need to remove all chars that cant be part of urls, like spaces ,<,> and etc.

I am getting the data from database.
For Example if the the retrieved data is: Product #number 123!

the new string should be: Product-number-123

Should I use regex? is there a regex pattern for that? Thanks

1
  • 1
    I take it you want an SEO-friendly string; not a data-preserving (uri-escaped) string? Commented Jul 24, 2009 at 10:33

3 Answers 3

2

Here is a an example on how to generate an url-friendly string from a "normal" string:

public static string GenerateSlug(string phrase) { string str = phrase.ToLower(); str = Regex.Replace(str, @"[^a-z0-9\s-]", ""); // invalid chars str = Regex.Replace(str, @"\s+", " ").Trim(); // convert multiple spaces into one space str = str.Substring(0, str.Length <= 45 ? str.Length : 45).Trim(); // cut and trim it str = Regex.Replace(str, @"\s", "-"); // hyphens return str; } 

You may want to remove the trim-part if you are sure that you always want the full string.

Source

Sign up to request clarification or add additional context in comments.

4 Comments

Might be worth doing a replace for Multiple hyphens too at the end of the above or you can end up with my----name----is type urls.
Is this a problem in other strings than strings that allready have hyphens? "my- --name- -is"
This seems like a very complicated piece of code to accomplish something that can be done with a single regex replace.
It's not that complicated, just verbose. And much easier to read than a single regex-replace.
1

An easy regex to do this is:

string cleaned = Regex.Replace(url, @"[^a-zA-Z0-9]+","-"); 

4 Comments

Yes, this is fairly simple, perhaps follow it up with a replace of consecutive "-". Off the top of my head something like: cleaned = Regex.Replace(cleaned, @"--+",""); should do the trick.
Edited the answer to include my suggestion as it checked out. Hope you don't mind :)
I wouldn't mind if your edit is correct, but it isn't. My original regex replacement never produces consecutive dashes.
This yields "too many" hyphen characters in certain situations, including when you want to spell the name "O'Donnel".
1

To just perform the replacement of special characters like "<" you can use Server.UrlEncode(string s). And you can do the opposite with Server.UrlDecode(string s).

2 Comments

Looks like he's after a human readable (aka SEO) friendly url rather than one that includes all the extra characters. Though this would work but would not be all that readable.
That's a fair point. I obviously missed that part of the question.