Generalhttp://www.geekzilla.co.uk/Innovation Team's dumping grounden-usTue, 10 Jun 2003 04:00:00 GMTTue, 10 Jun 2003 09:41:01 GMTGeekZilla.co.ukeditor@GeekZilla.co.ukwebmaster@GeekZilla.co.uk Cleaning a string using a regular expression ready for placing in a URL http://www.geekzilla.co.uk/viewFE7A2539-E0EE-4452-A1CC-BC4D18966F0F.htm Cleaning a string using a regular expression ready for placing in a URL 9/1/2008 http://www.geekzilla.co.uk/viewFE7A2539-E0EE-4452-A1CC-BC4D18966F0F.htm Regex match html content without screwing up the tags http://www.geekzilla.co.uk/view86FE27C3-E286-4EC7-8275-6E0D8BB58C42.htm Regex match html content without screwing up the tags When needing to highlight words in a string containing HTML we found we soon ran into problems when the word we were searching for appeared in the middle of a tag.. Imagine the example: If I wanted to bold all occurances of geekzilla, I'd usually do this: .. unfortunately, when dealing with HTML rather than just text, this will screw my tag and produce the following We did a lot of googling and found loads of people discussing ways to ignore the tags. Suggetions rainged from sax parsers to character by character loops (nasty). Armed with an excellent regex for matching an entire HTML tag we came up with the following solution Our Solution Use a custom Regex match evaluator to ignore any tags. This works well and is very fast. There may be a slicker way to do this, I hope someone is inspired enough to figure it out and post a comment :) 26/11/2007 http://www.geekzilla.co.uk/view86FE27C3-E286-4EC7-8275-6E0D8BB58C42.htm Using Regex Look Arounds http://www.geekzilla.co.uk/view6BD88331-350D-429F-AB49-D18E90E0E705.htm Using Regex Look Arounds .net supports four types of "Look Around". These can be used to make sure that patterns do or do not appear before or after whatever it is you're matching on. Consider the following text: If I wanted to pluck out the registtration number, I'd probably start with an expression like The problem is, although that expression does indeed return '''ab123abc''' as a match, it also returns '''or 2 day'''. Using a negative lookbehind and a negative lookahead I can stop the RegEx engine from matching parts of words. The following expression includes the negative look behind '''(?<![a-z])''' and the negative look ahead '''(?![a-z]))''' This only matches on the regnum. The available "Look Around" expressions are: ||'''?='''||Positive Look Ahead|| ||'''?<!'''||Negative Look Behind|| ||'''?<='''||Positive Look Behind|| ||'''?!'''||Negative Look Ahead|| 18/9/2006 http://www.geekzilla.co.uk/view6BD88331-350D-429F-AB49-D18E90E0E705.htm Using named match groups in expressions http://www.geekzilla.co.uk/view1C8D902A-0E30-4EAC-8287-9D6CB27FC66F.htm Using named match groups in expressions The following code will return the value '''user''' Sometimes it makes life easier to refer to the group by name rather than its position (or GroupNum). To do this we need to insert the name into the expression. For example: In the code above, I've called the match group '''username'''. The Regex parser recognises this name because it is declared within the group brackets '''()''' and is preceded by a '''?'''. i.e. ('''?<username>'''.*?). The Groups() method is overloaded to accept both GroupNum and GroupName. 4/8/2006 http://www.geekzilla.co.uk/view1C8D902A-0E30-4EAC-8287-9D6CB27FC66F.htm Using Regex.Replace http://www.geekzilla.co.uk/viewAACA13B1-A56A-4F6F-8331-13A40F49F0AD.htm Using Regex.Replace Say you want to get the username from a fully qualified username such as '''MyDomain\AUser''' Most developers would turn to SubString and LastIndexOf functions .. for example Instead of using these functions you should consider the use of Regex.Replace(). Here's how you'd do it: You can see from the example above that the Replace method is matching on the string and just returning the first Group. '''(.*?)''' is represented in the replace expression by '''$1''' As I'm sure you can imagine, the possibilities here are endless... example I hope you find uses for Regex.Replace in your applications. 4/8/2006 http://www.geekzilla.co.uk/viewAACA13B1-A56A-4F6F-8331-13A40F49F0AD.htm Using Regular Expressions to validate a filename in a FileUpload control http://www.geekzilla.co.uk/view9833D332-2813-43EE-9E8A-45F2B97189C3.htm Using Regular Expressions to validate a filename in a FileUpload control Here's a little code snippet I use to ensure that the file that has been uploaded is of type JPG, JPEG, PNG or GIF 31/7/2006 http://www.geekzilla.co.uk/view9833D332-2813-43EE-9E8A-45F2B97189C3.htm Highlighting keywords in text using Regex.Replace (Perfect for SEO) http://www.geekzilla.co.uk/view9106A22C-16B7-49C7-AC47-0CE9A1106CC8.htm Highlighting keywords in text using Regex.Replace (Perfect for SEO) Why I needed to take some text and bold certain keywords before returning the data to the web browser to enhance my {Search Engine Optimization}http://www.kwiboo.com/SearchEngineOptimisation.aspx Example The following example shows how I achieved this although it does contain dummy data. I created a new C# 2005 Console App and added the following to the Main method: Then added the follwoing static methods: Result Explanation First of all, I needed to swap out the comma+space (or just comma in some cases) for the pipe character ''''(regex or)'''' I then prepared a Regex object for the main keyword replace: You can see I chose to ignore case and match based on Singleline. Singleline ignores new line characters mid match.. for example two words seperated by newline rather than space. Now comes the replace. You'll notice that I pass a MatchEvaluator into the replace method. I use this to choose what to replace the match with. The MatchEval method only looks for the first group match, that's all I need. Had the master regular expresion contained two groups, the MatchEval method would have required a second '''If'''. 20/7/2006 http://www.geekzilla.co.uk/view9106A22C-16B7-49C7-AC47-0CE9A1106CC8.htm