Regex For Block With Start And End Over Multiple Lines

December 22, 2012

I was reformatting some html code for a table with many rows and columns and in cell tables which had HTML comments that I wanted to remove.

There were over a hundred comments in between the cells spanning multiple lines with tabs and spaces at the beginning and end of the comments and a manual highlight and delete would have taken quite some time, and perhaps some required HTML code would have accidentally been deleted during such a laborious process.

I thought that a search and replace using a regular expression would be quicker so I did a Google search but could not find anything appropriate.

So, I sat with my favourite code editing and regular expression tools, EditPad Pro and RegexBuddy, and after about 5 minutes of trial and error I managed to create a regular expression which found HTML comments over multiple lines and I simply did a replace with a blank string to delete the comments.

HTML comments start with <!– and end with –> which you will find in the start and end capturing group (brackets) in the regular express below, however you can replace the start and end capturing groups with whatever the start and end text strings are for the block you are looking for, and if find another use for the regex then please let me know.

The regex is quite simply:

(<!--)(.|[\r\n])+?(-->)

 Appreciation:

  • RegexBuddy : Software for creating, testing and using regular expressions.
  • EditPad Pro : A powerful text editor especially for programming.
Share this:
About Bharat Karavadra

"I research and share leading-edge information, tools and exercises to help people transform and heal their life situations."

Share your thoughts

Your email address will not be published. Required fields are marked

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
Bharat Karavadra