I previously wrote about using jQuery to Strip All HTML Tags From a Div. Now if you want to remove all bad character from a HTML string (which may have been provided by a $.getScript() call or such).
This is how you can easily clean up your html and remove bad characters, it could be useful when you get the html from somewhere and you want to .match() for strings but the .match() throws an error because of bad characters. We can do this using regex and still retain our HTML tags like so:
rawData = rawData.replace(/[<>^a-zA-Z 0-9]+/g,'');
If we wanted to be extra specific we could also remove other common characters which are not needed:
rawData = rawData.replace(/[^\/\\\"_+-<>=a-zA-Z 0-9]+/g,'');
cleanHTML() Function
I wrote this little function to help with the process of cleaning up the HMTL ready for using regex on it.
var JQUERY4U = {};
JQUERY4U.UTIL =
{
cleanUpHTML: function(html) {
html = html.replace("'",'"');
html = html.replace(/[^\/\\\"_+-\?!<>\[\]{}()=\*\.|a-zA-Z 0-9]+/g,'');
return html;
}
}
//usage:
var cleanedHTML = JQUERY4U.UTIL.cleanUpHTML(htmlString);



Pingback: Solutions to Common jQuery Errors | jQuery4u