Welcome to WebProNews Breaking eBusiness and Search News
Advertise | Newsletter | Sitemap | News Feeds News Feed 
 WebProNews Search Part of the iEntry network iEntry inc. 

Cleaning Up Verity Results (Coldfusion)

Raymond Camden
Expert Author
Published: 2007-04-18

WebProNews RSS Feed


Christian Ready pinged me a few days ago about an interesting problem he was having at one of his web sites.

His search (Verity-based on CFMX7) was returning HTML. The HTML was escaped so the user literally saw stuff like this in the results:
Hi, my name is <b>Bob</b> and I'm a rabid developer!
I pointed out that the regex used to remove HTML would also work for escaped html:

<cfset cleaned = rereplace(str, "<.*?>", "", "all")>

In English, this regex matches the escaped less than sign (<), any character (non greedy, more on that in a bit), and then the escaped greater than symbol (>).

The "non greedy" part means to match the smallest possible match possible. Without this, the regex would remove the html tag and everything inside of it! We just want to remove the tags themselves.

This worked - but then exposed another problem. Verity was returning text with incomplete HTML tags. As an example, consider this text block:

ul>This is some <b>bold</b> html with <i>markup</i> in it.
Here is <b

Notice the incomplete HTML tag at the beginning and end of the string.

Luckily regex provides us with a simple way to look for patterns at either the beginning or end of a string. Consider these two lines:

The first line looks for a match of a < at the end of the string. The next line looks for a > at the beginning of the string.

Both allow for bits of the html tag as well.

So all together this is the code I gave him:

<cfset cleaned = rereplace(str, "<.*?>", "", "all")>
<cfset cleaned = rereplace(cleaned, "<.*?$", "", "all")>
<cfset cleaned = rereplace(cleaned, "^.*?>", "", "all")>

Most likely this could be done in one regex instead.

Add to Del.icio.us | Digg | Reddit | Furl

Comments

View All Articles by Raymond Camden



Receive Our Daily Email of Breaking eBusiness News


About the Author:
Raymond Camden, ray@camdenfamily.com
http://ray.camdenfamily.com

Raymond Camden is Vice President of Technology for roundpeg, Inc. A long time ColdFusion user, Raymond has worked on numerous ColdFusion books and is the creator of many of the most popular ColdFusion community web sites. He is an Adobe Community Expert, user group manager, and the proud father of three little bundles of joy.

WebProNews RSS Feed

More Expert Articles Articles

Contact WebProNews
Advertisement





TOP NEWS

WebProBlog
The official blog of WebProNews.

Go to WebProBlog

Targeted Information for Business
WebProNews is part of the iEntry network

Internet Business: Marketing: Small Business:
WebProNews MarketingNewz SmallBusinessNewz
WebProWorld AdvertisingDay PromoteNews
EcommNewz SalesNewz EntrepreneurNewz

Software: Search Engines: Web Design:
WebMasterFree Jayde B2B DesignNewz
NetworkingFiles SearchZA FlashNewz
SecurityConfig SearchNewz WebSiteNotes

Developer: IT Management: Security:
DevWebPro ITManagement SecurityProNews
DevNewz SysAdminNews SecurityConfig
TheDevWeb NetworkingFiles NetworkNewz

The iEntry Network consists of over 100 web publications reaching millions of Internet Professionals. Contact us to advertise.
eBUSINESS RESOURCES






 Advertise | Contact Us | Corporate | Newsletter | Sitemap | Submit an Article | News Feeds
 WebProNews is an iEntry, Inc. ® publication - $line) { echo $line ; } ?> All Rights Reserved
WebProWorld
Ten most recent posts.


SearchBrains.com
NetworkingFiles
Featured Software


About WebProNews
WebProNews is the number one source for eBusiness News. Over 5 million eBusiness professionals read WebProNews and other iEntry business and tech publications.

WebProNews provides real-time coverage of internet business.

Free Email Newsletters:
WebProNews SearchNewz
WebProWorld DevWebPro
Marketing SecurityNews
Plus over 100 other newsletters!

Send me relevant info on products and services.


iEntry.com WebProWorld RSS Feed WebProWorld Contact WebProNews Print Version Email a friend Bookmark us SearchBrains.com SearchBrains RSS Feed