Atlanta Custom Software Development 

 
   Search        Code/Page
 

User Login
Email

Password

 

Forgot the Password?
Services
» Web Development
» Maintenance
» Data Integration/BI
» Information Management
Programming
  Database
Automation
OS/Networking
Graphics
Links
Tools
» Regular Expr Tester
» Free Tools

Strip the tags out of a HTML string

Total Hit ( 1368)

Rate this article:     Poor     Excellent 

 Submit Your Question/Comment about this article

Rating


 


A relational technique to strip the HTML tags out of a string. This solution demonstrates how to use simple tables & search functions effectively in SQL Server to solve procedural / iterative problems.

Click here to copy the following block
-- This table contains the tags to be replaced. The % in <head%>
-- will take care of any extra information in the tag that you needn't worry
-- about as a whole. In any case, this table contains all the tags that needs
-- to be search & replaced.
create table #html ( tag varchar(30) )
insert #html values ( '<html>' )
insert #html values ( '<head%>' )
insert #html values ( '<title%>' )
insert #html values ( '<link%>' )
insert #html values ( '</title>' )
insert #html values ( '</head>' )
insert #html values ( '<body%>' )
insert #html values ( '</html>' )
go

-- A simple table with the HTML strings
create table #t ( id tinyint identity , string varchar(255) )
insert #t values (
'<HTML><HEAD><TITLE>Some Name</TITLE>
<LINK REL="stylesheet" HREF="/style.css" TYPE="text/css" ></HEAD>
<BODY BGCOLOR="FFFFFF" VLINK="#444444">
Some HTML text after the body</HTML>'

)
insert #t values (
'<HTML><HEAD><TITLE>Another Name</TITLE>
<LINK REL="stylesheet" HREF="/style.css"></HEAD>
<BODY BGCOLOR="FFFFFF" VLINK="#444444">Another HTML text after the body</HTML>'

)
go

-- This is the code to strip the tags out.
-- It finds the starting location of each tag in the HTML string ,
-- finds the length of the tag with the extra properties if any. This is
-- done by locating the end of the tag namely '>'. The same is done
-- in a loop till all tags are replaced.
begin tran
while exists(select * from #t join #html on patindex('%' + tag + '%' , string ) > 0 )
   update #t
   set string = stuff( string , patindex('%' + tag + '%' , string ) ,
               charindex( '>' , string , patindex('%' + tag + '%' , string ) )
               - patindex('%' + tag + '%' , string ) + 1 , '' )
   from #t join #html
   on patindex('%' + tag + '%' , string ) > 0

select * from #t
rollback


Submitted By : Nayan Patel  (Member Since : 5/26/2004 12:23:06 PM)

Job Description : He is the moderator of this site and currently working as an independent consultant. He works with VB.net/ASP.net, SQL Server and other MS technologies. He is MCSD.net, MCDBA and MCSE. In his free time he likes to watch funny movies and doing oil painting.
View all (893) submissions by this author  (Birth Date : 7/14/1981 )


Home   |  Comment   |  Contact Us   |  Privacy Policy   |  Terms & Conditions   |  BlogsZappySys

© 2008 BinaryWorld LLC. All rights reserved.