1

I have a column named "Message". In this column there is a data which is HTML code. I need to parse this HTML in SQL then split it 5 different column "Name" - "Surname" - "Email" - "Telephone" - "Message". Here is the HTML format that I need to parse;

<html>
   <body>
      <br><br>
      <table>
         <tr>
            <td>NameSurname</td>
            <td>kaydi peldi sord</td>
         </tr>
         <tr>
            <td>Email</td>
            <td>[email protected]</td>
         </tr>
         <tr>
            <td>Telephone</td>
            <td>535...5464</td>
         </tr>
         <tr>
            <td colspan=2>Message</td>
         </tr>
         <tr>
            <td colspan=2>Benfica-Fenerbahçe</td>
         </tr>
      </table>
   </body>
</html>

First, split NameSurname to Name and Surname. The rule is split from last space (in this sample, it should "Name : ejder mehmet" , "Surname : sıkık", then insert other columns directly. How can I do that? Thanks for answers!

6
  • This isnt really a "request to your hearts desire website", its to help users who are actually willing to learn. stackoverflow.com/questions/8816194/… Commented Feb 4, 2015 at 9:43
  • As JSG suggests, just don't. Either get it out and parse it in something designed to parse HTML, such as PHP, or, better, split it up and store the various parts when you insert the Message into the database. Commented Feb 4, 2015 at 9:47
  • It is not for website. It is SSIS package. We retrieve data from other company. So I have to do it in SQL. Commented Feb 4, 2015 at 9:47
  • Create a code step in your SSIS package and do it in VB/C# Commented Feb 4, 2015 at 9:47
  • Yeah have something else (like PHP) take the data you get, explode it into the proper sections and as it does so, have it write individual queries into a text file, line by line. Then, once that's done, run the queries in your database and have it get updated. Let's hope your raw data has some sort of delimiter between fields. Commented Feb 4, 2015 at 9:53

1 Answer 1

1

I'm a year late, it's not pretty, and it's definitely not 100% safe, but this does the job for me on the rare occasions I need to parse HTML. Create this function first.

CREATE FUNCTION dbo.StringBetweenTwoPatterns (@PrePattern varchar(max) @PostPattern varchar(max), @string varchar(max)) 
RETURNS varchar(Max)
AS 
BEGIN   
DECLARE @WildPre VARCHAR(MAX) = '%' + @PrePattern + '%'
DECLARE @WildPost VARCHAR(MAX) = '%' + @PostPattern + '%'
IF PATINDEX(@WildPre, @String) > 0
    AND PATINDEX(@WildPost, @String) > 0
BEGIN
    DECLARE @RIGHT VARCHAR(MAX) = SUBSTRING(@string, PATINDEX(@WildPre,@string) + LEN(@PrePattern), LEN(@string))
    RETURN LEFT(@RIGHT,(PATINDEX(@WildPost,@RIGHT) - 1))
END
RETURN NULL
END
GO

When you call this function, you have to keep full formatting and white space in the search strings, so it's going to look like this:

SELECT [NameSurname] = StringBetweenTwoPatterns('<td>NameSurname</td>
        <td>','</td>',[Message]

Splitting Name and Surname is something you should be able to extrapolate from the the substring, right, left, and patindex examples above. Or just google some other answers for that.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.