0

I have the following text field in SQL Server table:

1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0 
  1. Would like to retrieve only the part before the exclamation mark (!). So for 1!1 I only want 1, for 3!0 I only want 3, for 23!0 I only want 23.

  2. Would also like to retrieve only the part after the exclamation mark (!). So for 1!1 I only want 1, for 3!0 I only want 0, for 23!0 I only want 0.

Both point 1 and point 2 should be inserted into separate columns of a SQL Server table.

4
  • 3
    You shouldn't be storing delimited values in a single column in the first place. Commented Jan 10, 2013 at 15:14
  • 2
    Is that entire string a single record, or is 1!1 a record, 3!0 another record, and so on? Commented Jan 10, 2013 at 15:15
  • 1
    Please, spend the time to normalize this. Commented Jan 10, 2013 at 15:15
  • I have a question: Do people also use the wrong end of the hammer to hit the nails and wonder why it is inefficient? Or is it just the DB topic that brings out this phenomenon? Commented Jan 10, 2013 at 15:17

3 Answers 3

1

I LOVE SQL Server's XML capabilities. It is a great way to parse data. Try this one out:

--Load the original string DECLARE @string nvarchar(max) = '1!2,3!4,5!6,7!8,9!10'; --Turn it into XML SET @string = REPLACE(@string,',','</SecondNumber></Pair><Pair><FirstNumber>') + '</SecondNumber></Pair>'; SET @string = '<Pair><FirstNumber>' + REPLACE(@string,'!','</FirstNumber><SecondNumber>'); --Show the new version of the string SELECT @string AS XmlIfiedString; --Load it into an XML variable DECLARE @xml XML = @string; --Now, First and Second Number from each pair... SELECT Pairs.Pair.value('FirstNumber[1]','nvarchar(1024)') AS FirstNumber, Pairs.Pair.value('SecondNumber[1]','nvarchar(1024)') AS SecondNumber FROM @xml.nodes('//*:Pair') Pairs(Pair); 

The above query turned the string into XML like this:

<Pair><FirstNumber>1</FirstNumber><SecondNumber>2</SecondNumber></Pair> ... 

Then parsed it to return a result like:

FirstNumber | SecondNumber ----------- | ------------ 1 | 2 3 | 4 5 | 6 7 | 8 9 | 10 
Sign up to request clarification or add additional context in comments.

Comments

0

I completely agree with the guys complaining about this sort of data. The fact however, is that we often don't have any control of the format of our sources.

Here's my approach...

First you need a tokeniser. This one is very efficient (probably the fastest non-CLR). Found at http://www.sqlservercentral.com/articles/Tally+Table/72993/

CREATE FUNCTION [dbo].[DelimitedSplit8K] --===== Define I/O parameters (@pString VARCHAR(8000), @pDelimiter CHAR(1)) --WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE! RETURNS TABLE WITH SCHEMABINDING AS RETURN --===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000... -- enough to cover VARCHAR(8000) WITH E1(N) AS ( SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 ), --10E+1 or 10 rows E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front -- for both a performance gain and prevention of accidental "overruns" SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4 ), cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter) SELECT 1 UNION ALL SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter ), cteLen(N1,L1) AS(--==== Return start and length (for use in substring) SELECT s.N1, ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000) FROM cteStart s ) --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found. SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1), Item = SUBSTRING(@pString, l.N1, l.L1) FROM cteLen l ; GO 

Then you consume it like so...

DECLARE @Wtf VARCHAR(1000) = '1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0' SELECT LEFT(Item, CHARINDEX('!', Item)-1) ,RIGHT(Item, CHARINDEX('!', REVERSE(Item))-1) FROM [dbo].[DelimitedSplit8K](@Wtf, ',') 

The function posted and logic for parsing can be integrated in to a single function of course.

Comments

0

I agree to normaliz the data is the best way. However, here is the XML solution to parse the data

DECLARE @str VARCHAR(1000) = '1!1,3!0,23!0,288!0,340!0,521!0,24!0,38!0,26!0,27!0,281!0,19!0,470!0,568!0,601!0,2!1,251!0,7!2,140!0,285!0,11!2,33!0' ,@xml XML SET @xml = CAST('<row><col>' + REPLACE(REPLACE(@str,'!','</col><col>'),',','</col></row><row><col>') + '</col></row>' AS XML) SELECT line.col.value('col[1]', 'varchar(1000)') AS col1 ,line.col.value('col[2]', 'varchar(1000)') AS col2 FROM @xml.nodes('/row') AS line(col) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.