String Operation to Extract Column Name - Loop

Question

I would kindly request some help on T-SQL coding. I have a column with a list of conditions (WHERE statements). I need to extract a distinct list of column names with the aliases.

Example Data: ISNULL(ABC.Premium, 0) < 0,ISNULL(ABC.Date, 101) < 19

Result: ABC.Premium, ABC.Date

I am looking to do a string operation to: 1. search for the '.' 2. String operation to extract the name.

I am not sure how to get the LOOP to search for multiple '.' and extract the column names.

CREATE DATABASE TEST
GO
    USE [Test]
    GO

    CREATE TABLE [dbo].[StringRetrival](
        [ID] [int] NOT NULL,
        [Condition] [varchar](4000) NULL
    ) ON [PRIMARY]

    GO

    INSERT [dbo].[StringRetrival] ([ID], [Condition]) VALUES (1, N'ISNULL(ABC.Premium, 0) < 0,ISNULL(ABC.Date, 101) < 19')
    GO
    INSERT [dbo].[StringRetrival] ([ID], [Condition]) VALUES (2, N'ISNULL(DEF.ColB, 101) < 25,ISNULL(DEF.ColB, 101) < 25,ISNULL(XYZ.ColB, 101) > 5, MSN.ColA < 5')
    GO
    INSERT [dbo].[StringRetrival] ([ID], [Condition]) VALUES (3, N'RTY.ColA')
    GO

I would appreciate your help on this.

Thank you

This problem is much more complicated than you are giving it credit for. What if part of your where condition is RTY.ColA = '.Text.MoreText.EvenMoreText.'? — iamdave
– iamdave, Commented Feb 23, 2017 at 10:31
yes that is true it could get more complex but for the moment as POC we are looking at data like that in the examples. It is not the best way of approaching this but this is the direction I have been asked to take. — Aarion
– Aarion, Commented Feb 23, 2017 at 10:45
Btw: I appreciate your effort to create a stand-alone test scenario +1 from my side! — Gottfried Lesigang
– Gottfried Lesigang, Commented Feb 23, 2017 at 14:10

Community · Accepted Answer · 2020-06-20 09:12:55Z

Okay, so, this is HIGHLY inadvised to actually use in any kind of production environment, but was a fun lil challenge and works for your test data. I strongly recommend you simply look for alternative solution to the entire problem that has led you to be holding where clauses in a database table.

This will not work for `where` clauses that have `.` characters within text strings, nor will any other solution that relies on splitting the string by `.` characters without a lot of effort to check of that character is a part of a string value.

Utilising Jeff Moden's string splitting function you can do the following:

declare @StringRetrival table(ID int,Condition varchar(4000));
insert into @StringRetrival(ID,Condition) values
 (1,N'ISNULL(ABC.Premium, 0) < 0,ISNULL(ABC.Date, 101) < 19')
,(2,N'ISNULL(DEF.ColB, 101) < 25,ISNULL(DEF.ColB, 101) < 25,ISNULL(XYZ.ColB, 101) > 5, MSN.ColA < 5')
,(3,N'RTY.ColA');

with s1 as
(
    select r.ID
            ,r.Condition
            ,s.ItemNumber
            ,max(s.ItemNumber) over (partition by r.ID) as MaxItemNumber
            ,reverse(s.Item) as Item
    from @StringRetrival r
        cross apply dbo.DelimitedSplit8K(r.Condition,'.') s
),s2 as
(
    select r.ID
            ,r.Condition
            ,s.ItemNumber
            ,max(s.ItemNumber) over (partition by r.ID) as MaxItemNumber
            ,reverse(s.Item) as Item
    from @StringRetrival r
        cross apply dbo.DelimitedSplit8K(reverse(r.Condition),'.') s
)
select distinct s1.ID
                ,reverse(left(s1.Item,patindex('%[^a-zA-Z]%',s1.Item + ',')-1)) + '.' + left(s2.Item,patindex('%[^a-zA-Z]%',s2.Item + ',')-1) as Col
from s1
    join s2
        on s1.ID = s2.ID
            and s1.ItemNumber = s2.ItemNumber
where s1.ItemNumber <> s1.MaxItemNumber
order by s1.ID;

Which will output:

+----+-------------+
| ID |     Col     |
+----+-------------+
|  1 | ABC.Date    |
|  1 | ABC.Premium |
|  2 | DEF.ColA    |
|  2 | DEF.ColB    |
|  2 | MSN.ColB    |
|  2 | XYZ.ColB    |
|  3 | RTY.ColA    |
+----+-------------+

The SQL to create the splitting function:

CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
        (@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE!  IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
 RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
     -- enough to cover VARCHAR(8000)
  WITH E1(N) AS (
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
                 SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
                ),                          --10E+1 or 10 rows
       E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
       E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
 cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
                     -- for both a performance gain and prevention of accidental "overruns"
                 SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
                ),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
                 SELECT 1 UNION ALL
                 SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
                ),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
                 SELECT s.N1,
                        ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
                   FROM cteStart s
                )
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
 SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
        Item       = SUBSTRING(@pString, l.N1, l.L1)
   FROM cteLen l

GO

I voted yours up, because it is a good answer. Funny how similarly we both solved this and yet how differently we coded it. I love SO especially for this!
@Shnugo How do you find the performance of the XML based splitting across larger datasets? I've seen you use it before, but never anywhere else.
In this great article several approaches are compared... XML is astonishingly fast, but not the best. With SQL-Server 2016 there will be built in support for this and no more need for such hacks. Just splitting let's say a comma separated list of INTs does not need futher effort to deal with special characters. In my answer I use one additional FOR XML PATH('') in order to force entity escaping. But this takes extra time...

Lali · Accepted Answer · 2017-02-23 11:21:48Z

2

-- select * from [dbo].[SplitDelimiterString]('this is some sample string', ' ') where item like '%.%'

ALTER FUNCTION [dbo].[SplitDelimiterString](@StringWithDelimiter VARCHAR(8000), @Delimiter VARCHAR(8))
RETURNS @ItemTable TABLE (Item VARCHAR(8000))
AS
BEGIN
    DECLARE @StartingPosition INT;
    DECLARE @ItemInString VARCHAR(8000);

    SELECT @StartingPosition = 1;
    --Return if string is null or empty
    IF DATALENGTH(@StringWithDelimiter) = 0 OR @StringWithDelimiter IS NULL RETURN; 

    WHILE @StartingPosition > 0
    BEGIN
        --Get starting index of delimiter .. If string
        --doesn't contain any delimiter than it will returl 0 
        SET @StartingPosition = CHARINDEX(@Delimiter,@StringWithDelimiter); 

        --Get item from string        
        IF @StartingPosition > 0                
            SET @ItemInString = SUBSTRING(@StringWithDelimiter,0,@StartingPosition)
        ELSE
            SET @ItemInString = @StringWithDelimiter;
        --If item isn't empty than add to return table    
        IF( DATALENGTH(@ItemInString) > 0)
            INSERT INTO @ItemTable(Item) VALUES (@ItemInString);            

        --Remove inserted item from string
        SET @StringWithDelimiter = SUBSTRING(@StringWithDelimiter,@StartingPosition +  
                     DATALENGTH(@Delimiter), DATALENGTH(@StringWithDelimiter) - @StartingPosition)

        --Break loop if string is empty
        IF DATALENGTH(@StringWithDelimiter) = 0 BREAK;
    END

    RETURN
END

The above function will split string using delimiter you provide and then you can use where clause to filter out.

answered Feb 23, 2017 at 11:21

Lali

2,8665 gold badges33 silver badges49 bronze badges

3 Comments

iamdave Over a year ago

This will fail if your column name has a space in it, such as [schema].[Column Name].

Lali Over a year ago

Yes, you can modify it little more

Gottfried Lesigang Over a year ago

Splitting approaches with loops are outdated. There are better solutions around...

Gottfried Lesigang · Accepted Answer · 2017-02-24 07:56:19Z

2

The following code will

splitt the strings on dots
cut from LEFT and RIGHT until the first non-simple character (PATINDEX('%[^a-z,A-Z,0-9]%')
Concatenate the start of each part (column name) with the end of the previous part (table name)

Try this

WITH Splitted AS
(
    SELECT ID
          ,CAST('<x>' + REPLACE((SELECT Condition AS [*] FOR XML PATH('')),'.','</x><x>') + '</x>' AS XML) AS Part
    FROM StringRetrival
)
,AllParts AS
(
    SELECT ID
          ,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY (SELECT NULL)) AS Nr
          ,p.value('.','nvarchar(max)') AS OnePart
    FROM Splitted
    CROSS APPLY Part.nodes('x') AS A(p)
)
,Parsed AS
(
    SELECT ROW_NUMBER() OVER(ORDER BY ID,Nr) AS SortInx,*
    FROM
    (
        SELECT ID,Nr,OnePart,NULL AS ColumnName
                ,RIGHT(OnePart,CASE WHEN Position.BreakingChar<1 THEN 999 ELSE Position.BreakingChar END) AS TableName 
        FROM AllParts 
        CROSS APPLY(SELECT PATINDEX('%[^a-z,A-Z,0-9]%',REVERSE(OnePart))-1 AS BreakingChar) AS Position
        WHERE Nr=1
        UNION ALL 
        SELECT ID,Nr,OnePart
                ,LEFT(OnePart,CASE WHEN Position.FirstBreakingChar<1 THEN 999 ELSE Position.FirstBreakingChar END)  
                ,RIGHT(OnePart,CASE WHEN Position.SecondBreakingChar<1 THEN 999 ELSE Position.SecondBreakingChar END) 
        FROM AllParts
        CROSS APPLY(SELECT PATINDEX('%[^a-z,A-Z,0-9]%',OnePart)-2 AS FirstBreakingChar
                            ,PATINDEX('%[^a-z,A-Z,0-9]%',REVERSE(OnePart))-1 AS SecondBreakingChar) AS Position
        WHERE Nr>1
    ) AS tbl
)
SELECT DISTINCT
        p1.ID
        ,ISNULL(p2.TableName + '.','') + p1.ColumnName
FROM Parsed AS p1
INNER JOIN Parsed AS p2 ON p1.ID=p2.ID AND p2.Nr=p1.Nr-1;

The result

ID  ColumnName
1   ABC.Date
1   ABC.Premium
2   DEF.ColB
2   MSN.Col
2   XYZ.ColB
3   RTY.ColA

edited Feb 24, 2017 at 7:56

answered Feb 23, 2017 at 11:06

Gottfried Lesigang

67.6k9 gold badges60 silver badges124 bronze badges

2 Comments

Aarion Over a year ago

Thank you so much - This is perfect. Amazing coding skills, much appreciated

Gottfried Lesigang Over a year ago

@Aarion, I'm glad to read this! Please allow me one tiny hint: Thx for accepting the answer! It would be nice to - additionally - vote it up. Voting and acceptance are two independant steps on SO. Acceptance markes the question as solved and votes count for privileges and badges and are more a measure of quality. Thx and happy coding!

Collectives™ on Stack Overflow

String Operation to Extract Column Name - Loop

3 Answers 3

This will not work for `where` clauses that have `.` characters within text strings, nor will any other solution that relies on splitting the string by `.` characters without a lot of effort to check of that character is a part of a string value.

3 Comments

3 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

This will not work for where clauses that have . characters within text strings, nor will any other solution that relies on splitting the string by . characters without a lot of effort to check of that character is a part of a string value.

3 Comments

3 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related

This will not work for `where` clauses that have `.` characters within text strings, nor will any other solution that relies on splitting the string by `.` characters without a lot of effort to check of that character is a part of a string value.