0

I have a column with a long string. The data needs split into columns and there are variable lengths of strings with not always the same amount of columns. Not exactly sure how to do this so was looking for some advice here.

Lets say I have this string:

VS5~MedCond1~35.4|VS4~MedCond2~16|VS1~MedCond3~155|VS2~MedCond4~70|SPO2~MedCond5~100|VS3~MedCond6~64|FiO2~MedCond7~21|MAP~MedCond8~98|

And in some cases the string might not have all the medical conditions just some of them.

I need to split into columns where the column name is in between the tilds i.e. MedCond1 and the value would be the value to the right of the tild but before the pipe and end up like this:

  MedCond1 MedCond2 MedCond3 MedCond4 MedCond5 MedCond6 MedCond7 MedCond8
  ======== ======== ======== ======== ======== ======== ======== ========
  35.1      24       110       64      100      88       21       79

I need to do this for a lot of rows within a large table and as I said not all the columns are always present but they will not be different names, you might have med cond 1- 8, then in another set have med cond 3, 4, 7.

Here is a query I created that is kind of what I want but not dynamic so it is picking up the values with some extra bits of the string

select MainCol, case when charindex('MedCond1', MainCol) > 0 then 
substring(MainCol, charindex('MedCond1', MainCol) + 9, 4) end as [MedCond1] 
from MedTable

Will return

MedCond1
========
35.3
40.2
33.6
33|V  <--- Problem

As you can see the numeric value is sometimes picked up with additional part of the string due to hard coding of the charindex number. The value is sometimes 4 characters long with a decimal place, sometimes 2 long with no decimal place. I would like to make this dynamic. The pipe defines the end of the data I need and the start is defined by the tild at the end of the column name.

Thanks for any thoughts on making this dynamic

Andrew

11
  • Too broad. Please specify which tool you plan to use to split these strings Commented Mar 14, 2017 at 18:24
  • Hope to be using SQL Server 2012 Commented Mar 15, 2017 at 7:48
  • Is anyone able to suggest anything as I am trying to meet a deadline to get this done? I will add a sql statement I have done that does kind of what I want but is not dynamic so picks up ohter bits of the string. Commented Mar 15, 2017 at 11:44
  • 1
    1) Find whoever broke first normal form and have him fix his bug? Because that's what it is. You should NOT store multiple values in a single column. You wouldn't have to worry about deadlines if the most basic rule of database design was followed Commented Mar 15, 2017 at 11:56
  • 2) You are asking about string splitting. There are a lot of duplicate questions. Aaron Bertrand even run a series of articles comparing the various options. The fastest and most scaleable is to use a SQLCLR method. Jeff Moden's solution comes second. You can also use XML manipulation, only if the data is XML safe, ie it doesn't contain unfortunate characters like <, > Commented Mar 15, 2017 at 12:03

2 Answers 2

1

This data looks like a table itself. It could have been stored in SQL Server as xml. SQL Server supports xml fields and allows querying them. In fact, one could try to convert this string to XML, then try to query it:

declare @medTable table (item nvarchar(2000))
insert into @medTable
values ('VS5~MedCond1~35.4|VS4~MedCond2~16|VS1~MedCond3~155|VS2~MedCond4~70|SPO2~MedCond5~100|VS3~MedCond6~64|FiO2~MedCond7~21|MAP~MedCond8~98|');

-- Step 1: Replace `|` with <item> tags and `~` with `tag` tags
-- This will return an xml value for each medTable row
with items as (
    select xmlField= cast('<item><tag>' 
                           + replace( 
                                     replace(item,'|','</tag></item><item><tag>'),
                                    '~','</tag><tag>' )
                           + '</tag></item>' as xml) 
    from @medTable 
)
-- Step 2: Select different tags and display them as fields
select 
    y.item.value('(tag/text())[1]','nvarchar(20)'),
    y.item.value('(tag/text())[2]','nvarchar(20)'),
    y.item.value('(tag/text())[3]','nvarchar(20)')
from items outer apply xmlField.nodes('item') as y(item)

The result is :

-------------------- -------------------- -------
VS5                  MedCond1             35.4
VS4                  MedCond2             16
VS1                  MedCond3             155
VS2                  MedCond4             70
SPO2                 MedCond5             100
VS3                  MedCond6             64
FiO2                 MedCond7             21
MAP                  MedCond8             98
NULL                 NULL                 NULL

It would be better to perform this conversion when loading the data though. It's easier for example, to make the replacements in C# or SSIS and store a complete xml value in the database.

You can modify this query too, to generate the xml value and store it in the database:

declare @medTable2 table (xmlField xml)

with items as (
    select xmlField= cast('<item><tag>' + replace(replace(item,'|','</tag></item><item><tag>'),'~','</tag><tag>' ) + '</tag></item>' as xml) 
    from @medTable 
)
insert into @medTable2
select items.xmlField
from items 

-- Query the new table from now on 
select 
    y.item.value('(tag/text())[1]','nvarchar(20)'),
    y.item.value('(tag/text())[2]','nvarchar(20)'),
    y.item.value('(tag/text())[3]','nvarchar(20)')
from @medTable2 outer apply xmlField.nodes('item') as y(item)
Sign up to request clarification or add additional context in comments.

1 Comment

Very slick, and better than my answer which requires more steps!
0

OK, let me take a stab at this. The solution I'm outlining is not going to be purely SQL Server, however, it uses a round-trip via a text-file.

The approach uses the following steps:

  1. Unpivot the data delimited by the pipe symbols (to create more than one line of output for each line of input)
  2. Round-trip the data from SQL Server to a text file and back
  3. Separate the data into columns on the tilde ~ symbol delimiter
  4. Pivot the data back into columns

The key benefit of this approach is the unpivot operation, which allows you to handle missing columns like MedCond2 naturally by the absence of an equivalent row. It also eliminates nearly all string manipulation, save for the one REPLACE function in step 1 below.

Given a single row's contents like the following:

VS5~MedCond1~35.4|VS4~MedCond2~16|VS1~MedCond3~155|VS2~MedCond4~70|SPO2~MedCond5~100|VS3~MedCond6~64|FiO2~MedCond7~21|MAP~MedCond8~98| 

Step 1 (Unpivot): Find and replace all instances of the pipe symbol with a newline character. So, REPLACE(column, '|', CHAR(13)) will give you the following lines of text (i.e. multiple lines of text in a single database row) for a single input row:

VS5~MedCond1~35.4
VS4~MedCond2~16
VS1~MedCond3~155
VS2~MedCond4~70
SPO2~MedCond5~100
VS3~MedCond6~64
FiO2~MedCond7~21
MAP~MedCond8~98

Step 2 (Round-trip): Write the above output to a text file, using your tool of choice (SSIS, SQLCMD, etc.) and ensure that the newline character defined is the same as that used in the REPLACE command in step 1.

The purpose of this step is to concatenate multiple lines within the same row with other lines in different rows.

Note that steps 1 can be eliminated by defining the row delimiter for steps 2 & 3 as the pipe symbol. I've put in the additional step 1 using newlines only to make it easier to understand and debug.

Step 3 (Separate columns): Import the text file back into SQL Server using the same tool, and define the column delimiter as the tilde ~ symbol, row delimiter same as in steps 1/2.

ColA   MedCondTitle  MedCondValue 
------ ------------- ------------- 
VS5    MedCond1      35.4
VS4    MedCond2      16
VS1    MedCond3      155
VS2    MedCond4      70
SPO2   MedCond5      100
VS3    MedCond6      64
FiO2   MedCond7      21
MAP    MedCond8      98

Step 4 (Pivot): Now you'd have a trivially simple step of pivoting rows to columns, which can be achieved with a statement of the form:

SUM(CASE WHEN MedCondTitle='MedCond1' THEN MedCondValue ELSE 0) as MedCond1 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.