Get only numeric part from column data SQL

Question

I have business requirement as keep only numeric values and make rest as null. Like below

Result
>4.2
<6.0
5.0 is max
dup
1

OUTPUT as

Result
4.2
6.0
5.0
NULL
1

In T-SQL? Not simple. With a language that supports Regex replacement? Trivial. If you can do this in the application layer than is inserted the data, that would be the best place. — Thom A
– Thom A ♦, Commented Jul 2, 2021 at 15:07
Then i would suggest investing in creating some CLR functions that provide Regex support. A search will give you a lot of results. — Thom A
– Thom A ♦, Commented Jul 2, 2021 at 15:12

Yitzhak Khabinsky · Accepted Answer · 2021-07-02 16:31:37Z

Please try the following solution.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (ID INT IDENTITY PRIMARY KEY, tokens VARCHAR(100));
INSERT INTO @tbl (tokens) VALUES
('>4.2'),
('<6.0'),
('5.0 is max'),
('dup'),
('1');
-- DDL and sample data population, end

DECLARE @separator CHAR(1) = SPACE(1);

SELECT ID, tokens 
    , c.query('<root> 
       { 
           for $x in /root/r 
           return if (xs:decimal($x) instance of xs:decimal) then $x
                 else () (: filter out non-decimals :) 
       }
       </root>').value('(/root/r/text())[1]','VARCHAR(10)') as result
FROM @tbl
CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA[' + 
      REPLACE(REPLACE(REPLACE(tokens,'>',''),'<',''), @separator, ']]></r><r><![CDATA[') + 
      ']]></r></root>' AS XML)) AS t(c);

Output

+----+------------+--------+
| ID |   tokens   | result |
+----+------------+--------+
|  1 | >4.2       | 4.2    |
|  2 | <6.0       | 6.0    |
|  3 | 5.0 is max | 5.0    |
|  4 | dup        | NULL   |
|  5 | 1          | 1      |
+----+------------+--------+

Gordon Linoff · Accepted Answer · 2021-07-02 15:13:55Z

0

SQL Server is not optimal for this, but this logic should do what you want:

select *,
       left(v2.str, patindex('%[^.0-9]%', v2.str + ' ') - 1)
from (values ('5.0 is max')) v(str) cross apply
     (values (stuff(v.str, 1, patindex('%[0-9]%', v.str) - 1, ''))) v2(str);

Here is a db<>fiddle.

answered Jul 2, 2021 at 15:13

Gordon Linoff

1.3m62 gold badges705 silver badges857 bronze badges

4 Comments

Vivan_J Over a year ago

Correct! this what result expected. Thank you but how do I put column name instead of hard code values?

Arulkumar Over a year ago

@Vivan_J Use column alias left(v2.str, patindex('%[^.0-9]%', v2.str + ' ') - 1) AS ResultValue . Refer

John Cappelletti Over a year ago

@Vivan_J Sample of Gordon's answer against a table dbfiddle.uk/…

Vivan_J Over a year ago

Thank you John.. really helpful.. appreciated much!

Stu · Accepted Answer · 2021-07-02 15:28:25Z

0

One approach you could try is using a user-defined function to strip out the numbers only from your strings.

It won't be performant over a very large dataset, but is simple to implement and might work for you depending on your requirements.

create or alter function [dbo].[NumbersOnly](@str varchar(100))  
returns varchar(100)
as  
begin
    declare @len smallint=Len(@str), @i smallint=0, @result varchar(100)=''

    while @i <=@len
    begin
        if Ascii(Substring(@str,@i,1)) in (46,48,49,50,51,52,53,54,55,56,57)
            set @result=@result + Substring(@str,@i,1) 
        set @i=@i+1
    end

    return @result

end
go

Then just use it against your data:

select NullIf(dbo.NumbersOnly(Result),'') as Result
from table

See Example Fiddle

edited Jul 2, 2021 at 15:28

answered Jul 2, 2021 at 15:23

Stu

32.7k6 gold badges17 silver badges34 bronze badges

2 Comments

Vivan_J Over a year ago

Very useful! Sorry if I ask more...not possible without function and with in-line query itself?

Stu Over a year ago

Well tbh it's possible to create something the can work inline as part of a cross apply() using string_split() unfortunately SQL Server doesn't provide a method to guarantee ordering, so a solution using this method also requires a custom set-based string split method, which is just a lot of gumph! Hence why I said this might be useful to you if it works for your particular use-case; I use it in various projects where it's required on no more than 1-2k rows with no issues.

Alan Burstein · Accepted Answer · 2021-07-02 22:55:23Z

0

You can use Patreplace8K which makes this stuff easy. Here I'm saying "remove anything that matches this pattern, [0-9.]; e.g. anything that is not a number or a dot. This is the fastest function for this type of thing.

--==== Sample Data
DECLARE @table TABLE (SomeString VARCHAR(20));
INSERT @table VALUES('>4.2'),('<6.0'),('5.0 is max'),('dup'),('1');

--==== Solution using Patreplace
SELECT      t.SomeString, pr.NewString
FROM        @table AS t 
CROSS APPLY samd.patReplace8K(t.SomeString,'[^0-9.]','') AS pr;

Results:

SomeString    NewString
------------- ---------------
>4.2          4.2
<6.0          6.0
5.0 is max    5.0
dup           NULL
1             1

answered Jul 2, 2021 at 22:55

Alan Burstein

8,0081 gold badge18 silver badges19 bronze badges

Collectives™ on Stack Overflow

Get only numeric part from column data SQL

4 Answers 4

Comments

4 Comments

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

4 Comments

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related