3

I have a column which has inconsistent data. The column named ID and it can have values such as

0897546321
ABC,0876455321
ABC,XYZ,0873647773
ABC,
99756
test only

The SQL query should fetch only Ids which are of 10 digit in length, should begin with a 08 , should be not null and should not contain all characters. And for those values, which have both digits and characters such as ABC,XYZ,0873647773, it should only fetch the 0873647773 . In these kind of values, nothing is fixed, in place of ABC, XYZ , it can be anything and can be of any length.

The column Id is of varchar type.

My try: I tried the following query

select id
from table
where id is not null
and id not like '%[^0-9]%'
and id like '[08]%[0-9]'
and len(id)=10

I am still not sure how should I deal with values like ABC,XYZ,0873647773

P.S - I have no control over the database. I can't change its values.

1
  • The real problem here apepars to be that you are storing delimited data in your RDBMS. What you should be really doing is fixing your design. Commented Dec 10, 2020 at 10:14

2 Answers 2

3

SQL Server generally has poor support regular expressions, but in this case a judicious use of PATINDEX is viable:

SELECT SUBSTRING(id, PATINDEX('%,08[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9],%', ',' + id + ','), 10) AS number
FROM yourTable
WHERE ',' + id + ',' LIKE '%,08[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9],%';

screen capture from demo link below

Demo

Sign up to request clarification or add additional context in comments.

4 Comments

@a_naq . . . This only fetches one value from each row. Nothing in the question states that you want only one value per row. In fact, the phrasing suggests that you want all of them, strongly suggesting you could have multiple matches within a single row.
@GordonLinoff ... in that case, I would have just thrown in my hat and tell the OP to just handle this from Java, C#, etc.; SQL is not the right tool for that problem.
. . The comment really isn't directed to your answer which is correct for what it does. It is for the OP because the question is quite misleading, if this is the correct answer.
@GordonLinoff In my table, it is only the column named ID which has inconsistent data like this and I only want to fetch Ids which are present in that column only.
2

If you normalise your data, and split the delimited data into parts, you can achieve this some what more easily:

SELECT SS.value
FROM dbo.YourTable YT
     CROSS APPLY STRING_SPLIT(YT.YourColumn,',') SS
WHERE LEN(SS.value) = 10
  AND SS.value NOT LIKE '%[^0-9]%';

If you're on an older version of SQL Server, you'll have to use an alternative String Splitter method (such as a XML splitter or user defined inline table-value function); there are plenty of examples on these already on Stack Overflow.

db<>fiddle

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.