Regular Expression Oracle SQL : handling multiple cases

Question

I am using Regex Substring to filter out values that have 'p' in the start and ends before '-'. p is followed by 6 digits.

My Code :

 code,REGEXP_SUBSTR(CODE,'^[p][^-]+')

`CODE`	`REGEXP_SUBSTR(CODE,'^[P][^-]+')`
p700401-	p700401
p791701-	p791701
100-,p788001-,	null

This is the result , but I am struggling to handle cases like in 3rd Row.

100-,p788001-

Can Someone Please guide me to handle such cases

MT0 · Accepted Answer · 2022-11-03 10:57:01Z

2

If you want to match complete terms in a comma-delimited string then you can use:

SELECT code,
       REGEXP_SUBSTR(code, '(^|,)(p\d{6})-(,|$)', 1, 1, NULL, 2) AS result
FROM   table_name;

Which, for the sample data:

CREATE TABLE table_name (code) as
  SELECT 'p700401-' FROM DUAL UNION ALL
  SELECT 'p791701-' FROM DUAL UNION ALL
  SELECT '100-,p788001-,' FROM DUAL UNION ALL
  SELECT '123-,p456789-xyz,p987654-' FROM DUAL UNION ALL
  SELECT 'p111111-,p222222-not_this,p333333-,p444444-' FROM DUAL;

Outputs:

CODE	RESULT
p700401-	p700401
p791701-	p791701
100-,p788001-,	p788001
123-,p456789-xyz,p987654-	p987654
p111111-,p222222-not_this,p333333-,p444444-	p111111

Displaying multiple terms

If you want to remove the non-matching terms from the string then:

SELECT code,
       LTRIM(
         REGEXP_REPLACE(
           ',' || REPLACE(code, ',', ',,') || ',',
           '((,p\d{6})-,)|,.*?,',
           '\2'
         ),
         ','
       ) AS result
FROM   table_name;

Which, outputs:

CODE	RESULT
p700401-	p700401
p791701-	p791701
100-,p788001-,	p788001
123-,p456789-xyz,p987654-	p987654
p111111-,p222222-not_this,p333333-,p444444-	p111111,p333333,p444444

And if you want to split the list into rows then:

SELECT t.code,
       i.*
FROM   (
         SELECT code,
                ',' || REPLACE(code, ',', ',,') || ',' AS double_delims
         FROM   table_name
       ) t
       INNER JOIN LATERAL (
         SELECT LEVEL As item,
                REGEXP_SUBSTR(double_delims, ',(p\d{6})-,|,(.*?),', 1, LEVEL, NULL, 1)
                  AS value
         FROM   DUAL
         CONNECT BY LEVEL <= REGEXP_COUNT(double_delims, ',(p\d{6})-,|,(.*?),')
       ) i
       ON (i.value IS NOT NULL);

Which outputs:

CODE	ITEM	VALUE
p700401-	1	p700401
p791701-	1	p791701
100-,p788001-,	2	p788001
123-,p456789-xyz,p987654-	3	p987654
p111111-,p222222-not_this,p333333-,p444444-	1	p111111
p111111-,p222222-not_this,p333333-,p444444-	3	p333333
p111111-,p222222-not_this,p333333-,p444444-	4	p444444

fiddle

edited Nov 3, 2022 at 10:57

answered Nov 3, 2022 at 9:46

MT0

173k12 gold badges70 silver badges136 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

JvdV Over a year ago

+ But did you leave out the first number out of the results on purpose in line 4?

MT0 Over a year ago

@JvdV The first line states "If you want to match complete terms" and the regular expression would not match the complete term for ,p456789-xyz,; so, yes, that row was included to show that it would match the correct term when matching complete terms and it should match the last term in the delimited list and not the middle term.

JvdV Over a year ago

Thanks, it's clear how you interpreted the data then. I borrowed how you created the schema but could have misinterpreted the desired outcome then.

Leverage Over a year ago

thanks MT0 for the thorough answer. Always appreciate the input .

Littlefoot · Accepted Answer · 2022-11-03 08:58:35Z

1

For sample data you posted, this returns the result you wanted (i.e. take "p" followed by exactly 6 digits):

SQL> with test (code) as
  2    (select 'p700401-' from dual union all
  3     select 'p791701-' from dual union all
  4     select '100-,p788001-,' from dual
  5    )
  6  select code,
  7         regexp_substr(code, 'p\d{6}') result
  8  from test;

CODE           RESULT
-------------- --------------
p700401-       p700401
p791701-       p791701
100-,p788001-, p788001

SQL>

edited Nov 3, 2022 at 8:58

answered Nov 3, 2022 at 8:53

Littlefoot

144k15 gold badges41 silver badges65 bronze badges

6 Comments

JvdV Over a year ago

Can I ask you a question @Littlefoot? Would something like SELECT code, REGEXP_REPLACE (code, '(?:\b[^p,-]\w*-,|,$)', '') result FROM test; work? I was thinking what if data is something like 100-,p788001-,100-,p788002-, but I have no means of testing this.

Littlefoot Over a year ago

It wouldn't work, @JvdV, I tested it. That regex returns the "original" code, doesn't "replace" anything.

JvdV Over a year ago

Thanks for the feedback. I wonder why. Maybe part of the regex would not be supported syntax. The idea would be to return something like this. Maybe the non-capturing group is supposed to be a regular capture group...

Littlefoot Over a year ago

I'm not that good at regex, @JvdV; I only know some Oracle syntax. Wiktor Stribiżew is the regex expert here, I guess he'd be able to explain it.

MT0 Over a year ago

@JvdV Oracle does not support non-capturing groups (: ) or word boundaries \b so you cannot use that method.

|

JvdV · Accepted Answer · 2022-11-03 10:24:57Z

Right, my two cents is to use REGEXP_REPLACE():

CREATE TABLE tst (code) as
  SELECT 'p700401-' FROM DUAL UNION ALL
  SELECT 'p791701-' FROM DUAL UNION ALL
  SELECT '100-,z123456' FROM DUAL UNION ALL
  SELECT '100-,p788001-,' FROM DUAL UNION ALL
  SELECT 'p788001-,100-' FROM DUAL UNION ALL
  SELECT '123-,p456789-xyz,p987654-' FROM DUAL;

SELECT
  code, REGEXP_REPLACE(REGEXP_REPLACE(code, '(p\d{6})-|.', '\1'), '(\d)(p)', '\1,\2') AS result
FROM tst

Resuls in:

CODE	RESULT
p700401-	p700401
p791701-	p791701
100-,z123456	null
100-,p788001-,	p788001
p788001-,100-	p788001
123-,p456789-xyz,p987654-	p456789,p987654

It's a nested statement due to the lack of support for handy regex syntax as per given link.

The 1st regex pattern is supposed to replace anything other than what you are after, see an online demo. The 2nd one is there to insert comma's back to seperate these values, see the demo.

Collectives™ on Stack Overflow

Regular Expression Oracle SQL : handling multiple cases

3 Answers 3

Displaying multiple terms

4 Comments

6 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Displaying multiple terms

4 Comments

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related