3

I'm trying to extract data from a String which consists of multiple notes which are appended to a varchar(4000) column. I'm using a mixture of regex and functions in the query along with CONNECT BY LEVEL using a regexp_count due to the fact I have no idea whether there will be one note or multiple. When I return results I noticed there were lots of duplicate rows. I belive this is purely due to the CONNECT BY and it's not something I've had to use prior to now so I think I've missed something.

Here is teh Query;

    select 
  id,
  substr(regexp_substr(VALUE,'^LOCKED BY USER: +(.*)',1,level,'m'),17) as LOCKUSER,
  substr(regexp_substr(VALUE,'^LOCKED ENTITY: +(.*)',1,level,'m'),16) as LOCKED_ENTITY,
  TO_DATE(LTRIM(regexp_substr(VALUE,'^LOCKED AT: ([[:digit:]]{2}/[[:digit:]]{2}/[[:digit:]]{4}\.?)',1,level,'m'),'LOCKED AT: '),'DD/MM/YYYY') as Dates_Locked,
  substr(regexp_substr(VALUE,'^LOCK NOTES: +(.*)',1,level,'m'),13) as LOCK_NOTES,
  'LOCK' as ACTION
 from TABLE
 where regexp_substr(VALUE,'^LOCK NOTES: +(.*)',1,level,'m') IS NOT NULL
  AND TO_DATE(LTRIM(regexp_substr(VALUE,'LOCKED AT: ([[:digit:]]{2}/[[:digit:]]{2}/[[:digit:]]{4}\.?)',1,level,'m'),'LOCKED AT: '),'DD/MM/YYYY') >= (SYSDATE -365)
connect by level <= regexp_count(VALUE,CHR(10)||CHR(13));

If I let this run against a table with 10K records I never get any results back which I asume is due to the sheer volume of duplicate rows it returns. Is there a way for me to prevent this?

Many Thanks

1
  • To confirm I have attempted to try SELECT DISTINCT without success Commented Nov 7, 2019 at 14:53

1 Answer 1

7

Currently, your CONNECT BY only limits the hierarchical level, and doesn't provide any condition for matching child rows to parent rows. This means that in a table with multiple rows, every row is a child of every other row. This is going to produce a massive result set.

If I understand correctly, you are trying to use the hierarchical functionality to pull multiple values from each individual row. So you really want each row to be parent and child to itself. I suggest trying:

CONNECT BY id = PRIOR id
AND prior sys_guid() is not null
AND level <= regexp_count(VALUE,CHR(10)||CHR(13))

Thanks to @kfinity for pointing out the need for the sys_guid() to prevent a CONNECT BY LOOP.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks Dave, I just gave it a try but unfortunately it complains... ORA-01436: CONNECT BY loop in user data 01436. 00000 - "CONNECT BY loop in user data" *Cause: *Action:
If you get ORA-01436 with this, you might also need to add and prior sys_guid() is not null
Perfect! You pre-empted my question just as I posted it! I've just added the "and prior sys_guid() is not null" and it now works great! Thank you for your help!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.