I'm trying to extract data from a String which consists of multiple notes which are appended to a varchar(4000) column. I'm using a mixture of regex and functions in the query along with CONNECT BY LEVEL using a regexp_count due to the fact I have no idea whether there will be one note or multiple. When I return results I noticed there were lots of duplicate rows. I belive this is purely due to the CONNECT BY and it's not something I've had to use prior to now so I think I've missed something.
Here is teh Query;
select
id,
substr(regexp_substr(VALUE,'^LOCKED BY USER: +(.*)',1,level,'m'),17) as LOCKUSER,
substr(regexp_substr(VALUE,'^LOCKED ENTITY: +(.*)',1,level,'m'),16) as LOCKED_ENTITY,
TO_DATE(LTRIM(regexp_substr(VALUE,'^LOCKED AT: ([[:digit:]]{2}/[[:digit:]]{2}/[[:digit:]]{4}\.?)',1,level,'m'),'LOCKED AT: '),'DD/MM/YYYY') as Dates_Locked,
substr(regexp_substr(VALUE,'^LOCK NOTES: +(.*)',1,level,'m'),13) as LOCK_NOTES,
'LOCK' as ACTION
from TABLE
where regexp_substr(VALUE,'^LOCK NOTES: +(.*)',1,level,'m') IS NOT NULL
AND TO_DATE(LTRIM(regexp_substr(VALUE,'LOCKED AT: ([[:digit:]]{2}/[[:digit:]]{2}/[[:digit:]]{4}\.?)',1,level,'m'),'LOCKED AT: '),'DD/MM/YYYY') >= (SYSDATE -365)
connect by level <= regexp_count(VALUE,CHR(10)||CHR(13));
If I let this run against a table with 10K records I never get any results back which I asume is due to the sheer volume of duplicate rows it returns. Is there a way for me to prevent this?
Many Thanks