2

I need to be able to remove some rows in a table where two-column combination have the same value. For example, in below sample table, there should be only one combination of (48983, 2018-05-01).

ID      CertID   DueDate
676790  48983   2018-05-03
678064  48983   2018-05-02
678086  48983   2018-05-01
678107  48983   2018-05-01
678061  48983   2018-05-01

I tried to get the list of duplicate entries but what I get is the entire table. This is what I used:

WITH A   -- Get a list of unique combinations of ResponseDueDate and CertificateID
AS  (
   SELECT Distinct
          ID,       
          ResponseDueDate,
          CertID
   FROM  FacCompliance
)
,   B  -- Get a list of all those CertID values that have more than one ResponseDueDate associated
AS  (
    SELECT CertID
    FROM   A
    GROUP BY
           CertID
    HAVING COUNT(*) > 1
)
SELECT  A.ID,
        A.ResponseDueDate,
        A.FacCertificateID
FROM    A
    JOIN B
        ON  A.CertID = B.CertID
order by CertID, ResponseDueDate;

What is wrong with the query I am using and is it possible to remove extra rows (in above example, keep one instance of (48983, 2018-05-01) combination and remove the rest. I am using SQL Server 2016.

3
  • Do you have a preference on which row to keep, eg smallest ID? Commented May 18, 2018 at 13:19
  • 1
    you can use ROW_NUMBER()OVER(Partition by CertId,DueDate ORDER BY ID, then delete the row number >1 Commented May 18, 2018 at 13:21
  • no, as long as I end up with one row instead of three Commented May 18, 2018 at 13:43

2 Answers 2

5

use row number:

WITH A AS  (
   SELECT 
          ID,       
          ResponseDueDate,
          CertID,
          ROW_NUMBER() over (partition by CertID, ResponseDueDate order by ResponseDueDate) lp
   FROM  FacCompliance
)
delete a
where lp <> 1
;

also, if ID is unique you can do it without window functions:

delete fc
from  FacCompliance fc
where exists (
    select 1
    from FacCompliance ref
    where ref.ResponseDueDate = fc.ResponseDueDate
        and ref.CertID = fc.CertID
        and ref.ID < fc.ID
)
Sign up to request clarification or add additional context in comments.

Comments

1

You can order the data, partitioned by the CertID and DueDate, to eliminate the extra rows.

DECLARE @T TABLE (ID INT,CertID INT, DueDate DATE)
INSERT INTO @T(ID,CertID,DueDate) SELECT 676790,48983,'2018-05-03'
INSERT INTO @T(ID,CertID,DueDate) SELECT 678064,48983,'2018-05-02'
INSERT INTO @T(ID,CertID,DueDate) SELECT 678086,48983,'2018-05-01'
INSERT INTO @T(ID,CertID,DueDate) SELECT 678107,48983,'2018-05-01'
INSERT INTO @T(ID,CertID,DueDate) SELECT 678061,48983,'2018-05-01'


DELETE t
FROM @T t
INNER JOIN (
    SELECT
        *
        ,Row_number() OVER(PARTITION BY CertID,DueDate ORDER BY ID ASC) AS [Row]
    FROM @T
) Ordered ON Ordered.ID=t.ID
WHERE [Row]<>1

SELECT * FROM @T

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.