1

The database I'm working in is Snowflake. I also have to work through an interface and can't edit the SQL query directly. I'm trying to add a JOIN to a FROM clause that is connecting two schemas with 1 key that is shared between the two tables.

The goal is to get all of the information I am selecting to show in one row, the issue is that my query is causing the table to display up to four rows for the same personID.

DW.DTBL_PERSON

Person_Key Person_ID Person_Name
1 4500 Person A
2 4501 Person B
3 4502 Person C

REPORTING.MTBL_CONTACTS

Person_Key Contact_Priority_Order Contact_First_Name
1 1 Anna
1 2 Steve
2 1 Joseph
3 1 Mimi
3 2 Mitchell
3 3 Chris
SELECT DISTINCT
    dtbl_person.person_key as "KEY",
        dtbl_person.person_name as "Person Name",
    CASE 
            WHEN T2.CONTACT_PRIORITY_ORDER = 1 THEN T2.CONTACT_FIRST_NAME as "First Contact First Name"
    END,
    CASE 
            WHEN T2.CONTACT_PRIORITY_ORDER = 2 THEN T2.CONTACT_FIRST_NAME as "Second Contact First Name"
    END,
    CASE 
            WHEN T2.CONTACT_PRIORITY_ORDER = 3 THEN T2.CONTACT_FIRST_NAME as "Third Contact First Name"
    END

FROM
    DW.DTBL_PERSON
INNER JOIN 
        REPORTING.MTBL_CONTACTS AS T1 ON DTBL_PERSON.PERSON_KEY = T1.PERSON_KEY
INNER JOIN 
        REPORTING.MTBL_CONTACTS AS T2 ON CONCAT(T2.PERSON_KEY, T2.CONTACT_PRIORITY_ORDER) = CONCAT(T1.PERSON_KEY, T1.CONTACT_PRIORITY_ORDER)

ORDER BY
    dtbl_students.person_id

What it's doing

KEY Person Name First Contact First Name Second Contact First Name Third Contact First Name
1 Person A Steve
1 Person A Anna
2 Person B Joseph
3 Person C Chris
3 Person C Mimi
3 Person C Mitchell

What I want it to do

KEY Person Name First Contact First Name Second Contact First Name Third Contact First Name
1 Person A Anna Steve
2 Person B Joseph
3 Person C Mimi Mitchell Chris

Right now it's pulling all the information, just creating new rows to display the information. I want it to show all of the information in the same row for the same key. The second INNER JOIN is supposed to act as a self join which I was hoping to use to resolve the issue of displaying multiple rows instead of displaying the information for each key in one row.

Any help on this issue would be much appreciated.

12
  • 1
    I don't have a Snowflake environment, but maybe this sample fiddle helps? Commented Nov 5 at 19:40
  • That is amazing and exactly what I want it to look like. However I get the following error from the MAX( ) inclusion. Query error: SQL compilation error: [DTBL_Person.Person_NAME] is not a valid group by expression Commented Nov 5 at 19:53
  • 1
    @RachelHelm Thanks for your further information. I agree with Dale on both points: Edit your question to add this explanation and write your own answer. Then we can remove our comments. Commented Nov 6 at 5:51
  • 1
    Any tips on what should go into the explanation? Also surprisingly, min produces the same result. My assumption is because the key+priority order in the select is identifying each cell uniquely so the min/max doesn't matter as they are identical values. Commented Nov 6 at 18:48
  • 1
    Whether using MAX, MIN or even FIRST_VALUE or similar functions does not really matter in your case. You will just "collapse" the separate rows into one per group by aggregating. Commented Nov 8 at 16:41

1 Answer 1

2

After much help from @JonasMetzler and @DaleK, the following query worked for Snowflake.

My goal was to get all contacts to display on one row instead of multiple rows for each person_key.

SELECT
    p.PERSON_KEY AS "KEY",
    p.PERSON_NAME AS "Person Name",
    MAX(CASE WHEN c.CONTACT_PRIORITY_ORDER = 1 THEN c.CONTACT_FIRST_NAME END) AS "First Contact First Name",
    MAX(CASE WHEN c.CONTACT_PRIORITY_ORDER = 2 THEN c.CONTACT_FIRST_NAME END) AS "Second Contact First Name",
    MAX(CASE WHEN c.CONTACT_PRIORITY_ORDER = 3 THEN c.CONTACT_FIRST_NAME END) AS "Third Contact First Name"
FROM DTBL_PERSON p
LEFT JOIN MTBL_CONTACTS c
       ON p.PERSON_KEY = c.PERSON_KEY
GROUP BY p.PERSON_KEY, p.PERSON_NAME
ORDER BY p.PERSON_KEY;

Because there are multiple contacts attached to each key, I added the contact priority order to identify each which of the contacts should be selected. Then the max function looks at the row and selects the non null value for each column. The group by then identifies how each selection is grouped and delivers the desired single row for each piece of information.

Sign up to request clarification or add additional context in comments.

2 Comments

That's the same code as you claimed didn't work for you, in a now deleted answer.
At the time it wasn't working because I didn't have the group by which is not easy to add in the interface I'm working in and also requires me to delete other lines from my actual query. Not sure why. I added the answer under advisement from those helping me understand why it wasn't working. I don't know what happened to the previous answer, but someone said it was likely AI generated. Apologies if anything I did was incorrect, not looking for up votes, just wanted to get help with my issue.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.