1

I am trying to add a column in a SQL query to use as an alternative ID. The data has this format:

UserID | Value
---------------
1      | 23
2      | 10
1      | 45

I'd like to create another column that is another number but respects the ID uniqueness. Like so

MaskedID | Value
---------------
9      | 23
8      | 10
9      | 45

I've tried using a subquery which creates a table with a random number, but the random number is not staying the same for the first ID:

Select b.Masked, a.value
from table a
left join 
(select distinct(UserID), dbms_random.value(1,100000) as Masked) b on a.UserID=b.UserID

But that results in:

MaskedID | Value
---------------
7      | 23
8      | 10
9      | 45

The # of userIDs may change with time so it's not something that should be predefined. Would a CTE keep the random numbers from being regenerated in the final table?

7
  • 1
    There are lots of options. But what's the purpose of this new column? Commented Dec 8, 2023 at 2:32
  • DISTINCT is not a function. It acts on an entire row. Commented Dec 8, 2023 at 10:05
  • 1
    @MT0 We had this in my job. People kept doing it. So I wrote an email::: "... I had a good laugh. The thing is about SQL, you can have as many parenthesis as you wish. So if I do SELECT DISTINCT(UG.USERID), (UG.USERDESC),(UG.EMAIL), (UG.ACL) FROM USERS UG It will be just as good. Parenthesis act as separator in SQL and therefore SELECT DISTINCT(UG.USERID), Is same as SELECT DISTINCT UG.USERID," ... SO!!! Potentially OP just separated DISTINCT and USERID using parenthesis vs space. So, who is getting caught here, OP or you? This is a contest Commented Dec 8, 2023 at 14:10
  • @MT0 it runs just fine for me boss Commented Dec 8, 2023 at 15:21
  • @PaulW I need the information that a specific user may have many rows, but I do not want to be able to trace the userID Commented Dec 8, 2023 at 15:22

4 Answers 4

1

Try a hash function. It will alway return the same value for the same input, but can't be reverse engineered to the original value. The oldest and simplest is ORA_HASH:

select ORA_HASH(USERID) Masked, a.value
from table a

This produces a number, but with enough values you could get a collision. Even better is the newer STANDARD_HASH function which is far less likely to give a collision, but which produces an alphanumeric (hexadecimal) output:

select STANDARD_HASH(USERID) Masked, a.value
from table a
Sign up to request clarification or add additional context in comments.

Comments

1

You can generate a GUID for the first of each UserId and then use analytic functions to use the same value for all the other rows with the same UserId:

SELECT MAX(maskedid) OVER (PARTITION BY userid) AS maskedid,
       value
FROM   (
  SELECT CASE ROW_NUMBER() OVER (PARTITION BY userid ORDER BY ROWNUM)
         WHEN 1
         THEN SYS_GUID()
         END AS maskedid,
         userid,
         value
  FROM   table_name
);

Which, for the sample data:

CREATE TABLE table_name (UserID, Value) AS
SELECT 1, 23 FROM DUAL UNION ALL
SELECT 2, 10 FROM DUAL UNION ALL
SELECT 1, 45 FROM DUAL;

May output:

MASKEDID VALUE
0x0BFDC951F97909BCE06502163E386F05 23
0x0BFDC951F97909BCE06502163E386F05 45
0x0BFDC951F97A09BCE06502163E386F05 10

You could use any function rather than SYS_GUID() such as DBMS_RANDOM.VALUE:

SELECT MAX(maskedid) OVER (PARTITION BY userid) AS maskedid,
       value
FROM   (
  SELECT CASE ROW_NUMBER() OVER (PARTITION BY userid ORDER BY ROWNUM)
         WHEN 1
         THEN DBMS_RANDOM.VALUE(0, 1e6)
         END AS maskedid,
         userid,
         value
  FROM   table_name
);

Which may randomly output:

MASKEDID VALUE
84046.38920350070167103410156941056149 23
84046.38920350070167103410156941056149 45
297835.63547525613908362909300172303338 10

fiddle

Comments

0

Why don't you simply take one (static)random number and do some math on it to get unique persitent id as follows:

Select 987654321 - a.USER_ID AS MaskedID, a.value
from table a;

You can also use some complex math with multiplication or division(with caution, try to avoid division) also to create the MaskedID.

1 Comment

It may or may not be a text field. Forgot to mention
0

This will have a unique Id for you every time.

select col1, col2, SYS_GUID() uniqueId from table1

I believe that you can also use oracle row ID, it is unique

select col1, col2, ROWID uniqueId from table1

Oracle also has row number but it will change with the row sort

select col1, col2, ROWNUM uniqueId from table1

But I believe, what you REALLY need here is to take 2 columns that make "something"+value unique, and hash it - select ORA_HASH(col1 || col2)

4 Comments

This will create unique IDs for every row (like my last example) no?
@Rickerz will your new ID need to be persistent for each row? or it can change every time? Do you need to map a particular row to the same unique id?
It can change every time, i only need to be able to 'hide' the original ID, but keeping the information that a user has said values. So yes mapping. My solution's problem was that the randomized part was somehow being randomized even for the same ID, where I thought using the subquery would go around that
@Rickerz I believe, what you REALLY need here is to take 2 columns that make "something"+value unique, and hash it - select ORA_HASH(col1 || col2)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.