0

I am currently checking a large number of tables for linking keys(all with different names) between datasets. My current code checks for how many records exist and for how many records have a 0 key value. My code looks like this:

%let table = table1;
proc sql;
create table linking_keys as
select * 
from work.my_datasets
where Table_Name = "&table" 
and linking_key NE 'N';
quit;

proc sql;
select Column_Name
into :lnkkeycol separated by ','
from linking_keys;
quit;

%put &lnkkeycol;
proc sql;
create table lnk_key_only as
select count(*)
from library1.&table;
quit;

proc sql;
create table lnk_key_only_2 as
select count(&lnkkeycol)
from library1.&table
where &lnkkeycol = 0;
quit;

This works fine when there is only 1 linking key in the table, but when there is multiple, it falls over. I have tried grouping by macro variables but not had much success with the below:

proc sql;
create table lnk_key_only_2 as
select count(&lnkkeycol)
from library1.&table
where &lnkkeycol = 0
group by &lnkkeycol;
quit;

I am not entirely sure where I am going wrong.

1 Answer 1

1

You need to specify each link key column separately. If you have multiple link key columns, you are passing this into the count() function:

count(col1, col2, col3)

Which is invalid for the count() function. Instead, create a separate macro variable that counts each link key col only when its value is 0.

proc sql;
    select cats('sum( (', column_name, '=0) ) as count_', column_name)
    into :lnkkeycol separated by ','
    from linking_keys;
quit;

This will create individual sum() functions that will only count the condition where column_name = 0. (boolean conditon) is a shortcut in SAS that creates a 1/0 value if the condition is true or false. If we encapsulate it in a sum function, sum( (boolean condition) ), we can sum only the 1's.

proc sql;
    create table lnk_key_only_2 as
    select &lnkkeycol
    from library1.&table;
quit;

This translates into:

proc sql;
    create table lnk_key_only_2 as
       select sum( (col1=0) ) as count_col1
            , sum( (col2=0) ) as count_col2
            , sum( (col3=0) ) as count_col3
    from library1.&table;
quit;
Sign up to request clarification or add additional context in comments.

4 Comments

That would force SAS to invent names for the three count() function result variables. Also if you are counting only the observations that have COL1=0 and COL2=0 and COL3=0 then then all three of the COUNT() results will be the exact same number, the number of observations with all three variables equal to zero.
Probably should change it from AND to OR? I need some sample data to try it out.
Updated it to sum only the boolean condition (coln=0)
Thanks for the solution Stu, I actually figured out a slightly different method a bit before you replied but I appreciate your help again.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.