This might be related to this question, but in my current SAS Viya Stable 2024.12 environment I encounter an empty Error Message and the SAS Studio session context breaks, while trying to do a SELECT DISTINCT in a PROC SQL step.
My table was loaded from postgres DB to WORK with SELECT *. It has 1.18 Mio. rows, Col1 & Col3 are of type varchar(64), Col2 is of type varchar(256) the other 27 columns are not of interest. (The exported .sas7bdat file from work is like 283 MB)
If I run
PROC SQL _method;
CREATE TABLE WORK.OUTPUT AS SELECT
t1.col1, t1.col2, t1.col3
FROM WORK.INPUT AS t1;
QUIT;
it works like a charm and needs like 3 seconds, with following log:
NOTE: SQL execution methods chosen are:
sqxcrta
sqxsrc( WORK.INPUT(alias = T1) )
NOTE: Compressing data set WORK.OUTPUT decreased size by 95.76 percent.
Compressed is 1,192 pages; un-compressed would require 28,144 pages.
NOTE: Table WORK.OUTPUT created, with 1,182,028 rows and 3 columns.
NOTE: PROZEDUR SQL used (Total process time):
real time 1.55 Seconds
user cpu time 1.47 Seconds
system cpu time 0.12 Seconds
memory 6,032.03k
OS Memory 30,448.00k
Timestamp ...
Step Count 12 Switch Count 0
Page Faults 0
Page Reclaims 244
Page Swaps 0
Voluntary Context Switches 29
Involuntary Context Switches 8
Block Input Operations 0
Block Output Operations 153,360
But if I use "SELECT DISTINCT" it raises an empty error and the log stops at:
NOTE: SQL execution methods chosen are:
sqxcrta
sqxunqs
sqxsrc( WORK.INPUT(alias = T1) )
This also happen if I only SELECT DISTINCT on t1.Col2 (or a combination of four other varchar(64) columns...).
Apparently, the loaded varchar columns have quadrupled in size (256 → 1024), and I encountered the following behavior:
SELECT DISTINCT t1.col2- FailsSELECT DISTINCT PUT(t1.col2, %256.)- SucceedsSELECT DISTINCT PUT(t1.col2, %1024.)- Fails
I tried to monitor and raise the memory usage, as follows, as well as executing on an older preliminary env with Viya Stable 2024.08, and there it works:
| STATUS Distinct | MEMSIZE | MAXMEMQUERY | |
|---|---|---|---|
| PreEnv Viya | Succeeds | 2,147,483,648 | 268,435,456 |
| Current Viya Dev | Fails | 4,294,967,296 | 268,435,456 |
| Current Viya Test | Fails | 21,474,836,480 | 268,435,456 |
NOTE: SQL execution methods chosen are:
sqxcrta
sqxunqs
sqxsrc( WORK.INPUT(alias = T1) )
NOTE: Table WORK.OUTPUT created, with 158457 rows and 3 columns.
NOTE: PROZEDUR SQL used (Total process time):
real time 38.33 Seconds
user cpu time 3.20 Seconds
system cpu time 7.98 Seconds
memory 1,052,242.14k
OS Memory 1,082,636.00k
Timestamp ...
Step Count 16 Switch Count 0
Page Faults 0
Page Reclaims 309,992
Page Swaps 0
Voluntary Context Switches 55,732
Involuntary Context Switches 4,160
Block Input Operations 10,254,992
Block Output Operations 5,322,656
So it seems to be not an memory issue, maybe it is the compressing of data? I'm no admin, so I have limit insight to the environment configurations.
I know I just could use data steps or similar, but this is just an minimal example; this error happens with different tables/flows/steps. In addition, we normally use SAS Studio flows to 'program', which generates its code automatically. However, this code also fails copied in a separate SAS program, which is represented in the example.
We also have in mind that the data size will only but expand, so this issue needs to be solved.
Does someone have an idea which setting/config/property could be the cause of this issue and what can be done to prevent/eliminate it?