0

What I encountered

I'm programming with Flows in SAS Studio in the ViyaCloud. While trying to load large Tables from a PostgreSQL Database I encountered the following error combination:

Two Error Pop-Ups. One stating that a problem with the SAS Session is detected. One "Run Flow" is blank.

I narrowed it down to too large Postgres tables conflicting with SAS Work library, as the problem also appears within 'normal' SAS programms and mindless mundane queries.

Basic example - Try it yourself

I tried different things but it narrows down to this simple list:

  • With python I created myself data for a dummy DB table with 20 columns (1 'ID', 1 Date, 3 Character, 15 Numeric (Precision 8. if not Integers)) and 2.000.000 rows of random data
  • I connected the PostgreSQL Database with a SAS Studio library using SAS/ACCESS to Postgres
    • I'm no admin, so I can't view detailed logs in the DB and have no details about the environment/database configurations
  • In SAS Studio I created a new SAS Program and tested the connection successfully with a PROC SQL:
/* 1x ID, 10 Observations - Successfull */
PROC SQL OUTOBS=10;
    CREATE TABLE WORK.Test_Select_All AS SELECT t1.* FROM POSTGRES.RAND_DUMMY_DATA t1
    WHERE t1.id <= 1
    ;
QUIT;
RUN;
  • Afterwards I removed the filter which lead to the described error/problem
/* All, 10 Observations - Fails */
PROC SQL OUTOBS=10;
    CREATE TABLE WORK.Test_Select_All AS SELECT t1.* FROM POSTGRES.RAND_DUMMY_DATA t1
    ;
QUIT;
RUN;
  • If I use this PROC SQL to create a new Table not in WORK but in Postgres itself, it works
  • When only 14 or less columns are selected, it works
    • If I only use one half (5 Info Cols with 7 Value Cols), then the other half (5 Info Cols with the other 8 Value Cols), it works for both
  • With 15 or more columns it also fails

Regarding the connection:

  • I tried using the 'Use DBMS specific tool/interface to retrieve data' and 'Enable bulk load' option, that didn't help either

PROC CONTENTS of DB Table

INFO1 VAL1 INFO2 VAL2
Data Set Name POSTGRES.RAND_DUMMY_DATA Observations .
Member Type DATA Variables 20
Engine POSTGRES Indexes 0
Created . Observation Length 0
Last Modified . Deleted Observations 0
Protection Compressed NO
Data Set Type Sorted NO
Label
Data Representation Default
Encoding Default
# Variable Type Len Format Informat Label
2 char_1 Char 4 $4. $4. char_1
3 char_2 Char 4 $4. $4. char_2
4 char_3 Char 12 $12. $12. char_3
1 date Num 8 DATETIME 25.6 DATETIME 25.6 date
20 id Num 8 id
5 val_1 Num 8 val_1
6 val_2 Num 8 val_2
7 val_3 Num 8 val_3
8 val_4 Num 8 val_4
9 val_5 Num 8 val_5
10 val_6 Num 8 val_6
11 val_7 Num 8 val_7
12 val_8 Num 8 val_8
13 val_9 Num 8 val_9
14 val_10 Num 8 val_10
15 val_11 Num 8 val_11
16 val_12 Num 8 val_12
17 val_13 Num 8 val_13
18 val_14 Num 8 val_14
19 val_15 Num 8 val_15

What I expected

That the DB table would be loaded into the Work Library without any errors, as runs with less data worked fine.

Update - Connection Issue?

After I added CONOPTS="UseDeclarefetch=1;FetchSize=250000" to my libname statement, my minimal example I posted works. But for other reasons I currently can't try more complex / the original ones.

I suspect that it's a RAM issue o.eq. of the Workspace and I need to adjust the connection to the DB further to not kill the process. Or request more RAM I guess. Keeping you updated.

6
  • If you click CANCEL, do you get a SAS log with error messages from the PROC SQL step? Commented Jan 6 at 17:59
  • Run proc contents on your Postgres table (POSTGRES.RAND_DUMMY_DATA). Is Postgres one of those databases that defines variable length character variables that can hold up to tens or hundreds of thousands of bytes of data? What is the LENGTH for the character variables as SAS sees them? Also what is your current setting for the COMPRESS system option? Commented Jan 6 at 18:31
  • Have you tested whether it is a specific column that is causing the problem? Your test data has 19 columns. If you read 10 columns in one step and then read the remaining 9 columns in another step, they both run fine? Commented Jan 6 at 21:33
  • 1
    Turn on SAS trace debug options to see what is being attempted by the library engine, something like options sastrace='d,,d,d' sastraceloc=saslog nostsuffix;. You might be able to use some of the information in a different tool (such as DBeaver) to explore what might be causing SAS issues. Commented Jan 7 at 0:31
  • Thank you for your comments! # C1 from Quentin # Unfortunaly no. There are preprocessing/datastep comments of /* region: Generated Preambel */, but no fruther log/info from SAS Studio compute context. # C2 from Tom # Thanks for that, I added the info to the post. # C3 from Quentin # I tried that, both works. # C4 from Richard # Thanks for the info. I now get a bit more info: The log stops with this last rows: POSTGRES: pggbfie, Set statement's SQL_BIND_TYPE to 344 POSTGRES: Enter pggeti, table is RAND_DUMMY_DATA, statement 0, connection 7 POSTGRES: ENTER SQLExecute Commented Jan 7 at 11:33

1 Answer 1

0

I found a fix for this, even if I quite not understand the details.

It seems to be a memory/caching problem, because our default was so that everything was loaded at once. The following helped us to prevent this from happening:

1. Connection Options

Add the specific connection option to fetch the data 'batchwise' to the library definition like:

libname YOURNAME libdef='defSource' CONOPTS="UseDeclareFetch=1;FetchSize=500000";

2. Environment Variables

Let your Admin declare the fetch size in the environmentvariables that are called SAS_PG_USEDECLAREFETCH and SAS_PG_FETCHSIZE like above...

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.