1

Database : Oracle 11g Server : GNU/Linux Bash Shell.

I have developed a shell script that use sqlplus to connect to the database and select each row from a table and update a column with a value.

I designed this because I had very little data on that table,but now the data have grown to 500K rows. Select and update each record will obviously take long time to update 500K rows.

Is there a way I can execute the script in parallel but each script picks up unique record and update the row ? avoid updating same row by the scripts running parallel?

2
  • In theory, you should be able to do this with a single query, although its hard to say because you didn't share your data structure or current script Commented Oct 24, 2013 at 16:34
  • @CharlieMartin I will try to post the code, however the logic is simple, select dd_no from stagin_table where seq_num = &1; and dosomething on dd_no and call update staging_table set dd_no= '${dd_no}' where seq_num = &1; Hope this answers your question. Waiting to hear from you. Commented Oct 24, 2013 at 17:56

2 Answers 2

2

You could have one script that takes in one or more parameters and updates one row. You could then have another script that calls the first script iteratively in the background. For instance:

updateRow.sh

!#/bin/bash
firstParameter=$1
secondParameter=$2
# ...and so on

# Update table based on input

updateTable.sh

!#/bin/bash
for i in 1 .. N
do
    $WORKING_DIR/updateRow.sh <param1> <param2> & > /path/to/log/file
done

You could of course come up with different logic to do the same thing. Be careful that the script instances running in parallel do not attempt to update the same row.

Sign up to request clarification or add additional context in comments.

1 Comment

To avoid overloading the server and mixing of output you can use GNU Parallel: parallel $WORKING_DIR/updateRow.sh arg1 {} ::: {1..100}
0

One of the nice things about Oracle databases is you can use PLSQL (Procedural SQL), which was created precisely for migrations like this. I'm not positive that I completely understand your example, but I think your script would look something like this...

spool name-of-log.log

SET SERVEROUTPUT ON
SET DEFINE OFF
SET SCAN OFF

-- Output the current schema and execution time for logging purposes
SELECT USER

  ||' @ '

  ||GLOBAL_NAME

  || '    '

  || TO_CHAR(SYSDATE,'dd-MON-yy hh24:MI:ss') AS ENVIRONMENT

from global_name;

-- now your procedure..
DECLARE
  -- declare any necessary variables (none needed in this example)
BEGIN
  FOR i IN
  (SELECT dd_no, seq_num
  FROM stagin_table)
  LOOP
  -- do something on i.dd_no, then..
     EXECUTE IMMEDIATE 'update staging_table set dd_no = ' || i.dd_no || ' where seq_num = ' || i.seq_num;
  END LOOP;
END;
/

spool off;

Then just execute your script with sqlplus in your shell script or run it from the command line..

sqlplus>@my-script-name.sql

In theory, this will be faster than calling multiple shell scripts

8 Comments

There is a problem here! I could have done that in PLSQL but the problem lies at the line "do something on i.dd_no" . I cannot do anything on i.dd_no in SQL, the reason is , I will have decrypt the dd_no which is not possible PLSQL directly. I will have to call third party application in the shell script to decrypt for me and the result will be stored $dd_no and used in update statement.
Sorry, I didn't realize the 'do something' was something that can't be done in PLSQL. I wouldn't underestimate PLSQL though. There lots of PLSQL functions out there for encyption/decryption, plus you can always write your own if you have to. You can easily find algorithmic examples of the more popular methods. This was the first result in a quick google search.. oracleflash.com/41/…
I agree with you, but this is application level encryption. Although it uses AES but it cannot be done in SQL. I double checked with Application Arch here and they said, we have to call API's to decrypt. Perl would have helped but then there is no DBI perl installed on production.
Other facts are I'm not good in Java :). Im good at C but to run update statements I will have to do in pro*c or OCI which is time consuming and hence I fall back to shell script, but I see this performance issue here.
This means you'll be hitting the api endpoint 500k times in the matter of seconds that your script is running. Maybe they'll make it possible when you start crashing their server :)
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.