0

I have an array stored in the site options in WordPress which will potentially have 3000 email addresses in it. Every night I need to process this list with some checks on those email addresses and I'm worried that if it fails I could be in trouble.

I decided for safety I should do this in batches. At midnight I create a scheduled event which runs every 2 minutes in batches of 50.

Currently, in the function, this is what happens. I make a new array of 50 items which I then use in a loop to run the various processing bits I need to do.

function tw_process_daily_list_chunk(){

   $tw = get_option("tw_settings");
   $daily_list_chunk = array_splice($tw["daily_list"], -50, 
   count($tw["daily_list"]));

   foreach ($daily_list_chunk as $playeremail){
      do_various_things();
   }

   if (count($tw["daily_list"]) == 0 ){
      do_stop_processing_stuff();
   }
}

I'm now concerned that if this failed then the next time the scheduled event runs it thinks it processed the last batch and it might not have. Is there any downside to directly accessing the larger array from get_option, and then also updating updating that option instead?



function tw_process_daily_list_chunk(){

   $tw = get_option("tw_settings");
   $iteration = 0;

   foreach ($tw["daily_list"] as $playeremail){
      do_various_things();

      $key = array_search($playeremail, $tw_settings['daily_list']);
      if ($key !== false) {
         unset($tw_settings['daily_list'][$key]);
      }
      update_option("tw_settings", $tw_settings);

      $iteration ++;
      if ($iteration == $chunksize){
         break;
      }
   }

   if (count($tw["daily_list"]) == 0 ){
      do_stop_processing_stuff();
   }
}
3
  • "and then also updating updating that option instead?" - what exactly do you want to update in that array? Are you keeping the info whether an individual item was already (correctly) processed in there, or what? Or are you talking about reducing the array, by removing the processed items? (If so, then who will fill that array again, before the run on the next day?) Commented Jan 12, 2024 at 14:07
  • The array is populated via other mechanisms on the site. It starts a full list of users subscribed to an event, if they don't haven't done a specific action that day then we have to take some action on them (remove them from a mailing list, remove a life or take them off the scheme altogether). Once that list is fully processed it's refreshed for the following day. I mean updating the array in its original location via update_option. Need to know if there's a cost for accessing and updating that data in a loop. Commented Jan 12, 2024 at 14:13
  • Of course there is a bit of "cost", because any update requires a database operation. Commented Jan 12, 2024 at 14:28

1 Answer 1

0

You just need to change what you are doing ever so slightly, and then you can process the entire array efficiently at once, assuming do_various_things() isn't too involved.

Do not call update_option inside the loop!

Instead, before the loop, create an empty array (maybe called $processed_array). Inside the loop, assign to this array cleaned values (or skip values that shouldn't be in the array). AFTER the loop completes, call update_option and pass in your processed array.

After this slight rewrite to your function, if you are worried about this process timing out, you should comment out the update_option line and test it to see to see how long it takes. If do_various_things() is just some string comparisons and if/then statements, this should run very, very quickly.

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks. The problem is I cant really test in advance. The processing does call other api's as well so I cant really be assured of the speed on that either!
Calling other APIs? Ooof! Action Scheduler is a friend you need to make: wordpress.org/plugins/action-scheduler This allows you trigger hooks with unique data, so you can trigger multiple hooks, and send a portion of the data to the be processed for each hook.
Yes, thats exactly what it does, I've set batches of 50 but ciuold be any size. I "decided for safety I should do this in batches. At midnight I create a scheduled event which runs every 2 minutes in batches of 50."
To answer your question then, there is no downside to accessing the larger array via get_option. There is a downside to calling update_option inside of the for loop, so I would update the array from the get_option call inside the loop and then call update_option once, after the loop, as I described. If you are worried about a batch failing to complete successfully, add logic to detect and log the error, delete all pending actions, and then either retry/reschedule the failed action and cancelled ones, or do nothing and investigate the failure.
A flow that might benefit you here is to create an action only for the first batch of 50, and then in the callback function for this action, if everything has gone well, you create the action for the next 50, scheduled to begin shortly.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.