0

I have converted the WildflowerSearch.org website that runs on Google Apps Engine from Python 2.7 to Python 3.11. The conversion to Python 3 is required by Google by February of 2024 in order to continue to develop the website. The conversion was painful but successful. However, the Python 3 version of the website crashes after running for about an hour. The crash is non-sensical and not repeatable and requires about an hour of user input making it very difficult for me to discover the problem. I have been working on this problem for weeks and do not know how to proceed.

The exact crash message is "UnboundLocalError: cannot access local variable xxx where it is not associated with a value" The variable is a 16k long bytes string and it is not always the same string. After the variable is assigned a value I have code that tests the variable to be sure it has been assigned a value by accessing a byte at both ends of the bytes string. Then the program enters a loop where each byte is accessed. While in this loop the variable can not be assigned a value. The loop adds items to a list which uses memory. While in the loop the variable value changes from a defined bytes string to having no value but only after with website has been running for about an hour.

The symptoms suggest that the use of memory by the loop triggers garbage collection. The garbage collection routine clobbered the bytes string because at some time in the past the free memory was goofed up.

Before entering the loop the program acquires a few 16k long bytes strings from NDB memory. The Python 2 version uses asynchronous NDB access which I removed in case the bug was related to the asynchronous operation. I also added code to add a byte to the end of each bytes string so that while in the loop the original NDB string is not being used. I added code to verify that each bytes string could be accessed before entering the loop. (While in the loop the bytes strings are not modified.) None of these changes eliminated the problem.

I disabled parts of the program that were not necessary for the normal operation of the website. This assured me that the problem was not due to a bug in those parts of the program.

Periodically, the website is bombarded with nonsense from malicious programs. I have not noticed this sort of activity prior to the website crashing.

I looked for a way to access the gc() garbage collection routines in Google's Python 3 but could not discover how this is done. Forcing garbage collection each time the website is accessed might cause the problem to be discovered more quickly once free memory is goofed up helping to locate the URL that causes the problem.

It seems most likely that the problem is related either to Flask or NDB and possibly how my program interacts with those modules.

The Python 2 version never crashes.

I changed the version numbers in the requirements.txt file to access the latest versions. This did not fix the problem.

Another thing I tried was to reload, from NDB, the contents of the variable that suddenly became undefined. I used "try:" and added the code that reloaded the variable from NDB in the "except:" section. After around an hour the program crashed, first with the variable going undefined and then again in the except: section because the NDB key had also gone undefined. In my experience, when free memory gets corrupted lots of variables get trashed and everything goes to pot.

At this time of year there are around 5,000 search requests to the website per day. It seem likely that around 200 search requests happen before the website crashes. Unlike simple problems, repeating the search requests that occurred before the crash does not cause a crash and those search requests are completed correctly.

5
  • This could be due to 'new' instances being created (and is dependent on how you wrote your code). See if this stackoverflow response gives you an idea of how to fix yours. Python 3 typically spawns more instances than Python 2 which could be why you didn't see the issue in Python 2 Commented Dec 25, 2023 at 21:45
  • Thank you for thinking about my problem. app.yaml has max_instances: 1. Also, the dashboard shows a single instance. But even with multiple instances the website should not crash. Commented Dec 25, 2023 at 22:10
  • You said the error is UnboundLocalError which implies it's about variable scoping. If you read the response I linked to, it explains how spawning of a new instance can lead to 'issues' in state of variables depending on your code. However, you've indicated that your app.yaml is set to a max_instance of 1 so that should rule it out. Do you have an entrypoint in your app.yaml file? If so, what is it? Commented Dec 25, 2023 at 23:55
  • I don't have an entrypoint. This is what Google says: If you do not specify entrypoint for the Python 3 runtime, App Engine configures and starts the Gunicorn webserver. Therefor, Gunicorn (whatever that is) is started. Anyway, the website works. Commented Dec 26, 2023 at 0:21
  • The best way to narrow down your problem is for you to post the code snippet which covers the variable that is erroring out. Commented Dec 26, 2023 at 1:46

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.