Yesterday and today we investigated more the last issue reported in #57352, which stated:
even the new, Python-based hwsAna software interrupts the images acquisition and/or the wavefronts generation after one or two iterations, for reasons that are not clear (especially now that Matlab is out of the picture); this has to be studied offline as well.
The tests we did did not involve Metatron, as it was already working (the other issue reported in such entry has not been addressed yet), but they involved only the chain PyHWS (online VPM process) -> hwsAna.py (local script).
We found out that the issue was due to the fact that PyHWS called the hwsAna script with a p = subprocess.Popen call that, even if supposedly asynchronous and in separate shell (shell=True), ended abruptly after a number of seconds, killing the child process (hwsAna); we added to the cm handler a p.wait() instruction, therefore explicitly waiting for the child process to end before moving on; this had the result of making hwsAna to survive, but killing the PyHWS process as it is clearly a blocking instruction.
In the end, we changed the cm handler of PyHWS, which now uses a more basic os.system() call, and the script is explicitly called in the background (using the standard '&'). This had the dual effect of having both PyHWS and hwsAna survive.
We did a few tests and they were all successful but the last, long one we launched, where a new, different kind of issue arose: we got up to wavefront acquisition #90, where the HWS (the DET one) stopped acquiring the images at number 42 (of 100), thus stopping there without generating the wavefront nor carrying on acquiring the images. Despite this new issue, the hwsAna process was still alive on servertcs1. At the same time, the HWS was not visible online (via the tcsImage.py script).
We decided to kill the hwsAna process, but still after that the HWS was not visible online.
We will continue to study this issue offline.