In the last few days we had (again) three occurrences of a rather obscure problem, where the management of ITF_LOCK by ITF_CONDITIONS (Autorelock & Autoscience node) conflicts with some leftover request from the last management from CALI, in these cases the last requested state during Monday's calibration shift, LOCKED_PRWI.
These requests happen after an unlock, as then is when we request again something to the node, and they are not the cause of the unlock, as stated.
Looking at the unlock from this morning, the SDB1 fast shutter closes for some reason at 06:10:09.93 UTC; the signals based on SDB* are immediately cut off, while the other ones are alive for a bit more time. Everything is supposedly locked, and both the automation and the fast global control are up at the time. The fast LSC_ENABLE flag goes down at 06:10:10.03 UTC, so apparently not as a consequence of a cm command. ITF_LOCK enters the DOWN state at 06:10:11 UTC sharp from the DAQ/SMS point of view, but actually at 06:10:13.00 UTC according to the logs.
Looking at the zoomed plots of the event, we can see that the fast shutter behaviour was triggered by some glitch in either DARM or DARM/end mirrors correction signals, before the B1 saturation and the shutter closing.
From the automation point of view, the weird request is seen by ITF_LOCK and I suspect it happens more frequently than expected, but we see it if happens as the last one, so the one which sticks, while the first (the good one, in this case) is overwritten:
2025-07-03T06:10:13.008Z ITF_LOCK JUMP target: DOWN
2025-07-03T06:10:13.008Z ITF_LOCK [LOW_NOISE_3.exit]
2025-07-03T06:10:13.008Z ITF_LOCK STALLED
2025-07-03T06:10:13.537Z ITF_LOCK JUMP: LOW_NOISE_3->DOWN
2025-07-03T06:10:13.537Z ITF_LOCK calculating path: DOWN->LOW_NOISE_3
2025-07-03T06:10:13.544Z ITF_LOCK new target: FMODERR_TUNED
2025-07-03T06:10:13.554Z ITF_LOCK executing state: DOWN (1)
2025-07-03T06:10:13.555Z ITF_LOCK [DOWN.enter]
2025-07-03T06:10:13.575Z USERINPUT REQUEST: olserver52.virgo.infn.it virgorun old=LOW_NOISE_3 new=LOW_NOISE_3
2025-07-03T06:10:13.585Z USERINPUT REQUEST: olserver52.virgo.infn.it virgod old=LOW_NOISE_3 new=LOCKED_PRWI
2025-07-03T06:10:13.659Z ITF_LOCK in DOWN main
2025-07-03T06:10:13.674Z ITF_LOCK ['ITF_LOCK', 'DRMI_3F', 'CARM_SET_STEP1', 'CARM_SET_STEP2', 'CARM_SET_STEP3', 'CARM_NULL_3F', 'CARM_NULL_1F', 'DITHERING', 'SHUTTER', 'DC_READOUT', 'LOW_NOISE_1', 'LOW_NOISE_2', 'LOW_NOISE_3']
2025-07-03T06:10:14.008Z ITF_LOCK REQUEST: LOCKED_PRWI
It is also seen by CALI, and this is what should not happen, as in the DOWN state of CALI we clearly state to release ITF_LOCK from management (nodes.release(), and apparently it does, from the GUI point of view):
2025-07-03-06h10m13-UTC>INFO...-Unstalling ITF_LOCK
2025-07-03-06h10m13-UTC>WARNING-[DOWN.run] ezca: V1:GRD-ITF_LOCK_REQUEST => LOCKED_PRWI
In some cases, like yesterday, this happens with CALI complaining about connections errors:
2025-07-02-11h52m19-UTC>INFO...-Memory Used increase, cur. 2411364.00(KB), inc. 8.00(KB)
2025-07-02-12h56m41-UTC>INFO...-Unstalling ITF_LOCK
2025-07-02-12h56m41-UTC>WARNING-[DOWN.run] ezca: V1:GRD-ITF_LOCK_REQUEST => LOCKED_PRWI
2025-07-02-12h56m41-UTC>WARNING-[DOWN.run] USERMSG 0: EZCA CONNECTION ERROR: Any Other Error: Invalid REQUEST: LOCKED_PRWI
2025-07-02-12h59m41-UTC>INFO...-Memory Used increase, cur. 2411368.00(KB), inc. 4.00(KB)
2025-07-02-12h59m43-UTC>INFO...-Memory Used increase, cur. 2411520.00(KB), inc. 152.00(KB)
I will look more at the problem as soon as possible, one possibility is how the "unstalling" of nodes work, as it could be the actor that tries to re-instate a management that doesn't exist anymore, but this is still unclear. Another possibility is that we do the un-management wrong in some way (this is the only place where we un-manage nodes).