Virgo Logbook

AdV-DET (Commissioning)

flaminio - 8:45 Tuesday 04 May 2021 (51633)

SDB2 - SDB1 issue

Yesterday I was informed by Matteo Tacca that the SDB2 suspension is stucked.
I checked the data on the VIM and I found that the SDB2 suspension underwent a large motion on April the 29th.
I also noticed that the data from the SDB1 suspension had disappeared the previous day on April the 28th. These data reappeared on the following day.
The shock on the SDB2 suspension coincides with the reappearence of the data from the SDB1 suspension.
I informed Romain Gouaty about this coincidence.
We concluded that the shock on SDB2 is likely due to the SDB2 - SDB1 tracking system confused by something happening on the SDB1 suspension.

Comments to this report:

bertolini, bonnand, bulten, tacca - 9:34 Tuesday 04 May 2021 (51635)

The suspension got large oscillations till the loops opened (see plot). Yesterday we have tried to recover it but we did not succeed.
Here a list of the findings:

-Motor#0 looks at the end of its range: cannot move further LEFT but can come back RIGHT
-Motor#1 looks jammed at the end of its range: cannot move further LEFT or RIGHT
-Motor#2 looks at the end of its range: cannot move further LEFT but can come back RIGHT

All correction springs appear to be completely pulled on one side in the attempt to correct for the large TY of F0 that stays >+4500urad
Quite amazing considering that before last week accident the TY free oscillation position was -60 urad

We managed to close all loops at:
LC-Z=-280um (nominal is -190um)
LC-X=+380um (nominal is ~0)
LC-Y=-8 (nominal but not stable, see below)
LC-TY=1100urad (nominal)
LC-TX=-220urad (nominal)
LC-TZ=-100urad (right now not very stable and significantly off from the original setpoint=+24urad)

Right now is not possible to retrieve the original setpoint in X; top stage TY setpoint is not good.
In the open-loop spectrum of LVDT_F0_y there is a clear peak slightly above 2Hz which is not normal; this is an indication
that possibly something wrong (mechanical short) is happening at the level of F1.

We have decided to vent the minitower and inspect bench and F1. The intervention is already ongoing.

Images attached to this comment

ruggi - 9:35 Tuesday 04 May 2021 (51636)

The disappearing of the data from SDB1 suspension is one event of a long list of 'similar' problems occured on suspension electronic in the last period. For most of them, the common source appeared to be a bad data received from outside. In some cases, the trigger was known. In some other, like the events occurred on SDB1, PR, WI last week, they were considered simply side effect of the break for the installations, and the consequent increased activity on DAQ.

This is the point of view of the user, ignorant of electronics, which recovered some of those pathological situation removing NAN by a download. A deeper analysis would be recommended.

bertolini, tacca, vardaro, vacuum - 12:37 Tuesday 04 May 2021 (51638)

The SDB2 minitower has been vented early this morning. The suspension has been inspected and an issue with F1 was found (cable touching a stop) and fixed. The jammed motor#1 has been manually unblocked and it's now working correctly. All horizontal correction springs have been re-centered at mid-range. Motor#5 (yaw DC positioning) has been used to reduce the correction on the LC-TY. Excess torque on F0_TY has also been recovered.
Tx and Tz have been balanced by slightly shifting the ballast masses on the bench. LC controls have been tested in air and work correctly with low correction torques. SBE loops have also been closed in air and operate correctly as well. Therefore we decided to close the tank again and proceed with pumping to recover normal operation.

bertolini, tacca, vardaro, vacuum - 15:25 Tuesday 04 May 2021 (51641)

The SDB2 minitower has been pumped down over lunch time and the suspension has been rebalanced right after. All loops closed smoothly at the original set-points. The bench is now available for commissioning activities. Meanwhile additional automatic safety procedures will be implemented to prevent accidents as the April 24th one to happen again.

bonnand, masserot - 18:10 Tuesday 04 May 2021 (51649)

As the SDB2 tower was opened , glitches occured on the 100MHz Timing channel used to slave the SDB2 Daq boxes triggering the timing errors . To recover the running conditions, timing error at zero, the SDB2 Daq boxes were reconfigured at 2021-05-04-15h52m20-UTC and the rtpc5 DAC driver servers restarted

bertolini, bonnand, tacca - 22:04 Tuesday 04 May 2021 (51651)

Unfortunately issues with SDB2 are not completely over. Vertical correction started saturating quickly unveiling a problem at the level of F0 (see plot)
F0 looks stuck in vertical at the current position. Investigation is ongoing to understand the cause. After some non conclusive tests we decided to
temporarily disable the vertical control. We left the bench controlled in all DoF except Y. A new venting of the minitower is necessary to fix this problem; time of this action will be decided compatibly with the planned ISC activities.

Images attached to this comment

bertolini, gherardini, magazzu, tacca - 14:48 Thursday 06 May 2021 (51684)

The minitower was vented last night and opened this morning. Visual inspection of the suspension chain did not show any issue. Then, when pulling
up and down the F0-F1 wire, it has been found that the mechanical range of the F0 keystone was limited on the upper side to ~+850 um instead of the expected +3500um; in the downward direction the keystone could move freely down to -3500um (end-stop). We have carefully inspected the LVDT/actuator block of F0 without spotting anything wrong. When shaking the F0 keystone we managed to recover part of the positive mechanical range which now reaches +1600um. The vertical axis is freely moving and oscillating over the whole -3500--+1600 range, way sufficient for the correct operation of the suspension. Further investigation about the limited vertical range would require disassembling in place the LVDT actuator block, which is a major and very risky intervention. We have decided then to re-close the tank and operate the suspension in the current condition. Meanwhile we will monitor the operation of the suspension and get suitable tooling ready for an intervention whether it will be necessary. In-vacuum, closed-loop operation has been restored around 1PM.

flaminio - 7:43 Sunday 09 May 2021 (51656)

My previous entry contains some errors on dates, so let me correct it.
I will take this occasion to add more information that I collected since Tuesday.

First the dates.
The data from SAT/SDB1 disappeared on April the 27th night (see plot).
They re-appered on April the 28th afternoon (see plot).

The cause of the disappearance of the data is not known yet. The elog entry (https://logbook.virgo-gw.eu/virgo/?r=51636) suggested that this might be caused by some bad data received from the SAT/SDB1 control system. So, I asked about this to Alain. He told me that no data are sent to SAT/SDB1 from the global control.

Instead, the re-appearance of the data on April the 28th is now clear. It is due to a reboot of the SAT/SDB1 crate performed at that time by Valerio. Unfortunately this manual reboot was not reported in the elog and the operators were not informed about it.

The re-appearance of the SAT/SDB1 caused the shock to SBE/SDB2 suspension because the SBE/SDB2 suspension tracks the SAT/SDB1 suspension. The guardian on SBE/SDB2 suspension was not able to deal with it.

In conclusion there were several problems. Among these:
1) The failure in the SAT/SDB1 control system whose cause remains to be investigated.
2) The breaking of the rule according to which interventions on the detector have to be reported to the operators before they are done and on the elog after they are done.
3) The SBE/SDB2 guardian did not manage to avoid the shock to the suspension.

The recovery of SBE/SDB2 was completed on May the 6th (https://logbook.virgo-gw.eu/virgo/?r=51684)

Images attached to this comment

ruggi - 14:31 Sunday 09 May 2021 (51711)

In my previous comment I wrote 'bad data received from outside'. Again, for this specific event this mechanism is not demonstrated (it was for other events, similar in the final effect). But it is possible, given the architecture of the suspension controls. SDB1 top stage control is involved in the complex network of Global Inverted Pendulum Control. It receives data 'from outside', namely PR top stage master board via TOLM system. PR receives data from several boards, and in particular data which are connected to global control output towards suspensions (GIPC is the control architecture which put the locking force in loop for top stage controls).

If a NAN enters a board in one position of this network, it is able in principle to reach boards quite far away, even if GIPC is not turned on, because a switch at zero is not able to stop it (NAN*0=NAN, in our codes). This is not simply theoretically possible: it happened several times.

Now I must say again that there is no evidence that this is the mechanism that put SDB1 suspension control board in a state that required a reboot. I'm just saying that, if somebody want to investigate on that event, the bad data received from outside has to be considered a trail to follow.

masserot - 16:45 Wednesday 12 May 2021 (51764)

The SUSP_SBE_LC server propagate some Sa channels to some SBE controls (SNEB,SWEB and SDB2) . Here the logfie content of this server when

the Sa_OB data were lost the 2021-04-27-03h33m44-UTC>WARNING-AcAdcChCheck> Sa_OB_F0_Z - start delayed or missing at GPS1303529641-970704850
the Sa_OB data were back the 2021-04-28-14h24m08-UTC>WARNING-AcAdcChCheck> Sa_OB_F0_Z - Sa_OB_F0_Z delayed or missing from GPS1303529641-970704850 for more or less 125424(s) - nLoop 1254230294@10000Hz

The following plots shows the LSC_{PR,BS,NI,WI,NE,WE}_CORR channels sent by the LSC_Acl server to each ITF towers and the forwarded one, if any, as SC_{(PR,BS,NE,WE} _MIR_LSC_CORR . The same data sent to the Sc DSP is sent to the DAQ too .

when the Sa_OB data disappeared
when the Sa_OB went back

During these events the ITF was unlocked , as consequence the corrections were at zero and the Sc DSP forwarded corrections remains at zero : so no NAN was forwarded by the Sc DSPs to the DAQ

Images attached to this comment

ruggi - 18:41 Wednesday 12 May 2021 (51767)

It could be useful to analyse also another event, occurred on April 28, which affected PR and WI top stage control at the same time. In this case the data were not permanently lost as for SDB1, so it could be something totally different. In that case, PR and WI loops stopped working at the same time, and their re-activation required the removal of a NAN from some variable. The source is not known, but I think we can say that, at least for one of the two, the problem arrived from outside. Which doesn't mean 'global control' (LSC_CORR appeared to be zero also in that case), or DAQ. Thinking about the data distribution was a simple answer I gave to me, given that something similar (in the effect) happened days before, in coincidence with a timing problem triggered by an ADC failure at WE (in that case Alain gave in advance a warning about possible drawbacks on GIPC data). The second hint was the fact that a DAQ activity was ongoing (according to the operator).

I'm not doing any diagnosis: just giving information, which I forgot to write in the logbook in the right moment.

Images attached to this comment