2026-04-02
The DAQ recovery, after the computing shutdown, started on 2026-04-02-18h28m45-UTC once the CmNames server and the VPM server were restarted and that all the expected hosts (rtpc, olserver and stol) were alive .
We restarted fisrt all the storage and data collection parts
- raw stream on stol01 and stol02 hosts
- rawfull, rds and trend streams on stol03 host
- the data collection parts running on
- the olserver52 and olserver53 hosts
- running on the rtpcs (TolmFrameBuilder)
- operations performed between 2026-04-02-18h28m48-UTC and 2026-04-02-18h38m44-UTC
As everything seems to run correctly , we continued by restarting all the servers involved
- in the data access part
- in the Detector Monitoring System
- in the Automation part
- The PyHVAC server was unable to retart due to
- the missing /dev/shm/zFbsIng imput SHM
- and later the missing input channels V1:HVAC_INJ_TE_OUT_SET
- server fully recovered the 2026-04-07-08h27m37-UTC
- The PyHVAC server was unable to retart due to
- in the Image readout
- operations performed between 2026-04-02-18h39m15-UTC and 2026-04-02-19h08m44-UTC
We left all DAQ running during 1 hour to check that everything ran smoothly, thank to the availabilty of the DMS and VIM web pages.
Then we continued by
- reconfiguring all the Tolm devices and by restarting all the rtpc's servers without closing the loops, except for those that are automatically closed by the ACL server.
- restarting the Telescreen 's servers
- restarting the environment's servers in the Detector's Environnent Monitoring part
- restarting the Newtow Noise servers
- operations performed between 2026-04-02-20h10m49-UTC and 2026-04-02-21h28m35-UTC
We left all the things running during the nigth to perform a more accurate check the next day
2026-04-03
We found that a lot of Ethernet devices (lnfs, sqz, power suppplies ) not any more recheable .
For DAQ/DET ethernet devices
- the DaqBoxes were reachable
- the SBD power supplies were not reachable.
- the SDB internal switches were reachable
- the SWEB QD RD482-ETH bridge was not reachable, meaning the B8_QD{1,2} shutters and Vbias were not anymore remotely driven and monitored
Thanks to the local support, the SBD power supplies remote monitoring/driving were recovered .
To recover remote monitoring/driving B8_QD{1,2},
- the SWEB_PowerUnit12 was turned off/on and fortunately it was enough to recover this device.
- as the internal Timing mezzanine was switched OFF/ON to, the SWEB bench DBoxes were reconfigured
- to recover the correct running conditions , the SWEB rach DBoxes were reconfigured too and the Acl's SWEB_rtpc servers restarted
- operations performed between 2026-04-03-09h19m33-UTC and 2026-04-03-09h37m46-UTC
Easter break
During the Easter break, the olserver115 became unreachable around 2026-04-03-19h34m38-UTC . The MdVim server has been restarted temporaly on the olserver118 at 2026-04-04-04h20m30-UTC.
The olserver115 host has been recovered by the IT department and the MdVim server is now running on its host since 2026-04-07-06h32m25-UTC
2025-04-07
All the Fbs servers were restarted with a new release v8r24p0
- operations performed between 2026-04-07-15h03m20-UTC and 2026-04-07-15h08m29-UTC