As all the Automation servers and the Reconstruction servers are stopped, it remains only few servers providing data the Frame merger FbmAlp:
- 3 Slow frame builders: FbsDet, FbsAlp and FbsVac
- 2 python servers: PyDAQ and PyHVAC
- and FbmMain
This plot show the FbmMain latency and the ones of its frame providers where we can see that it latency is mainly due to the FbsDet one and sometime to the PyDAQ one.
The PyDAQ compute the latency of the different data streams (RAW, RAW_FULL,RDS, TREND, RAW_BACK). It use the DAQ only as trigger to perform the stream latency computation and to make the result available in the DAQ. To reduce it latency, it can be connected to an upstream server:
- FbmFE from 2020-04-07-14h20 to 2020-04-07-15h05
- FbsMoni since 2020-04-07-15h05 . The FbsMoni configuration has been modified to provide an frame stream ouput in the /dev/shm/VirgoOnline directory.
The improvement for PyDAQ latency can be seem in this plot
This plot show the relationship between the FbsDet latency and the FbmAlp one . Thanks to the monitoring performed by the Fbs server, the culprits were easily identified: the SDB power monitoring servers , the SIB2, SDB2 and SNEB ones.
Using the advance request facility for these servers it was possible to reduce the FbsDet latency by 1s and to have FbmAlp latency independent of the FbsDet one . The operation was done around 2020-04-06-21h00UTC (see this plot )
During these operation the RdsFbm_20 server crashed and was not restarted quickly as consequence 1020s of data are missing in the RDS stream
2020-04-06-21h11m25-UTC>WARNING-FdIOGetFrame: miss 1020 seconds between 1270241640.0 and 1270242660.0
This plot show the latency of its different frame providers, mainly the 50Hz processing servers, where one can see that the FbmMain latency is due to 50Hz processing of the Injection channels (INJ_50Hz server).
It appears that for the INJ_50Hz server there is a lot of channels sampled at high frequency, Fs>100kHz which are ended with the "_FS" suffix .
Removing the channels stored only in the RAW_FULL stream from the online 50Hz computation allows to reduce by 0.2s the latency at the FbmMain level.
This operation was done the 2020-04-07-08h50UTC.(plot)
This plot show
- the online latency close to 2s at FbmAlp level without Automation servers running
- the online data flow, around 63MB/s
- the RAW_FULL stream flow at 42MB/s or not 83MB/s as reported in the logbook 48896
- the RAW stream flow at 18MB/s
As consequence for the online disk buffers
- raw_back 36TB , so 24days with an input data flow of 18MB/s
- raw_full 50TB, so 14days with an input data flow of 42MB/s
For the processing servers before the Automation level, if they don't use the 50Hz channels, it s better if they take their data for FbmFE or for an upstream server with a /dev/shm facility..
Maybe all the PyALP servers in the Automation VPM subsystem could use at least FbmFE instead of FbmMain .
The 2020-04-07-12h04UTC the PyHVAC server was restarted with as Frame input the FbsMoni /dev/shm .