Virgo Logbook

Among the external network connections also the Low Latency Data Transfer via Kafka was also effected, resulting in Virgo data missing on CIT side.

Predecessors of the problems were reported by the Iciga2 monitoring at the top level by flagging the Virgo Low Latency machines as not pingable:

lowlatency-virgo is DOWN

Host check output:

PING CRITICAL - Packet loss = 80%, RTA = 8635.61 ms

Notification type: PROBLEM
Date time: Thu Aug 21 20:18:11 2025 UTC

Then the process in the LowLatencyAnalysis VPM dedicated to the monitoring of the Cascina -> CIT link (V1KafkaCITIn) reported accurately such data loss:

2025-08-22-01h23m36-UTC>WARNING-Miss 11408 seconds between 1439849472 and 1439860880
2025-08-22-01h23m36-UTC>INFO...-CfgReachState> Active(Active) Ok

At the moment, V1KafkaCITIn report those errors as warning (tipically we can have few seconds interruptions that get managed by the internal Kafka mechanism), we intend to modify the process so it goes in error state (and trigger DMS notifications for better monitoring) in case the interruption is lasti more than a certain time (to be put as a parameter for the process)

To be noted that the data loss also happened for the incoming data but only for the LLO link, as reported by the writing process L1KafkaCasIn :

2025-08-22 01h21m13 UTC FdIOGetFrame: miss 11413 seconds between 1439849472.0 and 1439860885.0

2025-08-22 01h21m13 UTC Input frames are back; gps=1439860885 latency=6.6

The LHO link was not effected. This may be due to the complexity of the network outage generated by the firewall.