Reports 1-1 of 1 Clear search Modify search
IT infrastructure (Data Storage)
cortese, kraja - 14:47 Tuesday 31 January 2023 (58640) Print this report
Test of main fileserver fs01 performance during HA actions on underlying VMware hypervisor

In order to find the reason of some seconds of data loss of last wednesday ( https://logbook.virgo-gw.eu/virgo/?r=58547 )  on olserver52 DAQ framebuilder we performed the same actions on the VMware Fault Tolerance (FT) infrastructure that we did last tuesday and wednesday.

Filesystem affected could be in principle: /olusers, /virgoData, /virgoLog, /virgoApp, /virgoDev.

Chronology - baseline status is with FT paused

9:25 LT: start of Fault Tolerance resuming (a fs01 disk image replica is built transparently)

...  no appreciable effects spotted on nfs clients performance

9:57 LT: Fault Tolerance active (the disk image replica has completed and a standby replica of the fs01 VM is synchronized in lockstep in realtime) ; a glitch occurs on the olserver52 network output flux

...  load on fs01 increases 5 times, olserver53 cpu IOwait increases from 0 to 1.25% , there are no effects on the stolxx writing servers

10:45 LT: Fault Tolerance is paused and left in this state (the fs01 standby VM and disk replica is destroyed); a second glitch appears on the olserver52 network output flux

 

Images attached to this report
Comments to this report:
masserot - 18:23 Tuesday 31 January 2023 (58650) Print this report

Today during the tests, none frame was lost: there is no grey  period on all the plots only few SMS channels were not collected (see first plot) .

See the logbook 58457 for the tests last week.

On the second plot,  the green rectangle refers to the time period of the tests

  • the first one(cyan rectangle) refers to the  "Fault Tolerance resuming" period where none variation is present 
  • the second one(red rectangle) refers to  " Fault Tolerance active" period where one can see an increase of the iowait of the olserver53 due to the activity of the PyDAQ servers , which scan the some disk repositories (/virgoLog, ..)
Images attached to this comment
Search Help
×

Warning

×