Today since midnight many physical devices started to disconnect from the network apparently randomly until when the majority was not reachable this morning ( see this entry ) both in the experiment network and the general network (thin/vdi clients and phones).
The problem was related to the centralized infrastructure that assigns the devices to the vlan/IP_networks according to their ethernet mac-address: this has been traced down to the primary RADIUS server which started to deny access requests randomly.
After having reconfigured the LAN switches to point to the secondary RADIUS server the NAC assigment was working again around 13AM LT.
Since the current SL6.x RTPCs depend on this infrastructure to allow for their mobility, they also have been affected and required to be rebooted.
Some other devices whose dhcp client is not resilient to server outages needed to be either restarted manually by Franco (Netcom eth/serial bridges) or their interface flapped remotely from the switches (Hameg PSUs , env monitoring boxes).
(in NEB, the link of a device, probably ipcam05 , has been disabled since it floods the network with multiple source ethernet addresses causing load on the radius servers; at the moment it is not possible to point it as the cause of the NAC problem)
The primary radius server will stay offline until when the investigations will be completed.