| Gigabit NICs Got a question for all you networking gurus out there.
I have a customer who has a system using dual Gigabit NICs. The configuration is somewhat complex, but it goes like this:
There are three servers. Each server has two NICs. One NIC connects to a dedicated network shared just between the three servers. This is used for internal data replication between the servers, so that if one goes down, the others have the data necessary to take the load. The other NIC is used to connect the server to the outside world.
The NICs in question are Intel EtherPro/1000 using the 82465 chipset.
The customer is complaining that the system is intermittently slow to respond. In this particular case, I've got a custom web server application I wrote which implements a very small portion of the W3C standard. I use it because the mainstream web server apps are just too fat. When I look in my applications logs, everything appears normal. However, because my application is working at the TCP layer, I don't see anything from the network until the TCP connection is completed. In other words, if there is something going on with the NIC driver or the network itself that is resulting in delays or retries, it would be invisible to my app. The symptoms are that when this problem occurs, the client app makes a request and several seconds (sometimes several 10s of seconds) later it gets a response. During this period of time the server does not show unusual disk, cpu, or network traffic.
I am about to deplay a laptop machine with a gigabit adapter and a copy of Ethereal to sniff the network so that I can try to see what is going on, but I have the potential to end up digging through a LOT of data without finding anything.
Have any of you seen anything like this before?
__________________ Avatar and sig graphic by Pitch. Subscribers!
Ask about a custom graphic or avatar today! Later, Gizmo   |