Causes of Linux UDP packet drops

I have a Linux C++ application which receives sequenced UDP packets. Because of the sequencing, I can easily determine when a packet is lost or re-ordered, i.e. when a "gap" is encountered. The system has a recovery mechanism to handle gaps, however, it is best to avoid gaps in the first place. Using a simple libpcap-based packet sniffer, I have determined that there are no gaps in the data at the hardware level. However, I am seeing a lot of gaps in my application. This suggests the kernel is dropping packets; it is confirmed by looking at the /proc/net/snmp file. When my application encounters a gap, the Udp InErrors counter increases.

At the system level, we have increased the max receive buffer:

# sysctl net.core.rmem_max
net.core.rmem_max = 33554432

At the application level, we have increased the receive buffer size:

int sockbufsize = 33554432
int ret = setsockopt(my_socket_fd, SOL_SOCKET, SO_RCVBUF,
        (char *)&sockbufsize,  (int)sizeof(sockbufsize));
// check return code
sockbufsize = 0;
ret = getsockopt(my_socket_fd, SOL_SOCKET, SO_RCVBUF, 
        (char*)&sockbufsize, &size);
// print sockbufsize

After the call to getsockopt(), the printed value is always 2x what it is set to (67108864 in the example above), but I believe that is to be expected.

I know that failure to consume data quickly enough can result in packet loss. However, all this application does is check the sequencing, then push the data into a queue; the actual processing is done in another thread. Furthermore, the machine is modern (dual Xeon X5560, 8 GB RAM) and very lightly loaded. We have literally dozens of identical applications receiving data at a much higher rate that do not experience this problem.

Besides a too-slow consuming application, are there other reasons why the Linux kernel might drop UDP packets?

FWIW, this is on CentOS 4, with kernel 2.6.9-89.0.25.ELlargesmp.

13
задан Matt 6 May 2011 в 15:06
поделиться