User login

Libtrace Native Linux Capturing Improvements Using PACKET_MMAP

30

Nov

2012

Introduction

In an attempt to increase the speed of libtraces native Linux capture I've implemented PACKET_MMAP socket capturing. PACKET_MMAP consists of a ring buffer mapped in shared memory that both the kernel and a user program can directly access, this is ideally suited to packet capturing. This allows libtrace to check the status of a frame in the ring buffer without the need to make a system call. This is also implemented as zero copy within libtrace.

This should be included in the next version of libtrace as the 'ring:' URI input/output format. The current 'int:' URI will remain unchanged.

PACKET_MMAP is supported in the latest linux kernels.
Linux kernel 2.6.31.1 or higher support both RX_RING (reading) and TX_RING (writing).

Hardware

Ixia Traffic Generator sending a stream of packets across a 1 Gbit connection directly connected to the testing machine.

Running libtrace on a clean Debian install, Linux kernel 2.6.32-5-amd64.
Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz , 4G ram, Intel Corporation 82574L Gigabit Network Connection.

Method

I used the libtrace tool tracestats to count packets and report the number dropped, and time to provide other useful statistics.
root@machine5:~# /usr/bin/time -v tracestats ring:eth1
This is ideal since tracestats only counts packets and does no additional processing. This gives a good benchmark for the libtrace library itself without the additional overhead caused by processing packets.

I also monitored the systems CPU usage average every 5 seconds using:
root@machine5:~# sar -u 5

Each test consisted of 20 seconds of data transmission. In this way CPU times are comparable even if tracestats runs longer in one test than another.
Since we have packets being sent for 20 seconds, I've taken the middle three 5 second intervals from sar to calculate the average CPU usage during packet capture.

Test 1:
Comparing the current capture method, recv(socket), to the new PACKET_MMAP way.
Traffic was generated at 1Gbit speed full link speed and the packet size was varied from 64bytes (1488095 packets/seconds) to 1518 (81274 packets/second). A ring buffer size of 128 frames, a single frame being space to hold a single captured packet, was used which is fairly small. If packet loss is an issue the buffer can easily be made 100 times that.

Test 2:
Compare different PACKET_MMAP buffer sizes.
Traffic was generated at 1Gbit speed full link speed and the packet size was randomly picked between 64–1518 bytes. This averaged 153940 packets/second.

Results

Test 1:

Recv() graph
PACKET_MMAP graph
On the whole when using PACKET_MMAP the CPU usage is a lot lower leaving more CPU for processing the packets. The kernel itself started to lose packets (not even reporting packets as being dropped) at and below 150 bytes in size. Interesting things happen with the very small packet size of 64 to PACKET_MMAP, no doubt caused by the extremely high rate of packets.

Test 2:

PACKET_MMAP buffer sizes graph

Here 'int' is referring to the current recv() capture method where all the other numbers refer to the PACKET_MMAP buffer size. Anything over 8 frames seem to have almost no loss, with this traffic rate at least, 153940 packets/second with random packet sizes. CPU usage is consistent across all buffer sizes.

I've attached all the graphs and the raw output from sar and tracestats.

AttachmentSize
PACKET_MMAP Graphs.xls85.5 KB
PACKET_MMAP raw data.xls55 KB