Distributed, Search Optimized Full Packet Capture System: PCAPDB

2017-06-02T18:36:43
ID N0WHERE:171654
Type n0where
Reporter N0where
Modified 2017-06-02T18:36:43

Description

Distributed, Search Optimized Full Packet Capture System


PcapDB is a distributed, search-optimized open source packet capture system. It was designed to replace expensive, commercial appliances with off-the-shelf hardware and a free, easy to manage software system. Captured packets are reorganized during capture by flow (an indefinite length sequence of packets with the same src/dst ips/ports and transport proto), indexed by flow, and searched (again) by flow. The indexes for the captured packets are relatively tiny (typically less than 1% the size of the captured data).

A PcapDB installation consists of a Search Head and one or more Capture Nodes. The Search Head can also be a Capture Node, or it can be a VM somewhere else. Wherever it is, the Search Head must be accessible by the Capture Nodes, but there’s no need for the Capture Nodes to be visible to the Search Head.

Hardware Requirements

Hardware requirements depend greatly on the peak time capture rates of the network being monitored.

Capture rates listed below are for the peak-time-average traffic. For example, a deployment on a 10 Gb link sees an average of 3 Gb/s of traffic during the busiest part of the work week, though it does have momentary spikes of up to 10 Gb/s of traffic. For our purposes, this is a 3Gb/s network.

Core System

  • 6 dedicated cores for the OS and search, plus an and additional core for every 1 Gb/s of capture.
    • Disabling Hyperthreading is recommended.
  • 128 GB system memory, minimum (ECC preferred)

Capture Card

While PcapDB can technically run (in libpcap mode) using just about any network card, for best performance you’ll want a card compatible with the PFring ZC library and it’s custom drivers. Currently, the Intel X520 line of server adaptors is the most affordable option at around $300. Myricom cards are another, albeit more expensive, option.

Storage

  • Physical ‘Capture Disks’ scaled to the raw capture history needs of the site.
  • A ‘Capture Disk’ is logical unit dedicated entirely to storing captured packets.
  • Capture disks can be essentially any logical disk unit, as long as it can offer sustained writes proportional to twice its share of the capture rate. A single 9 disk RAID 5 (using 7500 RPM Nearline SAS disks) can easily handle 1 Gb/s.
  • For each day of capture history, you typically need around 5.5 TB of disk per 1 Gb/s of capture rate.
  • The physical disks may be directly attached to the system, as components of a JBOD (software RAIDS can be managed via the PcapDB interface), or as separate logical units. Capture will be balanced across all capture storage devices according to size.
  • A separate ‘Index’ device. A single 7500 RPM Nearline SAS disk, or a pair in RAID 1, works fine. You typically need about 1% of your total capture disk as index disk.
  • Disks as needed for the OS.

Requirements

PcapDB is designed to work on Linux servers only. It was developed on both Redhat Enterprise and Debian systems, but its primary testbed has so far been Redhat based. While it has been verified to work (with packages from non-default repositories) on RHEL 6, a more bleeding edge system (like RHEL/Centos 7, or the latest Debian/Ubuntu LTS) will greatly simplify the process of gathering dependencies.

sys_requirements.md contains a list of the packages required to run and build pcapdb. They are easiest to install on modern Debian based machines.

requirements.txt contains python/pip requirements. They will automatically be installed via ‘make install’.

Installing

To build and install everything in /var/pcapdb/, run one of:

make install-search-head
make install-capture-node
make install-monolithic
  • Like with most Makefiles, you can set the DESTDIR environment variable to specify where to install the system. make install-search-head DESTDIR=/var/mypcaplocation
  • This includes installing in place: make install-capture-node DESTDIR=$(pwd) . In this case, PcapDB won’t install system scripts for various needed components. You will have to run it manually, see below.
  • If you’re behind a proxy, you’ll need to specify a proxy connection string using PROXY=host:port as part of the make command.

To make your life easier, however, you should work make sure the indexing code builds cleanly by running ‘make’ in the ‘indexer/’ directory.

Postgresql may install in a strange location, as noted in the ‘indexer/README’. This can cause build failures in certain pip installed packages. Add PATH=$PATH:<pgsql_bin_path> to the end of your ‘make install’ command to fix this. For me, it is: make install PATH=$PATH:/usr/pgsql-9.4/bin .

Setup

After running ‘make install’, there are a few more steps to perform.

Post-Install script

The core/bin/post-install.sh script will handle the vast majority of the system setup for you.

  • It does so idempotently, so it can be run multiple times without breaking anything.
  • Run without arguments to get the usage information.
  • Basically, you want to give it arguments based on whether you’re setting up a search head (-s), a capture node (-c), or monolithic install (-c -s).
  • You’ll also have to give it the search head’s IP.

    /var/pcapdb/core/bin/post-install.sh [-c] [-s] <search_head_ip>

This will set up the databases and rabbitmq.

DESTDIR/etc/pcapdb.cfg

This is the main Pcapdb config file. You must set certain values before PcapDB will run at all. There are a few things you need to set in here manually:

  • (On capture nodes) The search head db password
  • (On capture nodes) The rabbitmq password
    • Both of the above should be in the search head’s pcapdb.cfg file.
  • (On search head) The local mailserver.
    • If you don’t have one, I’d start with installing Postfix. It even has selectable install settings that will configure it as a local mailserver for you.

Add an admin user (Search Head Only)

You’ll need to create an admin user.

sudo su - capture
./bin/python core/manage.py add_user &lt;username&gt; &lt;first_name&gt; &lt;last_name&gt; &lt;email&gt;
  • This will email you a link to use to set that user’s password.
  • (This is why email had to be set up).
  • root@localhost is a reasonable email address, if you need it.
  • _ Note that manage.py also has a createsuperuser command, which shouldn’t be used. _

That should be it.

You should be able to login with your admin account.

Things that can, and have, gone wrong

  • If your host doesn’t have a host name in DNS, you can set an IP in the ‘search_head_host’ variable in the pcapdb.cfg file.

pfring-zc drivers

One more thing. You should install the drivers specific to your capture card for pfring-zc. The packages from NTOP actually build the drivers for your kernel on the fly when installed, though you may have to reinstall that package whenever you do a kernel update. Building and installing from source is also fairly straightforward.

Distributed, Search Optimized Full Packet Capture System: PCAPDB