Where’s my network traffic?

Using Graphite and nPulse CPX API to map out the network

It was late.   The lab administrators were gone, and I work 95 miles away from our data center.  At work, we’re working on setting up a new and improved QA/Testing rack of equipment, and I was trying to run my automated tests.   Unfortunately, I misread a memo, and didn’t know where the data was going.

For our testing purposes, we have a custom replay appliance that exposes its operations via a RESTful API.  Our CPX platform does as well, more on that in a second.  So, when I passed some commands to the replay box, I didn’t get the data I expected.  I tried again.  Nothing.  Hmmm.. No one was there to help troubleshoot, so, I had to figure it out, remotely.


The newest tool is my tool chain is:
Graphite (Graphite Web, Carbon, Whisper).  http://goo.gl/UnqbN
combined with our CPX platform:  http://goo.gl/qqnLx
old faithful, Tornado’s HTTP Client: http://goo.gl/O4kHH


I have access to a bunch of machine in our development and test lab, so that helps.  Using ‘my’ general virtual machine (Debian 6 Linux), I set up a graphite-web installtion.  More on that later. It’s kind of a bear to get installed on Debian.

I whipped up a quick script that loop through our CPX boxes, to watch their stats.  We have  a pretty simple RESTFul API to get capture statistics.  The plan is to grab the stats, create some entries in the Whisper database and then watch a graph to see where the traffic spikes.   (From now on, I’m just going to use Graphite as the entire system. So, I will put data in Graphite.  Although really, the data goes to Carbon, which puts it in Whisper, which is then served and visualized by Graphite-Web)

The format of the data is:

name.spaced.attribute value timestamp

in python:

“%s %d %d” % (name, value, time)

The CPX Capture Statistics end point takes this format, this returns a JSON structure:

‘https://%s/api/channel/capture?polling=true’ % cpx[‘url’]

So, to set up my python array of CPXs,

cpxs = []
cpxs.append({'url':'localhost:1443', 'name':'taylor', 'username':'cpx','password':'cpx'})
cpxs.append({'url':'localhost:2443', 'name':'hhext1', 'username':'cpx','password':'cpx'})
cpxs.append({'url':'localhost:3443', 'name':'harrison', 'username':'cpx','password':'cpx'})
cpxs.append({'url':'localhost:4443', 'name':'pierce', 'username':'cpx','password':'cpx'})
cpxs.append({'url':'localhost:5443', 'name':'ike', 'username':'cpx','password':'cpx'})

Then, simply enough, I loop the CPXs, build my URL, make a tornado request, and get the data back.
Then I loop through the stats of interest, build the appropriate Graphite formatted string, append it to my buffer, then send it away.

while True:  
#Keep doing it. There's enough delay in each HTTP request so nothing gets overwhelmed.
  stats = ['errors','mbps','octets','sliced','mfps','frames','violations','dfps','dropped']
  lines = []
  for cpx in cpxs:
    now = int( time.time() )
    # I'm keeping a total per CPX.  This is a quick way to aggregate the data to simplify some visualizations.
    totals = defaultdict(int)
    url = 'https://%s/api/channel/capture?polling=true' % cpx['url']
    print cpx['name'], ":", now
    requ = httpclient.HTTPRequest(url,auth_username=cpx['username'],auth_password=cpx['password'], validate_cert=False)
    client = httpclient.HTTPClient()
      response = client.fetch(requ)
      if response.error:
        print response.error
        responsebody = response.body
          re = json.loads(responsebody)
          # Each CPX load balancing traffic across virtual 'feeds' 
          # for improved performance and data localization.
          for feed in re['feeds']:
            feednum = feed['feed']
            name = cpx['name']
            for stat in stats:
              totals[stat] += int(float(feed.get(stat,0))) 
              #You'll see here, I'm pumping in data per feed.  
              #Also pushing the 'totals' after each CPX
              lines.append('cpx.capturestats.%s.%s.%d %s %d' % ( name, stat, feednum, feed.get(stat,0), now))
          print 'Error: ', sys.exc_info()
          print 'Unable to parse: ', responsebody 
      for stat in stats:
        lines.append('cpx.capturestats.%s.%s.total %s %d' % ( name, stat, totals.get(stat,0), now))
      print 'Error getting a URL:', cpx
      print sys.exc_info()
## My error handling could be a lot better, but hey, this is a small utility. It'll be fine

Graphite Web

What I found to be the simplest way of charting exactly what I wanted to chart, was to use the ‘render’ API that graphite-web provides.  Essentially, it’s a URL that outputs a PNG based on parameters.  It even takes a wild card, so, in one fell swoop, I can get a PNG showing the total ‘mbps’ per CPX.


Looks like this, for our ‘steady state’ traffic.

CPX Traffic Monitoring Steady State

Not replaying any traffic from my source. These boxes have data from elsewhere.

Then, after doing some experimentation with our replay end point, I can watch the graphite charts, to see which CPX is getting traffic, based on different parameters. Pretty slick! Now, I know where my traffic is going!

CPX Traffic Monitoring Spikes

Each spike shows what CPX received traffic.



Footnote URLs: