Roll Your Own Tools.. Real-time Graphing and Round Robin Data Storage

I have spent a lot of time playing around with graphics libraries and toolkits for integrating real-time graphs within my own testing and monitoring tools. It seems like there are many open source tools available in the world of performance testing and system monitoring. And lots of people roll their own tools in whatever programming language they are into... but many lack graphics capabilities.

Two of the toolkits/libraries I end up using often for my own homebrew test tools are: RRDTool , and JRobin.

from the RRDTool site:
"RRD is the Acronym for Round Robin Database. RRD is a system to store and display time-series data (i.e. network bandwidth, machine-room temperature, server load average). It stores the data in a very compact way that will not expand over time, and it can create beautiful graphs. It can be used via simple shell scripts or as a perl module."

So...
RRDTool is a really good back-end for storing time-series data; which is pretty much all we care about when we are doing performance testing. It has bindings for various scripting languages, or can be invoked from the command line. If you are developing tools that need a data repository and graphing capabilities, this provides you both. You create an RRD and then you begin inserting data values at regular intervals. You then call the graphing API to have a graph displayed. The cool thing about this data storage is its “round robin” nature. You define various time spans, and the granularity at which you want them stored. I fixed binary file is created, and this never grows in size over time. As you insert more data, it is inserted into each span. As results are collected, they are averaged and rolled into successive time spans. It makes a much more efficient system than using your own complex object structures, or a relational database, or file system storage.

You will probably recognize the graphs it creates, as RRDTool is integrated in many popular monitoring tools (it is Free/Open Source, GPL License). I have built many tools around RRDTool, and it is really a nice system.

If you are in the Java world, there is a very cool project named JRobin. JRobin is a clone of RRDTool in pure Java. So you can create RRD's directly from your Java code.. and all in memory if you want to!

Some days I pretend to be a Java programmer, so I had to build a tool using JRobin. As a proof of concept, I wrote a small network latency monitoring tool. It shows off some of JRobin's capabilities. It pings a host at a given interval and records the latency. A graph of the network latency is rendered in real-time onto a Swing panel.

Here is my network latency monitoring tool: NetPlot (includes Java source code, GPL Licensed)

The tool itself is just a trivial example, and really isn't the point. But you could easily adapt this code or create your own to develop real-time graphs of your own time-series data.

(hmm.. I wonder if I could hook this into JMeter? probably..)

(How freaking ironic?.. I've been using this thing for a while now, but I decided to check the JRobin web site while I'm writing this.. and the developer just ceased development of the project and turned over all related rights to OpenNMS. can someone reading this please take over JRobin maintenance? .. erm seriously)

-Corey Goldberg
www.goldb.org

Comments

At WOPR6 I announced I am working on an application for remote system monitoring. The tool I mention here is NOT that. That is a much more robust system for remote agentless monitoring of system stats and service response times. It is written in Python and can use RRDTool as a back-end for data storage and graphing. The project is named PyMeter. look for it sometime soon.. hosted at www.openqa.org.

So "storing time-series data" is all performance testers care for? I didn't hear you mention that at WOPR6 :)

Anyway, thanks for the info, "storing time-series data" is ONE of the things I care about...

Julian Harty

>So "storing time-series data" is all
>performance testers care for?

true true.. i was just being dramatic. It should have been phrased something like:

"during a performance test, data presented in a time-series is most useful for analysis" .. or something to that effect.