Configuring JFFNMSTopInstalling jffnmsAdministrating JFFNMS

Administrating JFFNMS

Apart from configuration, there are some tasks that the administrator may need or want to do with JFFNMS. This chapter describes these tasks, when they are needed and what to do.

Reporting

JFFNMS can report on the availability of a system of a set of interface or on individual interfaces themselves. The reports are found at the Administration menu item Reports => State & Availability. The interface selector is like the one you see for the interface administration screen and the performance screen.

For a report you select the interfaces you want to report on, using the interface selector and the time period you want the report for. You can also choose what event types you will consider for an interface being up and down.

In the top right corner, there is a view details option box. Clicking this box will show the details of the alarms, including their start and stop times plus their duration.

For each interface selected, you get a report on: round trip time and packet loss (where relevant); Unavailable time ; Availability. Unavailable time is the total time that the interface has been considered in the down state for the selected time period. The availability is:

(total selected time - unavailable time) * 100 / total selected time

Right at the bottom of the list is the Total Unavailable Time and Total Availability. These figures are for the "system" of the selected interfaces. It's important to know they are a system availability and not just an average.

How JFFNMS calculates Total Availability

JFFNMS does not just do a simple (and often incorrect) average of system availability for the Total availability. It doesn't do a multiplication of the individual availabilities either. The next few paragraphs describe how Total Availability is calculated.

JFFNMS scans the database for all Ups and Downs of the selected interfaces for the requested time interval. It then orders these Ups and Downs in chronological (time event occurred) order. The ordered list is then scanned and an "interfaces down" counter is initially set to 0 and is incremented when the scanner finds a Down alarm and decremented when it finds and Up. If the counter leaves 0, the system notes the time of the Down alarm. If the counter returns the 0 (meaning no interfaces are down at that time) the time difference between this Up alarm time and the time of the Down alarm that incremented the counter from 0 is added to the Total Unavailable time.

For example, if there are a series of events for the selected time period 00:00 to 24:00 that look like table * then the consolidator will process the events into a series of alarms that will look like table *.

Example list of interface events

Time

Interface State
10:00hs A Down
11:00hs B Down
12:00hs B Up
13:00hs C Down
14:00hs A Up
15:00hs C Up

Processed alarms from example events

Start

Stop Duration Interface
(hours)
10:00 14:00 4 A
11:00 12:00 1 B
13:00 15:00 2 C

When the report needs to work out the Total Availability it ignores which particular interface was affected by the Up or Down event but instead uses a counter of down interfaces. A count of 0 means the system is available. Table * shows the how the times for the interface availability are calculated.

Interface State Calculation

Time State Counter Notes
00:00hs -- 0 Initially assume all interfaces up
10:00hs Down 1 Counter was 0 so record time as start
11:00hs Down 2
12:00hs Up 1
13:00hs Down 2
14:00hs Up 1
15:00hs Up 0 it is now 0 so record time as stop of this outage

There are some important boundary conditions and assumptions that are made here that are useful to know. The first is only alarms that start and stop within the time period selected are considered. A side effect of this is that all interfaces are assumed up at the start of the report and have to be up at the finish of it (otherwise their stop time is past the end of the time period).

The table above shows a system outage of 5 hours long, starting at 10:00 when interface A went down and finishing at 15:00 when interface C comes up. Using a method such as this one means you get a Total Unavailable Time of 5 hours instead of 7, which is the sum of the 3 outage times. 5 hours is correct, because this is the time that one or more of the interfaces being considered was down.

The Total Availability is now easily worked out using the Total Unavailable Time and Total Selected Time

(total selected time - total unavailable time) * 100 / total selected time

In our example, it is

(24h - 5h ) * 100 / 24h = 19 * 100 / 24 = 79.17%

So the Total Unavailable Time for our report is 5 hours and the Total Availability is 79.19%.


JFFNMS Manual, last changed March 29, 2008


Configuring JFFNMSTopInstalling jffnmsAdministrating JFFNMS