Cheshire Cat Computing http://www.steveshipway.org/forum/ 

Desperately Seeking Trending http://www.steveshipway.org/forum/viewtopic.php?f=17&t=90 
Page 1 of 2 
Author:  cracraft [ Sat Dec 06, 2003 10:47 am ] 
Post subject:  Desperately Seeking Trending 
Hi  seems that MRTG/RRDTOOL/ROUTERS.CGI would be a good base to build a trending system for each graph from. Show N bars beyond the current bar, for each graph, a predicted trend. The way I envision this, apart from additional support code to implement it in the existing graphs and I doubt that solution would fly, would be one trending graph extra for each regular graph. In that would be the two normal data points, but the whole thing would be shifted one day, one week in the future, etc. If you can't shift the X axis on a pergraph basis into the future, don't shift and just label the legend as "one period in the future" to remind the viewer. Then, insert the two data points normally for the entire period but based on a mathematical extrapolation of some kind indicating the likely trend from the last period of the same duration. Best fit, polynomial, whatever seems decent. Is there trending already for MRTG/RRDTOOL/ROUTER or would the above work? It uses the same ideas as already in the tools and just throws in more graphs and data that is timeshifted which these can already handle. The only issue would be deriving the equation that provides a good fit, extending out past the end of previous data, taking sample points every 5 minutes, using them instead with MRTG, and voila. Stuart 
Author:  cracraft [ Sat Dec 06, 2003 12:13 pm ] 
Post subject:  trending, reply 
Upon rereading my note, I realize now it looks absurd. My point was not to trend by shifting an existing graph to the right on the X axis, obviously. The point is to use the appropriate math to plot a best fit, sample beyond the existing curve N data points 5 minutes apart, and fit those in. The main thing that I see missing in MRTG/RRDTOOL/ROUTERS.CGI is trending. Perhaps it is there or someone has done it and I simply haven't the knowledge yet. Stuart 
Author:  stevesh [ Sun Dec 07, 2003 9:36 am ] 
Post subject:  
This is actually incredibly difficult to do. The hardest part is the 'some mathematical function' you mention. Also, the RRDTool graphing libraries just dont have the ability to add this sort of extension line to the graph. With trending, the main problems are 1) what shape of line to predict? A straight line, firstorder quadratic, etc? 2) how far back to look at existing data when predicting? 3) should we weight past data? 4) how far into the future to predict? should we give a range or a line? 5) should we look at duplicating time/day of week patterns? While it would be easy to predict (eg) disk usage that increases in a probably linear, constant and regular fashion, predicting network activiey on a daily basis would be very hard, as it has a daily and weekly pattern. There is no easy way to work out which sort of prediction method should be used, given a data set. I could write in a linear daily prediction based on the last week's data (for example) but it would be useless for 90% of data sets, and still there would be no way to get it into the RRD graph function. If anyone has some ideas or working code, then I'd be very interested to see it... Steve 
Author:  cracraft [ Sun Dec 07, 2003 9:54 am ] 
Post subject:  Trending 
Instead of embedding it in the current graph, create a whole new graph that represents the trend itself. Then display that graph on the same physical screen as the original nontrended data. One graph for the original data. One graph for the trend without any original data on it (just the trend as if it itself were data.) For example, assume the fetch feature is used to gather the last 30 days worth of datapoints. Now use bestfit to calculate a matching formula for the data from fetch. Next, take this formula and for each time (x axis) output the y (trend). Take these two and stuff them into an RRD database. Do this all "really fast" and built up the RRD and it will be graphed by MRTG and your ROUTERS.CGI. Then, create a new window using your CGI view that displays that graph plus the graph it was derived from on the same physical screen. Voila! You have limited trending suitable for some things. Add more complex formulae later as a trendformula menu on the side. Looks like rrdtool update should be able to do the business of storing the trendformulacalculated yaxis values from the xaxis time periods from the original graph. Stuart 
Author:  stevesh [ Sun Dec 07, 2003 4:37 pm ] 
Post subject:  
It sounds as if what you are basically saying is to create a temporary rrd file so as to be able to use the RRD library. I hadn't thought of that ... its would probably be easier to output the temporary data as XML and then load this into a tmp.rrd file, so as to only create the one required RRA. This would be a handy way of doing it, if we had the data. We could even use some extra datasets so as to be able to plot predicted data in a different colour. However, we still have the primary problem of how to meaningfully "use bestfit to calculate a matching formula for the data from fetch." This is by no means simple  is the best fit a curve? A line? A repeating pattern? I could relatively simply generate a best fit line, but this would be useless for most data sets. All we have to go on is a sample set of data, and we want to generate a formula. Steve 
Author:  cracraft [ Wed Dec 10, 2003 6:59 am ] 
Post subject:  new idea 
Just thought of a new idea... For each data graph, you have two points A/B, that are graphed. Assume now that either is completely blank, say B. A is the real data. B is simply zero and being supplied as zero for each 5 minute interval. In such a case, have the CGI curvefit the nonzero data of the A/B pair and replot the curvefit as data point B. Do this with rrdtool update every 5 minutes for all data points in the rrd file that includes the B data. When your cgi script displays the A/B data, the effect will be real data and a curvefit, for every plot! As regards to what type of formula to use, start out with a simple basic linear fit to get it started and we can worry about polynomials and other levels later. The router script can ensure that the B data is zero by ensuring it is not an "error count", i.e. has the string "err" anywhere in its cfg description or associates names and then by scanning the last N 5 minute entries to ensure they are zero. Conversely, if you don't like that way of identifying it, have the requirement that the B datapoint always has to be some magic value, say 1, or 777, or something unlikely to turn up too often, and then replace all those occurrences with the value from the curvefit for the given time quantum. Stuart 
Author:  stevesh [ Wed Dec 10, 2003 1:23 pm ] 
Post subject:  
Now that there is a workable way to use the RRD graphing libraries to show a predicted line, we still have the (major) problem of how to, given a set of datapoints in an RRA, generate a 'best fit' line or curve. I have a possible way of doing it using decayed standard deviations but it is computationally heavy, and only generates a firstorder curve (ie, a straight line) which is not really appropriate in most cases. There is also the question of what decay parameters and how many datapoints to consider. I will look into this (it is a very interesting challenge ) but if anyone else wants to try it (as a routers.cgi*Extension script?) then please let me know so we can share notes. Steve 
Author:  cracraft [ Wed Dec 10, 2003 1:40 pm ] 
Post subject:  trending 
I think that if you can get a reasonable approximation, even something halfway reasonable, the usefulness of MRTG/RRD/ROUTER goes up exponentially. For example, half of the commerical competition (TeamQuest, Sysload, HP Performance Agent, etc.) just fall away since trending is their major advantage and all management worth their salt need trending to predict and build budgets and procure the iron. It would be super to have a strategic view (trending) in addition to the shorterrange, current, troubleshooting view. Also, I don't see why a basic linear fit isn't a good one to just start off with to prove the technology. Each time the trendbuilder builds the trend for any given RRD, it's going to rebuild it occasionally anyway (say once a day or settable?) that way it won't be a terrible load on the system. It could be made separate to keep out of MRTG's way not slowing it down. I'd suggest starting with just a basic linear fit, keep the trendbuilder separate from the MRTG processes, institute a locking mechanism to prevent the two from trampling each other, trend in column B of any RRD file found to have N recent 5 minute column B entries all set to some standard number (1 for instance), etc. Then work on improved curvefitting later. If you start with curvefitting first, you're starting with by far the hardest issue. The rest is an afternoon hack session. The curvefitting algorithm picking is a much longer process best suited for the occasional kaizen satori rather than a heavy research approach. Also, once you announce a basic linear fit version, the mathematicians out there will be all over you with suggestions for various curvefitting algorithms. Stuart 
Author:  stevesh [ Wed Dec 10, 2003 10:58 pm ] 
Post subject:  
Well, Ive started to put together a prototype for a trending module, that works using the routers.cgi*Extension[] interface. I think it should make a linear trend based on the yearly data (since a daily trend analysis is pretty useless). Maybe it should also highlight when/if the value reaches MaxBytes or 0? So far, Ive still to write the code to generate the graph (but I should be able to steal this from routers.cgi), the .cfg file parser (again stealable) and of course, the trending code (although I have a nifty algorythm on paper). Im trying to make it sufficiently modular that it will be easy to plug in different trending functions in the future. This will be something for me to work on in the office while I man the phones over the christmas break Although maybe I should concentrate on getting v2.13 out! Steve 
Author:  stevesh [ Fri Dec 12, 2003 11:42 am ] 
Post subject:  
I have a very prototype version of this now for trial. Anyone who wants a copy can email me, but it is a long way from general release yet... 
Page 1 of 2  All times are UTC + 12 hours [ DST ] 
Powered by phpBB® Forum Software © phpBB Group http://www.phpbb.com/ 