The primary reason to use a monitoring system is to inform your support teams when technology is not working the way it is expected to. But monitoring systems can bring much more value to an organization if you can find ways to reuse the data. Below are four possibilities.
1. Operations Excellence Initiatives. I am a very big proponent of synthetic monitoring solutions. And so was one of my previous CTOs, who not only sponsored an entire team to create an effective synthetic web monitoring solution but also used the stats from that monitoring system to determine the bonuses for the entire IT division.
Everyday the stats from that tool were reported to IT management. Every hiccup of a website was investigated. And it seemed like every week I had at least one web development team asking me for help investigating (or proving that my monitoring solution was accurate).
Lo and behold the frequent outages that plagued our web services for years disappeared. (And people got their bonuses.)
2. Inventory Management. When I took over Microsoft Systems Center Operations Manager (SCOM) at my last employer, the inventory of Windows servers maintained by the sys admins was completely out-of-date (and we were running thousands of servers). I worked with the sys admins to automate a report that showed the difference between what SCOM was monitoring and what servers in their inventory were flagged for monitoring (basically, only production servers). I then told them that monitoring would only be done if inventory actually showed the server was flagged for monitoring. Thus alerts for non-production servers (which they loathed) and missed alerts (which they got dinged for by management) drove them everyday to update the inventory. In a year, their inventory was pretty much in-sync with reality.
Of course, this process primarily kept production server inventory up-to-date. However, by the time production inventory was straightened out, the sys admins had become so accustomed to updating inventory that they built inventory update tasks into their provisioning and decommissioning processes for all servers.
3. Data Model of Your IT Systems. Many monitoring tools provide some degree of topology mapping features which is essentially a data model of your IT systems. Network monitoring tools will tell you what your network looks like right now. Web Application Performance Management (APM) tools will tell you how your services are comprised. This is extremely valuable information that can be used to supplement your CMDB (or create your own if you like) or help your event management systems effectively perform event correlation and root cause isolation.
Unfortunately, this topology data is isolated for use within the monitoring system that generates it. If you really want to make a data model of your IT systems, you are going to need to bring your data architecture skills and a lot of engineering ingenuity to extract or federate the data. But it may be worth it to avoid “all hands” calls, reduced mean time to repair (MTTR), and large numbers of incident tickets generated during event storms.
4. Capacity Planning. Monitoring systems collect a lot of data for use in troubleshooting. Much of that same data can be fed into analytics tools, such as Excel and R, to perform capacity studies. These studies often involve correlations of service counters (e.g. number of users, number of hits / queries) and system counters (e.g. CPU utilization, Memory Utilization) which may necessitate extracting data from multiple monitoring systems. This extra elbow grease may be worth it if you consider the cost of reaching capacity during the next major marketing campaign or if you can identify many under-utilized servers that can be repurposed.
In conclusion, monitoring systems can provide more than just alerts and diagnostic data. That data can be reused for management performance scorecards, IT system data modeling, capacity planning and can be leveraged to improve data elsewhere, such as in inventory systems and CMDBs.