A guide to some of the terms we use around monitoring
Basics of Monitoring in Opsview Cloud
When getting started with you will often come across “Hosts” and “Services”, these are the most basic components of monitoring within Opsview Cloud.
A Host is a monitored system. In a physical environment, this would be a router, switch, server, etc. For a virtualized environment hosts would also include each Virtual machine with a unique IP address. For example, if you had one server running ESXi which hosted 5 Virtual Machines your total host count would be 6 (1 for the ESXi server and 5 Virtual Machines).
Opsview Cloud licensing is based upon the number of monitored hosts.
Services are the things we monitor on the hosts, these could be almost anything but examples could be CPU, Memory, Disk Space or Interface throughput. Within Opsview Cloud you can monitor as many services per host as you require without any additional licensing costs.
How to Monitor
There are a few different ways of monitoring hosts and services in Opsview Cloud; you can use the Opsview Agent, agentless checks or SNMP. Understanding the differences between these different methods is important to finding the rights checks and getting them set up.
The Opsview Cloud Agent runs directly on the monitored host and allows Opsview Cloud to run plugins (scripts) on the host and then pass the results back to the monitoring system. The advantage of the agent is that it has local access to many of the machines' resources and does not rely on exposed services or APIs.
The Opsview Cloud Agent is available for Windows and Linux systems. You can find more about the available agents on our Download area.
Agentless checks run Opsview Cloud plugins on Opsview Cloud which then query the host. These are normally designed for services that are available remotely like applications with an API, HTTP interface, or technologies like Windows WMI.
SNMP is most common on network devices but is also available on some other systems. Opsview Cloud is able to query SNMP devices with SNMP polls and accept SNMP traps, where the device sends the SNMP data directly to Opsview Cloud.
Opsview Cloud comes with some pre-built SNMP Service Checks but can quickly and easily monitor many SNMP enabled devices by accepting a MIB file and then allowing you to run an SNMP Walk to find all of the available data. You can find instructions on doing this here.
Information on setting up SNMP traps in Opsview Cloud can be found here.
Using Your Monitoring
Now you are collecting lots of information about the status of your hosts and services you are going to need ways to easily and clearly present and report information about them. Luckily, Opsview Cloud makes this very straight forward using a few easy to understand tools and views.
Hashtags are a simple way of breaking up hosts and services into logical groups that mean something to you and your organization. They allow you to tag systems into any grouping you choose, for example, you could have Hashtags for locations, technologies, system responsibility or departments, but they can be for any grouping that makes sense within your environment.
How do Hashtags work?
Hashtags are the made up of the intersection of hosts and services that you select, put another way this means you select the combination of hosts and services you require for a Hashtag.
For example, if I was setting up some Hashtags based upon location, to set up a “London” Hashtag I would select all of the hosts in my London office and then select all services for these hosts because all of those hosts and services are in London.
If I was setting up a “Database” Hashtag as part of a group of Technology Hashtags I would select all of the hosts that include a database system and then select all of the database related services, this means even if these hosts include Operating System service checks alongside the database ones, the Hashtag will just include the database information.
If I was interested in the CPU of all my hosts I could also set up a Hashtag that just selects the CPU services checks and includes all hosts.
Hashtags require a bit of understanding on how they break up hosts and services but once you understand them, they are a very quick, easy and powerful way to break up your monitoring information into easy to use data and visualizations.
What can I use Hashtags for?
Hashtags have their own view which lets you see exactly what is going on in your custom Hashtag groupings; any outage on any Hashtag host or service will show that Hashtag as down. Hashtags also appear in a few different dashlets in Opsview Cloud dashboards.
Hashtags are very useful for alerts, more on that later.
Many of Opsview Cloud reports are based upon the information provided by Hashtags, there is more information about that in the Reporting section of this guide.
Business Service Monitoring (BSM)
Business Service Monitoring (BSM) functionality allows you to create groupings of hosts and services a bit like Hashtags but they go a bit further as they allow you to define Business Service dependencies and redundancies. Within Business Services you can easily define Components, which are sets of related service checks and then set the Operational Zone. If you have a clustered pair of Exchange servers then you can add them both to a Component and then set a 50% Operational Zone, meaning that only one of the two of them are required to be up (available) at any one time for this part of the Business Service to be operational. You can then group multiple Components into a Business Service.
What can I use Business Services for?
Business services allow you to easily see the overall status of your high-level services. This gives a much broader view of your overall service status than just viewing individual host status. You can see the status of Business Services in the dedicated BSM view or using dashlets within the dashboards.
Just like Hashtags, Business Services can be used in the reporting module to generate reports, but because Business Services understand the system dependencies and relationships, they can generate real SLA reports, no need for any manual calculations!
The Opsview Reporting Module allows you to generate a variety of reports. It is an optional extra in the SMB plan but is included in the Enterprise Plans.
When generating reports, the data comes from the ODW (Opsview Data Warehouse) database. There are settings within Opsview Cloud that allow you to configure what data is stored into the ODW. You can find the ODW settings from the main menu under
Configuration > My System then go to the ODW tab.
The “Enable ODW Import” controls the import of basic monitoring information, such as state changes and summarized statistics. These are used in the availability and event reports, and use a relatively small amount of storage space due to the use of summarized and aggregated information.
The “Enable extended import” option allows you to store all performance data returned by plugins which are used by performance and trend type reports. This greatly increases the storage required for the database, as it will store all service check results and performance data points for the specified retention period.
It is worth noting that data is stored to the ODW only from the point that the ODW settings are configured and that a Business Service or Hashtag is created, so if you were to create a Hashtag today you will be able to run a daily report starting from tomorrow.
Notifications are an important way to ensure that users are alerted to events within the monitoring system, Opsview Cloud includes many different notification methods from email to service desk connectors. You can find details of them here.
The most notable and useful scalability feature of Opsview Cloud is Distributed Monitoring using additional Collectors.
Collectors can be located within a different network or data centers; they just utilize a single channel of encrypted communication. This reduces the amount of configuration required to monitor hosts within another network or data center as you do not have to open up multiple firewall rules, as well as having the effect of improving security by reducing complexity.
Network outages between data centers can be a common issue, and monitoring across these links can be unreliable if the network is unreliable or heavily loaded. Collectors solve these problems; they are able to buffer data during network drops and the checks run locally so will not be slowed down by anything external. Notifications can also be sent directly from cluster systems so that even in the event that Opsview Cloud Instances cannot be contacted, your users will still be notified of detected problems.
Opsview Cloud Collectors fully support clustering, allowing you to deploy two or more Collector systems in a group which can cope with the failure of one of its members without interrupting monitoring.
Updated about 3 years ago