Saturday, May 16, 2015

Distributed Monitoring with Proxies - Part Two

In part one of this series we went through the process of installation, setup, and initial communication of a Zabbix Proxy.  Overall, it was a pretty easy process to get going, but simply HAVING a proxy doesn't accomplish anything.  We need to configure the Zabbix server to use the proxy to do monitoring before it is of any use.

Assigning hosts to proxies

There are several ways to assign hosts to a proxy.

Single Host Configuration 

First, let's take a look at assigning a single host to a proxy.
  1. In the Zabbix web UI go to "Configuration"->"Hosts".
  2. Click on the name of a host in the "Name" column that should be monitored by a proxy.
  3. Notice the "Monitored by proxy" drop-down at the bottom of the host configuration.  By default, it is set to "(no proxy)".  In the drop-down select the name of the proxy.
  4. Click the "Update" button.

Mass-Update Host Configuration

Alright, here's a way to configure multiple hosts at the same time to be monitored by the same proxy.
  1. In the Zabbix web UI go to "Configuration"->"Hosts".
  2. Check the check-boxes next to all the hosts that should be monitored by a proxy.
  3. In the drop-down on the bottom-left corner of the host list, select the "Mass update" option and then click the "Go" button next to the drop-down.
  4. In the new page that comes up make sure the "Host" tab is on (this is the default).
  5. Select the "Monitored by proxy" check-box and then choose the appropriate proxy name in the drop-down box that appears.
  6. Click the "Update" button.

Multi-Host Configuration through Proxy Configuration

And.....surprise!  Here is yet another way to configure multiple hosts to be monitored by a proxy at the same time.
  1. Go to "Administration"->"Proxies".
    1. In Zabbix 2.0 and Zabbix 2.2 this is actually "Administration"->"DM".  Make sure that the drop-down in the upper-right corner of the page (far right side of the "CONFIGURATION OF PROXIES" bar) has "Proxies" selected.
  2. Click on the name of the proxy that should monitor other hosts.
  3. Notice that there are two "Hosts" boxes on this page.
    1. The "Proxy hosts" box contains the hosts that are being monitored by this proxy.
    2. The "Other hosts" box contains the hosts that are NOT being monitored by this proxy.
  4. Select the hosts in the "Other hosts" box that should be monitored by this proxy and then click the "<<" button.  That will move the hosts into the "Proxy hosts" box.
  5. Click the "Update" button.

Agent Configuration

Ta-da!  Zabbix is now configured to monitor the selected updated hosts through the proxy.  Oh wait....  Remember how for Agents that they have a parameter in the zabbix_agentd.conf file called "Server"?  Let's take a look at that parameter.

Configuration Options

### Option: Server
#       List of comma delimited IP addresses (or hostnames) of Zabbix servers.
#       Incoming connections will be accepted only from the hosts listed here.
#       If IPv6 support is enabled then '127.0.0.1', '::127.0.0.1', '::ffff:127.0.0.1' are treated equally.
#
# Mandatory: no
# Default:
# Server=

Server=my.server.fqdn.or.ip
Notice the comment that says "List of comma delimited IP addresses (or hostnames) of Zabbix servers"? That comment really should say something like "List of comma delimited IP addresses (or hostnames) of Zabbix servers/proxies". Notice I added "proxies" at the very end. If the host for this agent is configured to have any passive items, then the "Server" parameter MUST have the Proxy in it that is configured to monitor the agent!

Another parameter to keep in mind is the "ServerActive" parameter. Let's take a look:
### Option: ServerActive
#       List of comma delimited IP:port (or hostname:port) pairs of Zabbix servers for active checks.
#       If port is not specified, default port is used.
#       IPv6 addresses must be enclosed in square brackets if port for that host is specified.
#       If port is not specified, square brackets for IPv6 addresses are optional.
#       If this parameter is not specified, active checks are disabled.
#       Example: ServerActive=127.0.0.1:20051,zabbix.domain,[::1]:30051,::1,[12fc::1]
#
# Mandatory: no
# Default:
# ServerActive=

ServerActive=my.server.fqdn.or.ip
Just like the "Server" parameter discussed earlier, the "ServerActive" parameter has a comment that says this: "List of comma delimited IP:port (or hostname:port) pairs of Zabbix servers for active checks". It should say "pairs of Zabbix servers/proxies for active checks". Notice that I added "proxies" there. If the host for this agent is configured to have any active items, then the "ServerActive" parameter MUST have the proxy listed here!  Active items will not work if this parameter does not point to the proxy that monitors the agent.

Big massive caveat: Unless this agent is monitored in multiple Zabbix environments (meaning, multiple Zabbix servers get data from this agent), you do NOT want to have multiple entries in the "ServerActive" parameter. You will only want the IP or FQDN for the proxy that is configured to monitor this agent. If you don't understand what I'm talking about, just believe me and only put the IP or FQDN of the proxy monitoring this agent in the "ServerActive" parameter.

After having made any necessary configuration changes to the agent, go ahead and restart it.

Ready to rock and roll....or so we thought

Ahhh, now the Zabbix server is configured to receive/pull data from a proxy, hosts are configured to be monitored by a proxy, and the agent has been restarted.  Funny enough, we're not done yet.  If the agent is configured to do active checks (meaning that the "ServerActive" parameter isn't blank), then a line like this will show up (may be slightly different for Zabbix 2.0/2.2 agents):
2133:20150516:205905.015 no active checks on server [192.168.56.102:10051]: host [test host] not found
Wait, what?!  Note that the "no active checks on server" sentence can be translated to be "no active checks on proxy".  Zabbix uses the words server/proxy interchangeably in some places.  Feel free to double-check, but the host was configured in the Zabbix UI to be monitored by the proxy and the agent config is pointing to the proxy.  Even after that the proxy claims that it knows nothing about the host.  To understand what's going on, it's necessary to learn about something called the "Configuration cache".

If a passive proxy is being used, there will be an utter lack of item data from this particular agent.  That's the only way to notice this particular problem.

The Configuration Cache

For performance reasons, both the Zabbix server and all proxies have something called a "configuration cache".  This is an internal cache in which Zabbix stores all information about hosts/items/triggers/etc. that are configured.  When talking about the Zabbix server itself, it updates its configuration cache every 60 seconds by default.  This means that every minute it will read the Zabbix database and find all changes that have been made by someone in the UI and then store them internally.  Only after the configuration cache has updated itself will the Zabbix server become aware of any changes that have been made in the UI.

For Zabbix proxies, the idea is similar to the Zabbix server configuration cache, except with proxies, the default update interval is every 3600 seconds.  Yes, one hour.  Let's discuss what this means a bit more in detail.


The Configuration Cache Waiting Game

Let's use the example of the agent we recently configured to be a monitored by a proxy.  First, we updated the host in the Zabbix UI to indicate it should be monitored by a proxy.  Second, we updated the zabbix_agentd.conf file to allow the agent to communicate with the proxy.  Now here is what happened:
  • The change in the Zabbix UI updated the Zabbix database to show the host should be monitored by a given proxy.
  • Within one minute of that change (assuming the server default was never changed) the Zabbix server updated its configuration cache by re-reading the database.  It found the configuration change and updated its internal cache.
  • While that went on, the zabbix_agentd.conf file was updated and the agent was restarted.
  • In the case of an active agent (meaning the "ServerActive" parameter is not blank), the agent attempted to get its list of items from the Proxy, but the proxy had not yet updated its configuration cache.
  • The proxy responded to the agent that it had no information about the agent and therefore we saw the log entry mentioned earlier.
If we were willing to wait up to one hour, the proxy would update its configuration cache and then the agent would be able to successfully pull its item list from the Proxy.  I'm not that patient.

Changing Configuration Cache Intervals

There is a place to change the frequency at which the proxy updates its configuration cache.  The location of that configuration option depends on which type of proxy was created: active or passive.

If one stops to think about it for a moment, that location is easy to figure out.  Remember that active proxies initiate communication with the server, so on an active proxy the configuration option is in zabbix_proxy.conf.  On the other hand, passive proxies wait for the Zabbix server to do all the talking.  With a passive proxy, the configuration option is in zabbix_server.conf on the Zabbix server.

Active Proxy Configuration Cache Interval

Open up the zabbix_proxy.conf file in your favorite editor and look for the "ConfigFrequency" parameter.
### Option: ConfigFrequency
#       How often proxy retrieves configuration data from Zabbix Server in seconds.
#       For a proxy in the passive mode this parameter will be ignored.
#
# Mandatory: no
# Range: 1-3600*24*7
# Default:
# ConfigFrequency=3600
Uncomment the configuration option and set it to something more sane.  In the real world this should probably be something like five minutes or more (depending on the size of the Zabbix environment).  For the sake of testing, set the "ConfigFrequency" option to "60".  This means that the Zabbix proxy will pull its configuration from the Zabbix server every minute and will then update its configuration cache.

Restart the Zabbix proxy after making this change.

Passive Proxy Configuration Cache Interval

Open up the zabbix_server.conf file in your favorite editor and look for the "ProxyConfigFrequency" parameter.
### Option: ProxyConfigFrequency
#       How often Zabbix Server sends configuration data to a Zabbix Proxy in seconds.
#       This parameter is used only for proxies in the passive mode.
#
# Mandatory: no
# Range: 1-3600*24*7
# Default:
# ProxyConfigFrequency=3600
Uncomment the configuration option and set it to something more sane.  In the real world this should probably be something like five minutes or more (depending on the size of the Zabbix environment).  For the sake of testing, set the "ProxyConfigFrequency" option to "60".  This means that the Zabbix server will push the configuration information to the proxy every minute and the proxy will then update its configuration cache.

Restart the Zabbix server after making this change.

Ready to rock and roll, for real this time

Now that the proxy will update its configuration cache at a much faster interval, testing with a proxy will be much, much easier.  If the agent being monitored is using active items, go ahead and restart the Zabbix agent.  Assuming that it has been at least one minute since the active/passive proxy/server service restart, the agent will no longer show any message like the log entry we saw before.  In fact, the proxy should be happily monitoring active and passive items on this agent now without any problem.

Forcing Configuration Cache Updates

The Zabbix server and Active Zabbix proxies can be forced to update their configuration cache, regardless of the configured interval.  Here are the relevant commands for both a proxy and a server:
zabbix_server -R config_cache_reload
zabbix_proxy -R config_cache_reload
This can be used when changes need to take effect sooner rather than later.  Just be sure to update the server's configuration cache BEFORE the proxy's.  This is because the server gives the proxy its configuration based on what the server has in its configuration cache.

There is no way to force a configuration cache update for a passive proxy.

Tidying Up

At this point everything should be just fine with agents being monitored by a proxy.  There are just a few things to keep in mind after this exercise:
  • Configuration changes made in the Zabbix UI are NOT immediately seen.  The Zabbix server must first update its configuration cache before it knows about them.
  • By default, the Zabbix server updates its configuration cache every 60 seconds.
  • By default, Zabbix proxies update their configuration cache every 3600 seconds.
  • When an active proxy queries the Zabbix server for its configuration, that configuration is pulled from the Zabbix server's configuration cache.  This means that proxies will only know about what the Zabbix server has read from its own database!
  • Due to caching, monitoring changes can take quite some time to take effect.  In fact, the maximum amount of time it takes is equal to: (Zabbix server update interval) + (Zabbix proxy update interval).  In the case of active agents, the value for the "RefreshActiveChecks" parameter in zabbix_agentd.conf also needs to be added in.

Thursday, April 30, 2015

Distributed Monitoring with Proxies - Part one


In the interest of getting this tutorial published more quickly, I've decided to split it into multiple parts.  This part (part one) covers the basic installation of proxies and how to get them to communicate with the Zabbix Server.  Part two covers agent configuration and additional configuration for proxies.

While it is not the intent of this article to fully explain the ins and outs of Zabbix Proxies, here's a brief overview of them.

Proxies provide the ability to distribute monitoring.  There are many use cases for Proxies, some of which include:
  1. Remote environments where it would be impossible to give all the monitored devices access access to the Zabbix Server.
  2. Environments with strict security requirements.
  3. Large environments.
There are two types of Zabbix Proxies: active and passive.

An active Proxy periodically connects to the Zabbix Server to request its configuration (the timing is configurable).  It will then collect data and then send it on to the Zabbix Server every so often (also configurable).

A passive Proxy on the other hand does not make any connections to the Zabbix Server on its own.  The Zabbix Server will periodically send the proxy its configuration (timing is configurable).  The Zabbix Server will also pull data from the Proxy at a configured interval.

For a little more description about Proxies and how they work, take a look at the documentation.

Based on how proxies are used, it makes little to no sense to install one on a Zabbix Server.  This tutorial is based on installing it on a separate device.  I make the assumption that you are already SSH'd into the server that will become a Zabbix Proxy.

Now onto the tutorial!

Installation

If you really want to install Zabbix from source, be my guest, but I assume you don't want to.  The commands to install the Zabbix Proxy really depend on what flavor of Linux you're running and are pulled directly from the documentation.  If you happen to be running Debian/Ubuntu or some other distro, take at the Zabbix documentation for specific instructions for your case.

RedHat 6/CentOS 6/<Derivatives>

As you probably have guessed, you'll add the Zabbix Yum Repository to your server and then use Yum to install the proxy.  Run these commands to do that:
rpm -ivh http://repo.zabbix.com/zabbix/2.4/rhel/6/x86_64/zabbix-release-2.4-1.el6.noarch.rpm
yum install zabbix-proxy-sqlite3

Done!  You may have noticed that we explicitly installed the SQLite3 proxy binary.  That's because in the vast majority of cases the SQLite Proxy database will suffice.  Unless you are pushing a LOT of data through a Proxy (I believe the technical term would be "crap-ton") the SQLite database will work just fine and requires less maintenance/configuration than using MySQL or Postgresql.

Configuration File

While it's nice to get the Proxy installed so quickly, it does require some specific configuration.  If you installed from Linux Packages (ie: RPM/deb) then your configuration files will be in /etc/zabbix.  The Proxy configuration file is most likely located at /etc/zabbix/zabbix_proxy.conf.  Go ahead and open up that file in your favorite text editor.

First we'll take a look at two options that need to be configured on all proxies.

### Option: ProxyMode
#       Proxy operating mode
#       0 - proxy in the active mode
#       1 - proxy in the passive mode
## Mandatory: no
# Default:
# ProxyMode=0
The "ProxyMode" option defines what type of proxy this is.  By default proxies are set to be "active".  If you want a "passive" proxy, go ahead and uncomment this option and set the value to "1".

### Option: DBName 
#       Database name. 
#       For SQLite3 path to database file must be provided. DBUser and DBPassword are ignored. 
#       Warning: do not attempt to use the same database Zabbix server is using. 
## Mandatory: yes
# Default: 
# DBName=

DBName=zabbix
Make sure you read the comment for the "DBName" parameter.  Notice that for SQLite (which we're using) this parameter is the absolute path to the SQLite database the proxy will use.  Something neat about SQLite proxies is that the proxy will automatically create the database if it doesn't exist.

By default the configuration file is not setup with a path in which to write the SQLite database.  The "DBName" option should point to a valid location. A location that most will most likely work on a RedHat/CentOS box is "/var/lib/zabbix/proxy_db.sqlite3".

Caveat: Make sure that SELinux and friends do not prevent the proxy from writing to whatever location you give it!  If the proxy has issues creating the database, you will see errors like the last one shown here in the proxy log file:
7063:20150324:173741.311 cannot open database file "/var/lib/zabbix/proxy_db.sqlite3": [2] No such file or directory
7063:20150324:173741.312 creating database ...
7063:20150324:173741.312 [Z3002] cannot create database '/var/lib/zabbix/proxy_db.sqlite3': [0] unable to open database file

Of course, normal Unix permissions apply here too, so it could be simply that the user the proxy runs as ("zabbix" by default) does not have rights to write files in whatever path you gave.

The next important configuration options really depend on the type of Proxy that you wish to setup.  As such, the following two sections cover the relevant options for each type of proxy.

Active Proxy

### Option: Hostname 
#       Unique, case sensitive Proxy name. Make sure the Proxy name is known to the server! 
#       Value is acquired from HostnameItem if undefined.
## Mandatory: no
# Default: 
# Hostname= 

Hostname=Zabbix proxy
The importance of the "Hostname" option cannot be understated.  When an active proxy talks to the Zabbix Server, it will pass this value to the server to identify which proxy is talking.  The value here MUST match exactly with another one that will be discussed later.  Suffice to say, just make sure you remember what you put here.

### Option: Server 
#       IP address (or hostname) of Zabbix server. 
#       Active proxy will get configuration data from the server. 
#       For a proxy in the passive mode this parameter will be ignored. 
## Mandatory: yes (if ProxyMode is set to 0)
# Default:
# Server=

Server=127.0.0.1
The "Server" option indicates the address (IP/FQDN) of the Zabbix Server.  The proxy uses this option to know from where to retrieve its configuration.

Passive Proxy 

In the case of a passive proxy, there isn't anything additional to setup on the proxy itself.  You're done!


Starting the service

Go ahead and start up the proxy.  Assuming RPM/Deb packages were used, run this command:
service zabbix-proxy restart

We do a restart here for consistency's sake.  This will make sure none of you somehow had some random proxy processes already running that cause issues :).

Take a look at the proxy log file.  It probably exists at /var/log/zabbix/zabbix_proxy.log.  Note that this output is from Zabbix 2.4.x!  Depending on your version of Zabbix, the output could be different.  You should see output like this:
1588:20150326:094411.616 Starting Zabbix Proxy (active) [Zabbix proxy]. Zabbix 2.4.4 (revision 52341). 
1588:20150326:094411.617 **** Enabled features **** 
1588:20150326:094411.617 SNMP monitoring:       YES 
1588:20150326:094411.617 IPMI monitoring:       YES
1588:20150326:094411.617 WEB monitoring:        YES 
1588:20150326:094411.617 VMware monitoring:     YES 
1588:20150326:094411.617 ODBC:                  YES 
1588:20150326:094411.617 SSH2 support:          YES 
1588:20150326:094411.617 IPv6 support:          YES
1588:20150326:094411.617 ************************** 
1588:20150326:094411.617 using configuration file: /etc/zabbix/zabbix_proxy.conf 
1588:20150326:094411.634 current database version (mandatory/optional): 02040000/02040000 
1588:20150326:094411.634 required mandatory version: 02040000
1588:20150326:094411.644 proxy #0 started [main process]
1590:20150326:094411.645 proxy #1 started [configuration syncer #1]
1591:20150326:094411.645 proxy #2 started [heartbeat sender #1] 
1592:20150326:094411.645 proxy #3 started [data sender #1]
1593:20150326:094411.646 proxy #4 started [poller #1]
1591:20150326:094411.647 sending heartbeat message to server failed: error:"negative response: "failed"", info:"proxy "Zabbix proxy" not found"
1594:20150326:094411.651 proxy #5 started [poller #2]
1595:20150326:094411.664 proxy #6 started [poller #3] 
1596:20150326:094411.647 proxy #7 started [poller #4]
1598:20150326:094411.659 proxy #9 started [unreachable poller #1]
1599:20150326:094411.650 proxy #10 started [trapper #1]
1600:20150326:094411.663 proxy #11 started [trapper #2]
1602:20150326:094411.669 proxy #13 started [trapper #4] 
1603:20150326:094411.659 proxy #14 started [trapper #5]
1604:20150326:094411.664 proxy #15 started [icmp pinger #1] 
1605:20150326:094411.651 proxy #16 started [housekeeper #1]
1606:20150326:094411.670 proxy #17 started [http poller #1] 
1609:20150326:094411.651 proxy #20 started [history syncer #2]
1610:20150326:094411.670 proxy #21 started [history syncer #3] 
1612:20150326:094411.663 proxy #23 started [self-monitoring #1]
1608:20150326:094411.669 proxy #19 started [history syncer #1]
1601:20150326:094411.670 proxy #12 started [trapper #3]
1611:20150326:094411.670 proxy #22 started [history syncer #4]
1597:20150326:094411.670 proxy #8 started [poller #5]
1590:20150326:094411.719 cannot obtain configuration data from server: proxy "Zabbix proxy" not found
1607:20150326:094411.727 proxy #18 started [discoverer #1]
1591:20150326:094411.647 sending heartbeat message to server failed: error:"negative response: "failed"", info:"proxy "Zabbix proxy" not found"
1590:20150326:094411.719 cannot obtain configuration data from server: proxy "Zabbix proxy" not found

Notice the log entries that show the message "proxy 'Zabbix proxy' not found". Remember that "Hostname" parameter in zabbix_proxy.conf?  An active Zabbix proxy passes that value to the server (as mentioned earlier).  In this case, we haven't finished setting everything up for the proxy to work right, so these log messages are expected.

The log messages from 2.0.x are slightly different.  The log file will not indicate which configuration file it read and any errors talking to the server will look like this and mean the same thing:
1953:20150326:193529.347 Heartbeat message failed
1952:20150326:193529.425 Cannot obtain configuration data from server. Proxy host name might not be matching that on the server.

Configuration (UI)

Go ahead and fire up a browser and load the Zabbix UI.
  • For Zabbix 2.4+,  go to "Administration" -> "Proxy".
  • For Zabbix 2.0 or 2.2, go to "Administration -> "DM".  Once the page loads, ensure that the dropdown box in the upper-right corner has "Proxies" selected and not "Nodes".
Now go ahead and click the "Create proxy" button so that we can create the configuration for the proxy.

The manner in which to configure the proxy depends on the type of proxy that is wanted.

Active Proxy

  1. Remember that "Hostname" parameter in the zabbix_proxy.conf file?  This is where it must match exactly.  In the "Proxy name" box,  make 100% certain that the value entered matches that "Hostname" parameter EXACTLY (beware of white spaces in here).
  2. Verify that the "proxy mode" option is set to "active".
  3. Click "Save".
On the proxy itself, restart the proxy service by running:
service zabbix-proxy restart

After the restart, you should now see a line similar to this:
1862:20150330:213908.399 received configuration data from server, datalen 2594

If you see that log entry, congrats! Your proxy will now retrieve its configuration from the server on a given interval.
If you still see entries indicating a failure (like the ones mentioned before in the "Starting the service" section), then most likely your "Hostname" configuration option and the "Proxy name" entry in the UI don't match.  Another thing to check could be the "Server" parameter in zabbix_proxy.conf to be sure that it points to the correct Zabbix server.

Passive Proxy

  1. The "Proxy name" field can be anything.  Make sure it's something that you'll recognize later on though!
  2. Set "Proxy mode" to "passive".
  3. In the "interface" section only one of the following is required
    1. "IP Address" - Set the IP address for the proxy and then click the "IP" button under the "Connect to" option.
    2. "DNS" - Set the DNS for the proxy and then click the "DNS" button under the "Connect to" option.
  4. Change the port number if necessary.  By default proxies listen on port 10051, so probably no change is necessary.
  5. Click "Save".
You might have to wait a while for the Zabbix Server to send the Proxy its configuration data (up to one hour by default).  When it does send the proxy it's configuration, you'll see a line like this in the Zabbix server log (/var/log/zabbix/zabbix_server.log):
1901:20150430:212155.013 sending configuration data to proxy "my_cool_proxy", datalen 2588

Tidying up

At this point the proxy can communicate with the Zabbix server. Congratulations!  In part two we'll discuss how to change the frequency in which the the proxy is given/gets its configuration.  We'll also discuss how to change hosts to be monitored by a proxy.

If it appears your proxy is still unable to talk to the Zabbix server, make sure to look for these gotchas:
  • If using an active proxy, can that proxy talk to the zabbix server on the port it listens (port 10051 by default)?
  •  If using a passive proxy, can the Zabbix server talk to the proxy on the port it listens (port 10051 by default)?
  • If the proxy log indicates a failure to create the SQLite database, did you change the "DBName" parameter in zabbix_proxy.conf to point to a valid location?
    • Does the user the Zabbix proxy runs as ("zabbix" by default) have write permissions to this path?
    • Is SELinux (or other programs like it) preventing the Zabbix proxy process from writing to this path?
  • If using an active proxy, does the "Hostname" parameter in zabbix_proxy.conf EXACTLY match the proxy name given in the Zabbix UI?
  • In the Zabbix UI, does the proxy configuration correctly indicate the kind of proxy being used (Active/Passive)?
  • If using a passive proxy, does the Zabbix UI show that the proxy is configured with the correct IP or FQDN?
    • Make sure that the appropriate "IP" or "DNS" box is selected!