You have Imperva SecureSphere Database Activity Monitoring (DAM) up and running. You’ve deployed the system and configured your business audit policies. So, what’s next?
In a previous post I discussed the capacity management challenges of database monitoring solutions, in this post I’ll elaborate on the solutions SecureSphere has for managing and resolving those challenges. I’ll explain the importance of managing your DAM capacity over time, review how you can discover capacity issues and share ways to mitigate any problems.
As discussed in the do's and don’ts post on capacity estimation, it is very difficult to get an accurate estimate on the expected capacity per database. It’s better to estimate for the entire deployment.
The problem with estimates is, well, they’re estimates. Things can change or perform differently than you expect. You could find out for instance that your 20 core MySQL database has less activity than your 8 core MSSQL. Why? It could be due to several reasons, such as:
Even if you did a great job estimating your capacity needs when you first deployed, it's just a matter of time before it loses accuracy. From upgrading your database, changing your applications, adding more users, or adding more databases to your deployment – your capacity requirements WILL change over time. These changes will affect not only your overall capacity requirements, but capacity per database.
Before I drill down into SecureSphere solutions for capacity management, there are a few terms that would be helpful to understand. The basic operation of DAM (usually) involves agents, which are installed on the database servers, and gateways, which are used to process the database activity sent from the agents. This means that every agent needs at least one connection to at least one gateway.
There are two primary methods to manage multiple gateways:
Every gateway has a maximum capacity measured in IPU (Imperva Performance Units), and every agent has a relative load on the gateway, which is also measured in IPU. The purpose of IPU measurement is so you can compare and estimate the capacity impact of the agent assignment on gateways.
With the basic terminology down, let's find out if you have capacity issues.
The best way to identify most capacity problems in SecureSphere DAM is through the health monitoring feature, which displays alarms on various issues. There are a few that indicate an overload problem, either at the gateway level or at the cluster level (see Figure 1):
Figure 1: SecureSphere displaying current status of SecureSphere components
There are also alarms for special scenarios. For example, if an agent load is more than your gateway can handle (depending on the gateway model) you will receive an alarm with a recommendation to scale out.
These alarms are based on real time measurements of the overall load of each gateway and the relative load of each of its corresponding agents. Each alarm contains a detailed explanation with recommendations for mitigation.
You can be more proactive by analyzing the current agent load and total gateway(s) load via the cluster management feature (Figure 2), which displays detailed information for all types of clusters and gateway groups. You can see which agents are assigned to which gateways, the capacity information, versions, status, etc.
Figure 2: SecureSphere DAM cluster management feature enabling cluster maintenance
There are four ways to solve for SecureSphere DAM capacity issues. Let’s take a look at each one.
You can choose to analyze your deployment and manually change the assignment of agents to gateways. It is also possible to set a threshold to prevent assigning an agent to a gateway if that gateway is overloaded according to its current assignment. It is important to note that this threshold is based on the estimates given to each agent upon initial assignment and not according to real time measurements.
Another way to solve a load balancing issue is to let SecureSphere do it automatically. The automatic load balancing feature (Figure 3) ensures the cluster will be optimized for the long term. Its aim is NOT to solve a momentary peak load, but to cluster balance over time. It doesn't change the agent assignment unless it improves the overall load scenario.
Figure 3: Configuring automatic load balancing in SecureSphere
In other scenarios, you might discover that you need more gateways (scale out), or more powerful gateways (scale up) (Figure 4). This means that load balancing is not a relevant solution – you simply don't have enough capacity. The alarms will guide you with recommendations, but in some cases it will still be beneficial to contact support to make the best decision.
Figure 4: Scale out (add more gateways) versus scale up (add more powerful gateways)
There are a few special scenarios which can lead to capacity-related alarms. One of them is discovering that a certain agent’s required capacity is larger than a "full gateway", and that multiple gateways are required to handle this single database. In this scenario, you will see the appropriate alarm with a recommendation to create a large server cluster. The large server cluster is used to solve the problem of monitoring very large databases (minimum of 128 cores might be considered large – depending on various factors).
As you can see, SecureSphere DAM supplies multiple tools to handle capacity management before there's a problem and mitigate any existing ones. It is highly recommended not to wait for the next alarm to pop up, but to follow these guidelines:
By utilizing these solutions and best practices you will improve the foundation for future growth plans and keep the capacity management overhead of your SecureSphere DAM deployment to a minimum.