Login Register Actian.com  

Actian Community Blogs

Rate this Entry

Scalable Load Balancing “On a Dime” with .NET Connection Pooling

Submit "Scalable Load Balancing “On a Dime” with .NET Connection Pooling" to Digg Submit "Scalable Load Balancing “On a Dime” with .NET Connection Pooling" to del.icio.us Submit "Scalable Load Balancing “On a Dime” with .NET Connection Pooling" to StumbleUpon Submit "Scalable Load Balancing “On a Dime” with .NET Connection Pooling" to Google
Posted 2012-10-17 at 02:20 PM by teresa
Updated 2012-10-19 at 09:18 AM by teresa

Got a big data project coming up? Does it have lots of data, needs to be scalable, potentially with many users who will access the system at any time of the day? Better go shopping for an expensive cluster solution or load balancing software, right? Not necessarily. Another approach may work for you, as explained in this article.

A recent Vectorwise project required support for a large number of users running complex (select) queries against large, mostly static databases. Due to the large number of users and the complexity of the queries, the customer wanted to balance the queries across multiple instances of the same logical database running on different servers. This would provide users reliable and consistent response times. A simple but effective solution was implemented with Ingres/Vectorwise—without using any expensive third-party load balancing product or clustering.

Goal: Split the Load

The goal was to split the load from the users evenly across the databases and allow individual databases to be brought up and down for maintenance with minimal interruption to users. Because these databases of information are provided as a service to the client's customers, it was crucial that response times be consistent and the data be available at all times. As volume or processing requirements increase, the solution is scalable because more servers can be added fairly easily. This article discusses how this system was implemented for .NET applications using a number of features native to Ingres/Vectorwise.

Duplicating the Database

The first step was to set up multiple copies of each database, each of which ran on different servers in Ingres/Vectorwise installations. For example purposes, we will refer to the servers as Server-1, Server-2, and Server-3 with a single database DB-A, which exists (identical copy) in each of the three Vectorwise server installations. If more databases are desired, they can be implemented on the same or separate servers, or a combination thereof. Using multiple servers provides a way to scale up, as needed, by adding more servers. Also, though not the focus of this article, the workload within an individual Ingres/Vectorwise installation can be tuned and balanced in numerous ways, such as starting more server processes for the Data Access Server (DAS), Communications, and/or DBMS (Ingres and/or Vectorwise). Since the latter is not the focus of this article, we’ll just assume one server process of each type in each installation, which is often sufficient.

The application servers run on machines separate from the DBMS servers. So the next question was how to split the queries coming from the application(s) to the various database servers so that the workload is evenly distributed.

Specifying Target Servers in the Connection String

The first part of the solution was to specify all of the target servers in the connection string. The application in this case was .NET, though a similar approach could be used for JDBC or other types of applications using merged vnodes in Ingres/Vectorwise Net (not to be confused with .NET). Each server has its own listen port and server machine name (or TCP/IP address). For .NET, the connection string from the application must specify each host name or address and DAS port for the target database. For example:

Server=(Server-1:VW7;Server-2:VW7;Server-3:VW7);Database=DB-A


This example assumes the installation ID of Vectorwise on each machine is “VW”, which defaults to “VW7” as the default listen port for the DAS, which is what .NET (and JDBC) applications connect to. Obviously, the installation ID and port could be different on any or all of the servers. Database DB-A is the duplicated database on each server. Further information regarding configuring multiple DAS servers can be found in the Ingres Connectivity Guide and at Ingres 9.3 and 10.0 .NET enhancements.

The Ingres/Vectorwise .NET Data Provider (.NETDP) randomly selects a server from the list to connect to. This provides, over many connections, a roughly even distribution across the servers. If the connection fails, then one of the other servers in the list will be tried, continuing until either a connection is successful or the list has been exhausted. This provides a degree of failover and is transparent to the application. Thus, when one of the servers is down (for example, for maintenance), new connection requests will continue to work but be redirected to one of the servers that is up. Note that there is some minor overhead in this approach because connections to the inactive servers will fail (when attempted) so the connection setup will take slightly longer while a good server is located; that is, inactive servers are not automatically removed from the list. However, the advantage to this approach is that, when the server is brought back up, it will quickly begin servicing new connections.

Setting Parameters for Optimization

Another tool to minimize time spent connecting to an unresponsive server is the Connection Timeout connection string parameter. This parameter specifies the time (in seconds) to wait for an attempted connection to time out if the connection has not completed; the default is 15 seconds. Setting this to a lower value can speed up the switching to try another server; of course, setting it too low could also cause a good connection to fail before it gets a chance to complete.

Even queries in .NET can be timed out if desired. Timeout values can be specified with the Command Timeout property of the IngresCommand class (not available on the connection string). This defaults to zero (meaning no timeout), but can be specified by setting the IngresCommand.CommandTimeout property to a certain number of seconds in which the results should be returned. Again, caution is advised if setting this. However, if a reasonable maximum can be determined, this option can avoid excessively resource-intensive (or problematic) queries from tying up a system too long. A fix was recently made to the .NETDP to support this.

Using Connection Pooling

Another aspect of the solution is the use of connection pooling in the .NETDP, which is on by default. Connection pooling greatly improves response time because it saves the significant overhead of reconnecting to the database on subsequent connect requests from the application. More time is spent on processing the next query and less time on setting up and terminating connections across the network to the database. That is, when the application disconnects, the connection to the database is merely put into the .NETDP connection pool without actually disconnecting from the database. Then, when other connection requests are received from the application, if the connection string matches, the connection is merely handed off to the application and the query can be sent (connection setup is almost immediate). The .NETDP maintains separate pools for each unique connection string, so there is a separate pool for each target type of database (such as DB-A and DB-B). Connection pooling in the .NETDP is documented in the Ingres Connectivity Guide.

Connection pooling has the most benefit in applications that connect and disconnect frequently. This is common in applications that service information requests from many users, such as from the web, and each request (which may involve multiple SQL statements) is wrapped with a connect/disconnect to the database. For applications where a user typically stays logged in on one connection all day, periodically making requests on the same connection, connection pooling will not provide much benefit. It is the former type of application that was involved in this case, so connection pooling was very beneficial.

Connections in the pool are processed in last-in, first-out sequence. That is, the last session to be disconnected by an application and thus put into the pool will be the first session handed off to the next connection request. This has the benefit of reusing the most recently used resources as well as redirecting the next query to the server that most recently finished a query (and thus likely has resources available to service a new request). This may cause a database server to get more total requests than another, but it provides a simple but effective way of balancing resource usage. If a server becomes tied up (for instance processing one or more complex queries), other servers will be given more of the new requests.

The size of the connection pool can be controlled with the connection string parameter Min Pool Size=n (default is zero). If the number of sessions in the pool exceeds the minimum, sessions in the pool will be closed after 60 seconds of inactivity, until the minimum is reached. This helps reduce system resources held for inactive sessions. The goal is to set this parameter to a value that ensures enough inactive sessions are kept in the pool to satisfy most new connection requests, while not making it so large that excessive system resources are tied up unnecessarily. Similarly, connection string parameter Max Pool Size may be specified to place an upper limit on the number of inactive sessions (and associated resources) kept in the pool.

The connection pooling logic in the .NETDP also has some built-in provisions to help ensure bad connections are removed from the pool and not handed off to the application. First, it does a quick check on the network connection (like a “ping”) before providing a connection from the pool to the application. This tests the TCP/IP connection from the application/.NETDP to the DAS in the server installation. Though it does not test the full route to the DBMS, it is a very quick test and does a good job of verifying the network between the application and the DAS server, as well as that the DAS itself is up.

Some connection pooling enhancements were recently made that can be optionally turned on (configured). Some of these are beneficial for handling failover and starting or stopping servers. Pool Check Interval and Pool Check Timeout poll the underlying session to the database to check if the servers are still responsive; this complements the existing ping on the network connection by testing all the way through to the DBMS. Connection Lifetime limits the life of a pooled inactive connection. The new Ingres/Vectorwise .NET data provider connection string keywords are:
  • Pool Check Interval – This keyword controls the periodic testing of pooled connections waiting for work and removes any bad connection from the Connection Pool. In a separate asynchronous thread, inactive connections in the pool that are waiting for work beyond the specified interval are checked with a simple query to test the responsiveness of the associated DBMS server. This query (“SELECT 1”) tests if the session is still healthy and responsive. Non-responsive connections are closed and removed from the pool. Specifying a value too low generates needless query traffic to the DBMS. Specifying a value too high delays detection of non-responsive or “shut down” servers. An interval of 300 seconds (5 minutes) might be chosen as a starting value depending on server reliability. Interval is in seconds. Default is 0 (no checking).
  • Pool Check Timeout – This is the time for an inactive connection in the pool to wait for a non-responsive server before timing out and disconnecting. This keyword has no effect if Pool Check Interval is 0. Timeout is in seconds. Default is 5 seconds.
  • Connection Lifetime – An inactive connection with a total lifespan that is longer than the specified interval will be removed from the connection pool (or not be returned to the pool), and will be closed. This parameter increases the number of physical connection setups required over the life of the system, but allows the system administrator to release connections and system resources, or balance servers regardless of other pool parameters such as Min Pool Size. Lifetime interval is in seconds. Default is 0 (no checking).

Minimizing the Impact of Downtime

Another aspect of the solution, besides load balancing, is how to minimize the impact to the user if a server becomes unresponsive, fails, or is brought down intentionally such as for maintenance or for syncing the database with more current data. As noted earlier, there are several mechanisms in the .NETDP connection pooling logic that check the health of the associated server. This minimizes the likelihood of wasting time attempting to connect to a bad or non-running server. The network ping check and the DBMS health check (Pool Check Interval and Pool Check Timeout) fall into this category.

For syncing the data in the database, it may also be advantageous to simply bring down the associated DAS (ingstop –iigcd), rather than the DBMS. All .NET applications must go through the DAS, so if all application access is via .NET, then shutting down the DAS will make it safe to reload or update the database in some other (non-.NET) manner without recycling it. When the database is ready, the associated DAS can be restarted, making the database available again to process .NET application requests. While the DAS is down, all the connections in the pool to that DAS will get flushed as each connection is determined to be bad.

When the DAS is restarted (ingstart –iigcd), the .NETDP connection pool will initially have no connections to the restarted DAS/database. As new connections are needed (and can’t be satisfied from the pool to the other servers), the pool will acquire connections to the restarted DAS/database. To speed up the process of achieving a more equal distribution of connections in the pool, the connection string parameter Connection Lifetime or a small Min Pool Size and/or Max Pool Size can be helpful. The lifetime parameter will flush existing connections to all servers out of the pool more quickly, which will then cause more new connections to be needed, giving an opportunity to set up new connections to the restarted server. The pool size parameters will keep the pool smaller and thereby increase the chance of needing a new physical connection. With any of these parameters, the downside is that more new connections must be set up during normal processing, rather than utilizing existing connections in the pool.

An alternative to using ingstop and ingstart to recycle the DAS is to merely close and then reopen the DAS via iimonitor. Specifically, the commands set server closed and set server open would be issued to accomplish this in an iimonitor session to the DAS.

Implementing Connection Pooling for Other Drivers

Note that, unlike the .NETDP, there is no stand-alone, built-in connection pooling in other front-end interfaces such as JDBC. However, if running in an application or web server, connection pooling is likely available with it and should be used instead, as the JDBC driver supports standard JDBC connection pooling. The DAS also provides connection pooling (for JDBC and .NET) if not available in the application server. The application itself could also provide its own connection pooling, but that likely would require fairly complex logic. The ODBC driver supports standard ODBC connection pooling as implemented in the Windows ODBC Driver Manager and in the Ingres/Vectorwise ODBC CLI (for non-Windows platforms). Also, the support for multiple target servers, which can be configured in .NET and JDBC connection strings, is available for other types of applications as well, such as ODBC, by defining multiple connection entries for an Ingres/Vectorwise Net virtual node (aka, “merged” vnodes). Thus, the overall approach described in this article is not restricted to .NET applications.

Summary

To summarize, connections can be distributed across multiple servers by including each of the server names and ports in the connection string. On each connection request, the server is chosen randomly from the list if there is no existing connection in the .NETDP connection pool to satisfy the request. If a connection request fails, another server in the list is tried, providing some degree of transparent failover at connection time. Utilizing connections from the pool speeds up processing by eliminating the overhead of setting up new physical connections to the database each time. Connections in the pool are validated in several ways before being handed off to the application, thereby minimizing delays caused by unresponsive or failed servers. Individual servers can be shut down and restarted with minimal impact to the application. Various connection string parameters, mostly related to the connection pool, are now available in the latest .NETDP to adjust the behavior of the pool.

-------
Many thanks to Bruce Lunsford and Dave Thole for their extensive contributions to the article.
Views 1177 Comments 0 Edit Tags

« Prev     Main     Next »

Total Comments 0

Comments

 
© 2011 Actian Corporation. All Rights Reserved