Tuning SQL Relay


Connect To The Database Using a Local Connection

One of the most common mistakes that people make when using SQL Relay is running SQL Relay on the same machine as the database, but not using a "local connection" to the database. It's an easy mistake to make, but when remedied, it can result in a substantial improvement in performance.

Most databases support "remote connections" (usually over an inet socket) and "local connections" (usually over a unix socket). If you're running SQL Relay on a seperate machine from the database, you have to configure it to connect to the database using a remote connection. However, if you're running SQL Relay on the same machine as the database, then you should configure it to connect to the database using a local connection if possible.

MySQL

MySQL supports local connections over a unix socket.

By default, MySQL is usually configured to accept local connections. To make sure MySQL is configuerd to accept local connections, edit the MySQL configuration file (usually /etc/my.cnf) and add a line like the following to the [mysqld] section if it's not already there:

socket=/var/lib/mysql/mysql.sock

Now restart MySQL.

To configure SQL Relay to connect to MySQL using the local connection, edit the sqlrelay.conf file and make sure that the connectstring for the MySQL instance uses the socket parameter, as follows:

user=testuser;password=testpassword;db=testdb;socket=/var/lib/mysql/mysql.sock

as opposed to:

user=testuser;password=testpassword;db=testdb;host=localhost;port=5432
PostgreSQL

PostgreSQL supports local connections over a unix socket.

By default, PostgreSQL is configured to accept local connections. I'm not sure you can even turn it off. By default, PostgreSQL listens on /tmp/.s.PGSQL.5432. The directory can be overridden by modifying the PostgreSQL configuration file (usually /var/lib/pgsql/data/postgresql.conf). The following parameters affect which file PostgreSQL listens on and that file's properties. By default these parameters are commented out. To modify the parameters, remove the leading # and set the parameter. By default, the parameters look like this:

#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777 # octal

#port = 5432

The socket file that PostgreSQL will listen on will be found in /tmp by default, or in whatever directory is specified by unix_socket_directory. The file always start with .s.PGSQL. and end with whatever port is specified with the port parameter. Note that the port parameter also determines which inet port PostgreSQL will listen on. By default it will have 777 permissions and be owned by whatever user and group the postmaster process is running as. You can override the permissions using the unix_socket_permissions parameter and the group using the unix_socket_group parameter.

The postmaster process also takes several command line parameters. The -k parameter overrides the unix_socket_directory parameter and the -p parameter overrides the port parameter.

After you modify postgresql.conf, restart PostgreSQL.

To configure SQL Relay to connect to PostgreSQL using the local connection, edit the sqlrelay.conf file and make sure that the connectstring for the PostgreSQL instance uses the host and port parameters as follows:

user=testuser;password=testpassword;host=/tmp;port=5432;

as opposed to:

user=testuser;password=testpassword;host=localhost;port=5432;

Ordinarily, the host parameter tells SQL Relay what host to connect to, but if it starts with a / then it indicates what directory to look for the socket file in. Likewise, the port parameter ordinarily tells SQL Relay what port to connect to, but if host starts with a / then it indicates that SQL Relay should look for a file in the directory specifid by the host parameter starting with .s.PGSQL. and ending in the port number.

mSQL

mSQL supports local connections over a unix socket.

By default, mSQL is usually configured to accept local connections. To make sure mSQL is configuerd to accept local connections, edit the mSQL configuration file (usually /usr/local/Hughes/msql.conf) and add a line like the following to the [general] section if it's not already there:

UNIX_Port = %I/msql2.sock

The full pathname of the file can be specified, or %I can be used to refer to the path specified in the Inst_Dir parameter (usually /usr/local/Hughes).

Now restart mSQL.

To configure SQL Relay to connect to mSQL using the local connection, edit the sqlrelay.conf file and make sure that the connectstring for the mSQL instance doesn't contain a host parameter, as follows:

db=testdb

as opposed to:

host=localhost;db=testdb

Leaving out the host parameter causes SQL Relay to read the msql.conf file and connect to whatever unix socket is specified in that file. It is important that msql.conf be readable by whatever user SQL Relay is configured to run as.

Oracle

Oracle supports local connections over IPC using shared memory.

By default, Oracle isn't usually set up to listen for IPC connections. To set up Oracle to listen for IPC connections, edit $ORACLE_HOME/network/admin/listener.ora and look for a section like this:

LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = TCP)(HOST = localhost.localdomain)(PORT = 1521))
      )
    )
  )

Add an additional ADDRESS section as follows:

LISTENER =
  (DESCRIPTION_LIST =
    (DESCRIPTION =
      (ADDRESS_LIST =
        (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
        (ADDRESS = (PROTOCOL = TCP)(HOST = localhost.localdomain)(PORT = 1521))
      )
    )
  )

Now Oracle's listener is configured to listen for both IPC and inet socket connections. Restart Oracle

SQL Relay looks up the ORACLE_SID in the $ORACLE_HOME/network/admin/tnsnames.ora file and uses the configuration parameters there to connect to the database. By default, the connection to the database is configured to use an inet socket. To set up an ORACLE_SID to use IPC instead, edit $ORACLE_HOME/network/admin/tnsnames.ora and look for a section for the ORACLE_SID like this:

ORA1 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = localhost.localdomain)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ora1)
    )
  )

Modify it as follows:

ORA1 =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = ora1)
    )
  )

SQL Relay requires no specific modifications to connect to the database using IPC instead of an inet socket. Just configure the connect string to use an SID that uses IPC to connect to the database. For example:

user=testuser;password=testpassword;oracle_sid=ora1;
SQLite

SQLite is different from other databases. SQL Relay's sqlite connection uses the sqlite library to access the database file directly. Effectively, the database connection is always local.

IBM DB2

As far as I can tell, DB2 supports local connections over IPC using shared memory, semaphores and message queues (though I can't find any documentation that specifically says that).

By default, clients connect to DB2 databases using local connections. In fact, to access a remote instance, you have to set up a stub local instance which relays to the remote instance and connect to the local instance.

SQL Relay requires no specific modifications to connect to the database. Just configure the connect string to use the local database. For example:

db=testdb
Sybase

As far as I can tell, Sybase only supports client connections over inet sockets. If anyone knows differently, send mail to dmuse@firstworks.com.

Lago

As far as I can tell, Lago only supports client connections over inet sockets. If anyone knows differently, send mail to dmuse@firstworks.com.

Interbase/Firebird

As far as I can tell, Interbase/Firebird only supports client connections over inet sockets. If anyone knows differently, send mail to dmuse@firstworks.com.


Connect To SQL Relay Using a Unix Socket

SQL Relay supports inet and unix sockets. If you are runing the SQL Relay client (your program) and the SQL Relay server on the same machine, make sure that they are talking to each other over the unix socket.

First make sure that SQL Relay is configured to listen on a unix socket. In the sqlrelay.conf file, the instance that is talking to mysql should have a socket attribute:

In the above example, the SQL Relay server will listen on inet port 8006 and unix socket /tmp/mysqltest.socket.

Now make sure that your app is using that socket as well. When you create a new sqlrconnection, make sure you tell it to use the unix socket:

sqlrconnection *con=new sqlrconnection("localhost",8006,"/tmp/mysqltest.socket","user","password",0,1);

as opposed to:

sqlrconnection *con=new sqlrconnection("localhost",8006,NULL,"user","password",0,1);

(for non-C++ api's the syntax is similar)

The client will first try the unix socket, then if it can't connect, it will try the host/port.

Make sure to type the filename of the unix socket correctly :) I've seen several cases where people were really confused why they were getting bad performance even though they were using unix sockets. They had just typed the unix socket file name wrong and it was trying it, failing, then falling back to host/port.


Use File Descriptor Passing

SQL Relay has a listener process that waits for incoming clients, then hands them off to a seperate process that is already logged into the database when one of those processes is available. The handoff can be done using file descriptor passing or by causing the client to disconnect from the listener and reconnect to the other process. File descriptor passing is much faster and uses fewer system resources.

In the sqlrelay.conf file, for the instance that is talking to mysql, make sure the handoff attribute is set to "pass":

<instance id="mysqltest" port="8006" ... handoff="pass" ... >

as opposed to:

<instance id="mysqltest" port="8006" ... handoff="reconnect" ... >

Tweak Various System Parameters

Certain system parameters can be tuned to get better performance out of SQL Relay.

TIME_WAIT timeout

The first paramater that comes to mind is the TIME_WAIT timeout. When a TCP client disconnects from a server, the socket that the client was connected on goes into a TIME_WAIT state for typically between a minute and 4 minutes.

For servers serving data over the the unreliable internet, this is probably reasonable. For internal servers, dedicated to serving other internal servers on a reliable network, reducing the length of the timeout is probably OK.

Here's why it helps...

The kernel keeps a list of sockets in the TIME_WAIT state. When the list is full, failures start to occur. On my test machine (running a linux 2.4 kernel), I can have about 1000 sockets in the TIME_WAIT state before running into problems.

If your server is getting new client connections faster than it can bleed off sockets in the TIME_WAIT state, the list will ultimately get full. Decreasing the timeout increases the bleed-off rate.

The following instructions illustrate how to change the timeout rate for Linux and Solaris. Note that I got these instructions off of the web and have not tried all of them myself.

For Linux, set the timeout by executing the following command. In this example, the timeout is set to 30 seconds. You should put this command in a system startup file so it will be executed at boot time.

echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout

For Solaris, the parameter can be modified while the system is running using the ndd command to set the number of milliseconds to wait. These examples set the timeout to 30 seconds. You should put these commands in a system startup file so they'll be executed at boot time.

For Solaris 2.6 and earlier: ndd -set /dev/tcp tcp_close_wait_interval 30000

For Solaris 2.7 and later: ndd -set /dev/tcp tcp_time_wait_interval 30000

Port range

Another paramter that you may want to tweak is the range of available ports. On Linux 2.2 kernels, it defaults to ports 1024 through 4999. You can display the range by running:

/sbin/sysctl net.ipv4.ip_local_port_range

You can increase this to range from 1024 to 65535 by running the following command:

/sbin/sysctl -w net.ipv4.ip_local_port_range="1024 65535"

You should put this command in a system startup file so it'll be executed at boot time.

I'm not sure what the default port range is or how to change it on other operating systems.


Don't Get Unnecessary Column Info

If you run select queries but don't need to know the names/types/sizes of the columns, rather you can just refer to them by number, then you can call dontGetColumnInfo() before running your query and much less data will be transmitted from the SQL Relay server to the client. This is really useful for speeding up 1 column queries like "select count(*) from mytable".


Use Result Set Buffer Size

By default, SQL Relay fetches the entire result set of a query into memory. SQL Relay was originally written for use with web-based apps, and most web pages display small result sets so generally it's faster if the entire result set is buffered. But, if you run select queries that return really long result sets, you should use setResultSetBufferSize() to fetch rows in groups instead of all at once or 1 at a time. The value to use tends to be query-specific though so you usually have to do some benchmarking to get it optimized. If you fetch too many rows at once, memory has to be allocated which slows things down (and in really bad cases can cause swapping) but if you fetch too few at once, it causes too many round trips to the database. Each query has a sweet spot, finding it may dramatically improve that query's performance.