Workarounds for Memory Leak in Some Versions of Microsoft Windows afd.sys file

By: Quinn Wildman

Abstract: afd.sys, which is part of the TCP/IP stack in Microsoft Windows, contains a memory leak in many versions of the driver.

afd.sys, which is part of the TCP/IP stack in Microsoft Windows, contains a memory leak in many versions of the driver. This leak can result in Windows consuming more and more memory as part of its Pool Nonpaged application under certain circumstances when InterBase is connected.

Background

InterBase uses the SO_KEEPALIVE option when it opens a socket. This instructs the host TCP/IP stack to check the state of the connection at fixed intervals. These fixed intervals are governed by kernel configuration settings for Linux/Unix and Registry settings for Windows.

Traditionally, these intervals have defaulted to 1 or 2 hours on their respective operating systems. The settings can be modified by users but the new setting would affect sockets used by all applications not just InterBase. The process of making the session change isn't trivial and what would be desirable for InterBase might not be optimal for other applications.

This motivated the introduction of the DUMMY_PACKET_INTERVAL config parameter. This parameter is symmetric in nature - the client checks the server's viability by sending a dummy packet to the server while awaiting a result set and the server checks the client by sending a dummy packet to an idle client.

Since an open transaction can cause the disablement of garbage collection on a database or cause other client transactions to wait indefinitely on updates, the 1 or 2 hour wait for the SO_KEEPALIVE socket option was considered excessive. Customers with high transaction databases who set the DUMMY_PACKET_INTERVAL too high will risk the same performance degradation due to inhibiting garbage collection and have to weigh that risk against the probability of non-paged pool consumption on the client.

The Problem

We have a couple of probable reasons why this might be caused.

InterBase has a notion of DUMMY_PACKET_INTERVAL (ibconfig parameter). This parameter value is set at the client side and is 60 seconds by default. The parameter informs the InterBase server, for a TCP/IP socket connection, to send dummy packets (small dummy opcode) to the client if the socket connection is idle for the specified duration.

Now consider this...

  1. The client application is having an idle connection for > 60 seconds.
  2. The InterBase server gets a timeout on the connection, and sends a dummy packet.
  3. Since the client application is not doing any database activity, it does not read the socket for any data, and hence will not read the dummy packet sent by the server, yet. The client will only read the pending dummy packet when it goes to read from the socket (when it needs to do some database work).

In this situation, (many) client programs may be having idle connections. All of the dummy packets sent by the server to each client application may not be read immediately. According to Microsoft's KB article, the bug in afd.sys copies the socket data to non-paged pool (which is limited to physical memory). This non-paged pool is released only when the database connection is closed (via the socket), and the system can release all the non-paged pool for that socket. Even though, MS claims this is fixed as of SP3, our testing shows otherwise. We have a Win2K SP3 system, and are able to reproduce this problem in afd.sys.

Possible workarounds/resolutions

  1. DUMMY_PACKET_INTERVAL timeout value can be set for each client application individually via the DPB parameter. If you want to set it for all clients in that system, you can do this in one place, by modifying the IBCONFIG file parameter value. Do not forget to uncomment the parameter line in the file.
    Dummy packets were introduced in InterBase earlier on as a mechanism to identify clients that have abnormally terminated, but the server does not know about it. This leads to socket stagnation, thereby using sockets on the server without active connections. Before dummy packet introduction, the server would not know if the client is gone, thereby using up a connection from the total set of licensed connections allowed. That is a bit of history as to why dummy packets were introduced.
    If you are fairly sure that your clients are not going to just terminate or go away without doing a proper disconnect from the database, you have the choice to modify the DUMMY_PACKET_INTERVAL to a very high value, say 3600 (60 minutes). The server will then send dummy packets only once every hour, if the connection is idle for that long.
  2. Alternatively, you can make sure that you have some database activity happening before the dummy packet timeout gets triggered for the connection.
    For example: if the interval is set to 3600 seconds, then make sure you have some communication with the server on the live database socket connection. This could be a simple call to a info parameter (say server version, or database ODS etc.). This will inform the server that the connection is fine, and that it will reset the timeout for the client socket connection.
  3. Make sure that you do not have database connections idle for a long period of time from your client applications. This will have the dual benefit of making proper use of licensed connections, and that of inhibiting any dummy packet transmission.
  4. Set DUMMY_PACKET_INTERVAL to 0. This will cause InterBase to never send a dummy packet in the first place, thereby avoiding the problem. We are planning to make this as the default in a patch release soon to be made available to registered 7.1 users. This patch release will also have a new feature in the the TMP$ATTACHMENTS table.

    UPDATE TMP$ATTACHMENTS SET TMP$STATE = 'KEEPALIVE' [WHERE ...]
    COMMIT

    will cause dummy packets to be sent to all remote TCP/IP database connections that meet the optional WHERE clause. This allows a user to clear a dead connection immediately whenever there is a strong suspicion that a dead connection is lingering somewhere in the world.

  5. Obtain the latest service pack, see "Windows 2000 Non-Paged Pool Is Exhausted by Afd.sys" at http://support.microsoft.com/default.aspx?kbid=296265.

Server Response from: ETNASC04