Connections to various PingFederate endpoints may intermittently fail.

Connections to various PingFederate endpoints may intermittently fail.

Published: 11/05/2018

Problem Description:

If an application is using PingFederate (running on Linux) as an OAuth Authorization Server, communication may fail intermittently when the application tries to reach various PingFederate Endpoints. The PingFederate logs will show no evidence of these connections, which often leads administrators to think this is a networking issue such as incorrect DNS records or a firewall, proxy, or load balancer blocking or failing to pass traffic.

Depending on the network configuration, the application may report different error messages.

If the application is connecting directly to the PingFederate Server, the application will likely record a "Connection timed out" type of error, however it's not as likely that this will happen if there's a direct connection.
If the application is behind a NAT, or there's a load balancer, or other network appliance between the application and PingFederate, the application might see "An existing connection was forcibly closed by the remote host" type of error. This will happen because if the load balancer doesn't get a response from the destination (PingFederate in this case), it will close the connection that the application had established.
 

Solution:

This issue can be caused by incorrect settings in the /etc/sysctl.conf file on the operating system. The settings that can cause this issue are:
 
net.ipv4.tcp_tw_recycle
net.ipv4.tcp_tw_reuse

The default settings for both of these is "0" or disabled. While in some cases/environments, enabling these settings can achieve better performance, they can also cause issues that are difficult to track down. The linux documentation has the following descriptions for these settings. 
 
tcp_tw_recycle (Boolean; default: disabled; Linux 2.4 to 4.11)
              Enable fast recycling of TIME_WAIT sockets.  Enabling this
              option is not recommended as the remote IP may not use
              monotonically increasing timestamps (devices behind NAT,
              devices with per-connection timestamp offsets).  See RFC 1323
              (PAWS) and RFC 6191.
tcp_tw_reuse (Boolean; default: disabled; since Linux 2.4.19/2.6)
              Allow to reuse TIME_WAIT sockets for new connections when it
              is safe from protocol viewpoint.  It should not be changed
              without advice/request of technical experts.

To help determine if this is the cause of the problems, you need to capture a network trace on the PingFederate Server and look at the SYN packets coming from the source (either the application or the network appliance). It's important that the capture is taken from the PingFederate Server, because if it's taken from the client or network appliance, you'll never know if that SYN packet is getting to the PingFederate Server. 

If this setting is causing the issue, you will see multiple SYN packets arrive on the PingFederate server, but no SYN,ACK sent back to the source. This will cause the source to retransmit the SYN packet, so the easiest way to spot this is by searching the packet capture for "TCP Retransmission". If this happens, change the values for both net.ipv4.tcp_tw_recycle and net.ipv4.tcp_tw_reuse to 0 and restart the PingFederate Server.
Category:
OAuth , General , 
KB or other URL: