Eilfa we teach you learn


Site Search








solaris 10 tcp/ip tuning


In this solaris 10 tcp/ip tuning article i have collected the most essential and important tcp keys in order to achieve a better throughput, off course according to the situation being examined.


Please Note:

When mentioned Temporary, it means that after reboot configuration will be erased while when mentioned Permanently, it means that after reboot the configurations will remain intact.

Fanout the incoming TCP/IP connections

Determines the number of squeues to be used to fanout the incoming TCP/IP connections.
The incoming traffic is placed on one of the rings. If the ring is overloaded, packets are dropped.
Default - 2
Range - 0 - nCPUs, where nCPUs is the maximum number of CPUs in the system
Dynamic? - No. The interface should be plumbed again when changing this parameter.
When to Change Consider setting this parameter to a value greater than 2 on systems that have 10 Gbps NICs and many CPUs.
Use these commands to set temporarly and permanently the changes: 

Temporaryndd -set /dev/tcp ip_soft_rings_cnt

Permanently insert into /etc/system - set ip:ip_soft_rings_cnt=16 


A value of 0 associates a new TCP/IP connection with the CPU that creates the connection. A value of 1 associates the connection with multiple squeues that belong to different CPUs. The number of squeues that are used to fanout the connection is based upon ip_soft_rings_cnt.
Consider setting this parameter to 1 to spread the load across all CPUs in certain situations. For example, when the number of CPUs exceed the number of NICs, and one CPU is not capable of handling the network load of a single NIC, change this parameter to 1.

Temporaryndd -set /dev/tcp ip_squeue_fanout 1

Permanently insert into /etc/system - set ip:ip_squeue_fanout = 1




In Solaris, the available range of TCP/IP ports is 0 to 65535. However, there are some restrictions that apply:
Ports in the range 0 to 1023 are reserved for privileged (root) services, such as telnetd, ftpd, and so on.
Ports in the range 1024 to tcp_smallest_anon_port-1 are used for user services such as NFS server daemon, FONT server, and so on.
This leaves the range 32768 to 65535 available for general TCP/IP connections. To limit the range of the port numbers allocated for the general use, the following two ndd(1M) parameters can be used:
This determines the smallest TCP port number that may be used for an anonymous connection. Solaris allocates anonymous ports above 32768. The default value is 32768.
This is the largest TCP port number that may be used for anonymous connections. The default value of this is 65535.


kernel sockets

The kernel keeps a list of sockets in the TIME_WAIT state. When the list is full, failures start to occur. If your server is getting new client connections faster than it can bleed off sockets in the TIME_WAIT state, the list will ultimately get full. Decreasing the timeout increases the bleed-off rate.

Default - 60000

Temporaryndd -set /dev/tcp tcp_time_wait_interval 3000  

Setting this didnt give a better performance for a lighty web server.



 TCP hash table size


Check UNDER /etc/system:

Controls the hash table size in the TCP module for all TCP connections(default 512).
Temporaryndd -set tcp:tcp_conn_hash_size=8192

Controls the hash table size in an IP module for all active (in ESTABLISHED state) TCP connections(default 512).

Permanently insert into /etc/system - set ipc_tcp_conn_hash_size=8192


congestion window

The maximum (Default 4)initial congestion window (cwnd) size in MSS of a TCP connection tcp_rexmit_interval_initial

Temporary - ndd -set /dev/tcp tcp_slow_start_initial 1

ndd -set /dev/tcp tcp_slow_start_after_idle 1
When to Change?
For more information, see tcp_slow_start_initial.

ndd -set /dev/tcp tcp_slow_start_initial 2
ndd -set /dev/tcp tcp_slow_start_initial 1
When to Change?
Do not change the value.
If the initial cwnd size causes network congestion under special circumstances, decrease the value.


 TIME_WAIT ports

These ensure that TIME_WAIT ports either get reused or closed fast.

insert into /etc/system
 Linux net.ipv4.tcp_fin_timeout = 1 in Solaris tcp_time_wait_interval
Linux net.ipv4.tcp_tw_recycle = 1

TCP memory
Linux net.core.rmem_max = 16777216
Linux net.core.rmem_default = 16777216
Linux net.core.netdev_max_backlog = 262144 
in Solaris to tcp_conn_req_max_q
Linux tcp_slow_start_after_idle = 262144



SYN cookies protection

Linux net.ipv4.tcp_syncookies = 1 , but Solaris has SYN flood protection enabled by default.(The "syn cookies" violate the TCP spec thus solaris uses thier own mechanisem).


Linux net.ipv4.tcp_max_orphans = 262144
Linux net.ipv4.tcp_max_syn_backlog = 262144 in Solaris tcp_conn_req_max_q0

Linux net.ipv4.tcp_synack_retries &
 net.ipv4.tcp_syn_retries = 2  in Solaris

tcp_rexmit_interval_min 400

tcp_rexmit_interval_max 60000

tcp_ip_abort_interval 480000

tcp_rexmit_interval_initial 3000



You shouldn't be using conntrack on a heavily loaded server anyway, but these are suitably high for our uses, insuring that if conntrack gets turned on, the box doesn't die
 net.ipv4.netfilter.ip_conntrack_max = 1048576
 net.nf_conntrack_max = 1048576
In Solaris use Dtrace script to track connections
Dtrace script to track connections



TCP/IP connection control blocks

Notifies TCP/IP on how long to keep the connection control blocks closed.
After the applications complete the TCP/IP
connection, the control blocks are kept for the specified time. When
high connection rates occur, a large backlog of the TCP/IP connections
accumulates and can slow server performance. The server can stall during certain peak periods. If the server stalls, the netstat command shows that many of the sockets that are opened to the HTTP server are in the CLOSE_WAIT or FIN_WAIT_2 state. Visible delays can occur for up to four minutes, during which time the server does not send any responses, but CPU utilization stays high, with all of the activities in system processes.
Default - 60000

Temporary - ndd -set /dev/tcp tcp_time_wait_interval 3000




FIN_WAIT_2 state timer interval

Specifies the timer interval prohibiting a connection
in the FIN_WAIT_2 state to remain in that state. When high connection
rates occur, a large backlog of TCP/IP connections accumulates and can
slow server performance. The server can stall during peak periods. If
the server stalls, using the netstat command shows that many of the
sockets opened to the HTTP server are in the CLOSE_WAIT or FIN_WAIT_2
state. Visible delays can occur for up to four minutes, during which
time the server does not send any responses, but CPU utilization stays
high, with all of the activity in system processes.
Default - 675000
Temporary - ndd -set /dev/tcp tcp_fin_wait_2_flush_interval 67500



TCP keepalive

TCP keepalive is a feature provided by many TCP implementations, including Solaris, as a way to clean up idle connections in situations like the ones mentioned above. Applications must enable this feature with the SO_KEEPALIVE socket option via the setsockopt(3SOCKET) socket call. Once enabled, a keepalive probe packet is sent to the other end of the socket provided the connection has remained in the ESTABLISHED state and has been idle for the specified time frame. This time frame is the value specified by the TCP tunable tcp_keepalive_interval.

A keepalive probe packet is handled just like any other TCP packet which requires an acknowledgment (ACK) from the other end of the socket connection. It will be retransmitted per the standard retransmission backoff algorithm. If no response is received by the time specified for the other TCP tunable, tcp_ip_abort_interval, the connection is terminated, as would be the case for any other unacknowledged packet. Hence the actual maximum idle time of a connection utilizing TCP keepalive, which has no responding peer will therefore be:

tcp_keepalive_interval + tcp_ip_abort_interval
Default valuses respectively 7200000 480000

The above parameters are global and will affect the entire system. Keep in mind that TCP keepalive probes have no effect on inactive connections as long as the remote host is still responding to probes. However care should be taken to ensure the above parameters remain at a high enough value to avoid unnecessary traffic and other issues such as prematurely closing active connections in situations where a few packets have gone missing.

Temporary - ndd -set /dev/tcp tcp_keepalive_interval 300000




 Backlog Queue

The backlog queue is a large memory structure used to handle incoming packets with the SYN flag set until the moment the three-way handshake process is completed.
An operating system allocates part of the system memory for every incoming connection. We know that every TCP port can handle a defined number of incoming requests. The backlog queue controls how many half-open connections can be handled by the operating system at the same time. When a maximum number of incoming connections is reached, subsequent requests are silently dropped by the operating system.
As mentioned before, when we detect a lot of connections in the SYN RECEIVED state, host is probably under a SYN flooding attack. Moreover, the source IP addresses of these incoming packets can be spoofed. To limit the effects of SYN attacks we should enable some built-in protection mechanisms. Additionally, we can sometimes use techniques such as increasing the backlog queue size and minimizing the total time where a pending connection in kept in allocated memory (in the backlog queue).
Run this command to count how many half-open connections are in the backlog queue at the moment

netstat -s -P tcp | grep tcpHalfOpenDrop

In Sun Solaris there are two parameters which control the maximum number of connections.

The first parameter tcp_conn_req_max_q controls the total number of full connections.

The second tcp_conn_req_max_q0 parameter defines how many half-open connections are allowed without the dropping of incoming requests. In Sun Solaris 8, the default value is set to 1024. Using the ndd command we can modify this value.
It is pretty simple really: never change these parameters unless connections are refused because the values are too low.

The only way to determine this empirically is to use ‘netstat –s | fgrep –i listendrop’.

If tcpListenDrop is non-zero, increase tcp_conn_req_max_q. If tcpListenDropQ0 is non-zero, increase tcp_conn_req_max_q0.

Temporary - ndd -set /dev/tcp tcp_conn_req_max_q  128 OR 262144
Temporary - ndd -set /dev/tcp tcp_conn_req_max_q0 1024 OR 30000




Outgoing connection establishe time wait

Some systems allow you to configure how long a system waits for an outgoing connection to be established. When set too high, establishing outgoing connections to destination servers such as replicas not responding quickly can cause long delays.

Temporary - ndd -set /dev/tcp tcp_ip_abort_cinterval 10000 (default is 180000)

Specifies the default total retransmission timeout value for a TCP connection. For a given TCP connection, if TCP has been retransmitting for tcp_ip_abort_interval period of time and it has not received any acknowledgment from the other endpoint during this period, TCP closes this connection.

Temporary - ndd -set /dev/tcp tcp_ip_abort_interval 60000 (default is 480000)


TCP/IP statistics

These set of commands will present some of the TCP/IP statistics you will need in order to follow every change you make in your TCP/IP stack.

netstat -nP tcp | grep WAIT | wc -l;netstat -nP tcp |wc -l;netstat -s -P tcp | grep -E "tcpL"


netstat -I bnx0 10 


iostat -xn 10 


TCP/IP script

This script will assist you in configuring the TCP/IP parameters on your system.


#######Start of TCP/IP script#############

ndd -set /dev/ip ip_forward_src_routed 0 #(Defalut value alreay set)
ndd -set /dev/tcp tcp_rev_src_routes 0 #(Defalut value alreay set)
ndd -set /dev/ip ip_forward_directed_broadcasts 0 #(Defalut value alreay set)
*ndd -set /dev/tcp tcp_conn_req_max_q0 4096 #(Defalut value 1024)
*ndd -set /dev/tcp tcp_conn_req_max_q 1024 #(Defalut value 128)

###Prevent System responding to ICMP timestamp requests
ndd -set /dev/ip ip_respond_to_timestamp 0 #(Defalut value alreay set)
###Prevent System responding to ICMP timestamp Broadcast
ndd -set /dev/ip ip_respond_to_timestamp_broadcast 0 #(Defalut value alreay set)
ndd -set /dev/ip ip_respond_to_address_mask_broadcast 0 #(Defalut value alreay set)
ndd -set /dev/ip ip_respond_to_echo_broadcast 0
ndd -set /dev/arp arp_cleanup_interval 60000
ndd -set /dev/ip ip_ire_arp_interval 60000
ndd -set /dev/ip ip_ignore_redirect 1
ndd -set /dev/ip ip_strict_dst_multihoming 1
ndd -set /dev/ip ip_send_redirects 0

########END of TCP/IP script###############




Eilfa Team