Close Wait problem
CLOSE_WAITS occur when the
connection is open at one end and closed at one end.
CLOSE_WAITS occur due to a lot of
reasons and the actual reason has to be figured out.
Login
to the Webserver that shows the CLOSE_WAIT problem and run the below
netstat -an | egrep 'CLOSE_WAIT' | awk '{print $5}'| sort | uniq -c | sort -nr
or place it as a shell script in /usr/local/adm/bin/count_closewait and run count_closewait going forth
//content of count_closewait.sh script is stated below
**begin
clear
echo "Here is the current Snapshot of CLOSE_WAIT ... "
echo
echo "Count Applicaiton (IP+Port) "
echo "===== ===================== "
netstat -an | egrep 'CLOSE_WAIT' | awk '{print $5}'| sort | uniq -c | sort -nr
echo
**end
below is a snapshot of the output
Count Applicaiton (IP+Port)
===== =====================
120 <<ipaddress1>>.9081
110 <<ipaddress1>>.9087
90 <<ipaddress1>>.9086
85 <<ipaddress1>>.9088
83 <<ipaddress1>>.9083
88 <<ipaddress2>>.9082
73 <<ipaddress2>>.9087
71 <<ipaddress2>>.9084
60 <<ipaddress2>>.9083
From the above
output we infer the count of closewaits to a particular Server on a particular
port. This displays the IP+Port with the maximum closewaits, at top.
- Out of all the outgoing connections from that server,
check the IP and the corresponding port no. with the most number of
CLOSE_WAIT’s associated with it.
- After identifying the ip and port no, do an nslookup
to find the name of that server.
nslookup <ipaddress>
Now login to the appropriate server
as found from the above step.
- Now find the process id of the process running on
that port number of the server on which the CLOSE_WAIT was detected. This
can be done by using the lsof command
lsof |
grep <port no>
The corresponding Java/JVM instance will be process ID will be listed
java 38863002 was ..... 0t0 TCP *:9083 (LISTEN)
ps -ef |grep <<pid>> in the above case
ps -ef |grep 38863002 and the JVM instance will be found.
NOTE : Alternately you can also check the webserver plugin-cfg.xml for the port number above which will give you the application server instance being referred by it.
NOTE : Alternately you can also check the webserver plugin-cfg.xml for the port number above which will give you the application server instance being referred by it.
Now check the systemout logs of that JVM for the reason of CLOSE_WAIT.
- Some times the CLOSE_WAIT’s may be a result of
network connectivity issues. Contact the Networks team to resolve the
issue.
- If there is no problem in network connectivity, check
the applications that are running on that JVM. This can be found by using
the info_app command
info_app | egrep
‘<JVM>|Server’
- If any application has a problem, then the
CLOSE_WAIT’s may be due to the non responsiveness of that application.
Contact the application support team for that and meanwhile perform the
recycle of the web server to prevent any other applications having
problems.
If
all the applications are running fine, then go ahead and recycle the Web
Server.
Notes
§
At any point of time, if you feel that the
number of CLOSE_WAITS has gone too high, you’d better recycle the Web server.
That should clear the issue
§
Usually, after performing the recycle of the web
server on one side, the number of CLOSE_WAIT sockets on the other side also
increases due to the increased load on the other server. Hence it is advised to
recycle the other web server after the first one.
No comments:
Post a Comment