8

It seems that the server is limited at ~32720 sockets... I have tried every known variable change to raise up this limit. But the server stay limited at 32720 opened socket, even if there is still 4Go of free memory and 80% of idle cpu...

Here's the configuration

~# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 63931 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 798621 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 2048 cpu time (seconds, -t) unlimited max user processes (-u) 63931 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited net.netfilter.nf_conntrack_max = 999999 net.ipv4.netfilter.ip_conntrack_max = 999999 net.nf_conntrack_max = 999999 

Any thoughts ?

10
  • Just so it's said: If you need more than 32000 sockets at once, you have bigger problems than just that number being too low. A normal server doesn't ever have more than a few hundred sockets (maybe even a couple thousand, for a busy server) open at once. Commented Aug 7, 2010 at 12:52
  • 1
    few hundred sockets ? from where did you get that number ? Commented Aug 7, 2010 at 13:24
  • @TheSquad: do you have some security framework loaded, that limits the number of fd's and/or connections? Commented Aug 7, 2010 at 13:39
  • Experience. Even extremely busy web sites rarely serve more than a couple thousand simultaneous clients -- once they get to that point, they're clustered or otherwise distributed to reduce load. And the QuakeNet IRC network, the best example i could think of for mass long-lived TCP client/server stuff, has maybe 80k simultaneous users spread over 40+ servers. That's about 2k per. Commented Aug 7, 2010 at 13:44
  • @mvds: The limit is most likely not due to security stuff -- security would kick in WAY before 32k sockets. Commented Aug 7, 2010 at 13:45

8 Answers 8

7

If you're dealing with openssl and threads, go check your /proc/sys/vm/max_map_count and try to raise it.

Sign up to request clarification or add additional context in comments.

Comments

4

In IPV4, the TCP layer has 16 bits for the destination port, and 16 bits for the source port.

see http://en.wikipedia.org/wiki/Transmission_Control_Protocol

Seeing that your limit is 32K I would expect that you are actually seeing the limit of outbound TCP connections you can make. You should be able to get a max of 65K sockets (this would be the protocol limit). This is the limit for total number of named connections. Fortunately, binding a port for incoming connections only uses 1. But if you are trying to test the number of connections from the same machine, you can only have 65K total outgoing connections (for TCP). To test the amount of incoming connections, you will need multiple computers.

Note: you can call socket(AF_INET,...) up to the number of file descriptors available, but you cannot bind them without increasing the number of ports available. To increase the range, do this:

echo "1024 65535" > /proc/sys/net/ipv4/ip_local_port_range (cat it to see what you currently have--the default is 32768 to 61000)

Perhaps it is time for a new TCP like protocol that will allow 32 bits for the source and dest ports? But how many applications really need more than 65 thousand outbound connections?

The following will allow 100,000 incoming connections on linux mint 16 (64 bit) (you must run it as root to set the limits)

#include <stdio.h> #include <sys/time.h> #include <sys/resource.h> #include <sys/types.h> #include <sys/socket.h> #include <netinet/ip.h> void ShowLimit() { rlimit lim; int err=getrlimit(RLIMIT_NOFILE,&lim); printf("%1d limit: %1ld,%1ld\n",err,lim.rlim_cur,lim.rlim_max); } main() { ShowLimit(); rlimit lim; lim.rlim_cur=100000; lim.rlim_max=100000; int err=setrlimit(RLIMIT_NOFILE,&lim); printf("set returned %1d\n",err); ShowLimit(); int sock=socket(AF_INET,SOCK_STREAM,IPPROTO_TCP); sockaddr_in maddr; maddr.sin_family=AF_INET; maddr.sin_port=htons(80); maddr.sin_addr.s_addr=INADDR_ANY; err=bind(sock,(sockaddr *) &maddr, sizeof(maddr)); err=listen(sock,1024); int sockets=0; while(true) { sockaddr_in raddr; socklen_t rlen=sizeof(raddr); err=accept(sock,(sockaddr *) &raddr,&rlen); if(err>=0) { ++sockets; printf("%1d sockets accepted\n",sockets); } } } 

Comments

2

Which server are you talking about ? It might be it has a hardcoded max, or runs into other limits (max threads/out of address space etc.)

http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-1 has some tuning to needed to achieve a lot of connection, but it doesn't help if the server application limits it in some way or another.

15 Comments

I'm talking about a Core i7 16Go with 160 Go of SSD, with debian... Good article you posted by the way, not sure it will fix the issue, but good to know, i'll let you know how it goes...
Sorry did got what you ask at first... The server application is a software we made with no limitation
A custom server app doesn't get 32k simultaneous clients unless it's made by a noteworthy company or does something shady. In the first case, you wouldn't need help -- someone who didn't understand scaling issues wouldn't have gotten hired.
Then call bullshit, I don't have to justify myself to get an answer from you... This issue is known to be tricky, and I'm not even sure that anyone as a good answer to this question (How many socket a server can handle at max...). The fact is that we have a lot more than 32K connection going on, only each servers are limited to 32K. Right now with all clustered servers, we do have more than 1 millions connection. We are looking for solutions to lower down the number of servers. That's it !
Umm, yeah, you do have to justify yourself to get an answer from me. SO doesn't pay me -- i'm here because i like solving problems. However, i'm not into helping people solve the wrong problem -- and so far, the problem seems more to be this supposed requirement for 32k+ simultaneous long-lived connections on one box, rather than a kernel and/or runtime limit that hardly anyone but stress testers even know exists. So unless i see that that's necessary, i'm going to continue to say "use fewer sockets".
|
2

Check the real limits of the running process with.

cat /proc/{pid}/limits 

The max for nofiles is determined by the Kernel, the following as root would increase the max to 100,000 "files" i.e. 100k CC

echo 100000 > /proc/sys/fs/file-max 

To make it permanent edit /etc/sysctl.conf

fs.file-max = 100000 

You then need the server to ask for more open files, this is different per server. In nginx, for example, you set

worker_rlimit_nofile 100000; 

Reboot nginx and check /proc/{pid}/limits

To test this you need 100,000 sockets in your client, you are limited in the testing to the number of ports in TCP per IP address.

To increase the local port range to maximum...

echo "1024 65535" > /proc/sys/net/ipv4/ip_local_port_range 

This gives you ~64000 ports to test with.

If that is not enough, you need more IP addresses. When testing on localhost you can bind the source/client to an IP other than 127.0.0.1 / localhost.

For example you can bind your test clients to IPs randomly selected from 127.0.0.1 to 127.0.0.5

Using apache-bench you would set

-B 127.0.0.x 

Nodejs sockets would use

localAddress 

/etc/security/limits.conf configures PAM: its usually irrelevant for a server.

If the server is proxying requests using TCP, using upstream or mod_proxy for example, the server is limited by ip_local_port_range. This could easily be the 32,000 limit.

Comments

1

If you're considering an application where you believe you need to open thousands of sockets, you will definitely want to read about The C10k Problem. That page discusses many of the issues you will face as you scale up your number of client connections to a single server.

5 Comments

The C10K problem is from 2003... With 32000 client connected the server still have great performance, it can handle much more believe me !
Don't you think that a seven years old problem is still of actuality with today's Core I7 8Go of RAM, and 2 network of 1Go each ? Like I said in my first post, with 32720 clients connected, the cpu is still under 10% of use, and free memory is way enough to open more connection (4Go). and here's some ifstat rows eth0 KB/s in KB/s out 89.22 145.37 126.97 136.15 104.11 158.18 84.17 123.62 90.64 106.47 93.17 125.98 97.21 130.69
@TheSquad, aha, and most TCP stacks were written 30 years ago. Gigs of RAM have nothing to do with this, it's the client port range. You obviously have no clue, so do yourself a favor and listen to what experienced people have to say.
do yourself a favor and read this : metabrew.com/article/… experienced one...
i have pointed out the RAM because each connection are SSL and SSL session take RAMs...
0

On Gnu+Linux, maximum is what you wrote. This number is (probably) stated somewhere in networking standards. I doubt you really need so many sockets. You should optimize the way you are using sockets instead of creating dozens all the time.

2 Comments

No, socket is just a limited resource. Clients are using sockets. It is not true that socket = connected client or each client needs his own socket. It depends on protocol. For example, TCP needs such an association (1 socket - 1 client) but UDP does not. Even when using TCP, who said that connection must be continuous?
I meant, in our software a client = a socket... We use SSL, do UDP is out of question, and connection needs to be continuous...
0

In net/socket.c the fd is allocated in sock_alloc_fd(), which calls get_unused_fd().

Looking at linux/fs/file.c, the only limit to the number of fd's is sysctl_nr_open, which is limited to

int sysctl_nr_open_max = 1024 * 1024; /* raised later */ /// later... sysctl_nr_open_max = min((size_t)INT_MAX, ~(size_t)0/sizeof(void *)) & -BITS_PER_LONG; 

and can be read using sysctl fs.nr_open which gives 1M by default here. So the fd's are probably not your problem.

edit you then probably checked this as well, but would you care to share the output of

#include <sys/time.h> #include <sys/resource.h> int main() { struct rlimit limit; getrlimit(RLIMIT_NOFILE,&limit); printf("cur: %d, max: %d\n",limit.rlim_cur,limit.rlim_max); } 

with us?

7 Comments

yeah, fd are fine, this was the first thing I have checked... I'm more concern about ports, but even here there should be ~32000 more ports available
Ports should be fine too. If you're running a server, it should be listening on one port, and all the clients would be connected to that same port number. Only a few protocols work differently -- with FTP being the only one i can come up with right off -- and that's because it uses a separate socket for data transfer.
your question was on sockets, and those don't seem to be the problem. "ports" cannot be a problem if you're the server and clients connect to you. Otherwise, it can be and you may have to increase net.ipv4.ip_local_port_range. Please be a little more specific on the situation; what exactly fails, giving what return value?
cHao : the in port is the same, but connection are 2 way side, the outgoing port is not the same for each client.
If you need one port per client, increase the port range, and use multiple ip's. There's only 64k ports in 2 bytes ;-)
|
0

Generally having too much live connections is a bad thing. However, everything depends on the application and the patterns it communicates with its clients.

I suppose there is a pattern when clients have to be permanently async-connected and it is the only way a distributed solution might work.

Assumimg there are no bottlenecks in memory/cpu/network for the current load, and keeping in mind that to leave idle open connection is the only way distributed applications consumes less resources (say, connection time, and the overall/peak memory), overall OS network performance might be higher than using best practices we all know.

Good question and it needs for a solution. The problem is nobody can answer this. I would suggest to use divide & conquer technique and when the bottleneck is found return to us.

Please take apart your application on testbed and you will find the bottleneck.

1 Comment

Hi, This question is 2 years old, and of course, I have overtopped this limit multiple times... Actually, the real limit that will be hit by the linux core, is the number of file descriptor (opened file)... this is the only limit. the limit of sockets, is actually 32768 per IP. Right now I have a working server (using Erlang) with something close to 2 millions user connected.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.