Post
by Jeffrey Hill » Tue Oct 01, 2013 7:49 pm
Short version: I've made some more configuration changes to hopefully completely fix the out of memory instances that have caused the server to go down intermittently. Let me know if you notice the site getting significantly slower or anything like that.
Long, semi-technical version: Today when I got home from work the site wasn't responding, but I was able to log in remotely, which usually didn't work when the site went down in the past (although it was incredibly slow). I dug around trying to figure out what was going on, and all of a sudden the server rebooted. It turns out that the server was in such a bad state that it was constantly swapping memory to and from disk that it was adversely affecting the servers of other customers running on the same physical host server, and so the host initiated a hard reboot (which I would have done immediately had I not been able to log in remotely). Needless to say, I got a warning from my host to figure out what caused this.
I think I had the server configured to accept way too many simultaneous connections and some bot tried to make tons of requests at once. (The access logs appear to indicate this was the case, and I have blocked the offending IP.) A while back I had already reduced the maximum number of clients, but it apparently wasn't enough. I think that reduction may have actually contributed to the server going into a constant swap state rather than crashing entirely. So, I have slashed the number of simultaneous clients even further. Hopefully this won't adversely affect your experience on this board, moaca.org, and moqba.org, but if you notice that the sites are significantly slower or anything else doesn't seem to be working right, let me know.
I hope this finally fixes the problem, because if this exact situation happens again it sure sounds like it's going to be a lot more urgent for me to solve it once and for all..