Thin fails to stop, deletes pid file anyway
Reported by JPL | January 6th, 2009 @ 03:57 PM | in 1.1
Running thin 1.0.0 under ubuntu 8.04. During development we've
noticed that thin often fails to stop when we issue
/etc/init.d/thin stop
. To reproduce this bug, I booted
the system, stopped thin (worked ok), started thin, then tried to
stop it (failed). Process is still running but the pid files have
all been cleared out, so it can't terminated the running
process.
If I can provide any additional info, let me know.
Sequence of commands after first (successful) stop is below, along with log file and config file.
art@art:~$ /etc/init.d/thin start
[start] /etc/thin/art.yml ...
Starting server on 0.0.0.0:5000 ...
Starting server on 0.0.0.0:5001 ...
Starting server on 0.0.0.0:5002 ...
Starting server on 0.0.0.0:5003 ...
art@art:~$ ps -ef |grep thin
art 5012 1 83 15:20 ? 00:00:01 thin server (0.0.0.0:5000)
art 5017 1 59 15:20 ? 00:00:01 thin server (0.0.0.0:5001)
art 5022 1 50 15:20 ? 00:00:01 thin server (0.0.0.0:5002)
art 5027 1 72 15:20 ? 00:00:00 thin server (0.0.0.0:5003)
art 5029 4945 0 15:20 pts/0 00:00:00 grep thin
art@art:~$ /etc/init.d/thin stop
[stop] /etc/thin/art.yml ...
Stopping server on 0.0.0.0:5000 ...
Sending QUIT signal to process 5012 ...
>> Stopping ...
>> Thin web server (v1.0.0 codename That's What She Said)
>> Maximum connections set to 1024
>> Listening on 0.0.0.0:5000, CTRL+C to stop
Timeout!
Stopping server on 0.0.0.0:5001 ...
Sending QUIT signal to process 5017 ...
>> Exiting!
Stopping server on 0.0.0.0:5002 ...
Sending QUIT signal to process 5022 ...
>> Exiting!
Stopping server on 0.0.0.0:5003 ...
Sending QUIT signal to process 5027 ...
>> Exiting!
art@art:~$ ps -ef |grep thin
art 5012 1 0 15:20 ? 00:00:07 thin server (0.0.0.0:5000)
art@art:~$ /usr/bin/thin stop --all /etc/thin
[stop] /etc/thin/art.yml ...
Stopping server on 0.0.0.0:5000 ...
Can't stop process, no PID found in /home/art/tmp/pids/thin.5000.pid
Stopping server on 0.0.0.0:5001 ...
Can't stop process, no PID found in /home/art/tmp/pids/thin.5001.pid
Stopping server on 0.0.0.0:5002 ...
Can't stop process, no PID found in /home/art/tmp/pids/thin.5002.pid
Stopping server on 0.0.0.0:5003 ...
Can't stop process, no PID found in /home/art/tmp/pids/thin.5003.pid
art@art:~$ ls /home/art/tmp/pids/
art@art:~$
------ tail of /home/art/log/thin.5000.log ---
>> Writing PID to /home/art/tmp/pids/thin.5000.pid
>> Changing process privilege to art:art
>> Using rails adapter
>> Stopping ...
>> Thin web server (v1.0.0 codename That's What She Said)
>> Maximum connections set to 1024
>> Listening on 0.0.0.0:5000, CTRL+C to stop
------ config file /etc/thin/art.yml ------------
---
pid: /home/art/tmp/pids/thin.pid
user: art
adapter: rails
log: /home/art/log/thin.log
timeout: 180
max_conns: 1024
port: 5000
environment: production
chdir: /home/art
max_persistent_conns: 512
address: 0.0.0.0
servers: 4
group: art
daemonize: true
Comments and changes to this ticket
-
macournoyer January 6th, 2009 @ 04:15 PM
- Milestone changed from Future to 1.1
- State changed from new to open
hi jpl,
So only the server on port 5000 is not being stopped properly? I see it times out when it sends the QUIT signal.
Is that server still responding to http request before the stop on port 5000 ?
-
macournoyer January 6th, 2009 @ 05:16 PM
Those are warning because you don't send the correct number of args to a proc in index.html.erb.
So this time it worked? I see all processes were stopped.
-
todd (at snappl) January 11th, 2009 @ 03:58 AM
I'm seeing exactly the same behaviour on ubuntu 8.04:
sudo invoke-rc.d thin restart [restart] /etc/thin/s_snap.yml ... Stopping server on 0.0.0.0:8110 ... Sending QUIT signal to process 26380 ... Timeout! Stopping server on 0.0.0.0:8111 ... Sending QUIT signal to process 6244 ...
Exiting!
Stopping server on 0.0.0.0:8112 ... Sending QUIT signal to process 2538 ...
Exiting!
Stopping server on 0.0.0.0:8113 ... Sending QUIT signal to process 2797 ...
Exiting!
Stopping server on 0.0.0.0:8114 ... Sending QUIT signal to process 3092 ... Timeout! Stopping server on 0.0.0.0:8115 ... Sending QUIT signal to process 1817 ... Timeout! Stopping server on 0.0.0.0:8116 ... Sending QUIT signal to process 26410 ... Timeout! Stopping server on 0.0.0.0:8117 ... Sending QUIT signal to process 2124 ... Timeout!
have to manually go in and kill all the running processes.
-
todd (at snappl) January 11th, 2009 @ 04:00 AM
Oops, sorry about the formatting boo-boo:
sudo invoke-rc.d thin restart [restart] /etc/thin/s_snap.yml ... Stopping server on 0.0.0.0:8110 ... Sending QUIT signal to process 26380 ... Timeout! Stopping server on 0.0.0.0:8111 ... Sending QUIT signal to process 6244 ... >> Exiting!
Stopping server on 0.0.0.0:8112 ... Sending QUIT signal to process 2538 ... >> Exiting!
Stopping server on 0.0.0.0:8113 ... Sending QUIT signal to process 2797 ... >> Exiting!
Stopping server on 0.0.0.0:8114 ... Sending QUIT signal to process 3092 ... Timeout! Stopping server on 0.0.0.0:8115 ... Sending QUIT signal to process 1817 ... Timeout! Stopping server on 0.0.0.0:8116 ... Sending QUIT signal to process 26410 ... Timeout! Stopping server on 0.0.0.0:8117 ... Sending QUIT signal to process 2124 ... Timeout!
-
macournoyer January 11th, 2009 @ 03:37 PM
@todd: seems all your servers are timing out. oye! Do you have this error when just running 1 server?
-
macournoyer January 12th, 2009 @ 09:26 AM
Also, does this happen only when the servers are started at boot time?
If you remove thin from the /etc/init.d items and you start and stop, do you see the same behaviour?
-
todd (at snappl) January 13th, 2009 @ 01:11 PM
@macournoyer: No, it doesn't happen with just one. I'll qualify that:
-
Started from /etc/init.d/thin - always happens, although not always with all instances;
-
Started from command line in the app. One instance: does not seem to happen at all.
-
Started from command line in the app with three instances (production runs 8): does not seem to happen.
I'm attaching a file with a backtrace (so as to not completely mangle the comment again--preview mode please, lighthouse!)
-
-
Alex Arnell January 16th, 2009 @ 02:22 PM
I am having the same issue as well. Except in my it happens on CentOS. We've spend nearly 2 days digging into this trying to re-create the problem in small scripts without any luck.
We're using Thin to server a cluster of Merb processes. On my local OS 10.5.6 machine, it works perfect, starts, stops, restarts all fine. On the production CentOS machines they won't restart or stop. Note: Merb traps HUP for restarts and ABRT kills forked child workers (thin).
We attached strace to one of our workers and I think the problem is related to EventMachine on linux systems with epoll.
epoll_wait(10, {}, 65536, 50) = 0 epoll_wait(10, {}, 65536, 50) = 0 epoll_wait(10, 7fffe93262c0, 65536, 50) = -1 EINTR (Interrupted system call) --- SIGHUP (Hangup) @ 0 (0) --- rt_sigreturn(0x1) = -1 EINTR (Interrupted system call) select(0, [], [], [], {0, 4998}) = 0 (Timeout) select(0, [], [], [], {0, 326}) = 0 (Timeout) select(0, [], [], [], {0, 0}) = 0 (Timeout) epoll_wait(10, {}, 65536, 50) = 0 epoll_wait(10, {}, 65536, 50) = 0
It just eats the interupt and starts its loop up again.
-
macournoyer January 16th, 2009 @ 02:33 PM
@alex try w/ the "no_epoll: true" option to see if that really is the problem and I'll open a ticket for EM ppl at http://rubyeventmachine.com
-
Alex Arnell January 16th, 2009 @ 03:57 PM
Yup that proved it, strace output below.
select(12, [7 10], [], [], {0, 89998}) = 0 (Timeout) select(12, [7 10], [], [], {0, 0}) = 0 (Timeout) select(12, [7 10], [], [], {0, 89998}) = ? ERESTARTNOHAND (To be restarted) --- SIGHUP (Hangup) @ 0 (0) --- rt_sigreturn(0x1) = -1 EINTR (Interrupted system call) rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0
And the process behaved like it should.
BTW, I had to monkey patch Merb's thin adapter to configure the no_epoll explicitly on the thin server instance.
Thin's command line --no-epoll results in an error when starting as cluster:
thin start -A merb -s 2 -p 5000 --no-epoll Starting server on 0.0.0.0:5000 ... /usr/lib/ruby/1.8/optparse.rb:1443:in `complete': invalid option: --no_epoll (OptionParser::InvalidOption) from /usr/lib/ruby/1.8/optparse.rb:1441:in `catch' from /usr/lib/ruby/1.8/optparse.rb:1441:in `complete' from /usr/lib/ruby/1.8/optparse.rb:1254:in `parse_in_order' from /usr/lib/ruby/1.8/optparse.rb:1247:in `catch' from /usr/lib/ruby/1.8/optparse.rb:1247:in `parse_in_order' from /usr/lib/ruby/1.8/optparse.rb:1241:in `order!' from /usr/lib/ruby/1.8/optparse.rb:1332:in `permute!' from /usr/lib/ruby/1.8/optparse.rb:1353:in `parse!' from /usr/lib/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/runner.rb:130:in `parse!' from /usr/lib/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/runner.rb:47:in `initialize' from /usr/lib/ruby/gems/1.8/gems/thin-1.0.0/bin/thin:6:in `new' from /usr/lib/ruby/gems/1.8/gems/thin-1.0.0/bin/thin:6 from /usr/bin/thin:31:in `load' from /usr/bin/thin:31
-
macournoyer January 16th, 2009 @ 04:08 PM
@alex I just fixed the option parsing error on git and will notify the EM ppl about the epoll issue.
Thx for reporting this. I will let you know about any progress on EM side.
-
Andrew Otto January 18th, 2009 @ 04:55 PM
@alex
How did you monkey-patch merb to use no_epoll? I'm having the same trouble on Gentoo using the default thin in portage and would rather not have to install thin from source to pick up macournoyer's fix.
Thanks!
-
Alex Arnell January 19th, 2009 @ 12:59 PM
@Andrew
I just subclassed Merb's Thin adapter. See the attached file. I placed thin in [merb-app-root]/merb/rack/ and then required it in my init.rb.
You could place the file anywhere really, just make sure to require it as somewhere in Merb's boot process before it tries to construct the adapter.
-
Alex Arnell January 19th, 2009 @ 01:14 PM
Oh and I forget to mention. Start your merb app with
merb -a nethin
ormerb -a ne
. -
Daniel Parker March 16th, 2009 @ 01:16 PM
I'm having this same problem, even after applying what I can see was fixed here so far: 1) Updated to eventmachine-0.12.6 2) Required the following file in my merb init.rb:
module Merb module Rack class Thin class << self alias :with_epoll_new_server :new_server def new_server(port) server = with_epoll_new_server(port) server.no_epoll = true server.config server end end end # no-epoll Thin end # Rack end # Merb
Thin Config
--- user: www-data group: www-data pid: /tmp/pids/thin.pid address: 0.0.0.0 timeout: 30 port: 6011 log: log/thin.log max_conns: 1024 require: []
environment: production max_persistent_conns: 512 daemonize: true servers: 2 chdir: /home/dcparker/app/
Bash Walk-through
$ tail -f app/log/thin.601.log & #<- (watch file while performing new commands) $ sudo thin start --config SiteConfigs/app-thin.yml*
Starting server on 0.0.0.0:6011 ... Starting server on 0.0.0.0:6012 ... ==> app/log/thin.6011.log <==
Writing PID to /tmp/pids/thin.6011.pid Changing process privilege to www-data:www-data Using merb adapter Loading init file from /home/dcparker/app/config/init.rb Loading /home/dcparker/app/config/environments/production.rb Logging to file at /home/dcparker/app/log/production.log Thin web server (v1.0.0 codename That's What She Said) Maximum connections set to 1024 Listening on 0.0.0.0:6011, CTRL+C to stop
==> app/log/thin.6012.log <==
Writing PID to /tmp/pids/thin.6012.pid Changing process privilege to www-data:www-data Using merb adapter Loading init file from /home/dcparker/app/config/init.rb Loading /home/dcparker/app/config/environments/production.rb Logging to file at /home/dcparker/app/log/production.log Thin web server (v1.0.0 codename That's What She Said) Maximum connections set to 1024 Listening on 0.0.0.0:6012, CTRL+C to stop
# At this point the service is running fine... the pid file at /tmp/pids/thin.6011.pid points to one of the parent processes running merb.
$ sudo thin stop --config SiteConfigs/app-thin.yml Stopping server on 0.0.0.0:6011 ... Sending QUIT signal to process 5492 ...
Stopping ...
==> app/log/thin.6011.log <==
Stopping ... Timeout! Stopping server on 0.0.0.0:6012 ... Sending QUIT signal to process 5497 ... Stopping ...
==> app/log/thin.6012.log <==
Stopping ... Timeout! $ ps -Af | grep merb
www-data 5492 1 0 17:00 ? 00:00:00 merb : merb : master www-data 5497 1 0 17:00 ? 00:00:00 merb : merb : master www-data 5498 5492 0 17:00 ? 00:00:00 merb : merb : master www-data 5499 5497 0 17:00 ? 00:00:00 merb : merb : master <br/>
# At this point the pid files are gone, but there are still merb process running. I can't stop them with kill -QUIT or kill -TERM, but kill -INT seems to do the job correctly (same as Ctrl+C):
$ sudo kill -INT 5492
==> apps/gopennyless.com/log/thin.6011.log <==
Exiting! Exiting!
$ sudo kill -INT 5497
==> apps/gopennyless.com/log/thin.6012.log <==
Exiting! Exiting!
So...
What seems to be the problem? It'd be nice if "thin stop" would stop the processes nicely - any ideas? Does it have to do with the user/group I'm running it as?
-
macournoyer March 16th, 2009 @ 08:47 PM
Seems like only the merb master processes are still there. I don't know how this works, but can you disable the master-process feature and try again?
-
Daniel Parker March 29th, 2009 @ 11:57 PM
In my example, the processes I kill manually are the process numbers listed in the PID files. I don't know how I have merb configured for this app, but the merb processes listed there are the only processes there in the first place. They just seem to be left untouched, then the PID files are deleted. But I can send kill -INT to them and they terminate just fine.
-
Alexey Chebotar May 5th, 2009 @ 04:53 AM
- Tag set to daemon, ruby19
Hello Marc. On ruby 1.9.1_p0 force_kill method does not work. To reproduce: Change method to (removed 'rescue nil'):
def force_kill(pid_file) Logging.log "Force KILL!" Process.kill("KILL", File.read(pid_file)) File.delete(pid_file) if File.exist?(pid_file) end
Try to stop Thin from daemonized mode you will see error:
# thin stop -P /var/run/onm.pid -t 10 Sending QUIT signal to process 8602 ... Timeout! Force KILL! /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/daemonizing.rb:134:in `kill': can't convert String into Integer (TypeError) from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/daemonizing.rb:134:in `force_kill' from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/daemonizing.rb:124:in `rescue in send_signal' from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/daemonizing.rb:111:in `send_signal' from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/daemonizing.rb:100:in `kill' from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/controllers/controller.rb:87:in `block in stop' from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/controllers/controller.rb:128:in `tail_log' from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/controllers/controller.rb:86:in `stop' from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/runner.rb:173:in `run_command' from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/runner.rb:139:in `run!' from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/bin/thin:6:in `<top (required)>' from /usr/bin/thin:19:in `load' from /usr/bin/thin:19:in `<main>'
To solve this problem you can change this method to:
def force_kill(pid_file) if File.file?(pid_file) && pid = File.read(pid_file) pid = pid.to_i Logging.log "Sending KILL signal to process #{pid} ... " Process.kill("KILL", pid) File.delete(pid_file) if File.exist?(pid_file) else puts "Can't stop process, no PID found in #{pid_file}" end end
Stopping thin will looks like:
# thin stop -P /var/run/onm.pid -t 10 Sending QUIT signal to process 8602 ... Timeout! Sending KILL signal to process 8602 ...
-
macournoyer May 21st, 2009 @ 04:24 PM
- State changed from open to resolved
(from [b1908c0c38e4b7b253975492abc463cddf631dde]) Fix force kill under 1.9 and name version [#92 state:resolved] http://github.com/macournoyer/thin/commit/b1908c0c38e4b7b253975492a...
Please Sign in or create a free account to add a new ticket.
With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป
People watching this ticket
Attachments
Referenced by
- 92 Thin fails to stop, deletes pid file anyway (from [b1908c0c38e4b7b253975492abc463cddf631dde]) Fix for...