#92 ✓resolved
JPL

Thin fails to stop, deletes pid file anyway

Reported by JPL | January 6th, 2009 @ 03:57 PM | in 1.1

Running thin 1.0.0 under ubuntu 8.04. During development we've noticed that thin often fails to stop when we issue /etc/init.d/thin stop. To reproduce this bug, I booted the system, stopped thin (worked ok), started thin, then tried to stop it (failed). Process is still running but the pid files have all been cleared out, so it can't terminated the running process.

If I can provide any additional info, let me know.

Sequence of commands after first (successful) stop is below, along with log file and config file.

art@art:~$ /etc/init.d/thin start
[start] /etc/thin/art.yml ...
Starting server on 0.0.0.0:5000 ... 
Starting server on 0.0.0.0:5001 ... 
Starting server on 0.0.0.0:5002 ... 
Starting server on 0.0.0.0:5003 ... 
art@art:~$ ps -ef |grep thin
art       5012     1 83 15:20 ?        00:00:01 thin server (0.0.0.0:5000)                                                                                                                                                                                                                                                                         
art       5017     1 59 15:20 ?        00:00:01 thin server (0.0.0.0:5001)                                                                                                                                                                                                                                                                         
art       5022     1 50 15:20 ?        00:00:01 thin server (0.0.0.0:5002)                                                                                                                                                                                                                                                                         
art       5027     1 72 15:20 ?        00:00:00 thin server (0.0.0.0:5003)                                                                                                                                                                                                                                                                         
art       5029  4945  0 15:20 pts/0    00:00:00 grep thin
art@art:~$ /etc/init.d/thin stop
[stop] /etc/thin/art.yml ...
Stopping server on 0.0.0.0:5000 ... 

Sending QUIT signal to process 5012 ... 
>> Stopping ...
>> Thin web server (v1.0.0 codename That's What She Said)
>> Maximum connections set to 1024
>> Listening on 0.0.0.0:5000, CTRL+C to stop
Timeout!
Stopping server on 0.0.0.0:5001 ... 
Sending QUIT signal to process 5017 ... 
>> Exiting!

Stopping server on 0.0.0.0:5002 ... 
Sending QUIT signal to process 5022 ... 
>> Exiting!

Stopping server on 0.0.0.0:5003 ... 
Sending QUIT signal to process 5027 ... 
>> Exiting!

art@art:~$ ps -ef |grep thin
art       5012     1  0 15:20 ?        00:00:07 thin server (0.0.0.0:5000)                                                                                                                                                                                                                                                                         

art@art:~$ /usr/bin/thin stop --all /etc/thin
[stop] /etc/thin/art.yml ...
Stopping server on 0.0.0.0:5000 ... 
Can't stop process, no PID found in /home/art/tmp/pids/thin.5000.pid
Stopping server on 0.0.0.0:5001 ... 
Can't stop process, no PID found in /home/art/tmp/pids/thin.5001.pid
Stopping server on 0.0.0.0:5002 ... 
Can't stop process, no PID found in /home/art/tmp/pids/thin.5002.pid
Stopping server on 0.0.0.0:5003 ... 
Can't stop process, no PID found in /home/art/tmp/pids/thin.5003.pid
art@art:~$ ls /home/art/tmp/pids/
art@art:~$

------ tail of /home/art/log/thin.5000.log ---
>> Writing PID to /home/art/tmp/pids/thin.5000.pid
>> Changing process privilege to art:art
>> Using rails adapter
>> Stopping ...
>> Thin web server (v1.0.0 codename That's What She Said)
>> Maximum connections set to 1024
>> Listening on 0.0.0.0:5000, CTRL+C to stop



------ config file /etc/thin/art.yml ------------
--- 
pid: /home/art/tmp/pids/thin.pid
user: art
adapter: rails
log: /home/art/log/thin.log
timeout: 180
max_conns: 1024
port: 5000
environment: production
chdir: /home/art
max_persistent_conns: 512
address: 0.0.0.0
servers: 4
group: art
daemonize: true

Comments and changes to this ticket

  • macournoyer

    macournoyer January 6th, 2009 @ 04:15 PM

    • Milestone changed from Future to 1.1
    • State changed from “new” to “open”

    hi jpl,

    So only the server on port 5000 is not being stopped properly? I see it times out when it sends the QUIT signal.

    Is that server still responding to http request before the stop on port 5000 ?

  • macournoyer

    macournoyer January 6th, 2009 @ 05:16 PM

    Those are warning because you don't send the correct number of args to a proc in index.html.erb.

    So this time it worked? I see all processes were stopped.

  • todd (at snappl)

    todd (at snappl) January 11th, 2009 @ 03:58 AM

    I'm seeing exactly the same behaviour on ubuntu 8.04:

    sudo invoke-rc.d thin restart [restart] /etc/thin/s_snap.yml ... Stopping server on 0.0.0.0:8110 ... Sending QUIT signal to process 26380 ... Timeout! Stopping server on 0.0.0.0:8111 ... Sending QUIT signal to process 6244 ...

    Exiting!

    Stopping server on 0.0.0.0:8112 ... Sending QUIT signal to process 2538 ...

    Exiting!

    Stopping server on 0.0.0.0:8113 ... Sending QUIT signal to process 2797 ...

    Exiting!

    Stopping server on 0.0.0.0:8114 ... Sending QUIT signal to process 3092 ... Timeout! Stopping server on 0.0.0.0:8115 ... Sending QUIT signal to process 1817 ... Timeout! Stopping server on 0.0.0.0:8116 ... Sending QUIT signal to process 26410 ... Timeout! Stopping server on 0.0.0.0:8117 ... Sending QUIT signal to process 2124 ... Timeout!

    have to manually go in and kill all the running processes.

  • todd (at snappl)

    todd (at snappl) January 11th, 2009 @ 04:00 AM

    Oops, sorry about the formatting boo-boo:

    sudo invoke-rc.d thin restart [restart] /etc/thin/s_snap.yml ... Stopping server on 0.0.0.0:8110 ... Sending QUIT signal to process 26380 ... Timeout! Stopping server on 0.0.0.0:8111 ... Sending QUIT signal to process 6244 ... >> Exiting!

    Stopping server on 0.0.0.0:8112 ... Sending QUIT signal to process 2538 ... >> Exiting!

    Stopping server on 0.0.0.0:8113 ... Sending QUIT signal to process 2797 ... >> Exiting!

    Stopping server on 0.0.0.0:8114 ... Sending QUIT signal to process 3092 ... Timeout! Stopping server on 0.0.0.0:8115 ... Sending QUIT signal to process 1817 ... Timeout! Stopping server on 0.0.0.0:8116 ... Sending QUIT signal to process 26410 ... Timeout! Stopping server on 0.0.0.0:8117 ... Sending QUIT signal to process 2124 ... Timeout!

  • macournoyer

    macournoyer January 11th, 2009 @ 03:37 PM

    @todd: seems all your servers are timing out. oye! Do you have this error when just running 1 server?

  • macournoyer

    macournoyer January 12th, 2009 @ 09:26 AM

    Also, does this happen only when the servers are started at boot time?

    If you remove thin from the /etc/init.d items and you start and stop, do you see the same behaviour?

  • todd (at snappl)

    todd (at snappl) January 13th, 2009 @ 01:11 PM

    @macournoyer: No, it doesn't happen with just one. I'll qualify that:

    • Started from /etc/init.d/thin - always happens, although not always with all instances;

    • Started from command line in the app. One instance: does not seem to happen at all.

    • Started from command line in the app with three instances (production runs 8): does not seem to happen.

    I'm attaching a file with a backtrace (so as to not completely mangle the comment again--preview mode please, lighthouse!)

  • Alex Arnell

    Alex Arnell January 16th, 2009 @ 02:22 PM

    I am having the same issue as well. Except in my it happens on CentOS. We've spend nearly 2 days digging into this trying to re-create the problem in small scripts without any luck.

    We're using Thin to server a cluster of Merb processes. On my local OS 10.5.6 machine, it works perfect, starts, stops, restarts all fine. On the production CentOS machines they won't restart or stop. Note: Merb traps HUP for restarts and ABRT kills forked child workers (thin).

    We attached strace to one of our workers and I think the problem is related to EventMachine on linux systems with epoll.

    
    epoll_wait(10, {}, 65536, 50)           = 0
    epoll_wait(10, {}, 65536, 50)           = 0
    epoll_wait(10, 7fffe93262c0, 65536, 50) = -1 EINTR (Interrupted system call)
    --- SIGHUP (Hangup) @ 0 (0) ---
    rt_sigreturn(0x1)                       = -1 EINTR (Interrupted system call)
    select(0, [], [], [], {0, 4998})        = 0 (Timeout)
    select(0, [], [], [], {0, 326})         = 0 (Timeout)
    select(0, [], [], [], {0, 0})           = 0 (Timeout)
    epoll_wait(10, {}, 65536, 50)           = 0
    epoll_wait(10, {}, 65536, 50)           = 0
    

    It just eats the interupt and starts its loop up again.

  • macournoyer

    macournoyer January 16th, 2009 @ 02:33 PM

    @alex try w/ the "no_epoll: true" option to see if that really is the problem and I'll open a ticket for EM ppl at http://rubyeventmachine.com

  • Alex Arnell

    Alex Arnell January 16th, 2009 @ 03:57 PM

    Yup that proved it, strace output below.

    
    select(12, [7 10], [], [], {0, 89998})  = 0 (Timeout)
    select(12, [7 10], [], [], {0, 0})      = 0 (Timeout)
    select(12, [7 10], [], [], {0, 89998})  = ? ERESTARTNOHAND (To be restarted)
    --- SIGHUP (Hangup) @ 0 (0) ---
    rt_sigreturn(0x1)                       = -1 EINTR (Interrupted system call)
    rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
    rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
    rt_sigprocmask(SIG_BLOCK, NULL, [], 8)  = 0
    

    And the process behaved like it should.

    BTW, I had to monkey patch Merb's thin adapter to configure the no_epoll explicitly on the thin server instance.

    Thin's command line --no-epoll results in an error when starting as cluster:

    
    thin start -A merb -s 2 -p 5000 --no-epoll
    Starting server on 0.0.0.0:5000 ... 
    /usr/lib/ruby/1.8/optparse.rb:1443:in `complete': invalid option: --no_epoll (OptionParser::InvalidOption)
    	from /usr/lib/ruby/1.8/optparse.rb:1441:in `catch'
    	from /usr/lib/ruby/1.8/optparse.rb:1441:in `complete'
    	from /usr/lib/ruby/1.8/optparse.rb:1254:in `parse_in_order'
    	from /usr/lib/ruby/1.8/optparse.rb:1247:in `catch'
    	from /usr/lib/ruby/1.8/optparse.rb:1247:in `parse_in_order'
    	from /usr/lib/ruby/1.8/optparse.rb:1241:in `order!'
    	from /usr/lib/ruby/1.8/optparse.rb:1332:in `permute!'
    	from /usr/lib/ruby/1.8/optparse.rb:1353:in `parse!'
    	from /usr/lib/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/runner.rb:130:in `parse!'
    	from /usr/lib/ruby/gems/1.8/gems/thin-1.0.0/lib/thin/runner.rb:47:in `initialize'
    	from /usr/lib/ruby/gems/1.8/gems/thin-1.0.0/bin/thin:6:in `new'
    	from /usr/lib/ruby/gems/1.8/gems/thin-1.0.0/bin/thin:6
    	from /usr/bin/thin:31:in `load'
    	from /usr/bin/thin:31
    
  • macournoyer

    macournoyer January 16th, 2009 @ 04:08 PM

    @alex I just fixed the option parsing error on git and will notify the EM ppl about the epoll issue.

    Thx for reporting this. I will let you know about any progress on EM side.

  • macournoyer
  • Andrew Otto

    Andrew Otto January 18th, 2009 @ 04:55 PM

    @alex

    How did you monkey-patch merb to use no_epoll? I'm having the same trouble on Gentoo using the default thin in portage and would rather not have to install thin from source to pick up macournoyer's fix.

    Thanks!

  • Alex Arnell

    Alex Arnell January 19th, 2009 @ 12:59 PM

    @Andrew

    I just subclassed Merb's Thin adapter. See the attached file. I placed thin in [merb-app-root]/merb/rack/ and then required it in my init.rb.

    You could place the file anywhere really, just make sure to require it as somewhere in Merb's boot process before it tries to construct the adapter.

  • Alex Arnell

    Alex Arnell January 19th, 2009 @ 01:14 PM

    Oh and I forget to mention. Start your merb app with merb -a nethin or merb -a ne.

  • Daniel Parker

    Daniel Parker March 16th, 2009 @ 01:16 PM

    I'm having this same problem, even after applying what I can see was fixed here so far: 1) Updated to eventmachine-0.12.6 2) Required the following file in my merb init.rb:

    
    module Merb
      module Rack
        class Thin
          class << self
            alias :with_epoll_new_server :new_server
    
            def new_server(port)
              server = with_epoll_new_server(port)
              server.no_epoll = true
              server.config
              server
            end
          end
        end # no-epoll Thin
      end # Rack
    end # Merb
    

    Thin Config

    --- user: www-data group: www-data pid: /tmp/pids/thin.pid address: 0.0.0.0 timeout: 30 port: 6011 log: log/thin.log max_conns: 1024 require: []

    environment: production max_persistent_conns: 512 daemonize: true servers: 2 chdir: /home/dcparker/app/

    Bash Walk-through

    $ tail -f app/log/thin.601.log & #<- (watch file while performing new commands) $ sudo thin start --config SiteConfigs/app-thin.yml* Starting server on 0.0.0.0:6011 ... Starting server on 0.0.0.0:6012 ... ==> app/log/thin.6011.log <==

    Writing PID to /tmp/pids/thin.6011.pid Changing process privilege to www-data:www-data Using merb adapter Loading init file from /home/dcparker/app/config/init.rb Loading /home/dcparker/app/config/environments/production.rb Logging to file at /home/dcparker/app/log/production.log Thin web server (v1.0.0 codename That's What She Said) Maximum connections set to 1024 Listening on 0.0.0.0:6011, CTRL+C to stop

    ==> app/log/thin.6012.log <==

    Writing PID to /tmp/pids/thin.6012.pid Changing process privilege to www-data:www-data Using merb adapter Loading init file from /home/dcparker/app/config/init.rb Loading /home/dcparker/app/config/environments/production.rb Logging to file at /home/dcparker/app/log/production.log Thin web server (v1.0.0 codename That's What She Said) Maximum connections set to 1024 Listening on 0.0.0.0:6012, CTRL+C to stop

    # At this point the service is running fine... the pid file at /tmp/pids/thin.6011.pid points to one of the parent processes running merb.

    $ sudo thin stop --config SiteConfigs/app-thin.yml Stopping server on 0.0.0.0:6011 ... Sending QUIT signal to process 5492 ...

    Stopping ...

    ==> app/log/thin.6011.log <==

    Stopping ... Timeout! Stopping server on 0.0.0.0:6012 ... Sending QUIT signal to process 5497 ... Stopping ...

    ==> app/log/thin.6012.log <==

    Stopping ... Timeout! $ ps -Af | grep merb www-data 5492 1 0 17:00 ? 00:00:00 merb : merb : master www-data 5497 1 0 17:00 ? 00:00:00 merb : merb : master www-data 5498 5492 0 17:00 ? 00:00:00 merb : merb : master www-data 5499 5497 0 17:00 ? 00:00:00 merb : merb : master <br/>

    # At this point the pid files are gone, but there are still merb process running. I can't stop them with kill -QUIT or kill -TERM, but kill -INT seems to do the job correctly (same as Ctrl+C):

    $ sudo kill -INT 5492 ==> apps/gopennyless.com/log/thin.6011.log <==

    Exiting! Exiting!

    $ sudo kill -INT 5497 ==> apps/gopennyless.com/log/thin.6012.log <==

    Exiting! Exiting!

    So...

    What seems to be the problem? It'd be nice if "thin stop" would stop the processes nicely - any ideas? Does it have to do with the user/group I'm running it as?

  • macournoyer

    macournoyer March 16th, 2009 @ 08:47 PM

    Seems like only the merb master processes are still there. I don't know how this works, but can you disable the master-process feature and try again?

  • Daniel Parker

    Daniel Parker March 29th, 2009 @ 11:57 PM

    In my example, the processes I kill manually are the process numbers listed in the PID files. I don't know how I have merb configured for this app, but the merb processes listed there are the only processes there in the first place. They just seem to be left untouched, then the PID files are deleted. But I can send kill -INT to them and they terminate just fine.

  • Alexey Chebotar

    Alexey Chebotar May 5th, 2009 @ 04:53 AM

    • Tag set to daemon, ruby19

    Hello Marc. On ruby 1.9.1_p0 force_kill method does not work. To reproduce: Change method to (removed 'rescue nil'):

    
          def force_kill(pid_file)
            Logging.log "Force KILL!"
            Process.kill("KILL", File.read(pid_file))
            File.delete(pid_file) if File.exist?(pid_file)
          end
    

    Try to stop Thin from daemonized mode you will see error:

    
    # thin stop -P /var/run/onm.pid -t 10
    Sending QUIT signal to process 8602 ...
    Timeout!
    Force KILL!
    /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/daemonizing.rb:134:in `kill': can't convert String into Integer (TypeError)
     from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/daemonizing.rb:134:in `force_kill'
     from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/daemonizing.rb:124:in `rescue in send_signal'
     from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/daemonizing.rb:111:in `send_signal'
     from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/daemonizing.rb:100:in `kill'
     from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/controllers/controller.rb:87:in `block in stop' 
     from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/controllers/controller.rb:128:in `tail_log'  
     from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/controllers/controller.rb:86:in `stop'
     from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/runner.rb:173:in `run_command' 
     from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/lib/thin/runner.rb:139:in `run!' 
     from /usr/lib/ruby/gems/1.9.1/gems/thin-1.0.0/bin/thin:6:in `<top (required)>' 
     from /usr/bin/thin:19:in `load'  
     from /usr/bin/thin:19:in `<main>'
    

    To solve this problem you can change this method to:

    
          def force_kill(pid_file)
            if File.file?(pid_file) && pid = File.read(pid_file)
              pid = pid.to_i
              Logging.log "Sending KILL signal to process #{pid} ... "
              Process.kill("KILL", pid)
              File.delete(pid_file) if File.exist?(pid_file)
            else
              puts "Can't stop process, no PID found in #{pid_file}"
            end
          end
    

    Stopping thin will looks like:

    
    # thin stop -P /var/run/onm.pid -t 10
    Sending QUIT signal to process 8602 ...
    Timeout!
    Sending KILL signal to process 8602 ...
    
  • macournoyer

    macournoyer May 21st, 2009 @ 04:24 PM

    • State changed from “open” to “resolved”

Please Sign in or create a free account to add a new ticket.

With your very own profile, you can contribute to projects, track your activity, watch tickets, receive and update tickets through your email and much more.

New-ticket Create new ticket

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

People watching this ticket

Attachments

Tags

Referenced by

Pages