Blogged by Ujihisa. Standard methods of programming and thoughts including Clojure, Vim, LLVM, Haskell, Ruby and Mathematics written by a Japanese programmer. github/ujihisa

Friday, March 5, 2010

All About Spawn

Spawn = Fork + Exec, but works also on platforms which don't support fork.

The usage of spawn is often said to be cryptic. Here I'll describe common cases.

  • system('ls')

    pid = spawn(['ls', 'ls'])
    Process.wait(pid)
    
  • system('ls .')

    pid = spawn('ls', '.')
    Process.wait(pid)
    
  • system('ls . &')

    pid = spawn('ls', '.')
    
  • system('ls . > a.txt')

    pid = spawn('ls', '.', :out => 'a.txt')
    Process.wait(pid)
    
  • system('ls . >> a.txt')

    pid = spawn('ls', '.', :out => ['a.txt', 'a'])
    Process.wait(pid)
    
  • system('ls . >& a.txt')

    pid = spawn('ls', '.', [:out, :err] => ['a.txt', 'w'])
    Process.wait(pid)
    
  • IO.popen('cat a.txt') {|io| p io.read

    i, o = IO.pipe
    spawn('cat a.txt', :out => o)
    o.close
    p i.read
    
  • system('make all &')

    spawn('make', 'all)
    
  • Dir.chdir('a') { system 'make all &' }

    spawn('make', 'all', :chdir => 'a')
    
  • Passing ENV:

    • Shell: $ AAA=1 make all &
    • Ruby: ENV['AAA'] = '1'; system('make all &')
    • Ruby with Spawn: spawn({'AAA' => '1'}, 'make', 'all')

Further information can be available in process.c in ruby trunk. Here the documentation from the file:

/*
* call-seq:
* spawn([env,] command... [,options]) => pid
* Process.spawn([env,] command... [,options]) => pid
*
* spawn executes specified command and return its pid.
*
* This method doesn't wait for end of the command.
* The parent process should
* use <code>Process.wait</code> to collect
* the termination status of its child or
* use <code>Process.detach</code> to register
* disinterest in their status;
* otherwise, the operating system may accumulate zombie processes.
*
* spawn has bunch of options to specify process attributes:
*
* env: hash
* name => val : set the environment variable
* name => nil : unset the environment variable
* command...:
* commandline : command line string which is passed to the standard shell
* cmdname, arg1, ... : command name and one or more arguments (no shell)
* [cmdname, argv0], arg1, ... : command name, argv[0] and zero or more arguments (no shell)
* options: hash
* clearing environment variables:
* :unsetenv_others => true : clear environment variables except specified by env
* :unsetenv_others => false : don't clear (default)
* process group:
* :pgroup => true or 0 : make a new process group
* :pgroup => pgid : join to specified process group
* :pgroup => nil : don't change the process group (default)
* resource limit: resourcename is core, cpu, data, etc. See Process.setrlimit.
* :rlimit_resourcename => limit
* :rlimit_resourcename => [cur_limit, max_limit]
* current directory:
* :chdir => str
* umask:
* :umask => int
* redirection:
* key:
* FD : single file descriptor in child process
* [FD, FD, ...] : multiple file descriptor in child process
* value:
* FD : redirect to the file descriptor in parent process
* string : redirect to file with open(string, "r" or "w")
* [string] : redirect to file with open(string, File::RDONLY)
* [string, open_mode] : redirect to file with open(string, open_mode, 0644)
* [string, open_mode, perm] : redirect to file with open(string, open_mode, perm)
* [:child, FD] : redirect to the redirected file descriptor
* :close : close the file descriptor in child process
* FD is one of follows
* :in : the file descriptor 0 which is the standard input
* :out : the file descriptor 1 which is the standard output
* :err : the file descriptor 2 which is the standard error
* integer : the file descriptor of specified the integer
* io : the file descriptor specified as io.fileno
* file descriptor inheritance: close non-redirected non-standard fds (3, 4, 5, ...) or not
* :close_others => false : inherit fds (default for system and exec)
* :close_others => true : don't inherit (default for spawn and IO.popen)
*
* If a hash is given as +env+, the environment is
* updated by +env+ before <code>exec(2)</code> in the child process.
* If a pair in +env+ has nil as the value, the variable is deleted.
*
* # set FOO as BAR and unset BAZ.
* pid = spawn({"FOO"=>"BAR", "BAZ"=>nil}, command)
*
* If a hash is given as +options+,
* it specifies
* process group,
* resource limit,
* current directory,
* umask and
* redirects for the child process.
* Also, it can be specified to clear environment variables.
*
* The <code>:unsetenv_others</code> key in +options+ specifies
* to clear environment variables, other than specified by +env+.
*
* pid = spawn(command, :unsetenv_others=>true) # no environment variable
* pid = spawn({"FOO"=>"BAR"}, command, :unsetenv_others=>true) # FOO only
*
* The <code>:pgroup</code> key in +options+ specifies a process group.
* The corresponding value should be true, zero or positive integer.
* true and zero means the process should be a process leader of a new
* process group.
* Other values specifies a process group to be belongs.
*
* pid = spawn(command, :pgroup=>true) # process leader
* pid = spawn(command, :pgroup=>10) # belongs to the process group 10
*
* The <code>:rlimit_</code><em>foo</em> key specifies a resource limit.
* <em>foo</em> should be one of resource types such as <code>core</code>.
* The corresponding value should be an integer or an array which have one or
* two integers: same as cur_limit and max_limit arguments for
* Process.setrlimit.
*
* cur, max = Process.getrlimit(:CORE)
* pid = spawn(command, :rlimit_core=>[0,max]) # disable core temporary.
* pid = spawn(command, :rlimit_core=>max) # enable core dump
* pid = spawn(command, :rlimit_core=>0) # never dump core.
*
* The <code>:chdir</code> key in +options+ specifies the current directory.
*
* pid = spawn(command, :chdir=>"/var/tmp")
*
* The <code>:umask</code> key in +options+ specifies the umask.
*
* pid = spawn(command, :umask=>077)
*
* The :in, :out, :err, a fixnum, an IO and an array key specifies a redirection.
* The redirection maps a file descriptor in the child process.
*
* For example, stderr can be merged into stdout as follows:
*
* pid = spawn(command, :err=>:out)
* pid = spawn(command, 2=>1)
* pid = spawn(command, STDERR=>:out)
* pid = spawn(command, STDERR=>STDOUT)
*
* The hash keys specifies a file descriptor
* in the child process started by <code>spawn</code>.
* :err, 2 and STDERR specifies the standard error stream (stderr).
*
* The hash values specifies a file descriptor
* in the parent process which invokes <code>spawn</code>.
* :out, 1 and STDOUT specifies the standard output stream (stdout).
*
* In the above example,
* the standard output in the child process is not specified.
* So it is inherited from the parent process.
*
* The standard input stream (stdin) can be specified by :in, 0 and STDIN.
*
* A filename can be specified as a hash value.
*
* pid = spawn(command, :in=>"/dev/null") # read mode
* pid = spawn(command, :out=>"/dev/null") # write mode
* pid = spawn(command, :err=>"log") # write mode
* pid = spawn(command, 3=>"/dev/null") # read mode
*
* For stdout and stderr,
* it is opened in write mode.
* Otherwise read mode is used.
*
* For specifying flags and permission of file creation explicitly,
* an array is used instead.
*
* pid = spawn(command, :in=>["file"]) # read mode is assumed
* pid = spawn(command, :in=>["file", "r"])
* pid = spawn(command, :out=>["log", "w"]) # 0644 assumed
* pid = spawn(command, :out=>["log", "w", 0600])
* pid = spawn(command, :out=>["log", File::WRONLY|File::EXCL|File::CREAT, 0600])
*
* The array specifies a filename, flags and permission.
* The flags can be a string or an integer.
* If the flags is omitted or nil, File::RDONLY is assumed.
* The permission should be an integer.
* If the permission is omitted or nil, 0644 is assumed.
*
* If an array of IOs and integers are specified as a hash key,
* all the elements are redirected.
*
* # stdout and stderr is redirected to log file.
* # The file "log" is opened just once.
* pid = spawn(command, [:out, :err]=>["log", "w"])
*
* Another way to merge multiple file descriptors is [:child, fd].
* \[:child, fd] means the file descriptor in the child process.
* This is different from fd.
* For example, :err=>:out means redirecting child stderr to parent stdout.
* But :err=>[:child, :out] means redirecting child stderr to child stdout.
* They differs if stdout is redirected in the child process as follows.
*
* # stdout and stderr is redirected to log file.
* # The file "log" is opened just once.
* pid = spawn(command, :out=>["log", "w"], :err=>[:child, :out])
*
* \[:child, :out] can be used to merge stderr into stdout in IO.popen.
* In this case, IO.popen redirects stdout to a pipe in the child process
* and [:child, :out] refers the redirected stdout.
*
* io = IO.popen(["sh", "-c", "echo out; echo err >&2", :err=>[:child, :out]])
* p io.read #=> "out\nerr\n"
*
* spawn closes all non-standard unspecified descriptors by default.
* The "standard" descriptors are 0, 1 and 2.
* This behavior is specified by :close_others option.
* :close_others doesn't affect the standard descriptors which are
* closed only if :close is specified explicitly.
*
* pid = spawn(command, :close_others=>true) # close 3,4,5,... (default)
* pid = spawn(command, :close_others=>false) # don't close 3,4,5,...
*
* :close_others is true by default for spawn and IO.popen.
*
* So IO.pipe and spawn can be used as IO.popen.
*
* # similar to r = IO.popen(command)
* r, w = IO.pipe
* pid = spawn(command, :out=>w) # r, w is closed in the child process.
* w.close
*
* :close is specified as a hash value to close a fd individually.
*
* f = open(foo)
* system(command, f=>:close) # don't inherit f.
*
* If a file descriptor need to be inherited,
* io=>io can be used.
*
* # valgrind has --log-fd option for log destination.
* # log_w=>log_w indicates log_w.fileno inherits to child process.
* log_r, log_w = IO.pipe
* pid = spawn("valgrind", "--log-fd=#{log_w.fileno}", "echo", "a", log_w=>log_w)
* log_w.close
* p log_r.read
*
* It is also possible to exchange file descriptors.
*
* pid = spawn(command, :out=>:err, :err=>:out)
*
* The hash keys specify file descriptors in the child process.
* The hash values specifies file descriptors in the parent process.
* So the above specifies exchanging stdout and stderr.
* Internally, +spawn+ uses an extra file descriptor to resolve such cyclic
* file descriptor mapping.
*
* See <code>Kernel.exec</code> for the standard shell.
*/
view raw gistfile1.txt hosted with ❤ by GitHub

No comments:

Post a Comment

Followers