Blogged by Ujihisa. Standard methods of programming and thoughts including Clojure, Vim, LLVM, Haskell, Ruby and Mathematics written by a Japanese programmer. github/ujihisa

Sunday, March 28, 2010

How to Get Each Characters

STDIN.getc returns a buffered character from STDIN. How do I get a character from STDIN immediately without buffering?

This code looks well, but it doesn't work as you expect. The input characters will be still buffered.

STDIN.sync = true
p STDIN.getc

Curses.getch

require 'curses'
p Curses.getch

This works, but there is an undesired side-effect. Curses.getch clears the screen.

stty raw

orig_stty = `stty -g`
system 'stty raw echo'
p STDIN.getc
system 'stty', orig_stty

This works, but some platform don't support the command.

Conclusion

Getting a character immediately looks easy, but it actually isn't easy.

Sunday, March 21, 2010

Spawn On JRuby

As the previous posts of this blog, ruby 1.9's Kernel.spawn is very useful. I ported it to ruby 1.8 partly. How about on JRuby?

Currently there are two ways of using spawn on JRuby, but both of them have problem.

fork + exec + spawn-for-legacy library

I made a wrapper library spawn-for-legacy which provides ruby 1.9 style spawn, using fork and exec.

If JRuby has fork and exec, you can use spawn-for-legacy. Unfortunatelly when you use fork on JRuby, you have to set a command line option to JRuby interpreter.

If you kill pid0, will the internal fork be automatically killed?

pid0 = fork {
pid1 = fork {
sleep 2
puts 'bah!'
}
sleep 1
}
Process.kill 'HUP', pid0
view raw gistfile1.txt hosted with ❤ by GitHub

Try the previous code. This doesn't show 'bah!' message. It seems to be true that internal fork will be automatically killed. But it's wrong. This code doesn't show 'bah!' message just because the internal fork didn't started yet when Process.kill was called.

Try to wait until both blocks are certainly for forked.

pid0 = fork {
pid1 = fork {
sleep 2
puts 'bah!'
}
sleep 1
}
sleep 0.1
Process.kill 'HUP', pid0
view raw gistfile1.txt hosted with ❤ by GitHub

This code shows 'bah!' message after the main routine finished!

So, let's think about how to kill the internal process.

  • The value of pid1 is not available in the outer main routine
  • An external command ps is not available in Windows
  • Usually the process ID pid0 and pid1 are sequential, but depending on it is dangerous

Here is the solution of it:

pid0 = fork {
pid1 = fork {
sleep 2
puts 'bah!'
}
at_exit { Process.kill 'HUP', pid1 }
sleep 1
}
sleep 0.1
Process.kill 'HUP', pid0
view raw gistfile1.txt hosted with ❤ by GitHub

I added a line of at_exit. This is cumbersome a little bit, but safe and easy.

With Spawn

The previous discussion was actually for getting ready. Windows doesn't have fork. We can use spawn instead of fork in most cases.

This succeeded!

One More Thing

When you have to fork a ruby process with spawn, for example in case to rewrite fork to spawn, how do you write?

fork { p Dir.pwd }

to

spawn 'ruby', '-e', 'p Dir.pwd'

is not correct. The ruby that booted the code itself is not always 'ruby'. It can be ruby19, jruby, rbx or other rubies. In this case, your the following approach.

require 'rbconfig'
spawn RbConfig::CONFIG['ruby_install_name'] + RbConfig::CONFIG['EXEEXT'], '-e', 'p Dir.pwd'

Compatible Array#flatten

Array#flatten with depth argument is useful.

[[1], [2], [[3], 4]].flatten(1)
#=> [1, 2, [3], 4]

That's been available since ruby 1.8.7, but older ruby such as ruby 1.8.6 doesn't support it.

Here is the pure-ruby implementation of Array#flatten. Use it in case.

#!/usr/bin/ruby
#!/opt/local/bin/ruby for 1.8.7
#!ruby191
#!ruby192
if RUBY_VERSION <= '1.8.6'
class Array
alias orig_flatten flatten
def flatten(depth = -1)
if depth < 0
orig_flatten
elsif depth == 0
self
else
inject([]) {|m, i|
Array === i ? m + i : m << i
}.flatten(depth - 1)
end
end
end
end
p RUBY_VERSION
p [1, 2, 3].flatten == [1, 2, 3]
p [1, 2, [3]].flatten == [1, 2, 3]
p [[1], [2], [3, 4]].flatten == [1, 2, 3, 4]
p [[1], [2], [[3], 4]].flatten == [1, 2, 3, 4]
p [1, 2, 3].flatten(1) == [1, 2, 3]
p [1, 2, [3]].flatten(1) == [1, 2, 3]
p [[1], [2], [3, 4]].flatten(1) == [1, 2, 3, 4]
p [[1], [2], [[3], 4]].flatten(1) == [1, 2, [3], 4]
p [[1, 2], [[[3], 4]]].flatten(1) == [1, 2, [[3], 4]]
p [[1], [2], [[3], 4]].flatten(2) == [1, 2, 3, 4]
p [[1, 2], [[[3], 4]]].flatten(2) == [1, 2, [3], 4]
p [[1, 2], [[[3], 4]]].flatten(3) == [1, 2, 3, 4]
p [[1, 2], [[[3], 4]]].flatten(-1) == [1, 2, 3, 4]
p [[1, 2], [[[3], 4]]].flatten(0) == [[1, 2], [[[3], 4]]]
view raw gistfile1.txt hosted with ❤ by GitHub

I used it for my library spawn-for-legacy to support ruby 1.8.6.

Rails programmers don't need this code. Rails now only supports ruby newer than 1.8.7.

Sunday, March 7, 2010

How To Run An External Command Asynchronously

This issue looks easy, but is actually complex.

On shell, you may run a command asynchronously just by adding & at the end of command.

$ sleep 10 &
[1] 8048

You can obtain the process ID, so you can stop the command.

$ kill 8048

How can we run an external command on Ruby?

system()

The easiest way to run an external command is system().

system 'sleep 10 &'

The return value of system() is the exit code. In this case, the system() returns true immediately after that ran, but it doesn't return process ID. To obtain the process ID, you can use $?.

system 'sleep 10 &'
p $? #=> 8050

But unfortunatelly the $? is not the processs ID of sleep, but sh -c sleep 10. There is a cussion. You cannot obtain the process ID of sleep directly.

One more problem: the notation is available only in UNIX based platform.

system() without shell

system 'sleep 10' uses shell but system 'sleep', '10' doesn't use shell. Separate the arguments. But system 'sleep', '10', '&' is not available.

Thread

To emulate '&' feature, how about using Ruby's asynchronous features? Thread is the easiest way of that.

t = Thread.start do
  system 'sleep', '10'
end
t.kill

Looks good, but it doesn't work. Thread certainly will stop, but the external command booted by the thread will still alive.

exec()

How about using another way of booting an external command? There is exec(), which roughly means the combination of system() and exit().

exec 'ls'

roughly means

system 'ls'
exit

So you think that the following code will work well.

t = Thread.start do
  exec 'sleep', '10'
end
t.kill

Unfortunatelly it doesn't work. You cannot use exec() in a child thread.

fork()

Look for another way of supporting asynchronous feature on Ruby. fork(). fork() copies the Ruby process itself.

pid = fork do
  exec 'sleep', '10'
end
Process.kill pid

It works! fork() is awesome!

But there is a bad news. Thread works on all platforms, but fork works on only some platforms. I'm sorry Windows users and NetBSD users.

spawn()

spawn = fork + exec. spawn() works on all platforms! It's perfect! But... ruby 1.8 doesn't have spawn()...

Summary

  • Ruby 1.9 + UNIX: Use fork and exec, or use spawn.
  • Ruby 1.8 + UNIX: Use fork and exec.
  • Ruby 1.9 + Windows: Use spawn.
  • Ruby 1.8 + Windows: Use other OS, other Ruby, or consider using this library

Saturday, March 6, 2010

How To Avoid The Worst Case

I broke an expensive unopened wine bottle. The white wine was spilt all over my kitchen.

Yesterday I bought a wine bottle and a box of beer at a liquor store which is located far from my home. I bought them because they were on sale. To save my money, I came home on my foot, saving $1.75. Today I ate lunch and drunk a sip of beer. After the lunch, I was trying to get a snack. There was the wine bottle in front of the snack. My elbow hit the bottle, and broke it. The delicious expensive wine has gone. $14 and the heavy work has gone.

While I was wiping the floor with smelling the wine, I was thinking what was the cause and how should I do after that. This incident was happened without any other people except for me. I'm the true culprit. So, how can I avoid such tragedy in my future?

The causes are the following two. The fact I was drunk a little bit and the location of wine was not safe a little bit. Both of them are not crucial. But the incident certainly happened.

This issue can be summarized as that we should assume anything for people who are in bad condition. People often become bad because of drowsiness, alcohol, anger and depression. Assume the condition. Never put glass products on the edge, or the place people can touch easily.

Friday, March 5, 2010

All About Spawn

Spawn = Fork + Exec, but works also on platforms which don't support fork.

The usage of spawn is often said to be cryptic. Here I'll describe common cases.

  • system('ls')

    pid = spawn(['ls', 'ls'])
    Process.wait(pid)
    
  • system('ls .')

    pid = spawn('ls', '.')
    Process.wait(pid)
    
  • system('ls . &')

    pid = spawn('ls', '.')
    
  • system('ls . > a.txt')

    pid = spawn('ls', '.', :out => 'a.txt')
    Process.wait(pid)
    
  • system('ls . >> a.txt')

    pid = spawn('ls', '.', :out => ['a.txt', 'a'])
    Process.wait(pid)
    
  • system('ls . >& a.txt')

    pid = spawn('ls', '.', [:out, :err] => ['a.txt', 'w'])
    Process.wait(pid)
    
  • IO.popen('cat a.txt') {|io| p io.read

    i, o = IO.pipe
    spawn('cat a.txt', :out => o)
    o.close
    p i.read
    
  • system('make all &')

    spawn('make', 'all)
    
  • Dir.chdir('a') { system 'make all &' }

    spawn('make', 'all', :chdir => 'a')
    
  • Passing ENV:

    • Shell: $ AAA=1 make all &
    • Ruby: ENV['AAA'] = '1'; system('make all &')
    • Ruby with Spawn: spawn({'AAA' => '1'}, 'make', 'all')

Further information can be available in process.c in ruby trunk. Here the documentation from the file:

/*
* call-seq:
* spawn([env,] command... [,options]) => pid
* Process.spawn([env,] command... [,options]) => pid
*
* spawn executes specified command and return its pid.
*
* This method doesn't wait for end of the command.
* The parent process should
* use <code>Process.wait</code> to collect
* the termination status of its child or
* use <code>Process.detach</code> to register
* disinterest in their status;
* otherwise, the operating system may accumulate zombie processes.
*
* spawn has bunch of options to specify process attributes:
*
* env: hash
* name => val : set the environment variable
* name => nil : unset the environment variable
* command...:
* commandline : command line string which is passed to the standard shell
* cmdname, arg1, ... : command name and one or more arguments (no shell)
* [cmdname, argv0], arg1, ... : command name, argv[0] and zero or more arguments (no shell)
* options: hash
* clearing environment variables:
* :unsetenv_others => true : clear environment variables except specified by env
* :unsetenv_others => false : don't clear (default)
* process group:
* :pgroup => true or 0 : make a new process group
* :pgroup => pgid : join to specified process group
* :pgroup => nil : don't change the process group (default)
* resource limit: resourcename is core, cpu, data, etc. See Process.setrlimit.
* :rlimit_resourcename => limit
* :rlimit_resourcename => [cur_limit, max_limit]
* current directory:
* :chdir => str
* umask:
* :umask => int
* redirection:
* key:
* FD : single file descriptor in child process
* [FD, FD, ...] : multiple file descriptor in child process
* value:
* FD : redirect to the file descriptor in parent process
* string : redirect to file with open(string, "r" or "w")
* [string] : redirect to file with open(string, File::RDONLY)
* [string, open_mode] : redirect to file with open(string, open_mode, 0644)
* [string, open_mode, perm] : redirect to file with open(string, open_mode, perm)
* [:child, FD] : redirect to the redirected file descriptor
* :close : close the file descriptor in child process
* FD is one of follows
* :in : the file descriptor 0 which is the standard input
* :out : the file descriptor 1 which is the standard output
* :err : the file descriptor 2 which is the standard error
* integer : the file descriptor of specified the integer
* io : the file descriptor specified as io.fileno
* file descriptor inheritance: close non-redirected non-standard fds (3, 4, 5, ...) or not
* :close_others => false : inherit fds (default for system and exec)
* :close_others => true : don't inherit (default for spawn and IO.popen)
*
* If a hash is given as +env+, the environment is
* updated by +env+ before <code>exec(2)</code> in the child process.
* If a pair in +env+ has nil as the value, the variable is deleted.
*
* # set FOO as BAR and unset BAZ.
* pid = spawn({"FOO"=>"BAR", "BAZ"=>nil}, command)
*
* If a hash is given as +options+,
* it specifies
* process group,
* resource limit,
* current directory,
* umask and
* redirects for the child process.
* Also, it can be specified to clear environment variables.
*
* The <code>:unsetenv_others</code> key in +options+ specifies
* to clear environment variables, other than specified by +env+.
*
* pid = spawn(command, :unsetenv_others=>true) # no environment variable
* pid = spawn({"FOO"=>"BAR"}, command, :unsetenv_others=>true) # FOO only
*
* The <code>:pgroup</code> key in +options+ specifies a process group.
* The corresponding value should be true, zero or positive integer.
* true and zero means the process should be a process leader of a new
* process group.
* Other values specifies a process group to be belongs.
*
* pid = spawn(command, :pgroup=>true) # process leader
* pid = spawn(command, :pgroup=>10) # belongs to the process group 10
*
* The <code>:rlimit_</code><em>foo</em> key specifies a resource limit.
* <em>foo</em> should be one of resource types such as <code>core</code>.
* The corresponding value should be an integer or an array which have one or
* two integers: same as cur_limit and max_limit arguments for
* Process.setrlimit.
*
* cur, max = Process.getrlimit(:CORE)
* pid = spawn(command, :rlimit_core=>[0,max]) # disable core temporary.
* pid = spawn(command, :rlimit_core=>max) # enable core dump
* pid = spawn(command, :rlimit_core=>0) # never dump core.
*
* The <code>:chdir</code> key in +options+ specifies the current directory.
*
* pid = spawn(command, :chdir=>"/var/tmp")
*
* The <code>:umask</code> key in +options+ specifies the umask.
*
* pid = spawn(command, :umask=>077)
*
* The :in, :out, :err, a fixnum, an IO and an array key specifies a redirection.
* The redirection maps a file descriptor in the child process.
*
* For example, stderr can be merged into stdout as follows:
*
* pid = spawn(command, :err=>:out)
* pid = spawn(command, 2=>1)
* pid = spawn(command, STDERR=>:out)
* pid = spawn(command, STDERR=>STDOUT)
*
* The hash keys specifies a file descriptor
* in the child process started by <code>spawn</code>.
* :err, 2 and STDERR specifies the standard error stream (stderr).
*
* The hash values specifies a file descriptor
* in the parent process which invokes <code>spawn</code>.
* :out, 1 and STDOUT specifies the standard output stream (stdout).
*
* In the above example,
* the standard output in the child process is not specified.
* So it is inherited from the parent process.
*
* The standard input stream (stdin) can be specified by :in, 0 and STDIN.
*
* A filename can be specified as a hash value.
*
* pid = spawn(command, :in=>"/dev/null") # read mode
* pid = spawn(command, :out=>"/dev/null") # write mode
* pid = spawn(command, :err=>"log") # write mode
* pid = spawn(command, 3=>"/dev/null") # read mode
*
* For stdout and stderr,
* it is opened in write mode.
* Otherwise read mode is used.
*
* For specifying flags and permission of file creation explicitly,
* an array is used instead.
*
* pid = spawn(command, :in=>["file"]) # read mode is assumed
* pid = spawn(command, :in=>["file", "r"])
* pid = spawn(command, :out=>["log", "w"]) # 0644 assumed
* pid = spawn(command, :out=>["log", "w", 0600])
* pid = spawn(command, :out=>["log", File::WRONLY|File::EXCL|File::CREAT, 0600])
*
* The array specifies a filename, flags and permission.
* The flags can be a string or an integer.
* If the flags is omitted or nil, File::RDONLY is assumed.
* The permission should be an integer.
* If the permission is omitted or nil, 0644 is assumed.
*
* If an array of IOs and integers are specified as a hash key,
* all the elements are redirected.
*
* # stdout and stderr is redirected to log file.
* # The file "log" is opened just once.
* pid = spawn(command, [:out, :err]=>["log", "w"])
*
* Another way to merge multiple file descriptors is [:child, fd].
* \[:child, fd] means the file descriptor in the child process.
* This is different from fd.
* For example, :err=>:out means redirecting child stderr to parent stdout.
* But :err=>[:child, :out] means redirecting child stderr to child stdout.
* They differs if stdout is redirected in the child process as follows.
*
* # stdout and stderr is redirected to log file.
* # The file "log" is opened just once.
* pid = spawn(command, :out=>["log", "w"], :err=>[:child, :out])
*
* \[:child, :out] can be used to merge stderr into stdout in IO.popen.
* In this case, IO.popen redirects stdout to a pipe in the child process
* and [:child, :out] refers the redirected stdout.
*
* io = IO.popen(["sh", "-c", "echo out; echo err >&2", :err=>[:child, :out]])
* p io.read #=> "out\nerr\n"
*
* spawn closes all non-standard unspecified descriptors by default.
* The "standard" descriptors are 0, 1 and 2.
* This behavior is specified by :close_others option.
* :close_others doesn't affect the standard descriptors which are
* closed only if :close is specified explicitly.
*
* pid = spawn(command, :close_others=>true) # close 3,4,5,... (default)
* pid = spawn(command, :close_others=>false) # don't close 3,4,5,...
*
* :close_others is true by default for spawn and IO.popen.
*
* So IO.pipe and spawn can be used as IO.popen.
*
* # similar to r = IO.popen(command)
* r, w = IO.pipe
* pid = spawn(command, :out=>w) # r, w is closed in the child process.
* w.close
*
* :close is specified as a hash value to close a fd individually.
*
* f = open(foo)
* system(command, f=>:close) # don't inherit f.
*
* If a file descriptor need to be inherited,
* io=>io can be used.
*
* # valgrind has --log-fd option for log destination.
* # log_w=>log_w indicates log_w.fileno inherits to child process.
* log_r, log_w = IO.pipe
* pid = spawn("valgrind", "--log-fd=#{log_w.fileno}", "echo", "a", log_w=>log_w)
* log_w.close
* p log_r.read
*
* It is also possible to exchange file descriptors.
*
* pid = spawn(command, :out=>:err, :err=>:out)
*
* The hash keys specify file descriptors in the child process.
* The hash values specifies file descriptors in the parent process.
* So the above specifies exchanging stdout and stderr.
* Internally, +spawn+ uses an extra file descriptor to resolve such cyclic
* file descriptor mapping.
*
* See <code>Kernel.exec</code> for the standard shell.
*/
view raw gistfile1.txt hosted with ❤ by GitHub

Followers