Standards

Monday, April 2, 2012

Concurrent PI-Calculator in Clojure

Inspired by a tutorial about akka tutorial in Scala about making a concurrent PI calculator, I made similar on in Clojure without actor model but just with thread and atom.

(ns pi.core
  (:import [java.util.concurrent Executors])
  (:gen-class))

(def sum (partial reduce +))

(defn f [n]
  (/ (if (even? n) 1 -1) (inc (* 2 n))))

(defn solver1 [n]
  (double (* 4 (sum (map f (range 0 n))))))

(defn solver2 [n]
  (def a (atom 0))
  (def tasks
    (let [unit (/ n 4)]
      (for [m (range 0 4)]
        (fn []
          (swap! a (partial + (sum (map f (range (* unit m) (* unit (inc m)))))))))))

  (let [pool (Executors/newFixedThreadPool (count tasks))]
    (doseq [f (.invokeAll pool tasks)]
      (.get f))
    (.shutdown pool))
  (double (* 4 @a)))

(defn -main []
  (let [n 2000]
    (prn (solver1 n))
    (prn (solver2 n))))

Commenting out the line of solver1/solver2, I got the following results on my 2-core with hyper-threading machine; virtually it's 4 core machine.

solver1

real 3.80
user 4.66
sys 0.09
3.141092653621043

solver2

real 2.01
user 3.44
sys 0.05
3.141092653621043

Comparing to the sequential version, the concurrent version was almost 2 times faster.

Sunday, February 5, 2012

IBus-Mozc for Kana Input on US Keyboard Layout

There are two major input methods in Japanese language; roma-input and kana-input (*1). Roma-input layouts in popular implementations are same while kana-input layouts are different among implementations for some reason when it's on US Keyboard layout.

I'm a kana typer and a user of US Keyboard Layout. Kana layout of Mac OS X Kotoeri layout and kana layout of ibus-mozc are very different. I had used Kotoeri for a long time, so I decided to change the layout of mozc, but mozc doesn't have an interface to change the layout.

I wrote a patch to make mozc same to Kotoeri's.

kana-layout.patch:

Index: unix/ibus/key_translator.cc
===================================================================
--- unix/ibus/key_translator.cc (revision 94)
+++ unix/ibus/key_translator.cc (working copy)
@@ -254,100 +254,100 @@
   // to Right Shift).
   { '\\', "", "" },
 }, kana_map_us[] = {
-  { '`' , "\xe3\x82\x8d", "\xe3\x82\x8d" },  // "ろ", "ろ"
-  { '~' , "\xe3\x82\x8d", "\xe3\x82\x8d" },  // "ろ", "ろ"
-  { '1' , "\xe3\x81\xac", "\xe3\x81\xac" },  // "ぬ", "ぬ"
-  { '!' , "\xe3\x81\xac", "\xe3\x81\xac" },  // "ぬ", "ぬ"
-  { '2' , "\xe3\x81\xb5", "\xe3\x81\xb5" },  // "ふ", "ふ"
+  { '$' , "\xe3\x81\x85", "\xe3\x81\x85" },  // "ぅ", "ぅ"
+  { '(' , "\xe3\x82\x87", "\xe3\x82\x87" },  // "ょ", "ょ"
+  { ',' , "\xe3\x81\xad", "\xe3\x81\xad" },  // "ね", "ね"
+  { '0' , "\xe3\x82\x8f", "\xe3\x82\x8f" },  // "わ", "わ"
+  { '4' , "\xe3\x81\x86", "\xe3\x81\x86" },  // "う", "う"
+  { '8' , "\xe3\x82\x86", "\xe3\x82\x86" },  // "ゆ", "ゆ"
+  { '<' , "\xe3\x80\x81", "\xe3\x80\x81" },  // "、", "、"
   { '@' , "\xe3\x81\xb5", "\xe3\x81\xb5" },  // "ふ", "ふ"
-  { '3' , "\xe3\x81\x82", "\xe3\x81\x81" },  // "あ", "ぁ"
-  { '#' , "\xe3\x81\x82", "\xe3\x81\x81" },  // "あ", "ぁ"
-  { '4' , "\xe3\x81\x86", "\xe3\x81\x85" },  // "う", "ぅ"
-  { '$' , "\xe3\x81\x86", "\xe3\x81\x85" },  // "う", "ぅ"
-  { '5' , "\xe3\x81\x88", "\xe3\x81\x87" },  // "え", "ぇ"
-  { '%' , "\xe3\x81\x88", "\xe3\x81\x87" },  // "え", "ぇ"
-  { '6' , "\xe3\x81\x8a", "\xe3\x81\x89" },  // "お", "ぉ"
-  { '^' , "\xe3\x81\x8a", "\xe3\x81\x89" },  // "お", "ぉ"
-  { '7' , "\xe3\x82\x84", "\xe3\x82\x83" },  // "や", "ゃ"
-  { '&' , "\xe3\x82\x84", "\xe3\x82\x83" },  // "や", "ゃ"
-  { '8' , "\xe3\x82\x86", "\xe3\x82\x85" },  // "ゆ", "ゅ"
-  { '*' , "\xe3\x82\x86", "\xe3\x82\x85" },  // "ゆ", "ゅ"
-  { '9' , "\xe3\x82\x88", "\xe3\x82\x87" },  // "よ", "ょ"
-  { '(' , "\xe3\x82\x88", "\xe3\x82\x87" },  // "よ", "ょ"
-  { '0' , "\xe3\x82\x8f", "\xe3\x82\x92" },  // "わ", "を"
-  { ')' , "\xe3\x82\x8f", "\xe3\x82\x92" },  // "わ", "を"
-  { '-' , "\xe3\x81\xbb", "\xe3\x83\xbc" },  // "ほ", "ー"
-  { '_' , "\xe3\x81\xbb", "\xe3\x83\xbc" },  // "ほ", "ー"
-  { '=' , "\xe3\x81\xb8", "\xe3\x81\xb8" },  // "へ", "へ"
-  { '+' , "\xe3\x81\xb8", "\xe3\x81\xb8" },  // "へ", "へ"
-  { 'q' , "\xe3\x81\x9f", "\xe3\x81\x9f" },  // "た", "た"
-  { 'Q' , "\xe3\x81\x9f", "\xe3\x81\x9f" },  // "た", "た"
-  { 'w' , "\xe3\x81\xa6", "\xe3\x81\xa6" },  // "て", "て"
-  { 'W' , "\xe3\x81\xa6", "\xe3\x81\xa6" },  // "て", "て"
-  { 'e' , "\xe3\x81\x84", "\xe3\x81\x83" },  // "い", "ぃ"
-  { 'E' , "\xe3\x81\x84", "\xe3\x81\x83" },  // "い", "ぃ"
-  { 'r' , "\xe3\x81\x99", "\xe3\x81\x99" },  // "す", "す"
-  { 'R' , "\xe3\x81\x99", "\xe3\x81\x99" },  // "す", "す"
+  { 'D' , "\xe3\x81\x97", "\xe3\x81\x97" },  // "し", "し"
+  { 'H' , "\xe3\x81\x8f", "\xe3\x81\x8f" },  // "く", "く"
+  { 'L' , "\xe3\x82\x8a", "\xe3\x82\x8a" },  // "り", "り"
+  { 'P' , "\xe3\x81\x9b", "\xe3\x81\x9b" },  // "せ", "せ"
+  { 'T' , "\xe3\x81\x8b", "\xe3\x81\x8b" },  // "か", "か"
+  { 'X' , "\xe3\x81\x95", "\xe3\x81\x95" },  // "さ", "さ"
+  { '\\' , "\xe3\x81\xb8", "\xe3\x81\xb8" },  // "へ", "へ"
+  { '`' , "\xef\xbd\x80", "\xef\xbd\x80" },  // "｀", "｀"
+  { 'd' , "\xe3\x81\x97", "\xe3\x81\x97" },  // "し", "し"
+  { 'h' , "\xe3\x81\x8f", "\xe3\x81\x8f" },  // "く", "く"
+  { 'l' , "\xe3\x82\x8a", "\xe3\x82\x8a" },  // "り", "り"
+  { 'p' , "\xe3\x81\x9b", "\xe3\x81\x9b" },  // "せ", "せ"
   { 't' , "\xe3\x81\x8b", "\xe3\x81\x8b" },  // "か", "か"
-  { 'T' , "\xe3\x81\x8b", "\xe3\x81\x8b" },  // "か", "か"
-  { 'y' , "\xe3\x82\x93", "\xe3\x82\x93" },  // "ん", "ん"
-  { 'Y' , "\xe3\x82\x93", "\xe3\x82\x93" },  // "ん", "ん"
-  { 'u' , "\xe3\x81\xaa", "\xe3\x81\xaa" },  // "な", "な"
-  { 'U' , "\xe3\x81\xaa", "\xe3\x81\xaa" },  // "な", "な"
-  { 'i' , "\xe3\x81\xab", "\xe3\x81\xab" },  // "に", "に"
-  { 'I' , "\xe3\x81\xab", "\xe3\x81\xab" },  // "に", "に"
-  { 'o' , "\xe3\x82\x89", "\xe3\x82\x89" },  // "ら", "ら"
+  { 'x' , "\xe3\x81\x95", "\xe3\x81\x95" },  // "さ", "さ"
+  { '|' , "\xe3\x82\x8d", "\xe3\x82\x8d" },  // "ろ", "ろ"
+  { '#' , "\xe3\x81\x81", "\xe3\x81\x81" },  // "ぁ", "ぁ"
+  { '\'' , "\xe3\x81\x91", "\xe3\x81\x91" },  // "け", "け"
+  { '+' , "\xe3\x80\x8c", "\xe3\x80\x8c" },  // "「", "「"
+  { '/' , "\xe3\x82\x81", "\xe3\x82\x81" },  // "め", "め"
+  { '3' , "\xe3\x81\x82", "\xe3\x81\x82" },  // "あ", "あ"
+  { '7' , "\xe3\x82\x84", "\xe3\x82\x84" },  // "や", "や"
+  { ';' , "\xe3\x82\x8c", "\xe3\x82\x8c" },  // "れ", "れ"
+  { '?' , "\xe3\x83\xbb", "\xe3\x83\xbb" },  // "・", "・"
+  { 'C' , "\xe3\x81\x9d", "\xe3\x81\x9d" },  // "そ", "そ"
+  { 'G' , "\xe3\x81\x8d", "\xe3\x81\x8d" },  // "き", "き"
+  { 'K' , "\xe3\x81\xae", "\xe3\x81\xae" },  // "の", "の"
   { 'O' , "\xe3\x82\x89", "\xe3\x82\x89" },  // "ら", "ら"
-  { 'p' , "\xe3\x81\x9b", "\xe3\x81\x9b" },  // "せ", "せ"
-  { 'P' , "\xe3\x81\x9b", "\xe3\x81\x9b" },  // "せ", "せ"
+  { 'S' , "\xe3\x81\xa8", "\xe3\x81\xa8" },  // "と", "と"
+  { 'W' , "\xe3\x81\xa6", "\xe3\x81\xa6" },  // "て", "て"
   { '[' , "\xe3\x82\x9b", "\xe3\x82\x9b" },  // "゛", "゛"
-  { '{' , "\xe3\x82\x9b", "\xe3\x82\x9b" },  // "゛", "゛"
-  { ']' , "\xe3\x82\x9c", "\xe3\x80\x8c" },  // "゜", "「"
-  { '}' , "\xe3\x82\x9c", "\xe3\x80\x8c" },  // "゜", "「"
-  { '\\', "\xe3\x82\x80", "\xe3\x80\x8d" },  // "む", "」"
-  { '|' , "\xe3\x82\x80", "\xe3\x80\x8d" },  // "む", "」"
-  { 'a' , "\xe3\x81\xa1", "\xe3\x81\xa1" },  // "ち", "ち"
-  { 'A' , "\xe3\x81\xa1", "\xe3\x81\xa1" },  // "ち", "ち"
+  { '_' , "\xe3\x81\xbb", "\xe3\x81\xbb" },  // "ほ", "ほ"
+  { 'c' , "\xe3\x81\x9d", "\xe3\x81\x9d" },  // "そ", "そ"
+  { 'g' , "\xe3\x81\x8d", "\xe3\x81\x8d" },  // "き", "き"
+  { 'k' , "\xe3\x81\xae", "\xe3\x81\xae" },  // "の", "の"
+  { 'o' , "\xe3\x82\x89", "\xe3\x82\x89" },  // "ら", "ら"
   { 's' , "\xe3\x81\xa8", "\xe3\x81\xa8" },  // "と", "と"
-  { 'S' , "\xe3\x81\xa8", "\xe3\x81\xa8" },  // "と", "と"
-  { 'd' , "\xe3\x81\x97", "\xe3\x81\x97" },  // "し", "し"
-  { 'D' , "\xe3\x81\x97", "\xe3\x81\x97" },  // "し", "し"
-  { 'f' , "\xe3\x81\xaf", "\xe3\x81\xaf" },  // "は", "は"
+  { 'w' , "\xe3\x81\xa6", "\xe3\x81\xa6" },  // "て", "て"
+  { '{' , "\xe3\x80\x8d", "\xe3\x80\x8d" },  // "」", "」"
+  { '"' , "\xe3\x82\x8d", "\xe3\x82\x8d" },  // "ろ", "ろ"
+  { '&' , "\xe3\x82\x83", "\xe3\x82\x83" },  // "ゃ", "ゃ"
+  { '*' , "\xe3\x82\x85", "\xe3\x82\x85" },  // "ゅ", "ゅ"
+  { '.' , "\xe3\x82\x8b", "\xe3\x82\x8b" },  // "る", "る"
+  { '2' , "\xe3\x81\xb5", "\xe3\x81\xb5" },  // "ふ", "ふ"
+  { '6' , "\xe3\x81\x8a", "\xe3\x81\x8a" },  // "お", "お"
+  { ':' , "\xe3\x82\x8c", "\xe3\x82\x8c" },  // "れ", "れ"
+  { '>' , "\xe3\x80\x82", "\xe3\x80\x82" },  // "。", "。"
+  { 'B' , "\xe3\x81\x93", "\xe3\x81\x93" },  // "こ", "こ"
   { 'F' , "\xe3\x81\xaf", "\xe3\x81\xaf" },  // "は", "は"
-  { 'g' , "\xe3\x81\x8d", "\xe3\x81\x8d" },  // "き", "き"
-  { 'G' , "\xe3\x81\x8d", "\xe3\x81\x8d" },  // "き", "き"
-  { 'h' , "\xe3\x81\x8f", "\xe3\x81\x8f" },  // "く", "く"
-  { 'H' , "\xe3\x81\x8f", "\xe3\x81\x8f" },  // "く", "く"
-  { 'j' , "\xe3\x81\xbe", "\xe3\x81\xbe" },  // "ま", "ま"
   { 'J' , "\xe3\x81\xbe", "\xe3\x81\xbe" },  // "ま", "ま"
-  { 'k' , "\xe3\x81\xae", "\xe3\x81\xae" },  // "の", "の"
-  { 'K' , "\xe3\x81\xae", "\xe3\x81\xae" },  // "の", "の"
-  { 'l' , "\xe3\x82\x8a", "\xe3\x82\x8a" },  // "り", "り"
-  { 'L' , "\xe3\x82\x8a", "\xe3\x82\x8a" },  // "り", "り"
-  { ';' , "\xe3\x82\x8c", "\xe3\x82\x8c" },  // "れ", "れ"
-  { ':' , "\xe3\x82\x8c", "\xe3\x82\x8c" },  // "れ", "れ"
-  { '\'', "\xe3\x81\x91", "\xe3\x81\x91" },  // "け", "け"
-  { '\"', "\xe3\x81\x91", "\xe3\x81\x91" },  // "け", "け"
-  { 'z' , "\xe3\x81\xa4", "\xe3\x81\xa3" },  // "つ", "っ"
-  { 'Z' , "\xe3\x81\xa4", "\xe3\x81\xa3" },  // "つ", "っ"
-  { 'x' , "\xe3\x81\x95", "\xe3\x81\x95" },  // "さ", "さ"
-  { 'X' , "\xe3\x81\x95", "\xe3\x81\x95" },  // "さ", "さ"
-  { 'c' , "\xe3\x81\x9d", "\xe3\x81\x9d" },  // "そ", "そ"
-  { 'C' , "\xe3\x81\x9d", "\xe3\x81\x9d" },  // "そ", "そ"
-  { 'v' , "\xe3\x81\xb2", "\xe3\x81\xb2" },  // "ひ", "ひ"
+  { 'N' , "\xe3\x81\xbf", "\xe3\x81\xbf" },  // "み", "み"
+  { 'R' , "\xe3\x81\x99", "\xe3\x81\x99" },  // "す", "す"
   { 'V' , "\xe3\x81\xb2", "\xe3\x81\xb2" },  // "ひ", "ひ"
+  { 'Z' , "\xe3\x81\xa3", "\xe3\x81\xa3" },  // "っ", "っ"
+  { '^' , "\xe3\x81\x89", "\xe3\x81\x89" },  // "ぉ", "ぉ"
   { 'b' , "\xe3\x81\x93", "\xe3\x81\x93" },  // "こ", "こ"
-  { 'B' , "\xe3\x81\x93", "\xe3\x81\x93" },  // "こ", "こ"
+  { 'f' , "\xe3\x81\xaf", "\xe3\x81\xaf" },  // "は", "は"
+  { 'j' , "\xe3\x81\xbe", "\xe3\x81\xbe" },  // "ま", "ま"
   { 'n' , "\xe3\x81\xbf", "\xe3\x81\xbf" },  // "み", "み"
-  { 'N' , "\xe3\x81\xbf", "\xe3\x81\xbf" },  // "み", "み"
+  { 'r' , "\xe3\x81\x99", "\xe3\x81\x99" },  // "す", "す"
+  { 'v' , "\xe3\x81\xb2", "\xe3\x81\xb2" },  // "ひ", "ひ"
+  { 'z' , "\xe3\x81\xa4", "\xe3\x81\xa4" },  // "つ", "つ"
+  { '~' , "\xe3\x80\x9c", "\xe3\x80\x9c" },  // "〜", "〜"
+  { '!' , "\xe3\x81\xac", "\xe3\x81\xac" },  // "ぬ", "ぬ"
+  { '%' , "\xe3\x81\x87", "\xe3\x81\x87" },  // "ぇ", "ぇ"
+  { ')' , "\xe3\x82\x92", "\xe3\x82\x92" },  // "を", "を"
+  { '-' , "\xe3\x81\xbb", "\xe3\x81\xbb" },  // "ほ", "ほ"
+  { '1' , "\xe3\x81\xac", "\xe3\x81\xac" },  // "ぬ", "ぬ"
+  { '5' , "\xe3\x81\x88", "\xe3\x81\x88" },  // "え", "え"
+  { '9' , "\xe3\x82\x88", "\xe3\x82\x88" },  // "よ", "よ"
+  { '=' , "\xe3\x82\x9c", "\xe3\x82\x9c" },  // "゜", "゜"
+  { 'A' , "\xe3\x81\xa1", "\xe3\x81\xa1" },  // "ち", "ち"
+  { 'E' , "\xe3\x81\x83", "\xe3\x81\x83" },  // "ぃ", "ぃ"
+  { 'I' , "\xe3\x81\xab", "\xe3\x81\xab" },  // "に", "に"
+  { 'M' , "\xe3\x82\x82", "\xe3\x82\x82" },  // "も", "も"
+  { 'Q' , "\xe3\x81\x9f", "\xe3\x81\x9f" },  // "た", "た"
+  { 'U' , "\xe3\x81\xaa", "\xe3\x81\xaa" },  // "な", "な"
+  { 'Y' , "\xe3\x82\x93", "\xe3\x82\x93" },  // "ん", "ん"
+  { ']' , "\xe3\x82\x80", "\xe3\x82\x80" },  // "む", "む"
+  { 'a' , "\xe3\x81\xa1", "\xe3\x81\xa1" },  // "ち", "ち"
+  { 'e' , "\xe3\x81\x84", "\xe3\x81\x84" },  // "い", "い"
+  { 'i' , "\xe3\x81\xab", "\xe3\x81\xab" },  // "に", "に"
   { 'm' , "\xe3\x82\x82", "\xe3\x82\x82" },  // "も", "も"
-  { 'M' , "\xe3\x82\x82", "\xe3\x82\x82" },  // "も", "も"
-  { ',' , "\xe3\x81\xad", "\xe3\x80\x81" },  // "ね", "、"
-  { '<' , "\xe3\x81\xad", "\xe3\x80\x81" },  // "ね", "、"
-  { '.' , "\xe3\x82\x8b", "\xe3\x80\x82" },  // "る", "。"
-  { '>' , "\xe3\x82\x8b", "\xe3\x80\x82" },  // "る", "。"
-  { '/' , "\xe3\x82\x81", "\xe3\x83\xbb" },  // "め", "・"
-  { '?' , "\xe3\x82\x81", "\xe3\x83\xbb" },  // "め", "・"
+  { 'q' , "\xe3\x81\x9f", "\xe3\x81\x9f" },  // "た", "た"
+  { 'u' , "\xe3\x81\xaa", "\xe3\x81\xaa" },  // "な", "な"
+  { 'y' , "\xe3\x82\x93", "\xe3\x82\x93" },  // "ん", "ん"
+  { '}' , "\xe3\x83\xbc", "\xe3\x83\xbc" },  // "ー", "ー"
 };

 }  // namespace
Index: unix/ibus/mozc_engine.cc
===================================================================
--- unix/ibus/mozc_engine.cc    (revision 94)
+++ unix/ibus/mozc_engine.cc    (working copy)
@@ -542,8 +542,7 @@
 #endif

   // TODO(yusukes): use |layout| in IBusEngineDesc if possible.
-  const bool layout_is_jp =
-      !g_strcmp0(ibus_engine_get_name(engine), "mozc-jp");
+  const bool layout_is_jp = 0;

   commands::KeyEvent key;
   if (!key_translator_->Translate(

Here I also made an ebuild for the latest ibus-mozc on gentoo portage. You can just use it if you are a Gentoo user.

http://ujihisa.shiracha.net/static/ibus-mozc-1.3.975.102.tgz

*1 There are more variety of other input methods like thumb-shift or kyuuri-kai.

Realtek RTL8191S WLAN Adapter on Gentoo

I bought a wireless network adapter for my Gentoo desktop computer. It's a usb dongle and it says it's Linux-compatible.

RealTek 8191SU

How to use:

Device Drivers ->
  Staging drivers ->
    RealTek RTL8712U (RTL8192SU) Wireless LAN NIC driver

In the menuconfig.

Sunday, January 22, 2012

AutoKey Rocks

Mac OS X has a built-in keyboard shortcut configuration tool in Preferences. You can set both global key bindings and application-specific key bindings.

I had used this to use Cmd key more often than Ctrl key. For example Google Chrome has default key mappings to go to the next tab by <ctrl-tab> and to the previous tab by <shift-ctrl-tab>. They break your left pinky easily. I mapped them ad <Cmd-j> and <Cmd-k>. That helped the finger.

But there is no such great built-in tool in the Linux world neither in Gnome nor KDE. There is a Keyboard Shortcuts in Preference of Gnome, but it's only for Gnome apps.

AutoKey

AutoKey is a similar tool to the Mac OS X built-in keybinding configuration tool. The differences are (1) AutoKey doesn't map a key to a function in the application's menu and (2) AutoKey enables you to run arbitrary scripts.

It's very easy to install on Debian family distributions such as Ubuntu. Gentoo, on the other hand, has an overlay that has autokey-gtk package, but it's broken. You have to get the source code and build it. It also requires you to install some dependencies manually, but it works.

autokey-gtk on gentoo

(Added on Jan 24 2012)

There is an issue in autokey that you cannot exclude a window easily. See the end of this discussion to solve it. http://groups.google.com/group/autokey-users/browse_thread/thread/658ad02cfbde8788

Monday, December 26, 2011

Compiling a Language to Whitespace

Whitespace is a widely-known instruction set on a stack-machine virtual machine.

The below is a sample hello world in Whitespace intermediate language. I added some notes that begin with #.

Push 0
Push 72
Store
# storing the value 72 to 0th heap
Push 1
Push 101
Store
# storing the value 101 to 1st heap
Push 2
Push 108
Store
Push 3
Push 108
Store
Push 4
Push 111
Store
Push 5
Push 44
Store
Push 6
Push 32
Store
Push 7
Push 119
Store
Push 8
Push 111
Store
Push 9
Push 114
Store
Push 10
Push 108
Store
Push 11
Push 100
Store
Push 12
Push 32
Store
Push 13
Push 111
Store
Push 14
Push 102
Store
Push 15
Push 32
Store
Push 16
Push 115
Store
Push 17
Push 112
Store
Push 18
Push 97
Store
Push 19
Push 99
Store
Push 20
Push 101
Store
Push 21
Push 115
Store
# (cont.d...)

Push 22
Push 33
Store
Push 23
Push 0
Store
Push 0
Call "\t \t  \t\t   \t \t\t\t \t  \t \t\t  \t  \t\t\t \t\t\t \t\t\t "
# Jumping to the ☃ mark with remembering this place to return later
Call "\t \t  \t\t  \t\t\t \t\t \t  \t \t\t   \t\t \t\t \t\t\t \t\t\t \t \t  \t\t  \t\t\t \t\t "
# Jumping to the ☁ mark with remembering this place to return later
End
# Terminating program.
Label "  \t  \t\t   \t  \t\t \t    \t\t "
Infix Plus
Return
Label "\t \t  \t\t   \t \t\t\t \t  \t \t\t  \t  \t\t\t \t\t\t \t\t\t "
# ☃
Dup
# Copying the top value of the stack. It's 0 (if you see here for the first time.)
Retrieve
# Getting the 0th value of the heap. It's 72 (if you see here for the first time.)
Dup
If Zero "  \t  \t\t  \t\t\t \t\t \t \t  \t\t \t\t\t\t\t \t \t \t  \t\t   \t \t\t\t \t  \t \t\t  \t  \t\t\t \t\t\t \t\t\t "
OutputChar
Push 1
Infix Plus
Jump "\t \t  \t\t   \t \t\t\t \t  \t \t\t  \t  \t\t\t \t\t\t \t\t\t "
Label "  \t  \t\t  \t\t\t \t\t \t \t  \t\t \t\t\t\t\t \t \t \t  \t\t   \t \t\t\t \t  \t \t\t  \t  \t\t\t \t\t\t \t\t\t "
Discard
Discard
Return
# Ending the function call and going back to the previous place.
Label "  \t  \t\t \t    \t\t \t \t  \t\t  \t  \t\t\t "
Dup
Dup
ReadChar
Retrieve
Dup
Push 10
Infix Minus
If Zero "  \t  \t\t  \t\t\t \t\t \t \t  \t\t \t\t\t\t\t \t   \t  \t\t \t    \t\t \t \t  \t\t  \t  \t\t\t "
Discard
Push 1
Infix Plus
Jump "  \t  \t\t \t    \t\t \t \t  \t\t  \t  \t\t\t "
Label "  \t  \t\t  \t\t\t \t\t \t \t  \t\t \t\t\t\t\t \t   \t  \t\t \t    \t\t \t \t  \t\t  \t  \t\t\t "
Discard
Push 1
Infix Plus
Push 0
Store
Return
Label "\t \t  \t\t  \t\t\t \t\t \t  \t \t\t   \t\t \t\t \t\t\t \t\t\t \t \t  \t\t  \t\t\t \t\t "
# ☁
Push 10
Push 13
OutputChar
OutputChar
Return

And the result is

Hello, world of spaces!

You can run it with a little bit fixes of the official whitespace implementation wspace 0.3 written in Haskell.

A Language

I made an experimental small language and it's compiler.

(begin
  (def 0 98)
  (f 90 7)
  (putc (ref 0))
  (end)
  (defn f (x y)
        (putc (g x y)))
  (defn g (x y)
        (+ x y)))

Syntax: an expression begins with "(" and a name, arguments for it, and end with ")". If an expression has a value, you have to use the value with enclosing another expression. Otherwise the behavior is undefined. If an expression does not have a value, you must not enclose it as an argument. Otherwise the behavior is undefined.

begin
- combines some expressions that don't have return values
def {index} {value}
- assigns the {value} to {index}th slot of heap
ref {index} -> {value}
- returns the {value} of {index}th slot of heap
putc {value}
- outputs a character which ASCII code is {value}
end
- terminates program
defn {name} ({args}) {body}
- defines a function
- if a program come to defn without using call, the behavior is undefined.

* + {value} {value} -> {value} * obviously

You can "call" a function you made just with ({name} {arg1} {arg2}). You can use arguments of the function just by an identifier like x.

Sample code

(begin
  (putc (+ 50 40))
  (end))

That is compiled to

Push 50
Push 40
Infix Plus
OutputChar
End

and shows 'Z'.

I little bit complicated example

(begin
  (def 0 98)
  (f 90 7)
  (putc (ref 0))
  (end)
  (defn f (x y)
        (putc (g x y)))
  (defn g (x y)
        (+ x y)))

compiled to

Push 0
Push 98
Store
Push 90
Push 7
Call "f"
Push 0
Retrieve
OutputChar
End
Label "f"
Push (-1)
Swap
Store
Push (-2)
Swap
Store
Push (-1)
Retrieve
Push (-2)
Retrieve
Call "g"
OutputChar
Return
Label "g"
Push (-3)
Swap
Store
Push (-4)
Swap
Store
Push (-3)
Retrieve
Push (-4)
Retrieve
Infix Plus
Return

and shows "ab".

You may have noticed that the function argument is in negative number of heaps. If you updates them like by (def -3 100) it may result in breaking something, but since this implementation doesn't support negative literals, it remains safe.

The compiler is below, written in Haskell.

import qualified VM as V
import qualified Text.Parsec as P
import Control.Applicative ((<|>), (<$>))
import qualified Control.Monad.State as S
import qualified Data.Map as M
import Data.Maybe (fromJust)

data Intermediate = Comment String
  | Inst V.Instruction
  | Paramdef String
  | Paramref String
  deriving Show

type LispParser = P.ParsecT String () (S.State String)
type ParamMap = M.Map String Integer

main :: IO ()
main = do
  code <- readFile "hworld.lisp"
  --mapM_ print $ parse code
  let runtime = compile (parse code)
  mapM_ print runtime
  putStrLn "--"
  V.vm (V.VM runtime (V.Stack []) (V.Stack []) M.empty 0)

parse :: String -> [Intermediate]
parse str = either (error . show) id $
  S.evalState (P.runPT parseExpr () "lisp" str) "toplevel"

parseExpr :: LispParser [Intermediate]
parseExpr = P.try parseInt
         <|> parseDefn
         <|> parseBuiltin
         <|> parseApply
         <|> parseVar

parseInt :: LispParser [Intermediate]
parseInt = do
  x <- P.many1 P.digit
  return [Inst $ V.Push $ read x]

parseAtom :: LispParser String
parseAtom = P.many1 $ P.noneOf " \t\n()"

parseDefn :: LispParser [Intermediate]
parseDefn = do
  P.try $ do
    ignoringSpaces $ P.char '('
    ignoringSpaces $ P.string "defn"
  fname <- requireSpaces parseAtom
  S.lift $ S.put fname

  ignoringSpaces $ P.char '('
  names <- ignoringSpaces $ parseAtom `P.sepBy` P.skipMany1 P.space
  ignoringSpaces $ P.char ')'

  body <- ignoringSpaces parseExpr
  ignoringSpaces $ P.char ')'
  S.lift $ S.put "toplevel"
  return $
    Comment "(defn" :
    Inst (V.Label fname) :
    map (Paramdef . ((fname ++ "/") ++)) names ++
    body ++ [Inst V.Return] ++ [Comment ")"]

parseBuiltin :: LispParser [Intermediate]
parseBuiltin = P.try $ do
  (fname, xs) <- atomAndArgs
  x <- case (fname, length xs) of
       ("+", 2) -> return [Inst $ V.Infix V.Plus]
       ("putc", 1) -> return [Inst V.OutputChar]
       ("def", 2) -> return [Inst V.Store]
       ("ref", 1) -> return [Inst V.Retrieve]
       ("end", 0) -> return [Inst V.End]
       ("begin", _) -> return []
       _ -> fail "omg"
  return $ Comment ('(' : fname) : concat xs ++ x ++ [Comment ")"]

parseApply :: LispParser [Intermediate]
parseApply = do
  (fname, xs) <- atomAndArgs
  return $ concat xs ++ [Inst $ V.Call fname]

atomAndArgs :: LispParser (String, [[Intermediate]])
atomAndArgs = do
  ignoringSpaces $ P.char '('
  fname <- ignoringSpaces parseAtom
  xs <- ignoringSpaces $ parseExpr `P.sepBy` P.many1 P.space
  P.char ')'
  return (fname, xs)

parseVar :: LispParser [Intermediate]
parseVar = do
  name <- ignoringSpaces $ P.many1 $ P.noneOf " \t\n()"
  fname <- S.lift S.get
  return [Paramref $ fname ++ '/' : name]

ignoringSpaces :: LispParser a -> LispParser a
ignoringSpaces f = P.skipMany P.space >> f

requireSpaces :: LispParser a -> LispParser a
requireSpaces f = P.skipMany1 P.space >> f

compile :: [Intermediate] -> [V.Instruction]
compile inters = concat $ S.evalState (mapM compile' inters) M.empty

compile' :: Intermediate -> S.State ParamMap [V.Instruction]
compile' (Comment _) = return []
compile' (Inst x) = return [x]
compile' (Paramdef name) = do
  idx <- pred . negate . fromIntegral . M.size <$> S.get
  S.modify $ M.insert name idx
  return [V.Push idx, V.Swap, V.Store]
compile' (Paramref name) = do
  idx <- fromJust . M.lookup name <$> S.get
  return [V.Push idx, V.Retrieve]

This code depends on VM.hs from wspace-0.3 to share the data structure of VM Instruction and to execute the compiled program. If you only want to compile given programs, you don't need VM.hs but just to add the following definition.

data Instruction =
       Push Integer
     | Dup
     | Ref Int
     | Slide Int
     | Swap
     | Discard
     | Infix Op
     | Store
     | Retrieve
     | Label Label
     | Call Label
     | Jump Label
     | If Test Label
     | Return
     | OutputChar
     | OutputNum
     | ReadChar
     | ReadNum
     | End
   deriving (Show,Eq)

By the way wspace-0.3 had an issue that it can only handle sequential indices of heap. You can store values in 0th, 1st and 2nd slots of heap, but you cannot store in 100th without completing all indices between 0 to 100. I wrote a patch to allow any index. Feel free to use it.

diff --git VM.hs VM.hs
index c9e96ab..bb74374 100644
--- VM.hs
+++ VM.hs
@@ -1,6 +1,8 @@
 module VM where

 import IO
+import qualified Data.Map as M
+import Data.Maybe (fromJust)

 {- Stack machine for running whitespace programs -}

@@ -35,7 +37,7 @@ type Loc = Integer

 type Program = [Instruction]
 newtype Stack = Stack [Integer]
-type Heap = [Integer]
+type Heap = M.Map Integer Integer

 data VMState = VM {
         program :: Program,
@@ -130,13 +132,7 @@ findLabel' m (_:xs) i = findLabel' m xs (i+1)
 -- Heap management

 retrieve :: Integer -> Heap -> IO Integer
-retrieve x heap = return (heap!!(fromInteger x))
+retrieve x heap = return $ fromJust $ M.lookup x heap

 store :: Integer -> Integer -> Heap -> IO Heap
-store x 0 (h:hs) = return (x:hs)
-store x n (h:hs) = do hp <- store x (n-1) hs
-             return (h:hp)
-store x 0 [] = return (x:[])
-store x n [] = do hp <- store x (n-1) [] 
-         return (0:hp)
-
+store x n h = return $ M.insert n x h

Thursday, December 22, 2011

Continuous-Passing Conversion in Haskell

http://en.wikipedia.org/wiki/Continuation-passing_style

Convert from

(+ (f 0 (g 1)) 2)

(g' (lambda (r0) (f' (lambda (r1) (+ r1 2)) 0 r0)) 1)

where data structure internally in Haskell is like

data AST = Node [AST] | Leaf Value
data Value = IntVal Int | Plus | Atom String | Lambda [String]

Implementation and description

import qualified Control.Monad.State as S

data AST = Node [AST] | Leaf Value
instance Show AST where
  show (Node xs) = "(" ++ unwords (map show xs) ++ ")"
  show (Leaf v) = show v

data Value = IntVal Int | Plus | Atom String | Lambda [String]
instance Show Value where
  show (IntVal i) = show i
  show Plus = "+"
  show (Atom name) = name
  show (Lambda names) = "lambda (" ++ unwords names ++ ")"

-- (+ (f 0 (g 1)) 2)
-- (g' (lambda (r0) (f' (lambda (r1) (+ r1 2)) 0 r0)) 1)
program :: AST
program = Node [Leaf Plus,
  Node [Leaf (Atom "f"), Leaf (IntVal 0), Node [Leaf (Atom "g"), Leaf (IntVal 1)]],
  Leaf (IntVal 2)]

main = do
  print program
  print $ cps program

cps :: AST -> AST
cps ast =
  let (newAst, modifiers) = S.runState (cps' ast) [] in
      foldl (flip ($)) newAst modifiers

cps' :: AST -> S.State [AST -> AST] AST
cps' (Node (Leaf (Atom f) : xs)) = do
  xs' <- mapM cps' xs
  n <- length `fmap` S.get
  let name = 'r' : show n
  append $ \root -> Node $
    (Leaf . Atom $ f ++ "'") :
    Node [Leaf (Lambda [name]), root] :
    xs'
  return $ Leaf (Atom name)
cps' (Node xs) = Node `fmap` mapM cps' xs
cps' c@(Leaf _) = return c

append x = S.modify (x :)

This converts correctly.

I used State Monad to modify given tree. The function cps starts state and the actual function cps' traverses given AST subtrees recursively.

(+ (f 0 (g 1)) 2)
   ^^^^^^^^^^^

When cps' sees this subtree, oh yes the first item of the list is a user-defined function and it's not tail-call, so cps' wants to replace the part with a new variable (say r), and enclose whole tree with new function f' and the arguments.

(f' (lambda (r) ((+ r 2) 0 (g 1))))
^^^^^^^^^^^^^^^^^   ^   ^^^^^^^^^^^

It's easy to change subtree but it's not trivial to change outside the subtree. But fortunately we already know that we only have to enclose something around the whole tree, so you can just save a function in state.

After cps' process is done, you apply all functions that the state has accumulatively to enclose trees. That will complete the job.

Tuesday, December 20, 2011

Unlambda Interpreter in Haskell

Unlambda is a minimal, "nearly pure"[1] functional programming language invented by David Madore. It is based on combinatory logic, a version of the lambda calculus that omits the lambda operator. It relies mainly on two built-in functions (s and k) and an "apply" operator (written `, the backquote character). These alone make it Turing-complete, but there are also some I/O functions to make it possible to interact with the user, some shortcut functions and a function for lazy evaluation. There are no variables in the language.

http://en.wikipedia.org/wiki/Unlambda

import qualified Text.Parsec as P
import Control.Applicative ((*>), (<$>), (<*>))

data AST = Apply AST AST | Val Value
instance Show AST where
  show (Apply a b) = "(" ++ show a ++ " " ++ show b ++ ")"
  show (Val (Dot c)) = "put-" ++ [c]
  show (Val (Builtin c)) = [c]

data Value = Dot Char
  | Builtin Char
  | PendingK Value
  | PendingS1 Value
  | PendingS2 Value Value
  deriving Show

main = do
  let helloworld = "`r```````````.H.e.l.l.o. .w.o.r.l.di"
  let fibonacci = "```s``s``sii`ki`k.*``s``s`ks``s`k`s`ks``s``s`ks``s`k`s`kr``s`k`sikk`k``s`ksk"
  print $ desugar $ parse helloworld
  eval $ desugar $ parse helloworld
  --eval $ desugar $ parse fibonacci

parse :: String -> AST
parse = either (error . show) id . P.parse parse' "unlambda"

parse' = P.try (P.char '`' *> (Apply <$> parse' <*> parse'))
         P.<|>
         P.try (P.char '.' *> (Val . Dot <$> P.anyChar))
         P.<|>
         P.try (Val . Builtin <$> P.anyChar)

desugar :: AST -> AST
desugar (Apply a b) = Apply (desugar a) (desugar b)
desugar (Val (Builtin 'r')) = Val (Dot '\n')
desugar (Val (Builtin 'i')) = Apply (Apply (Val (Builtin 's')) (Val (Builtin 'k'))) (Val (Builtin 'k')) -- i = ``skk
desugar x = x

eval :: AST -> IO (Value)
eval (Apply a b) = do
  a' <- eval a
  b' <- eval b
  apply a' b'
eval (Val x) = return x

apply :: Value -> Value -> IO Value
apply (Dot c) x = putChar c >> return x
apply (Builtin 'k') x = return $ PendingK x
apply (Builtin 's') x = return $ PendingS1 x
apply (PendingK x) y = return $ x
apply (PendingS1 x) y = return $ PendingS2 x y
apply (PendingS2 x y) z = do
  a <- apply x z
  b <- apply y z
  apply a b

parse the given string to abstract syntax tree
desugar the ast; expanding macros like r or i.
interpreter evaluates all nodes!

AST

(put-\n (((((((((((put-H put-e) put-l) put-l) put-o) put- ) put-w) put-o) put-r) put-l) put-d) ((s k) k)))

Result of helloworld

Hello world

Result of fibonacci

*
*
**
***
*****
********
*************
*********************
**********************************
*******************************************************
*****************************************************************************************

(added at Tue Dec 20 23:57:12 PST 2011)

I also made a stackmachine-based virtual machine and a compiler for it.

https://gist.github.com/1505131

This was actually much simpler/easier than I thought. There's a difference between pure interpreter and this virtualmachine, but it's not very big.

For example very short program "hi" that shows "hi" is "``.h.ii" in unlambda. First this parser converts the text to AST.

((put-h put-i) ((s k) k))

Then the compiler converts the tree to sequence of instructions.

IPush (Dot 'h')
IPush (Dot 'i')
IApply
IPush (Builtin 's')
IPush (Builtin 'k')
IApply
IPush (Builtin 'k')
IApply
IApply

Then the virtualmachine runtime will run it.

hi