Polemy: Artifact Content

Artifact d00c8d535d50e1e21ac2a668157788fc6c4a4b23

File index.dd
- 2010-11-24 13:22:04 - part of checkin [f9c31f3cd8] on branch trunk - Fixed the null dereference bug when directly wrote "case 1 when 2: 3" in REPL. It was due to null LexPosition in the AST. Now AST.pos !is null is an invariant of AST. (user: kinaba) [annotate]
Ddoc
$(DDOC_AUTHORS k.inaba)
$(DDOC_LICENSE NYSL 0.9982 (http://www.kmonos.net/nysl/))

<p>
このファイルは、言語仕様などの簡単な説明です。
</p>
<p>
あとついでに、左のサイドバーの "Package" タブをクリックすると実装のソースのドキュメントが読めます。
</p>

$(DDOC_MEMBERS

$(SECTION Syntax, $(SECBODY
<p>
文法について。
字句解析がわりと適当なので、
変数宣言の変数名のところに、数字を変数名として使えて参照できない変数が作れたり、
予約語は予約語として解釈され得ないところでは普通に変数名として使えちゃったりして、
偶にとんでもない見かけのソースが構文解析通りますが、気にしないで適当に使って下さい。
</p>

$(DDOC_MEMBERS

$(SECTION 文字コード, $(SECBODY
<p>
UTF-8 のみ対応です。
</p>
))

$(SECTION コメント, $(SECBODY
<p>
行コメントは <tt>#</tt> から改行までです。
</p>
<p>
ブロックコメントはありません。
</p>
))

$(SECTION BNF, $(SECBODY
<pre>
 ID    ::= 適当に識別子っぽい文字列
 LAYER ::= "@" ID

 E ::=
   $(D_COMMENT # 変数宣言)
     | ("var"|"let"|"def"|LAYER) ID "=" E (";"|"in") E
     | ("var"|"let"|"def"|LAYER) ID "(" PARAMS ")" "{" E "}" (";"|"in") E
     | ("var"|"let"|"def"|LAYER) ID "=" E
     | ("var"|"let"|"def"|LAYER) ID "(" PARAMS ")" "{" E "}"

   $(D_COMMENT # リテラル)
     | INTEGER                        $(D_COMMENT # 非負整数)
     | STRING                         $(D_COMMENT # "" でくくった文字列。\" と \\ は使える)
     | "{" ENTRYS "}"                 $(D_COMMENT # テーブル)
     | "fun" "(" PARAMS ")" "{" E "}" $(D_COMMENT # 無名関数)
     |  "λ" "(" PARAMS ")" "{" E "}" $(D_COMMENT # 無名関数)

   $(D_COMMENT # 関数呼び出し)
     | E "(" ARGS")"

         where    ARGS ::= E "," ... "," E
                PARAMS ::= (ID|LAYER)+ "," ... "," (ID|LAYER)+
                ENTRYS ::= ID ":" E    "," ... "," ID ":" E

   $(D_COMMENT # 演算子など)
     | "(" E ")"                 $(D_COMMENT # ただの括弧)
     | E BINOP E                 $(D_COMMENT # 二項演算子いろいろ)
     | E "."  ID                 $(D_COMMENT # テーブルのフィールドアクセス)
     | E ".?" ID                 $(D_COMMENT # テーブルにフィールドがあるか否か)
     | E "{" ENTRYS "}"          $(D_COMMENT # テーブル拡張)
     | "if" E ("then"|":"|"then" ":") E
     | "if" E ("then"|":"|"then" ":") E "else" ":"? E

   $(D_COMMENT # パターンマッチ)
     | "case" E ("when" PATTERN ":" E )* 

         where PATTERN ::= 式がだいたいなんでも書ける気がする

   $(D_COMMENT # レイヤ指定実行)
     | LAYER "(" E ")"
</pre>
))

$(SECTION 糖衣構文, $(SECBODY
<p>
演算子というものはありません。内部的には全て関数呼び出し構文に書き換えられています。<tt>if</tt> もです。
<br/>
パターンマッチも全部 <tt>if</tt> と <tt>==</tt> と <tt>&amp;&amp;</tt> と
<tt>.</tt> と <tt>.?</tt> を使った関数呼び出し式に書き換えられていますが、
規則の詳細を説明するのが面倒なので適当に想像して下さい。
他の書き換えはこんな感じです。
</p>
<pre>
    if E then E         ⇒ if( E, fun(){E}, fun(){} )
    if E then E else E  ⇒ if( E, fun(){E}, fun(){E} )
    E BINOP E           ⇒ BINOP(E, E)
    { ENTRIES }         ⇒ {}{ ENTRIES }
    {}                  ⇒ {}()
    E {ID:E, ...}       ⇒ .=(E, ID, E) { ... }
</pre>
<p>
変数宣言に色々ありますが、<tt>let</tt> と <tt>var</tt> と <tt>def</tt> は同じ扱いで、
<tt>in</tt> と <tt>;</tt> は同じ扱いです。つまり
</p>
<pre>
   let x = E in E
   var x = E in E
   def x = E in E
   let x = E ; E
   var x = E ; E
   def x = E ; E
</pre>
<p>
以上のどれも同じ意味なので、なんとなく関数型っぽく書きたい気分の日は <tt>let in</tt> を、
手続き型っぽく書きたい気分の日は <tt>var ;</tt> を使うとよいと思います。
<tt>if then else</tt> も微妙にコロンがあったりなかったりバリエーションがありますが好みで使います。
</p>
<p>
関数を宣言するときは、<tt>fun</tt> や <tt>λ</tt> を省略できます。
以下の書き換えが行われます。
</p>
<pre>
   def f( ARGS ) { E }; E   ⇒   def f = fun(ARGS){E}; E
</pre>
<p>
他に、もっと手続き型っぽくための書き換え色々
</p>
<pre>
   fun () { E; E; E      }   ⇒   fun () { let _ = E in let _ = E in E }
   fun () { var x = 100  }   ⇒   fun () { var x = 100; x }
   fun () { var x = 100; }   ⇒   fun () { var x = 100; x }
   fun () { }                ⇒   fun () { "(empty function body)" }
</pre>
<p>
中身が空の関数に何を返させるかは適当です。今はとりあえず適当に文字列返してます。
</p>
))

$(SECTION 変数のスコープ規則, $(SECBODY
<p>
基本的には、let によって常識的な感じに変数のスコープがネストします。
</p>
<pre>
   let x=21 in let x=x+x in x    $(D_COMMENT # 42)
</pre>
<p>
一方で、"let rec" のような特別な構文はありませんが、
</p>
<pre>
   let f = fun(x) { if x==0 then 1 else x*f(x-1) } in f(10)  $(D_COMMENT # 3628800)
</pre>
<p>
再帰的な関数定義なども、おそらく意図されたとおりに動きます。
内部の詳細は、諸般の事情により、
マジカルで破壊的なスコープ規則になっているのですが、
同名の変数を激しく重ねて使ったりしなければ、
だいたい自然な動きをすると思います、たぶん、はい。
</p>
<p>
ひとつだけ不可思議な動きをするのは、以下のケースです。
</p>
<pre>
   let x = 1 in
   let f = fun() {x} in
   let x = 2 in
      f()    $(D_COMMENT # 2!!)
</pre>
<p>
let-in を縦にチェインしたときだけ、同名変数を破壊的に上書きします
（再帰関数の定義が"うまく"いっているのはこの上書きのためです）。
なんでこんなことになっているかというと、
後で説明する「レイヤ」を使ったときに
<tt>let foo = ... in @lay foo = ... in ...</tt>
で他レイヤに重ね書きするためであります。
</p>
))
)
))




$(SECTION Basic Features, $(SECBODY
<p>
特に特徴的でもない部分を簡単にまとめ。
</p>
<ul>
  <li>静的型システムはありません。</li>
  <li>"ほぼ" 純粋関数型言語です。変数やテーブルのフィールドの破壊的な書き換えはできません。<br/>
      ただし、組み込み関数（<tt>print</tt>）と、変数のスコープ規則のマジカルな片隅に副作用があります。</li>
</ul>
<p>
静的型システムがないのは意図的ですが、破壊的代入がないのは、単に実装がめんどかっただけなので、
今後何か増えるかもしれません。増えないかもしれません。
</p>
$(DDOC_MEMBERS
$(SECTION データ型, $(SECBODY
<p>
以下のデータ型があります。
</p>
<ul>
  <li>整数:     <tt>0</tt>, <tt>123</tt>, <tt>456666666666666666666666666666666666666789</tt>, ...</li>
  <li>文字列:   <tt>"hello, world!"</tt>, ...</li>
  <li>関数:     <tt>fun(x){x+1}</tt></li>
  <li>テーブル: <tt>{car: 1, cdr: {car: 2, cdr: {}}}</tt></li>
  <li>未定義値: (undefined。特殊なケースで作られます)</li>
</ul>
<p>
関数はいわゆる「クロージャ」です。静的スコープで外側の環境にアクセスできます。
テーブルはいわゆるプロトタイプチェーンを持っていて、
自分にないフィールドの場合は親に問い合わせが行く感じになっていますが、
フィールドの書き換えがないので、これは特に意味ないかもしれない…。
</p>
<p>
また、リストを扱うために、いわゆる「cons リスト」を使います。
空リストを <tt>{}</tt>、１個以上要素があるものを <tt>{car: 先頭要素, cdr: 二番目以降のリスト}</tt>
という形で。この形でリストを扱わなければならないという決まりはありませんが、
この形は特別扱いされて <tt>print</tt> で綺麗に出力されたりします。
</p>
))
$(SECTION パターンマッチ, $(SECBODY
<p>
適当に実装されたパターンマッチがあります。
リストの 2n 番目と 2n+1 番目を足して長さを半分にする関数：
</p>
<pre>
    def adjSum(lst)
    {
      case lst
        when {car:x, cdr:{car: y, cdr:z}}: {car: x+y, cdr: adjSum(z)}
        when {car:x, cdr:{}}: lst
        when {}: {}
    }
</pre>
<p>
動かすときには、処理系がそれっぽい if-then-else に展開しています。
<tt>when</tt> を上から試していって、最初にマッチしたところを実行します。
</p>
<pre>
   PAT ::= "_"                                      $(D_COMMENT # ワイルドカード)
         | ID                                       $(D_COMMENT # 変数パターン)
         | "{" ID ":" PAT "," ... "," ID : PAT "}"  $(D_COMMENT # テーブルパターン)
         | E                                        $(D_COMMENT # 値パターン)
</pre>
<p>
変数パターンは常にマッチして、値をその変数に束縛します。
ワイルドカードも常にマッチしますが、変数束縛しません。
値パターンは、任意の式が書けます。その式を評価した結果と <tt>==</tt> ならマッチします。
外で束縛された変数を値パターンとして配置、は直接はできないので
</p>
<pre>
   var x = 123;
   case foo
     when {val: x+0}: ... $(D_COMMENT # これは {val:123} と同じ)
     when {val: x}:   ... $(D_COMMENT # これは任意の foo.?val なら常にマッチ)
</pre>
<p>
適当にちょっと複雑な式にしてやるとよいかも（裏技）。
</p>
<p>
テーブルパターンは、書かれたキーが全てあればマッチします。
<tt>{a: _}</tt> は、<tt>.a</tt> を持ってさえいればマッチするので、
<tt>{a: 123, b: 456}</tt> なんかにもマッチします。
なので、リストに対するパターンを書くときには、car/cdr の場合を先に書かないと
<tt>when {}</tt> を上に書くと全部マッチしちゃいます。注意。
</p>
))
)
))





$(SECTION Layers, $(SECBODY
<pre>
[Layers :: Overview]

  Polemy's runtime environment has many "layer"s.
  Usual execution run in the @value layer.

    >> 1 + 2
    3
    >> @value( 1 + 2 )
    3

  Here you can see that @LayerName( Expression ) executes the inner Expression in
  the @LayerName layer. Other than @value, one other predefined layer exists: @macro.

    >> @macro( 1+2 )
    {pos@value:{lineno@value:3, column@value:9, filename@value:<REPL>},
      is@value:app,
    args@value:{car@value:{pos@value:{lineno@value:3, column@value:9, filename@value:<REPL>},
                            is@value:int,
                          data@value:1},
                cdr@value:{
                  car@value:{pos@value:{lineno@value:3, column@value:11, filename@value:<REPL>},
                              is@value:int,
                            data@value:2},
                  cdr@value:{}}},
     fun@value:{pos@value:{lineno@value:3, column@value:10, filename@value:<REPL>},
                 is@value:var,
               name@value:+}}

  (Sorry, this pretty printing is not available on the actual interpreter...)
  This evaluates the expression 1+2 in the @macro layer. In this layer, the meaning of
  the program is its abstract syntax tree.

  You can interleave layers.
  The root node of the abstract syntax tree is function "app"lication.

    >> @value(@macro( 1+2 ).is)
    app



[Layers :: Defining a new layer]

  To define a new layer, you should first tell how to "lift" existing values two the new layer.
  Let us define the "@type" layer, where the meaning of programs is their static type.

    >> @@type = fun(x) {
    >>   if( _isint(x) ) { "int" } else {
    >>   if( _isfun(x) ) { x } else { "unknown" } }
    >> }
    (Note: polemy REPL may warn some exception here but please ignore)

  For simplicity, I here deal only with integers.
  _isint is a primitive function of Polemy that checks the dynamic type of a value.
  For function, leaving it untouched works well for almost all layers.

    >> @type( 1 )
    int
    >> @type( 2 )
    int
    >> @type( "foo" )
    unknown

  Fine! Let's try to type 1+2.

    >> @type( 1 + 2 )
    ...\value.d(119): [<REPL>:6:8] only @value layer can call native function

  Note that the behavior of this program is
    - run 1+2 in the @type layer
  and NOT
    - run 1+2 in @value and obtain 3 and run 3 in the @type.
  The problem is, the variable "+" is defined only in the @value layer.
  To carry out computation in the @type layer. We need to define it also
  in the @type layer.

  To define some variable in a specific layer, use @LayerName in place of
  (let|var|def)s.

    >> let x = 2
    >> @value x = 2
    >> @type x = "int"
    >> @hoge x = "fuga"

  For "+", do it like this.

    >> @type "+" = fun(x,y) {@value(
    >>   if( @type(x)=="int" && @type(y)=="int" ) { "int" } else { "typeerror" }
    >> )}
    polemy.value.native!(IntValue,IntValue,IntValue).native.__anonclass24

  It is just computing the return type from the input type.
  Not here that the intended "meaning" of if-then-else is the runtime-branching,
  and the meaning of "==" is the value-comparison. These are the @value layer
  behavior. So we have defined the function body inside @value layer.
  But when we refer the variables x and y, we need its @type layer meaning.
  Hence we use @type() there.

  Now we get it.

    >> @type( 1 + 2 )
    int

  Well, but do we have to define the @type layer meaning for every variables???
  No. After you defined @type "+", you'll automatically get the following:

    >> def double(x) { x + x }
    (function:17e4740:1789720)

    >> @type( double(123) )
    int

  Every user-defined functions are automatically "lift"ed to the appropriate layer.
  Only primitive functions like "+" requires @yourNewLayer annotation.



[Layers :: neutral-layer]

  let|var|def is to define a variable in the "current" layer.
  Not necessary to the @value layer.

    >> @value( let x = 1 in @value(x) )
    1

    >> @macro( let x = 1 in @value(x) )
    polemy.failure.RuntimeException: [<REPL>:14:29] variable x not found

    >> @macro( let x = 1 in @macro(x) )
    {pos@value:{lineno@value:15, ...



[Layers :: Layered-Parameters]

    >> def foo(x @macro @value) { {fst: x, snd: @macro(x)} }
    (function:1730360:1789720)

  If you annotate function parameters by @LayerNames, when you invoke the function...

    >> foo(1+2)
    {snd@value: {pos@value:{lineno@value:17, column@value:5, filename@value:<REPL>},
                  is@value:app, arg@value:{...
    /fst@value:3
    /}

  its corresponding arguments are evaluated in the layer and passed to it.
  If you specify multiple layers, the argument expression is run multiple times.
  If you do not specify any layer for a parameter, it works in the neutral layer.
</pre>
))


$(SECTION Macro Layers, $(SECBODY
<p>
Polemy 言語組み込みのレイヤは <code>@value</code> と <code>@macro</code> の二つです。
（内部的にはもういくつかありますが、ユーザから直接は使えません。）
<code>@value</code> は、「普通に」普通のセマンティクスでプログラムを実行するレイヤでした。
<code>@macro</code> は、実は、<code>@value</code> よりも前に実行されるレイヤで、
「プログラムを実行するとその構文木を返す」というセマンティクスで動きます。
</p>
<pre>
    (ここに例)
</pre>
<p>
動きとしてはこうです。
</p>
<ol>
<li>関数呼び出し時（とトップレベル環境の実行開始時）に、
	まず、<code>@macro</code> レイヤでコードを実行。</li>
<li>返ってきた構文木を、<code>@value</code> レイヤ、
	またはその関数を呼び出したときのレイヤで実行。</li>
</ol>
<p>
<code>@macro</code> レイヤも所詮ただのレイヤですので、
上で説明した方法で <code>@macro</code> レイヤに関数などを登録しておくことで、
構文木の生成をいじることが可能です。まさにマクロ。
</p>

$(DDOC_MEMBERS
$(SECTION 使い方, $(SECBODY
<pre>
   When function is invoked, it first run in the @macro layer, and after that,
   it run in the neutral layer. Here is an example.

     >> @macro twice(x) { x; x }
     >> def f() { twice(print("Hello")); 999 }
     (function:173b6a0:1789720)
     >> f()
     Hello
     Hello
     999

   When the interpreter evaluates f(), it first executes
     "twice(print("Hello")); 999"
   in the @macro layer. Basically what it does is to just construct its syntax tree.
   But, since we have defined the "twice" function in the @macro layer, it is
   execute as a function. Resulting syntax tree is
     "print("Hello"); print("Hello"); 999"
   and this is executed on the neutral (in this example, @value) layer.
   This is the reason why you see two "Hello"s.

      [[quote and unquote]]

   Here is more involved example of code genration.
   From "x", it generates "x*x*x*x*x*x*x*x*x*x".

     @macro pow10(x) {
       @value(
         def pow(x, n) {
           if( n == 1 ) { x }
           else {
             @macro( @value(x) * @value(pow(x,n-1)) )
           }
         }
         in
           pow(@macro(x),10)
       )
     };

   Here, x is a syntax tree but n is an actual integer. If you read carefully,
   you should get what is going on. Basically, @macro can be considered like
   quasiquoting and @value to be an escape from it.
</pre>
<p>
構文木がどのようなテーブルで渡されてくるかについては、ソースドキュメントの
<a href="http://www.kmonos.net/repos/polemy/doc/tip/doc/ast.html">polemy.ast</a>
のページをご覧下さい。例えば変数名を表す <code>Var</code> クラスには、
継承の分も合わせて
<tt><a href="http://www.kmonos.net/repos/polemy/doc/tip/doc/failure.html">LexPosition</a> pos;</tt>
と <tt>string name;</tt> の２つのメンバがあるので
</p>
<pre>
    { is:   "Var",
      pos:  {filename:"foo.pmy", lineno:123, column:45},
      name: "x" }
</pre>
<p>
こんな感じのテーブルになります。
クラス名が <tt>is</tt> フィールドに、メンバ変数はそのままの名前で入ります。
配列メンバは cons リストになって入ってきます。
</p>
))
$(SECTION 微妙な挙動, $(SECBODY
<pre>
   (rawmacro) レイヤの話

      [[limitations]]

   This @macro layer is a very primitive one, and not a perfect macro language.
   Two major limitations are seen in the following "it" example.

     >> @macro LetItBe(x, y) { let it = x in y };

   The variable name is not hygenic, and so without any effort, the syntax tree "y"
   can access the outer variable "it".

     >> def foo() { LetItBe( 1+2+3, it*it ) }
     >> foo()
     36

   Of course, this is not just a limitation; it can sometimes allow us to write
   many interesting macros.

   The other problem is that the macro expansion is only done at function startup.
   So 

     >> LetItBe( 1+2+3, it*it )
     ...\value.d(173): [<REPL>:24:1] variable LetItBe is not set in layer @value

   you cannot directly use the macro in the same scope as the definition.
   You need to wrap it up in a function (like the foo() in the above example).
</pre>
))
)
))


$(SECTION Built-in Primitives, $(SECBODY
<p>
組み込み関数・変数の一覧。
</p>
$(DDOC_MEMBERS

$(SECTION テーブル操作, $(SECBODY
  $(TABLE
    $(TR $(TH {}) $(TD ()) $(TD 空のテーブルを作る))
    $(TR $(TH .) $(TD (t, s)) $(TD テーブル t の名前 s のフィールドの値を取得。なければ <tt>undefined</tt>))
    $(TR $(TH .?) $(TD (t, s)) $(TD テーブル t に名前 s のフィールドがあれば 1、なければ 0))
    $(TR $(TH .=) $(TD (t, s, v)) $(TD テーブル t を親に持ち、名前 s のフィールドに v が入ったテーブルを作る))
  )
))
<br />

$(SECTION 制御フロー, $(SECBODY
  $(TABLE
    $(TR $(TH if) $(TD (n, ft, fe)) $(TD n が非 0 なら <tt>ft()</t>、0 なら <tt>fe()</tt> を実行))
  )
))
<br />

$(SECTION 演算, $(SECBODY
  $(TABLE
    $(TR $(TH +) $(TD (n, m)) $(TD 整数 n と整数 m を足して返す))
    $(TR $(TH -) $(TD (n, m)) $(TD 整数の引き算))
    $(TR $(TH *) $(TD (n, m)) $(TD 整数の掛け算))
    $(TR $(TH /) $(TD (n, m)) $(TD 整数の割り算))
    $(TR $(TH %) $(TD (n, m)) $(TD 整数の剰余))
    $(TR $(TH &amp;&amp;) $(TD (n, m)) $(TD 整数 n と m が両方非 0 なら 1、それ以外では 0))
    $(TR $(TH ||) $(TD (n, m)) $(TD 整数 n と m がどちらか非 0 なら 1、それ以外では 0))
    $(TR $(TH ~) $(TD (a, b)) $(TD a と b を文字列化して結合))
    $(TR $(TH &lt;) $(TD (a, b)) $(TD a と b を比較))
    $(TR $(TH &lt;=) $(TD (a, b)) $(TD a と b を比較))
    $(TR $(TH &gt;) $(TD (a, b)) $(TD a と b を比較))
    $(TR $(TH &gt;=) $(TD (a, b)) $(TD a と b を比較))
    $(TR $(TH ==) $(TD (a, b)) $(TD a と b を比較))
    $(TR $(TH !=) $(TD (a, b)) $(TD a と b を比較))
  )
<p>
注意点として、作者の趣味の問題で、<tt>&amp;&amp;</tt> と <tt>||</tt> は short-circuit 評価をしません。
整数演算の種類が少ないのは、D 言語の std.bigint がビット演算などをサポートしてないためです。
文字列が結合しかできないのは、単に手抜きです。
</p>
))

$(SECTION 外部とのやりとり, $(SECBODY
  $(TABLE
    $(TR $(TH print) $(TD (a)) $(TD a を文字列化標準出力に改行付きで表示))
    $(TR $(TH argv) $(TD ) $(TD スクリプトに渡された引数文字列のconsリスト))
    $(TR $(TH gensym) $(TD ()) $(TD エセgensym。変数名として他とかぶらなそうな文字列を返します))
    $(TR $(TH rand) $(TD (n)) $(TD 0 以上 n 未満の自然数を31bit以内でランダムに生成します))
  )
))
<br />

$(SECTION データ型判定, $(SECBODY
  $(TABLE
    $(TR $(TH _isint) $(TD (a)) $(TD a が整数なら 1、でなければ 0))
    $(TR $(TH _isstr) $(TD (a)) $(TD a が文字列なら 1、でなければ 0))
    $(TR $(TH _isfun) $(TD (a)) $(TD a が関数なら 1、でなければ 0))
    $(TR $(TH _istable) $(TD (a)) $(TD a がテーブルなら 1、でなければ 0))
    $(TR $(TH _isundefined) $(TD (a)) $(TD a が未定義値なら 1、でなければ 0))
  )
))
)
))

)
Macros:
    TITLE=Polemy Reference Manual
    DOCFILENAME=index.html
    SECTION=$(DDOC_DECL $(DDOC_PSYMBOL $1)) $(DDOC_DECL_DD $2)
    SECBODY=$0