跳转到内容

Introducing Julia/Metaprogramming

本页使用了标题或全文手工转换
维基教科书,自由的教学读本


« Introducing Julia
Metaprogramming
»
Plotting Modules and packages

何为元编程?

[编辑]

元编程是指编写 Julia 代码来处理和修改 Julia 代码。使用元编程工具,您可以编写 Julia 代码来修改源文件的其他部分,甚至可以控制修改后的代码是否运行以及何时运行。

在 Julia 中,原始源代码的执行分为两个阶段。(实际上,还有更多的阶段,但在这一点上,我们只关注这两个阶段。)

阶段1 是原始 Julia 代码被解析 - 转换为适合于求值的形式。您会对这个阶段比较熟悉,因为这时候所有语法错误都能被发现……这样做的结果是 抽象语法树 或 AST (Abstract Syntax Tree) ,该结构包含所有代码,但其格式比通常使用的人类友好语法更易于操作。

阶段2 是执行解析后的代码。通常,当您在 REPL 中键入代码并按 換行键 时,或者当您从命令行运行 Julia 文件时,您不会注意到这两个阶段,因为它们发生得太快了。但是,使用 Julia 的元编程工具,您可以在对代码解析之后,但在执行之前访问该代码。


这可以让你做一些你通常不能做的事情。例如,您可以将简单表达式转换为更复杂的表达式,或者在代码运行之前检查代码并对其进行更改,以使其运行得更快。使用这些元编程工具拦截和修改的任何代码最终都将以通常的方式执行,运行速度与普通Julia代码一样快。


您可能已经在Julia中使用了两个现有的元编程示例:

- @time 巨集指令:

julia> @time [sin(cos(i)) for i in 1:100000];
0.102967 seconds (208.65 k allocations: 9.838 MiB)

@time 巨集指令在代码的前面插入了 "秒表开始" 的命令在传入的表达式之前。当代码结束的时候,添加了一个“秒表结束” 的命令。然后进行计算,以报告所经过的时间和内存使用情况。

- @which 巨集指令

julia> @which 2 + 2
+(x::T, y::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} in Base at int.jl:53

此巨集指令根本不允许计算表达式 2 + 2 。相反,它报告将对这些特定参数使用哪种方法。它还会告诉您包含方法定义的源文件和行号。

元编程的其他用途包括 通过编写生成较大代码块的短代码 来实现单调编码工作的自动化,以及能通过生成您可能不希望手工编写的更快的代码来提高“标准”代码的性能。

冒号表达式(Quoted expressions)

[编辑]

要使元编程成为可能,在解析阶段完成后,Julia 就需要一种方法来存储未计算但已解析的表达式。这就是 ':' (冒号) 前缀运算符:

julia> x = 3
3

julia> :x
:x 

在 Julia 中, :x 表示一个未求值的符号或一个引用符号。

(如果您不熟悉编程中引用符号(Quoted Symbols)的用途,请想想在书写中如何使用引用来区分普通用途和特殊用途。例如,在句子中:

'Copper' contains six letters.

引号表明 “Copper” 这个词不是指金属,而是指这个单词本身。同样,在 :x 中,符号前面的冒号将使您和Julia将 'x' 视为未计算的符号,而不是值3。)

要引用整个表达式而不是单个符号,请用冒号开头,然后将 Julia 表达式括在括号中:

julia> :(2 + 2)
:(2 + 2)

还有一种形式的 :( ) 结构,使用 quote ... end 关键字来将表达式封闭起来并引用:

quote
   2 + 2
end

将返回

quote
    #= REPL[123]:2 =#
    2 + 2
end

而下面这个表达式:

expression = quote
   for i = 1:10
      println(i)
   end
end

返回的是:

quote
    #= REPL[124]:2 =#
    for i = 1:10
        #= REPL[124]:3 =#
        println(i)
    end
end

expression 对象的类型是 Expr:

julia> typeof(expression)
Expr

解析完成,并准备好做接下来的事情。

对表达式进行求值

[编辑]

Julia 还有一个函数 eval() 用于计算未求值的表达式:

julia> eval(:x)
3
julia> eval(:(2 + 2))
4
julia> eval(expression)
1
2
3
4
5
6
7
8
9
10

使用这些工具,可以创建并存储任何表达式,而不对其求值:

e = :(
    for i in 1:10
        println(i)
    end
)

返回:

:(for i = 1:10 # line 2:
    println(i)
end)

然后再计算这个表达式:

julia> eval(e)
1
2
3
4
5
6
7
8
9
10

更有用的是,可以在对表达式进行求值之前修改表达式的内容。

表达式的内部(Inside Expressions)

[编辑]

只要将 Julia 代码放在一个未计算的表达式中,而不是作为字符串中的一段文本,您就可以使用它来做一些事情。


下面是另外一段表达式

P = quote
   a = 2
   b = 3
   c = 4
   d = 5
   e = sum([a,b,c,d])
end

返回:

quote
    #= REPL[125]:2 =#
    a = 2
    #= REPL[125]:3 =#
    b = 3
    #= REPL[125]:4 =#
    c = 4
    #= REPL[125]:5 =#
    d = 5
    #= REPL[125]:6 =#
    e = sum([a, b, c, d])
end

请注意添加到每行引用表达式的有帮助的行号。 (每行的标签都添加在上一行的末尾。)

我们可以用 fieldnames() 函数看看表达式里面是什么:

julia> fieldnames(typeof(P))
(:head, :args, :typ)

head字段为:block , args字段是一个数组,包含表达式(包括注释)。我们可以用这些简单的 Julia 技巧来检查这些。

例如,第二个子表达式是什么:

julia> P.args[2]
:(a = 2)

把它们打印出来

for (n, expr) in enumerate(P.args)
    println(n, ": ", expr)
end
1: #= REPL[125]:2 =#
2: a = 2
3: #= REPL[125]:3 =#
4: b = 3
5: #= REPL[125]:4 =#
6: c = 4
7: #= REPL[125]:5 =#
8: d = 5
9: #= REPL[125]:6 =#
10: e = sum([a, b, c, d])

如你所见,表达式 P 包含许多子表达式。我们可以非常容易地修改这个表达式;例如,我们可以将表达式的最后一行更改为使用 prod()而不是 sum() ,这样,当对P 求值时,它将返回乘积而不是变量的和。

julia> eval(P)
14

julia> P.args[end] = quote prod([a,b,c,d]) end
quote                  
   #= REPL[133]:1 =#  
   prod([a, b, c, d]) 
end                   

julia> eval(P)
120

或者,您可以在表达式中直接指向 sum() 符号:

julia> P.args[end].args[end].args[1]
:sum

julia> P.args[end].args[end].args[1] = :prod
:prod

julia> eval(P)
120

抽象语法树(AST)

[编辑]

这种代码解析后的表示方式称为 AST (抽象语法树)。这是一个嵌套的层次结构,允许您和 Julia 轻松地处理和修改代码。

非常有用的 dump 函数使您可以轻松地可视化表达式的分层性质。例如,表达式::(1 * sin(pi/2)) 表示如下:

julia> dump(:(1 * sin(pi/2)))
 Expr
  head: Symbol call
  args: Array{Any}((3,))
    1: Symbol *
    2: Int64 1
    3: Expr
      head: Symbol call
      args: Array{Any}((2,))
        1: Symbol sin
        2: Expr
          head: Symbol call
          args: Array{Any}((3,))
            1: Symbol /
            2: Symbol pi
            3: Int64 2
          typ: Any
      typ: Any
  typ: Any

您可以看到 AST 完全由 Expr 和 原子符号(例如符号和数字) 组成。

表达式插值

[编辑]

在某种程度上,字符串和表达式是相似的——它们所包含的任何 Julia 代码通常都是未计算的,但是您可以使用插值来计算 引用表达式 中的一些代码。我们已经遇到了 字符串插值运算符,即美元符号($)。在字符串中使用它做插值时(可能会用括号将被插值的表达式括起来),这将计算被插值的 Julia 代码,然后将结果插入到字符串中:

julia> "the sine of 1 is $(sin(1))"
"the sine of 1 is 0.8414709848078965"

同样的,您也可以使用美元符号来将某段 Julia 代码的执行结果插入到表达式中(否则这段代码也会被引用,而不会被求值):

 julia> quote s = $(sin(1) + cos(1)); end
quote  # none, line 1:
    s = 1.3817732906760363
end

尽管这是一个被引用(quoted)了的表达式,因此未被计算,但表达式中的 sin(1) + cos(1) 却是被执行了,它的值被插入到了表达式中,原始代码则被值替换了。这种操作称为“拼接”。


与字符串插值一样,只有当你想要插入一个表达式的值时候才需要使用圆括号,插入单个符号的值用 $ 就行。

[编辑]

现在你已经知道如何创建并处理未求值的 Julia 表达式了,你肯定会想知道该怎样去修改它们。宏—— macro 就是从一个未求值的表达式生成新表达式的途径之一。 当你的 Julia 程序运行时,它首先会解析宏,并对宏进行求值,然后将宏生成的代码当成普通的表达式来计算。

下面是一个简单的宏定义,它只是打印出传入表达式的内容,然后直接将该传入表达式返回给调用者(在这里,调用者就是 REPL)。宏定义的语法和函数定义的语法很相似:

macro p(n)
    if typeof(n) == Expr 
       println(n.args)
    end
    return n
end

您可以通过在名称前添加 @ 前缀来运行宏。这个宏只需要一个参数,你直接提供未求值的 Julia 代码给它就行。也不必像调用函数那样,用括号将参数括起来。

先尝试一下用数值做参数:

julia> @p 3
3

数字并不是表达式,因此宏的 if 条件结果是 false。这个宏会直接返回 n。但如果你传入一个表达式,宏里边的代码就能够在表达式被求值前,通过 .args 属性来审查或处理表达式的内容:

julia> @p 3 + 4 - 5 * 6 / 7 % 8
Any[:-,:(3 + 4),:(((5 * 6) / 7) % 8)]
2.7142857142857144

在上面的例子中,if 条件结果为 true,输入的表达式的未求值形式就被打印了出来。因此你能看到表达式的 AST 结构——也就是 Julia 表达式被解析得到的结果,对它进一步求值,就得到表达式的值。 你也能发现解析操作会考虑到算术运算符的不同优先级。注意上层操作符和子表达式都被冒号(:)引用了起来。

Also notice that the macro p returned the argument, which was then evaluated, hence the 2.7142857142857144. But it doesn't have to — it could return a quoted expression instead.

As an example, the built-in @time macro returns a quoted expression rather than using eval() to evaluate the expression inside the macro. The quoted expression returned by @time is evaluated in the calling context when the macro has done its work. Here's the definition:

macro time(ex)
    quote
        local t0 = time()
        local val = $(esc(ex))
        local t1 = time()
        println("elapsed time: ", t1-t0, " seconds")
        val
    end
end

Notice the $(esc(ex)) expression. This is the way that you 'escape' the code you want to time, which is in ex, so that it isn't evaluated in the macro, but left intact until the entire quoted expression is returned to the calling context and executed there. If this just said $ex, then the expression would be interpolated and evaluated immediately.

If you want to pass a multi-line expression to a macro, use the begin ... end form:

@p begin
    2 + 2 - 3
end
Any[:( # none, line 2:),:((2 + 2) - 3)]
1

(You can also call macros with parentheses similar to the way you do when calling functions, using the parentheses to enclose the arguments:

julia> @p(2 + 3 + 4 - 5)
Any[:-,:(2 + 3 + 4),5]
4

This would allow you to define macros that accepted more than one expression as arguments.)

eval() and @eval

[编辑]

There's an eval() function, and an @eval macro. You might be wondering what's the difference between the two?

julia> ex = :(2 + 2)
:(2 + 2) 

julia> eval(ex)
4

julia> @eval ex
:(2 + 2)

The function version (eval()) expands the expression and evaluates it. The macro version doesn't expand the expression you supply to it automatically, but you can use the interpolation syntax to evaluate the expression and pass it to the macro.

julia> @eval $(ex)
4

In other words:

julia> @eval $(ex) == eval(ex)
true

Here's an example where you might want to create some variables using some automation. We'll create the first ten squares and ten cubes, first using eval():

for i in 1:10
   symbolname = Symbol("var_squares_$(i)")
   eval(quote $symbolname = $(i^2) end)
end

which creates a load of variables named var_squares_n, such as:

julia> var_squares_5
25

and then using @eval:

for i in 1:10
   symbolname = Symbol("var_cubes_$(i)")
   @eval $symbolname = $(i^3)
end

which similarly creates a load of variables named var_cubes_n, such as:

julia> var_cubes_5
125

Once you feel confident, you might prefer to write like this:

julia> [@eval $(Symbol("var_squares_$(i)")) = ($i^2) for i in 1:10]

Scope and context

[编辑]

When you use macros, you have to keep an eye out for scoping issues. In the previous example, the $(esc(ex)) syntax was used to prevent the expression from being evaluated in the wrong context. Here's another contrived example to illustrate this point.

macro f(x)
    quote
        s = 4
        (s, $(esc(s)))
    end
end

This macro declares a variable s, and returns a quoted expression containing s and an escaped version of s.

Now, outside the macro, declare a symbol s:

julia> s = 0

Run the macro:

julia> @f 2
(4,0)

You can see that the macro returned different values for the symbol s: the first was the value inside the macro's context, 4, the second was an escaped version of s, that was evaluated in the calling context, where s has the value 0. In a sense, esc() has protected the value of s as it passes unharmed through the macro. For the more realistic @time example, it's important that the expression you want to time isn't modified in any way by the macro.

Expanding macros

[编辑]

To see what the macro expands to just before it's finally executed, use the macroexpand() function. It expects a quoted expression containing one or more macro calls, which are then expanded into proper Julia code for you so that you can see what the macro would do when called.

julia> macroexpand(Main, quote @p 3 + 4 - 5 * 6 / 7 % 8 end)
Any[:-,:(3 + 4),:(((5 * 6) / 7) % 8)]
quote
   #= REPL[158]:1 =#
   (3 + 4) - ((5 * 6) / 7) % 8
end

(The #none, line 1: is a filename and line number reference that's more useful when used inside a source file than when you're using the REPL.)

Here's another example. This macro adds a dotimes construction to the language.

macro dotimes(n, body)
    quote
        for i = 1:$(esc(n))
            $(esc(body))
        end
    end
end

This is used as follows:

julia> @dotimes 3 println("hi there")
hi there
hi there
hi there

Or, less likely, like this:

julia> @dotimes 3 begin    
   for i in 4:6            
       println("i is $i")  
   end                     
end                        
i is 4
i is 5
i is 6
i is 4
i is 5
i is 6
i is 4
i is 5
i is 6

If you use macroexpand() on this, you can see what happens to the symbol names:

macroexpand(Main, # we're working in the Main module
    quote  
        @dotimes 3 begin
            for i in 4:6
                println("i is $i")
            end
        end
    end 
)

with the following output:

quote
    #= REPL[160]:3 =#
    begin
        #= REPL[159]:3 =#
        for #101#i = 1:3
            #= REPL[159]:4 =#
            begin
                #= REPL[160]:4 =#
                for i = 4:6
                    #= REPL[160]:5 =#
                    println("i is $(i)")
                end
            end
        end
    end
end

The i local to the macro itself has been renamed to #101#i, so as not to clash with the original i in the code we passed to it.

A more useful example: @until

[编辑]

Here's how to define a macro that is more likely to be useful in your code.

Julia doesn't have an until condition ... do some stuff ... end statement. Perhaps you'd like to type something like this:

until x > 100
    println(x)
end

You'll be able to write your code using the new until macro like this:

until <condition>
    <block_of_stuff>
end

but, behind the scenes, the work will be done by actual code with the following structure:

while true
    <block_of_stuff>
    if <condition>
        break
    end
end

This forms the body of the new macro, and it will be enclosed in a quote ... end block, like this, so that it executes when evaluated, but not before:

quote
    while true
        <block_of_stuff>
        if <condition>
            break
        end
    end
end

So the nearly-finished macro code is like this:

macro until(<condition>, <block_of_stuff>)
    quote
        while true
            <block_of_stuff>
            if <condition>
                break
            end
        end
    end
end

All that remains to be done is to work out how to pass in our code for the <block_of_stuff> and the <condition> parts of the macro. Recall that $(esc(...)) allows code to pass through 'escaped' (i.e. unevaluated). We'll protect the condition and block code from being evaluated before the macro code runs.

The final macro definition is therefore:

macro until(condition, block)
    quote
        while true
            $(esc(block))
            if $(esc(condition))
                break
            end
        end
    end
end

The new macro is used like this:

julia> i = 0
0

julia> @until i == 10 begin   
           global i += 1               
           println(i)          
       end                      
1
2
3
4
5
6
7
8
9
10

or

julia> x = 5
5

julia> @until x < 1 (println(x); global x -= 1)
5
4
3
2
1

Under the hood

[编辑]

If you want a more complete explanation of the compilation process than that provided here, visit the links shown in Further Reading, below.

Julia performs multiple 'passes' to transform your code to native assembly code. As described above, the first pass parses the Julia code and builds the 'surface-syntax' AST, suitable for manipulation by macros. A second pass lowers this high-level AST into an intermediate representation, which is used by type inference and code generation. In this intermediate AST format all macros have been expanded and all control flow has been converted to explicit branches and sequences of statements. At this stage the Julia compiler attempts to determine the types of all variables so that the most suitable method of a generic function (which can have many methods) is selected.

延伸阅读

[编辑]
  • Julia Introspects(英文) Leah Hanson 2013 写的关于 Julia 内部表示的文章
« Introducing Julia
Metaprogramming
»
Plotting Modules and packages