宏
何为元编程?
[编辑]元编程是指编写 Julia 代码来处理和修改 Julia 代码。使用元编程工具,您可以编写 Julia 代码来修改源文件的其他部分,甚至可以控制修改后的代码是否运行以及何时运行。
在 Julia 中,原始源代码的执行分为两个阶段。(实际上,还有更多的阶段,但在这一点上,我们只关注这两个阶段。)
阶段1 是原始 Julia 代码被解析 - 转换为适合于求值的形式。您会对这个阶段比较熟悉,因为这时候所有语法错误都能被发现……这样做的结果是 抽象语法树 或 AST (Abstract Syntax Tree) ,该结构包含所有代码,但其格式比通常使用的人类友好语法更易于操作。
阶段2 是执行解析后的代码。通常,当您在 REPL 中键入代码并按 在手动字词转换规则中检测到错误键 时,或者当您从命令行运行 Julia 文件时,您不会注意到这两个阶段,因为它们发生得太快了。但是,使用 Julia 的元编程工具,您可以在对代码解析之后,但在执行之前访问该代码。
这可以让你做一些你通常不能做的事情。例如,您可以将简单表达式转换为更复杂的表达式,或者在代码运行之前检查代码并对其进行更改,以使其运行得更快。使用这些元编程工具拦截和修改的任何代码最终都将以通常的方式执行,运行速度与普通Julia代码一样快。
您可能已经在Julia中使用了两个现有的元编程示例:
- @time
在手动字词转换规则中检测到错误:
julia> @time [sin(cos(i)) for i in 1:100000]; 0.102967 seconds (208.65 k allocations: 9.838 MiB)
@time
在手动字词转换规则中检测到错误在代码的前面插入了 "秒表开始" 的命令在传入的表达式之前。当代码结束的时候,添加了一个“秒表结束” 的命令。然后进行计算,以报告所经过的时间和内存使用情况。
- @which
在手动字词转换规则中检测到错误
julia> @which 2 + 2 +(x::T, y::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} in Base at int.jl:53
此在手动字词转换规则中检测到错误根本不允许计算表达式 2 + 2
。相反,它报告将对这些特定参数使用哪种方法。它还会告诉您包含方法定义的源文件和行号。
元编程的其他用途包括 通过编写生成较大代码块的短代码 来实现单调编码工作的自动化,以及能通过生成您可能不希望手工编写的更快的代码来提高“标准”代码的性能。
冒号表达式(Quoted expressions)
[编辑]要使元编程成为可能,在解析阶段完成后,Julia 就需要一种方法来存储未计算但已解析的表达式。这就是 ':' (冒号) 前缀运算符:
julia> x = 3 3 julia> :x :x
在 Julia 中, :x
表示一个未求值的符号或一个引用符号。
(如果您不熟悉编程中引用符号(Quoted Symbols)的用途,请想想在书写中如何使用引用来区分普通用途和特殊用途。例如,在句子中:
'Copper' contains six letters.
引号表明 “Copper” 这个词不是指金属,而是指这个单词本身。同样,在 :x
中,符号前面的冒号将使您和Julia将 'x' 视为未计算的符号,而不是值3。)
要引用整个表达式而不是单个符号,请用冒号开头,然后将 Julia 表达式括在括号中:
julia> :(2 + 2) :(2 + 2)
还有一种形式的 :( )
结构,使用 quote
... end
关键字来将表达式封闭起来并引用:
quote
2 + 2
end
将返回
quote
#= REPL[123]:2 =#
2 + 2
end
而下面这个表达式:
expression = quote
for i = 1:10
println(i)
end
end
返回的是:
quote
#= REPL[124]:2 =#
for i = 1:10
#= REPL[124]:3 =#
println(i)
end
end
expression
对象的类型是 Expr
:
julia> typeof(expression)
Expr
解析完成,并准备好做接下来的事情。
对表达式进行求值
[编辑]Julia 还有一个函数 eval()
用于计算未求值的表达式:
julia> eval(:x)
3
julia> eval(:(2 + 2))
4
julia> eval(expression)
1
2
3
4
5
6
7
8
9
10
使用这些工具,可以创建并存储任何表达式,而不对其求值:
e = :(
for i in 1:10
println(i)
end
)
返回:
:(for i = 1:10 # line 2:
println(i)
end)
然后再计算这个表达式:
julia> eval(e) 1 2 3 4 5 6 7 8 9 10
更有用的是,可以在对表达式进行求值之前修改表达式的内容。
表达式的内部(Inside Expressions)
[编辑]只要将 Julia 代码放在一个未计算的表达式中,而不是作为字符串中的一段文本,您就可以使用它来做一些事情。
下面是另外一段表达式
P = quote
a = 2
b = 3
c = 4
d = 5
e = sum([a,b,c,d])
end
返回:
quote
#= REPL[125]:2 =#
a = 2
#= REPL[125]:3 =#
b = 3
#= REPL[125]:4 =#
c = 4
#= REPL[125]:5 =#
d = 5
#= REPL[125]:6 =#
e = sum([a, b, c, d])
end
请注意添加到每行引用表达式的有帮助的行号。 (每行的标签都添加在上一行的末尾。)
我们可以用 fieldnames()
函数看看表达式里面是什么:
julia> fieldnames(typeof(P)) (:head, :args, :typ)
head
字段为:block
, args
字段是一个数组,包含表达式(包括注释)。我们可以用这些简单的 Julia 技巧来检查这些。
例如,第二个子表达式是什么:
julia> P.args[2] :(a = 2)
把它们打印出来
for (n, expr) in enumerate(P.args)
println(n, ": ", expr)
end
1: #= REPL[125]:2 =#
2: a = 2
3: #= REPL[125]:3 =#
4: b = 3
5: #= REPL[125]:4 =#
6: c = 4
7: #= REPL[125]:5 =#
8: d = 5
9: #= REPL[125]:6 =#
10: e = sum([a, b, c, d])
如你所见,表达式 P
包含许多子表达式。我们可以非常容易地修改这个表达式;例如,我们可以将表达式的最后一行更改为使用 prod()
而不是 sum()
,这样,当对P 求值时,它将返回乘积而不是变量的和。
julia> eval(P) 14 julia> P.args[end] = quote prod([a,b,c,d]) end quote #= REPL[133]:1 =# prod([a, b, c, d]) end julia> eval(P) 120
或者,您可以在表达式中直接指向 sum()
符号:
julia> P.args[end].args[end].args[1] :sum julia> P.args[end].args[end].args[1] = :prod :prod julia> eval(P) 120
抽象语法树(AST)
[编辑]这种代码解析后的表示方式称为 AST (抽象语法树)。这是一个嵌套的层次结构,允许您和 Julia 轻松地处理和修改代码。
非常有用的 dump
函数使您可以轻松地可视化表达式的分层性质。例如,表达式::(1 * sin(pi/2))
表示如下:
julia> dump(:(1 * sin(pi/2)))
Expr
head: Symbol call
args: Array{Any}((3,))
1: Symbol *
2: Int64 1
3: Expr
head: Symbol call
args: Array{Any}((2,))
1: Symbol sin
2: Expr
head: Symbol call
args: Array{Any}((3,))
1: Symbol /
2: Symbol pi
3: Int64 2
typ: Any
typ: Any
typ: Any
您可以看到 AST 完全由 Expr 和 原子符号(例如符号和数字) 组成。
表达式插值
[编辑]在某种程度上,字符串和表达式是相似的——它们所包含的任何 Julia 代码通常都是未计算的,但是您可以使用插值来计算 引用表达式 中的一些代码。我们已经遇到了 字符串插值运算符,即美元符号($)。在字符串中使用它做插值时(可能会用括号将被插值的表达式括起来),这将计算被插值的 Julia 代码,然后将结果插入到字符串中:
julia> "the sine of 1 is $(sin(1))" "the sine of 1 is 0.8414709848078965"
同样的,您也可以使用美元符号来将某段 Julia 代码的执行结果插入到表达式中(否则这段代码也会被引用,而不会被求值):
julia> quote s = $(sin(1) + cos(1)); end
quote # none, line 1:
s = 1.3817732906760363
end
尽管这是一个被引用(quoted)了的表达式,因此未被计算,但表达式中的 sin(1) + cos(1)
却是被执行了,它的值被插入到了表达式中,原始代码则被值替换了。这种操作称为“拼接”。
与字符串插值一样,只有当你想要插入一个表达式的值时候才需要使用圆括号,插入单个符号的值用 $ 就行。
宏
[编辑]现在你已经知道如何创建并处理未求值的 Julia 表达式了,你肯定会想知道该怎样去修改它们。宏—— macro
就是从一个未求值的表达式生成新表达式的途径之一。
当你的 Julia 程序运行时,它首先会解析宏,并对宏进行求值,然后将宏生成的代码当成普通的表达式来计算。
下面是一个简单的宏定义,它只是打印出传入表达式的内容,然后直接将该传入表达式返回给调用者(在这里,调用者就是 REPL)。宏定义的语法和函数定义的语法很相似:
macro p(n)
if typeof(n) == Expr
println(n.args)
end
return n
end
您可以通过在名称前添加 @
前缀来运行宏。这个宏只需要一个参数,你直接提供未求值的 Julia 代码给它就行。也不必像调用函数那样,用括号将参数括起来。
先尝试一下用数值做参数:
julia> @p 3 3
数字并不是表达式,因此宏的 if
条件结果是 false。这个宏会直接返回 n
。但如果你传入一个表达式,宏里边的代码就能够在表达式被求值前,通过 .args
属性来审查或处理表达式的内容:
julia> @p 3 + 4 - 5 * 6 / 7 % 8 Any[:-,:(3 + 4),:(((5 * 6) / 7) % 8)] 2.7142857142857144
在上面的例子中,if
条件结果为 true,输入的表达式的未求值形式就被打印了出来。因此你能看到表达式的 AST 结构——也就是 Julia 表达式被解析得到的结果,对它进一步求值,就得到表达式的值。
你也能发现解析操作会考虑到算术运算符的不同优先级。注意上层操作符和子表达式都被冒号(:
)引用了起来。
Also notice that the macro p
returned the argument, which was then evaluated, hence the 2.7142857142857144
. But it doesn't have to — it could return a quoted expression instead.
As an example, the built-in @time
macro returns a quoted expression rather than using eval()
to evaluate the expression inside the macro. The quoted expression returned by @time
is evaluated in the calling context when the macro has done its work. Here's the definition:
macro time(ex)
quote
local t0 = time()
local val = $(esc(ex))
local t1 = time()
println("elapsed time: ", t1-t0, " seconds")
val
end
end
Notice the $(esc(ex))
expression. This is the way that you 'escape' the code you want to time, which is in ex
, so that it isn't evaluated in the macro, but left intact until the entire quoted expression is returned to the calling context and executed there. If this just said $ex
, then the expression would be interpolated and evaluated immediately.
If you want to pass a multi-line expression to a macro, use the begin
... end
form:
@p begin
2 + 2 - 3
end
Any[:( # none, line 2:),:((2 + 2) - 3)]
1
(You can also call macros with parentheses similar to the way you do when calling functions, using the parentheses to enclose the arguments:
julia> @p(2 + 3 + 4 - 5) Any[:-,:(2 + 3 + 4),5] 4
This would allow you to define macros that accepted more than one expression as arguments.)
eval()
and @eval
[编辑]There's an eval()
function, and an @eval
macro. You might be wondering what's the difference between the two?
julia> ex = :(2 + 2) :(2 + 2) julia> eval(ex) 4 julia> @eval ex :(2 + 2)
The function version (eval()
) expands the expression and evaluates it. The macro version doesn't expand the expression you supply to it automatically, but you can use the interpolation syntax to evaluate the expression and pass it to the macro.
julia> @eval $(ex) 4
In other words:
julia> @eval $(ex) == eval(ex) true
Here's an example where you might want to create some variables using some automation. We'll create the first ten squares and ten cubes, first using eval()
:
for i in 1:10
symbolname = Symbol("var_squares_$(i)")
eval(quote $symbolname = $(i^2) end)
end
which creates a load of variables named var_squares_n
, such as:
julia> var_squares_5 25
and then using @eval
:
for i in 1:10
symbolname = Symbol("var_cubes_$(i)")
@eval $symbolname = $(i^3)
end
which similarly creates a load of variables named var_cubes_n
, such as:
julia> var_cubes_5 125
Once you feel confident, you might prefer to write like this:
julia> [@eval $(Symbol("var_squares_$(i)")) = ($i^2) for i in 1:10]
Scope and context
[编辑]When you use macros, you have to keep an eye out for scoping issues. In the previous example, the $(esc(ex))
syntax was used to prevent the expression from being evaluated in the wrong context. Here's another contrived example to illustrate this point.
macro f(x)
quote
s = 4
(s, $(esc(s)))
end
end
This macro declares a variable s
, and returns a quoted expression containing s
and an escaped version of s
.
Now, outside the macro, declare a symbol s
:
julia> s = 0
Run the macro:
julia> @f 2 (4,0)
You can see that the macro returned different values for the symbol s
: the first was the value inside the macro's context, 4, the second was an escaped version of s
, that was evaluated in the calling context, where s
has the value 0. In a sense, esc()
has protected the value of s
as it passes unharmed through the macro. For the more realistic @time example, it's important that the expression you want to time isn't modified in any way by the macro.
Expanding macros
[编辑]To see what the macro expands to just before it's finally executed, use the macroexpand()
function. It expects a quoted expression containing one or more macro calls, which are then expanded into proper Julia code for you so that you can see what the macro would do when called.
julia> macroexpand(Main, quote @p 3 + 4 - 5 * 6 / 7 % 8 end) Any[:-,:(3 + 4),:(((5 * 6) / 7) % 8)] quote #= REPL[158]:1 =# (3 + 4) - ((5 * 6) / 7) % 8 end
(The #none, line 1:
is a filename and line number reference that's more useful when used inside a source file than when you're using the REPL.)
Here's another example. This macro adds a dotimes
construction to the language.
macro dotimes(n, body)
quote
for i = 1:$(esc(n))
$(esc(body))
end
end
end
This is used as follows:
julia> @dotimes 3 println("hi there") hi there hi there hi there
Or, less likely, like this:
julia> @dotimes 3 begin for i in 4:6 println("i is $i") end end
i is 4 i is 5 i is 6 i is 4 i is 5 i is 6 i is 4 i is 5 i is 6
If you use macroexpand()
on this, you can see what happens to the symbol names:
macroexpand(Main, # we're working in the Main module
quote
@dotimes 3 begin
for i in 4:6
println("i is $i")
end
end
end
)
with the following output:
quote
#= REPL[160]:3 =#
begin
#= REPL[159]:3 =#
for #101#i = 1:3
#= REPL[159]:4 =#
begin
#= REPL[160]:4 =#
for i = 4:6
#= REPL[160]:5 =#
println("i is $(i)")
end
end
end
end
end
The i
local to the macro itself has been renamed to #101#i
, so as not to clash with the original i
in the code we passed to it.
A more useful example: @until
[编辑]Here's how to define a macro that is more likely to be useful in your code.
Julia doesn't have an until condition ... do some stuff ... end statement. Perhaps you'd like to type something like this:
until x > 100
println(x)
end
You'll be able to write your code using the new until
macro like this:
until <condition>
<block_of_stuff>
end
but, behind the scenes, the work will be done by actual code with the following structure:
while true
<block_of_stuff>
if <condition>
break
end
end
This forms the body of the new macro, and it will be enclosed in a quote
... end
block, like this, so that it executes when evaluated, but not before:
quote
while true
<block_of_stuff>
if <condition>
break
end
end
end
So the nearly-finished macro code is like this:
macro until(<condition>, <block_of_stuff>)
quote
while true
<block_of_stuff>
if <condition>
break
end
end
end
end
All that remains to be done is to work out how to pass in our code for the <block_of_stuff>
and the <condition>
parts of the macro. Recall that $(esc(...))
allows code to pass through 'escaped' (i.e. unevaluated). We'll protect the condition and block code from being evaluated before the macro code runs.
The final macro definition is therefore:
macro until(condition, block)
quote
while true
$(esc(block))
if $(esc(condition))
break
end
end
end
end
The new macro is used like this:
julia> i = 0 0 julia> @until i == 10 begin global i += 1 println(i) end
1 2 3 4 5 6 7 8 9 10
or
julia> x = 5 5 julia> @until x < 1 (println(x); global x -= 1) 5 4 3 2 1
Under the hood
[编辑]If you want a more complete explanation of the compilation process than that provided here, visit the links shown in Further Reading, below.
Julia performs multiple 'passes' to transform your code to native assembly code. As described above, the first pass parses the Julia code and builds the 'surface-syntax' AST, suitable for manipulation by macros. A second pass lowers this high-level AST into an intermediate representation, which is used by type inference and code generation. In this intermediate AST format all macros have been expanded and all control flow has been converted to explicit branches and sequences of statements. At this stage the Julia compiler attempts to determine the types of all variables so that the most suitable method of a generic function (which can have many methods) is selected.
延伸阅读
[编辑]- Julia ASTs · The Julia Language(英文) Julia 官方文档中关于 AST(抽象语法树)的部分
- Julia Introspects(英文) Leah Hanson 2013 写的关于 Julia 内部表示的文章