The author suggests that "the obvious way to implement a DSL as a macro, as we saw with if-match, hard-codes the form of the new syntax class". I disagree. That's not what I'd consider the obvious way at all.
I'd consider the most obvious approach would be to pass the macro onto a polymorphic function of some description:
(defmulti if-match*
(fn [pat _ _ _] (if (list? pat) (first pat) (type pat)))
(defmacro if-match [pat expr then else]
(if-match* pat expr then else))
Macros have all the same capabilities for extensibility as regular functions. In Clojure at least, macros are just functions with some metadata attached.However:
1. The code you give still isn't smart enough. It dispatches on the symbol at the head of the list, but that doesn't account for namespacing. So your pattern-macros will all end up in one giant namespace. You could probably invent something clever to account for this but...
2. My overall point[1] was that writing a macro-extensible macro shouldn't require cleverness or new code - it should be in the standard library! Indeed, ideally defining a "pattern-macro" should be accomplished via the same mechanism as defining an "expression-macro"; you shouldn't need separate, custom macro-defining-macros for each syntax class. I'd settle for it just being easy to define an extensible syntax class along with a macro-defining-macro for it, though.
[1] Admittedly, this point could have been far clearer.
The idea that there should be "pattern-macros" and "expression-macros" is confusing how macros are used with what macros are.
The namespacing issue can be solved in the usual way; by evaluating the symbol in some fashion. How that's done really depends on what you want the syntax to look like. The simplest mechanism is just to pass the macro on to another macro:
(defmacro if-match [pat expr then else]
(if (list? pat)
(list pat expr then else)
(if-match-on-type pat expr then else)))What I mean by a "macro" is: a convenient way to extend the syntax of a particular syntax class by telling my language how to translate the syntactic construct I want to add into constructs it already understands. In this sense one definitely can have expression-macros and pattern-macros. Is there some reason you think one shouldn't have this in a language?
Your new proposed `if-match' is strange. I'm not sure how I'd use it to, for example, implement a "list" pattern-macro, such that `(list p1 p2 p3)' matches any list of three elements, whose first element matches p1, second matches p2, and third matches p3. You seem to intend that `((list p1 p2 p3) expr then else)` should macro-expand to the code that performs this pattern match. This is difficult for two reasons:
1. I'd need to shadow the definition of `list' with a macro that usually behaves like the list function, but if used inside of if-match behaves differently. This seems difficult, and is conceptually ugly.
2. AIUI, to get `((list p1 p2 p3) expr then else)' to macro-expand specially, `(list p1 p2 p3)' has to macro-expand to a symbol naming a macro, call it MAGIC, such that `(MAGIC expr then else)` expands to the desired result. This is difficult, because MAGIC needs to know about p1, p2, and p3. So `(list p1 p2 p3)' will need to dynamically define a new macro (with a gensym'ed name) that knows specifically about p1, p2, and p3, and return that. This may be possible, but is gnarly as fuck.
This is certainly very far from being "the obvious way to do it" to anyone but a wizard.
Maybe I'm misunderstanding you, or you made a typo?
And for 2, even though what you say seems desirable on the surface, you still approaches the problem in a way that is too fuzzy, or abstract. Just as saying "we should write more secure code" and then failing to attack the problem directly.
No offense, but even though you may have a nice idea, your explanation is a little too handwavy.
Re 1:
In a.clj:
(ns a (:refer-clojure))
(defmulti if-match*
(fn [pat _ _ _] (if (list? pat) (first pat) (type pat))))
(defmacro if-match [pat expr then else]
(if-match* pat expr then else))
(defmethod if-match* clojure.lang.Symbol
[pat expr then else]
`(let [~pat ~expr] ~then))
In b.clj: (ns b (:refer-clojure) (:require a))
;; a 'foo pattern-macro that does negation
(defmethod a/if-match* 'foo [pat expr then else]
`(a/if-match ~(second pat) ~expr ~else ~then))
In c.clj: (ns c (:refer-clojure) (:require a))
;; a 'foo pattern-macro that is the identity pattern
(defmethod a/if-match* 'foo [pat expr then else]
`(a/if-match ~(second pat) ~expr ~then ~else))
Now, at the repl: user=> (use 'a)
nil
user=> (require 'b)
nil
user=> (if-match (foo x) 'yes x 'no)
no
user=> (require 'c)
nil
user=> (if-match (foo x) 'yes x 'no)
yes
Note that 'require doesn't actually import the symbols defined by b.clj or c.clj, and despite that we're able to use the pattern-macro 'foo they define, because they're modifying if-match's dispatch-table. So our pattern-macros aren't being scoped the same way our regular macros are. I think this is wrong. Moreover, our pattern-macros have a single namespace, so we get collisions between what b.clj defines 'foo to mean and what c.clj defines it to mean, which is why (if-match (foo x) 'yes x 'no) changes behavior after the (require 'c).That aside, clojure.core.match is implemented pretty much as you describe:
https://github.com/clojure/core.match/wiki/Extending-match-f...
I too was a bit confused about what the article was getting at.
I'm working on a Lisp variant that recognizes the difference between expressions and... non-expressions? But taking the opposite approach: I found a way to make it so that everything is an expression while allowing one to do everything you would do with a non-expression via expressions. To achieve this, there are two kinds of expression: mutations and functions.
There are two things to understand about this:
1. All expressions take in the environment. Most functions don't use it, while most mutations do.
2. All expressions run inside a trampoline that evaluates them. The difference between a mutation and a function is that when the trampoline evaluates a function, it places its result into the return register (where it can be picked up by something else). In contrast, when the trampoline evaluates a mutation, it replaces the environment with the result. This is why mutations typically use the environment--rather than destroying the environment, you usually want to build the new environment with most of the old environment.
Some examples:
((mut () env (assoc env :foo 1))) ; equivalent to (define foo 1)
((mut () env
(assoc env :my-define (mut (dest src) env (assoc env dest src)))))
; this is actually how `define` is defined
((mut () env (map)))
((+ 1 1)) ; throws exception "undefined symbol +" because previous line emptied the environment
The "everything takes env" bit is inspired by J. Shutt's paper on his Kernel programming language: https://www.wpi.edu/Pubs/ETD/Available/etd-090110-124904/unr... and a lot of what I'm working on is built on his work.Hopefully it doesn't make my research redundant. :)
I think "statements" is the usual term, with "sentences" being the general term which encompasses both (a "sentence" in this context is something recognized by your formal grammar).
Can your Lisp variant handle unrestricted continuations?
Here's what I'm doing with continuations:
At the bytecode level, call/cc is trivial to implement given my current design. All you need is the current instruction pointer, the current environment (environments right now are just immutable A-lists), and the current return pointer. But it's not obvious to me how to translate call/cc from the syntax to the bytecode.
Tail call elimination is the next thing I'm implementing, and unless there's something wrong with my design I haven't noticed yet, it won't be hard to do. Another step to take down that direction, though, is adding a pass transforming to continuation-passing style. I think this is going to be one of the last things I do, because I'd like to explore other optimizations so I can compare performance.
I'm not sure what you mean by sentences being "recognized" by the grammar. Keep in mind that there are languages where only a series of 'statements' is allowed, no freestanding expressions.
The fact that Lisp does not distinguish between statements and declarations is closely tied to the fact that Lisp is very much a dynamic language (in particular, it is dynamically typed). The article uses the example of Python declaration vs. statement; but actually Python declarations are statements, too. This is typical of dynamic languages.
On the other hand, in a statically typed language there is necessarily a distinction between code that is executed at runtime and (although we often don't talk about it this way) code that is executed at compile time. Declarations happen at compile time. Expressions and statements happen at runtime. The two categories almost always use very different syntax.
Among statically typed languages, Haskell is particularly interesting, because, while it necessarily makes a strong distinction between expressions and declarations, it has erased the distinction between expression and statement: the latter is represented by an expression that returns a list of side effects.
Another interesting take on this issue can be found in Daan Leijen's Koka programming language[1]. In Koka, whether a function has side effects is part of its type. So effect inference can be done. The result, if I understand things correctly, is that the expression-or-statement issue becomes more than just a yes/no thing. I think these ideas are worth further exploration.
Lastly: an extensible pattern set. My goodness, yes. That's the big lack I feel in Haskell; I want to define new kinds of patterns. I've read that F# has good support for this, but I know nothing about it; can anyone comment?
I'm not sure it's appropriate to say that declarations are "executed at" compile time in a statically-typed language. In SML, for example, there's a notion of "phase separation" by which you can split the meaning of a program into its compile-phase and its run-phase meanings. Many declarations end up having both compile- and run-time components. In Haskell you might say that declarations are executed at compile time, but this means nothing more than that name-resolution and type-checking happen at compile time.
Is the distinction or separation necessarily obvious? Statically typed languages which have a unified term- and type-level seem to get a bit subtle in this regard, since a type can be used as a value. And Idris passes proofs around, and has to use proof erasure in order to not let it infect the runtime: proofs can remain until run-time if you're not careful.
I am asking this as a question since I don't have enough experience with such languages to really know myself.
I don't think so, but in practice it usually is obvious, in the programming languages that actually get used.
OTOH, the notion that the compile-time language and the run-time language could be very similar is an idea that might be slowly catching on. We'll see ....
I may be confused about what the exact claim is here, but I don't see how this has anything to do with whether or not something is an expression. I don't quite understand how having "not everything is an expression" helps solve this problem.
syntax-rules is a form of pattern matching (mentioned in footnote 2). But it's only for matching on syntax, not on ordinary values. I can't write the fibonacci function using syntax-rules.
The idea behind "not everything is an expression" is that ordinary macros only extend the expressions in a language. But languages have more than expressions to them - they also have patterns, and possibly other syntax classes (loop formats, LINQ, monadic do-syntax). I think those syntax classes ought to be macro-extensible as well. That's what I mean when I say that ordinary macros don't acknowledge that not everything is an expression.
It's rather a roundabout way to say it, I guess.
[0] https://common-lisp.net/project/iterate/
[1] https://common-lisp.net/project/iterate/doc/Rolling-Your-Own...
(a) it's not a well-known technique
(b) it's not in the standard library of any Lisp I know of
(c) there are interesting open design questions to be answered in the implementation of such a system
I hope this article will get folks thinking about these issues.
For example, walking the AST and calling macroexpand will work, but then you can't have a macro that expands differently in different contexts - when interpreted as an expression versus as a pattern, for example. I think this is an important feature.
That sounds great, for lisps, when everything is an expression. I feel like you haven't fully embraced lisp, because the article just kind of asserts that not everything is an expression, without justifying it.
In lisp the point of expressions isn't that everything has a return value, the bigger gain is that everything fits together, like legos. One of the big wins is that you can use your same toolkit, functions & macros, on everything. The point of the article seems to desire restricting that.
Without justifying? The article gives some sample code (in a lisp) colored according to what syntax class each bit of code belongs to. The code is not all red (like it would be if it were only expressions).
I would say that declarations, ... are not syntax but semantic classes.
Too bad the conclusion does not offer a glimpse of what would the extension mechanism look like. Still, nice article.
And in this system, patterns are expressions. So "everything is an expression" isn't exactly wrong in this case.
In Scala, you are given full control over both how a name "applies" (i.e. what Foo(...) does when called as a sort of constructor or factory method) and how it matches, in the form of an "unapply" static method that you implement. It seems like a "match" macro in a LISP could desugar a pattern into a program that matches that pattern, the same way the Scala compiler generates code that calls "unapply".
The really important point, that moxy explored and that I think I didn't make clear in the article, is that patterns are just one example of where you want extensibility of a non-expression syntax class. Really you want to be able to define your own syntax classes and be able to extend them!
I only mentioned moxy in a footnote in the article because it's "research-quality" at the moment - it's not well documented and I'm still having second thoughts about its design. I've moved on to other thing and am busy with work, so I don't imagine I'll be working on it in the immediate future.
Declarations are definitely a syntactically distinct class in, for example, Haskell or SML or C. Not so much in Lisp, Python, Javascript, and so forth.
I think the question of what an extensible extension mechanism would look like is a tough one, which is why I punted on it. I have worked on it a bit myself, as mentioned in another comment, but not in Lisp. I definitely might try my hand at a Lisp version in future, though, and I'm interested to see what other people might come up with!
[0] http://www.amazon.com/Art-Metaobject-Protocol-Gregor-Kiczale...
(defmacro foo [x] :foo)
which always returns a keyword.
All lisps that I'm aware of only let macros dispatch on list evaluation, i.e. when you see (foo ...), call the function defined in (defmacro foo), but I'm not aware of any limitation preventing you from applying that to other types of data.
[1] technically, they run at macro-expansion time, which is after reading the expression, and before evaluating.
More fundamentally, being known to something else is not the direction "is a" works in.
In a language that doesn't allow that, we end up having to do some really weird things (https://github.com/yawaramin/lambdak).
Code:
fn sum(l: &[i64]) -> i64 {
match l {
[] => 0,
[x, xs..] => x + sum(xs)
}
}
Assembly: _ZN3sum20hf66fc5855a7cf5fc3aaE:
.cfi_startproc
cmpq %fs:112, %rsp
ja .LBB2_2
movabsq $24, %r10
movabsq $0, %r11
callq __morestack
retq
.LBB2_2:
pushq %rbx
.Ltmp10:
.cfi_def_cfa_offset 16
subq $16, %rsp
.Ltmp11:
.cfi_def_cfa_offset 32
.Ltmp12:
.cfi_offset %rbx, -16
movq 8(%rdi), %rcx
xorl %eax, %eax
testq %rcx, %rcx
je .LBB2_4
movq (%rdi), %rax
decq %rcx
movq (%rax), %rbx
addq $8, %rax
movq %rax, (%rsp)
movq %rcx, 8(%rsp)
leaq (%rsp), %rdi
callq _ZN3sum20hf66fc5855a7cf5fc3aaE
addq %rbx, %rax
.LBB2_4:
addq $16, %rsp
popq %rbx
retq
So... no. fn sum_tail(v : i64, l: &[i64]) -> i64 {
match l {
[] => v,
[x, xs..] => sum(v + x, xs)
}
}With respect to variable assignments or raising exceptions (though examples exist of the others...) aren't these by the author's definition statements?
(def a "apple")
or
(throw (Exception. "my exception message"))For instance, in JavaScript one can use ternary syntax in place of an if-statement. Is a ternary condition actually an expression? It seems more of an expression than a statement, but one could argue it's just a simplified syntax of a conditional statement.
I would have code his `if-mathch` in a simpler way in clojure:
(case (f data-structure)
0 (do-something)
1 (do-something-else)
(do-default))
Where `f` can be a function defined in a protocol or a multimethod, so you can actually implement your own `f` for any data structure you like.Now, what I am missing ?
One needs a registry, an interning function and a driving function. Below is just an example:
(defvar *patterns* (make-hash-table))
(defparameter *pattern-names* nil)
(defun intern-pattern (name if-pattern then-pattern)
(setf *pattern-names*
(append *pattern-names* (list name)))
(setf (gethash name *patterns*)
(list (compile nil `(lambda (pat)
,if-pattern))
(compile nil `(lambda (pat expr then else)
(declare (ignorable pat expr then else))
,then-pattern))))
name)
(defmacro if-match (pat expr then else)
(loop for name in *pattern-names*
for (if-part then-part) = (gethash name *patterns*)
when (funcall if-part pat)
do (return (funcall then-part pat expr then else))))
(intern-pattern 'variable
'(and pat (symbolp pat))
'`(let ((,pat ,expr)) ,then))
(intern-pattern 'literal-atom
'(atom pat)
'`(if (equalp ',pat ,expr) ,then ,else))
(intern-pattern 'cons
'(eq 'cons (car pat))
'(destructuring-bind (_ p-car p-cdr) pat
(declare (ignore _))
(let ((tmp (gensym)))
`(let ((,tmp ,expr))
(if (consp ,tmp)
(if-match ,p-car
(car ,tmp)
(if-match ,p-cdr
(cdr ,tmp)
,then
,else)
,else)
,else)))))
Writing the macro DEFPATTERN is then trivial...I help maintain an old Lisp-based web server, which was written in the mid-90s on the Symbolics Lisp Machine. It literally has zillions of these registry/intern/machinery/defining-macro combinations...
It's just: one has to program those. But it has been done many many many times.
There's lot of good work on programming languages that allow metasyntactic runtime extensions, going back to Brian Smith's work at PARC.
match compare x v with
| 0 -> ...
| 1 -> ...
| _ -> ...
with the minor wart that since compare returns an int, you need to match the last case with a default to prevent the compiler from warning you about possibly not matching other int values.