The author suggests that "the obvious way to implement a DSL as a macro, as we saw with if-match, hard-codes the form of the new syntax class". I disagree. That's not what I'd consider the obvious way at all.
I'd consider the most obvious approach would be to pass the macro onto a polymorphic function of some description:
(defmulti if-match*
(fn [pat _ _ _] (if (list? pat) (first pat) (type pat)))
(defmacro if-match [pat expr then else]
(if-match* pat expr then else))
Macros have all the same capabilities for extensibility as regular functions. In Clojure at least, macros are just functions with some metadata attached.However:
1. The code you give still isn't smart enough. It dispatches on the symbol at the head of the list, but that doesn't account for namespacing. So your pattern-macros will all end up in one giant namespace. You could probably invent something clever to account for this but...
2. My overall point[1] was that writing a macro-extensible macro shouldn't require cleverness or new code - it should be in the standard library! Indeed, ideally defining a "pattern-macro" should be accomplished via the same mechanism as defining an "expression-macro"; you shouldn't need separate, custom macro-defining-macros for each syntax class. I'd settle for it just being easy to define an extensible syntax class along with a macro-defining-macro for it, though.
[1] Admittedly, this point could have been far clearer.
The idea that there should be "pattern-macros" and "expression-macros" is confusing how macros are used with what macros are.
The namespacing issue can be solved in the usual way; by evaluating the symbol in some fashion. How that's done really depends on what you want the syntax to look like. The simplest mechanism is just to pass the macro on to another macro:
(defmacro if-match [pat expr then else]
(if (list? pat)
(list pat expr then else)
(if-match-on-type pat expr then else)))And for 2, even though what you say seems desirable on the surface, you still approaches the problem in a way that is too fuzzy, or abstract. Just as saying "we should write more secure code" and then failing to attack the problem directly.
No offense, but even though you may have a nice idea, your explanation is a little too handwavy.
That aside, clojure.core.match is implemented pretty much as you describe:
https://github.com/clojure/core.match/wiki/Extending-match-f...
I too was a bit confused about what the article was getting at.
I'm working on a Lisp variant that recognizes the difference between expressions and... non-expressions? But taking the opposite approach: I found a way to make it so that everything is an expression while allowing one to do everything you would do with a non-expression via expressions. To achieve this, there are two kinds of expression: mutations and functions.
There are two things to understand about this:
1. All expressions take in the environment. Most functions don't use it, while most mutations do.
2. All expressions run inside a trampoline that evaluates them. The difference between a mutation and a function is that when the trampoline evaluates a function, it places its result into the return register (where it can be picked up by something else). In contrast, when the trampoline evaluates a mutation, it replaces the environment with the result. This is why mutations typically use the environment--rather than destroying the environment, you usually want to build the new environment with most of the old environment.
Some examples:
((mut () env (assoc env :foo 1))) ; equivalent to (define foo 1)
((mut () env
(assoc env :my-define (mut (dest src) env (assoc env dest src)))))
; this is actually how `define` is defined
((mut () env (map)))
((+ 1 1)) ; throws exception "undefined symbol +" because previous line emptied the environment
The "everything takes env" bit is inspired by J. Shutt's paper on his Kernel programming language: https://www.wpi.edu/Pubs/ETD/Available/etd-090110-124904/unr... and a lot of what I'm working on is built on his work.Hopefully it doesn't make my research redundant. :)
I think "statements" is the usual term, with "sentences" being the general term which encompasses both (a "sentence" in this context is something recognized by your formal grammar).
Can your Lisp variant handle unrestricted continuations?
Here's what I'm doing with continuations:
At the bytecode level, call/cc is trivial to implement given my current design. All you need is the current instruction pointer, the current environment (environments right now are just immutable A-lists), and the current return pointer. But it's not obvious to me how to translate call/cc from the syntax to the bytecode.
Tail call elimination is the next thing I'm implementing, and unless there's something wrong with my design I haven't noticed yet, it won't be hard to do. Another step to take down that direction, though, is adding a pass transforming to continuation-passing style. I think this is going to be one of the last things I do, because I'd like to explore other optimizations so I can compare performance.
I'm not sure what you mean by sentences being "recognized" by the grammar. Keep in mind that there are languages where only a series of 'statements' is allowed, no freestanding expressions.
The fact that Lisp does not distinguish between statements and declarations is closely tied to the fact that Lisp is very much a dynamic language (in particular, it is dynamically typed). The article uses the example of Python declaration vs. statement; but actually Python declarations are statements, too. This is typical of dynamic languages.
On the other hand, in a statically typed language there is necessarily a distinction between code that is executed at runtime and (although we often don't talk about it this way) code that is executed at compile time. Declarations happen at compile time. Expressions and statements happen at runtime. The two categories almost always use very different syntax.
Among statically typed languages, Haskell is particularly interesting, because, while it necessarily makes a strong distinction between expressions and declarations, it has erased the distinction between expression and statement: the latter is represented by an expression that returns a list of side effects.
Another interesting take on this issue can be found in Daan Leijen's Koka programming language[1]. In Koka, whether a function has side effects is part of its type. So effect inference can be done. The result, if I understand things correctly, is that the expression-or-statement issue becomes more than just a yes/no thing. I think these ideas are worth further exploration.
Lastly: an extensible pattern set. My goodness, yes. That's the big lack I feel in Haskell; I want to define new kinds of patterns. I've read that F# has good support for this, but I know nothing about it; can anyone comment?
I'm not sure it's appropriate to say that declarations are "executed at" compile time in a statically-typed language. In SML, for example, there's a notion of "phase separation" by which you can split the meaning of a program into its compile-phase and its run-phase meanings. Many declarations end up having both compile- and run-time components. In Haskell you might say that declarations are executed at compile time, but this means nothing more than that name-resolution and type-checking happen at compile time.
Is the distinction or separation necessarily obvious? Statically typed languages which have a unified term- and type-level seem to get a bit subtle in this regard, since a type can be used as a value. And Idris passes proofs around, and has to use proof erasure in order to not let it infect the runtime: proofs can remain until run-time if you're not careful.
I am asking this as a question since I don't have enough experience with such languages to really know myself.
I don't think so, but in practice it usually is obvious, in the programming languages that actually get used.
OTOH, the notion that the compile-time language and the run-time language could be very similar is an idea that might be slowly catching on. We'll see ....
I may be confused about what the exact claim is here, but I don't see how this has anything to do with whether or not something is an expression. I don't quite understand how having "not everything is an expression" helps solve this problem.
syntax-rules is a form of pattern matching (mentioned in footnote 2). But it's only for matching on syntax, not on ordinary values. I can't write the fibonacci function using syntax-rules.
The idea behind "not everything is an expression" is that ordinary macros only extend the expressions in a language. But languages have more than expressions to them - they also have patterns, and possibly other syntax classes (loop formats, LINQ, monadic do-syntax). I think those syntax classes ought to be macro-extensible as well. That's what I mean when I say that ordinary macros don't acknowledge that not everything is an expression.
It's rather a roundabout way to say it, I guess.
[0] https://common-lisp.net/project/iterate/
[1] https://common-lisp.net/project/iterate/doc/Rolling-Your-Own...
That sounds great, for lisps, when everything is an expression. I feel like you haven't fully embraced lisp, because the article just kind of asserts that not everything is an expression, without justifying it.
In lisp the point of expressions isn't that everything has a return value, the bigger gain is that everything fits together, like legos. One of the big wins is that you can use your same toolkit, functions & macros, on everything. The point of the article seems to desire restricting that.
I would say that declarations, ... are not syntax but semantic classes.
Too bad the conclusion does not offer a glimpse of what would the extension mechanism look like. Still, nice article.
And in this system, patterns are expressions. So "everything is an expression" isn't exactly wrong in this case.
In Scala, you are given full control over both how a name "applies" (i.e. what Foo(...) does when called as a sort of constructor or factory method) and how it matches, in the form of an "unapply" static method that you implement. It seems like a "match" macro in a LISP could desugar a pattern into a program that matches that pattern, the same way the Scala compiler generates code that calls "unapply".
The really important point, that moxy explored and that I think I didn't make clear in the article, is that patterns are just one example of where you want extensibility of a non-expression syntax class. Really you want to be able to define your own syntax classes and be able to extend them!
I only mentioned moxy in a footnote in the article because it's "research-quality" at the moment - it's not well documented and I'm still having second thoughts about its design. I've moved on to other thing and am busy with work, so I don't imagine I'll be working on it in the immediate future.
Declarations are definitely a syntactically distinct class in, for example, Haskell or SML or C. Not so much in Lisp, Python, Javascript, and so forth.
I think the question of what an extensible extension mechanism would look like is a tough one, which is why I punted on it. I have worked on it a bit myself, as mentioned in another comment, but not in Lisp. I definitely might try my hand at a Lisp version in future, though, and I'm interested to see what other people might come up with!
[0] http://www.amazon.com/Art-Metaobject-Protocol-Gregor-Kiczale...
(defmacro foo [x] :foo)
which always returns a keyword.
All lisps that I'm aware of only let macros dispatch on list evaluation, i.e. when you see (foo ...), call the function defined in (defmacro foo), but I'm not aware of any limitation preventing you from applying that to other types of data.
[1] technically, they run at macro-expansion time, which is after reading the expression, and before evaluating.
In a language that doesn't allow that, we end up having to do some really weird things (https://github.com/yawaramin/lambdak).
Code:
fn sum(l: &[i64]) -> i64 {
match l {
[] => 0,
[x, xs..] => x + sum(xs)
}
}
Assembly: _ZN3sum20hf66fc5855a7cf5fc3aaE:
.cfi_startproc
cmpq %fs:112, %rsp
ja .LBB2_2
movabsq $24, %r10
movabsq $0, %r11
callq __morestack
retq
.LBB2_2:
pushq %rbx
.Ltmp10:
.cfi_def_cfa_offset 16
subq $16, %rsp
.Ltmp11:
.cfi_def_cfa_offset 32
.Ltmp12:
.cfi_offset %rbx, -16
movq 8(%rdi), %rcx
xorl %eax, %eax
testq %rcx, %rcx
je .LBB2_4
movq (%rdi), %rax
decq %rcx
movq (%rax), %rbx
addq $8, %rax
movq %rax, (%rsp)
movq %rcx, 8(%rsp)
leaq (%rsp), %rdi
callq _ZN3sum20hf66fc5855a7cf5fc3aaE
addq %rbx, %rax
.LBB2_4:
addq $16, %rsp
popq %rbx
retq
So... no. fn sum_tail(v : i64, l: &[i64]) -> i64 {
match l {
[] => v,
[x, xs..] => sum(v + x, xs)
}
}With respect to variable assignments or raising exceptions (though examples exist of the others...) aren't these by the author's definition statements?
(def a "apple")
or
(throw (Exception. "my exception message"))For instance, in JavaScript one can use ternary syntax in place of an if-statement. Is a ternary condition actually an expression? It seems more of an expression than a statement, but one could argue it's just a simplified syntax of a conditional statement.
I would have code his `if-mathch` in a simpler way in clojure:
(case (f data-structure)
0 (do-something)
1 (do-something-else)
(do-default))
Where `f` can be a function defined in a protocol or a multimethod, so you can actually implement your own `f` for any data structure you like.Now, what I am missing ?
One needs a registry, an interning function and a driving function. Below is just an example:
(defvar *patterns* (make-hash-table))
(defparameter *pattern-names* nil)
(defun intern-pattern (name if-pattern then-pattern)
(setf *pattern-names*
(append *pattern-names* (list name)))
(setf (gethash name *patterns*)
(list (compile nil `(lambda (pat)
,if-pattern))
(compile nil `(lambda (pat expr then else)
(declare (ignorable pat expr then else))
,then-pattern))))
name)
(defmacro if-match (pat expr then else)
(loop for name in *pattern-names*
for (if-part then-part) = (gethash name *patterns*)
when (funcall if-part pat)
do (return (funcall then-part pat expr then else))))
(intern-pattern 'variable
'(and pat (symbolp pat))
'`(let ((,pat ,expr)) ,then))
(intern-pattern 'literal-atom
'(atom pat)
'`(if (equalp ',pat ,expr) ,then ,else))
(intern-pattern 'cons
'(eq 'cons (car pat))
'(destructuring-bind (_ p-car p-cdr) pat
(declare (ignore _))
(let ((tmp (gensym)))
`(let ((,tmp ,expr))
(if (consp ,tmp)
(if-match ,p-car
(car ,tmp)
(if-match ,p-cdr
(cdr ,tmp)
,then
,else)
,else)
,else)))))
Writing the macro DEFPATTERN is then trivial...I help maintain an old Lisp-based web server, which was written in the mid-90s on the Symbolics Lisp Machine. It literally has zillions of these registry/intern/machinery/defining-macro combinations...
It's just: one has to program those. But it has been done many many many times.
There's lot of good work on programming languages that allow metasyntactic runtime extensions, going back to Brian Smith's work at PARC.
match compare x v with
| 0 -> ...
| 1 -> ...
| _ -> ...
with the minor wart that since compare returns an int, you need to match the last case with a default to prevent the compiler from warning you about possibly not matching other int values.