It was common in the 60s and 70s to have the hardware manufacturer ship all the OS and languages with their hardware. The languages were often designed for specific problem domains. The idea of general purpose languages (FORTRAN, PL/1, etc) was uncommon. You can see this in K&R (the original edition anyway) where they justify the idea of a general purpose language, even though C itself derived from prior general languages (B & BCPL) and they had gotten the idea from their experience on Multics (written in PL/1, a radical idea at the time). So a 20 year old idea was still barely diffused into the computing Zeitgeist.
Most Lisp development (since the early 70s at least) is writing a domain-specific representation (data structures and functions) and then writing your actual problem in it. I used both Lisp and Smalltalk this way at PARC in the early 80s.
More rigid languages (the more modern algolish languages like python, c++, rust, C, js etc -- almost every one of them) doesn't have these kind of affordances but instead do the same via APIs. Every API is itself a "little language"
What are called little languages in the Bently sense is simply a direct interface for domain experts. And after all what was a language like, say, Macsyma but a (large) "little language"?
I came to this conclusion early in my career. It went something like this:
A - "To do this, just create this object, fill in these properties, and call these methods."
B - "Okay, I did that, but it crashed."
A - "Yeah, it's because you set the properties in the wrong order. This property relies on this other property under the hood. Set them in this order."
B - "Still crashes."
A - "Yeah, you called the methods in the wrong order. This method relies on that method. Call them in this order and it works."
My conclusion was that the lisp philosophy of building a lot of little sub language was equivalent to what people were doing with OO in C#/Java. Either way you have to learn the "right" way to put things together which is dictated by unseen forces behind the scene.
Of course, I also concluded that most people work differently than I do. For most people, if the code "looks right" (ie recognizable syntax) then they're able to tell themselves a story that it's familiar and their intuition is able to pick up the slack for finding the right enough way to use most arbitrary APIs (just as long as they don't exceed some level of incomprehensibility). On the other hand, I have to understand the underlying logic or I use the API the wrong way pretty much every time.
So for most people lots of APIs is actually a much better cognitive way for them to work whereas for me API soup and lisp macros are the same conundrum.
IMO we need a language (or library?) that forces builders of an API to make incorrect behavior hard or impossible. Here's some scribbled out ideas: https://packetlost.dev/blog/lang-scribbles/
I agree with much of the problems listed in the article. The author even manages to stumble onto some of the solutions (e.g. Dhall being a total language).
"Expressiveness is co-decidability" is the main theme of these things. The crux of the issue is in our everyday programming tasks, we have many levels of decidability, ranging from RE all the way to things that require full Turing completeness.
The majority of work however, lies in the middle. There are so many things that can be done with pushdown automatons, or with deterministic automatons. Most codebases don't actually use those though. An issue is that there is a dearth of "mini" languages that support these things.
Another issue is that somehow we are enamoured with the idea that our languages must be able to express everything under the sun (up to TC/Recursively Enumerable). This seems to be more of an industry attitude than anything - there is this chase for the most powerful language (a lisp, clearly... everything else is a blub).
I've recently experimented with embedding an APL into my usual programming language, and it was a very interesting experience. It feels like having the power to do regular expression stuff, but with arrays. I want to do the same for the other levels of expressiveness.
As a corollary, such languages can't be Turing complete, since universal Turing machines can represent programs which don't terminate.
[1] https://en.wikipedia.org/wiki/Total_functional_programming
The article pulls Shell as an early example. Shell did not become the powerhouse it is because its "great" (though some would argue it is, I'm not here to debate that); or because its small; or because its general purpose; or because its single-purpose. It became a powerhouse because its Old and Omnipresent. See, the problem with inventing New Things is that they are, by definition, not Old, nor Omnipresent. New Things have to start somewhere, but you're starting in last place.
> Regular expressions and SQL won’t let you express anything but text search and database operations, respectively.
Oh mylanta. Did you know that after the addition of back-expressions, Regular Expressions became turing complete? They are, functionally, a real programming language, just like C; well, except, far more annoying to write. And naturally, SQL "won't let you express anything but database operations", which is to say nothing about "SELECT 1+1"... let alone the little corner of the language called "Stored Procedures".
Like, the thing I find comical about the original article is: they reach for shell, regex, and SQL as Prime Example of "mini languages done right". Be more domain specific, look at this its already happening. All of those examples are kinda shit languages! They're popular because of their omnipresence. Maybe regex is "fine", though the line "try to solve a problem with a regex and now you have two problems" is well known for a reason.
But my broader opinion is: if you're building an application, service, tool, whatever in LANGUAGE_X; having to "dip out" of that language into an entirely different language should be viewed as, fundamentally, a Negative Thing. There may be reasons why you should; the good may outweigh the bad; but there is Bad there. There will always be an interpretation layer; extra tooling; that's More Things that can go wrong, have to be configured, statically analyzed, tested, its a failure point. SQL injection is a thing. Why? Because, for a time, people had the thought "hey, its a string, lets just template the string"; but that's no good, so now we have A Layer between the Java and the SQL to keep us safe. We can only support So Many Layers; we need to make things simpler, not more complex, there needs to be Fewer Things.
Its time is gone, but I'll always view Heroku Buildpacks as a paragon of system design. Consider: You could write a nodejs application in javascript, write a package.json in json (also javascript), and end-to-end get that thing on a URL on the internet, all in one language. Fantastic! Today, a typical app will have the app (say, Golang), a go.mod (different syntax than go itself), dockerfiles (language 3), kubernetes yamls (language 4), maybe helm or cdk8s (language 5), shell scripts, makefiles, maybe you're also writing sql... this isn't better and it doesn't have to be like this. But right now, it is, and to some degree I'm happy for it because its 80% of the reason why I get paid six figures.
In C#, I can pull in Roslyn, and compile a string on the fly as a C# script; but the way the .NET standard library is structured makes it pretty much unfeasible to prohibit outside interactions I don't want to allow (in my case, e.g: `DateTime.Now`, while allowing the Handling of `DateTime` values).
It's possbile to embed the Typescript compiler into a website, but running code on the fly and some simple sandboxing is not feasible without a serious pile of hacks.
I've recently read a forum thread about a library for compiling/running Elixir code as a script, but guess what: The runtime (apparently) makes sandboxing really hard.
And so on and so on. I just wished that the LUA approach of "if I don't give you a hook, you cannot do that" were just the default. I've seen so many overcomplicated enterprise-y solutions that are basically just a plea for a well-designed, local and small scripting API…
Yes, because languages are still not capability-secure. Memory-safe languages are inherently secure up until you introduce mutable global state, and that's how they typically leak all kinds of authority. If you had no mutable global state, then you can eval() all the live-long day and you wouldn't be able to escape the sandbox of the parameters passed in.
Examples of mutable global state:
* APIs: you can make any string you like, but somehow you can access any file or directory object using only File.Open(madeUpString). This is called a "rights amplification" pattern, where your runtime permits you to somehow amplify the permissions granted by one object, into a new object that gives you considerably more permissions.
* Mutable global variables: as you point out, eval() can access any mutable global state it likes, thus easily escaping any kind of attempt to sandbox it.
If these holes are closed then memory-safe languages are inherently sandboxed at the granularity of individual objects.
> Any interaction with the environment, such as I/O, access to resources, or operating system calls, can only be performed by invoking functions provided by the embedder and imported into a WebAssembly module
And IIRC, the core instruction set is reasonably compact.
We already have major headaches switching between JS, SQL, and {insert backend language here}. Introducing tens of little languages into a codebase may marginally increase readability of each chunk of code in isolation, but the amount of context-switching and required background knowledge it introduces would more than make up the difference.
In a abstraction strategy that's based around libraries, every library agrees on the same basic syntax and semantics. You interact with an embedded DSL via a well-defined interface that all libraries respect. The syntax may not be perfectly ideal to any one part of the project, but it's consistent throughout every part of the project. That has real value.
I think it's also a red herring to argue about lines of code: stringing together a bunch of little languages is not likely to lead to fewer lines of code than pulling in an equivalent number of third-party libraries, and it will almost certainly increase the total amount of code in your distribution, because each little language must not only implement the functionality desired, it also needs its own parser and interpreter. Take McIlroy's shell script: if you add up the C code to implement each of those little languages, you're at about 10k lines of code to make that bit of code golf possible.
I'm a huge fan of DSLs, and I like the analogy of modern programming as pyramid building. I just don't think independent, chained-together DSLs are the answer. I'd rather have a language like Kotlin that is designed around embedded DSLs that all respect the same rules and use the same runtime.
This is a fair point, but part of this has to be how battle tested the language in question is, right?
Bringing in a single language that's been run through its paces (bourne shell in this case) for text processing, seems like a much lower risk than bringing in a dozen different languages and hoping the places that they interface doesn't blow up (hope someone tried that particular combination before).
I used to agree wholeheartedly with this, but now I pretty strongly disagree; in my decade or so of cumulative code-monkeying experience, having to understand the nuances of some "convert $BACKEND_LANGUAGE to JS and SQL" layer has been far more headache-prone than just, you know, writing JS and SQL. All about using the right tool for the job - and I know of very few languages that are the right tool for all three of those jobs (let alone the myriad other jobs that might pop up as soon as you expand beyond a simple CRUD app).
DSLs like SQL are the norm and you can see the problem of them in basically every project.
You either use ORMs or you end up hand rolling SQL rows into Structs or Classes.
The whole mapping usually looks like crap and contains a bunch of implicit corner cases, which eventually end up being a footgun for someone.
Usually the SQL sever runs somewhere else, the ports are wrong, the language version is wrong, or a migration failed and a function is missing yada yada....
The same is usually true for Regexp. There are a billion dialects and every single one of them is basically unreadable, incomplete or just weird.
The same is true for microservices with tons of config files for dev, staging, testing and production...
Everything has its own version, can be down or mutate some random state somewhere while depending one other servcies.
It always breaks at the seams.
Increasing the amount of DSLs increases the amount of seams and thus makes software worse.
Regexes can be abstracted from exactly by making a more readable DSL.
Config files and the stuff accepted therein are small languages as well.
What are the alternatives to making little languages?
I think your points re: sql are true of straightforward crud apps, but not true at all of an analytics app. In those cases, the sql is often _very_ complex, and while the results of a query may be eventually mapped into a struct or something, the query generation is rarely a simple mapping of properties in an object to select columns.
> There are a billion dialects and every single one of them is basically unreadable, incomplete or just weird.
Sure, but in the vast majority of cases, one only has to deal with at most 3 dialects, and there's a good chance you won't be hitting the corner cases that make each dialect significantly different.
gawk compiled to webassembly would seem to fit the bill -- just shifts gawk from "external" dsl to "internal" dsl. orthogonality allows for usage of all modern interface trappings without any retooling and/or breaking rules of gawk language. Makes gawk a module in group of customized modules to form a general-purpose program out of dsl/little languages.
> Hard to onboard new hires, code breaks because of lack of understanding of dependencies, and code changes become harder to manage.
If on-boarding requires knowing more languages that are less widely known or used then it will be harder.
Yeah, SQL is not a little language anymore.
It started as one, but because a lot has been added to it and SQL flavors for Oracle or Postgres are anything but tiny. Windowing, nesting, json handling...
I think the author is kinda proving with this that a successful little language does not stay little, and hence little languages are not the future.
And don't get me started on DSL in general. Just lookup my username and "DSL" on hackernews for endless rambling.
Yes, SQL is bigger than it used to be but it is still a domain specific language, which is what the article is actually about.
The problem is that your DSL has to be understood by other people, including future you. Programming tasks are vast, combinatorially explosive state spaces full of weird potential interactions between features. Once you get above the complexity and universal familiarity of say, arithmetic, it's difficult for others to understand what's going on just by looking at 1-2 live examples. You have to heavily invest in proper docs and tooling (if your language doesn't provide it for free). By the time you've completed that your "little language" usually isn't such a little effort anymore.
If you don't, you've just made the next CMake. Congrats you monster.
That's why we have languages with functions now, because people didn't want to manually do a register dance in assembly.
That's why we have name spaces, because naming conventions only take you so far.
That's why we have map and filter (or equivalent) because that's what most loops are doing anyway.
Generation after generation, we discover that we all use common abstractions. We name them design patterns, then we integrate them in the language, and now they are primitives.
And the languages grow, grow, bigger and bigger. But they are better.
More productive. Less error prone. And code bases become standardized, simple problems have known solutions, jumping to a new project doesn't mean relearning the entire paradigm of the system in place.
Small languages either become big, or are replaced by things that are big, for the same reason most people prefer a car to a horse to go shopping.
Not that horse ridding will totally disapear, but it will stay in their optimal niche, they are not "the future".
It's been around a long time. It's not general purpose. It's considered the best option there is if your setup allows you to use it.
So why are shell languages still around? Why are they not replaced by C#, C++, Java or another big (=general purpose) language?
I find your horse->car comparison more akin to the sh->bash->zsh transition. Zsh is not as small as sh, but still it is in the small league is you ask me.
Small does not mean w/o functions, without NS, without map/filter: it means "not general purpose".
1.) Performance. For example a run-time for a user-friendly little language that maps HTTP requests to SQL queries can be much faster than a language that does the same thing by plugging user-friendly APIs together. A custom run-time can parse an HTTP request string directly into SQLite op code while the JS developer is writing glue code that takes orders of magnitude more memory and CPU time.
2.) Static analysis. This means tools that are better at finding bugs, finding optimizations, visualizing structure, etc.
That's an important point. Maybe as an academic someone is more inclined in learning new languages for the sake of intellectual interest, but on the engineering side, having uniformity of language is a big plus.
I ask myself, by the way, if the author misses the point not considering that all code that the programmer writes is translated in machine language / byte code of elementary instructions: those instructions are the primitive language. But the programmer uses a more elevated language as he wants something more expressive.
Which people though? If you make a DSL that non programmers in your organization use, I'm sure they will appreciate not having to learn the intricacies of Rust or whatever's in fashion this week.
but, i think you're probably right.
I certainly rather use CMake, even if I need an open book on the side, than Gradle, Blaze, autotools, yet another Python based build tool,....
However, most of the times, since my use of C++ is related to personal projects, IDE project files are more than enough, they have been serving me well for the last 30 years.
That is like learning some SaaS application ins and outs you switch jobs and that specific experience is not useful at all for you.
General purpose language on the other hand is useful even if you move from one country to another and take job in different business niche.
As a developer there is no upside for me to spend my time on diving into some DSL I wont use in next job.
As a business person there is no upside for me to spend my time learning DSL or specific application interface in and out that I won't use in next job or in different position.
Sometimes a DSL is really the right solution.
Doing these things in vanilla syntax of general purpose programing languages is not exactly great.
- can you even design it properly?
- is it tested?
- is it debuggable?
- how does it integrate with the rest of your program(s)? with the rest of your system(s)?
- what's the performance, and does it matter?
- is it documented?
- who is going to maintain it 1 year from now? 5 years from now?
But little languages could be a nice interface for the non- or semi-programming tasks. Do you really want your domain experts to fiddle with the core of your application or do you want your programmers to do that? A little language could be a great interface to encode specific business rules and domain logic.
The author gives SQL as an example of a little language and we do indeed already provide SQL interfaces to analysts and let them do their thing.
The Curse Of Almost: Your tool is great, it's almost perfect... except for that one little thing it can't do, which your users need to do, which, therefore, leads to masses of ugly hacks unless you provide access to an escape hatch where sufficiently motivated experts can drop down to a real language which doesn't have your DSL's limitations and get the job done.
It's the Curse Of Almost because, if it were too much worse of a fit for the problem, nobody would even think of using it to solve that problem. Getting someone 90% there and crapping out puts your users in a more awkward position, especially if they feel they've invested effort in whatever tool they have.
An example is Talend versus CSV: Talend is an ETL Solution which Extracts data from some source, Transforms it according to a graphical DAG of ideally stateless components, and Loads it into some other storage. It's also a happy, friendly GUI on top of Java, which is nice, because the Real World isn't kind to happy, friendly GUI solutions which expect CSV is going to conform to any of your syntax rules or other misguided preconceptions about files having structure. So, when you have to run a Talend pipeline on vaguely-comma-delimited text files which may once have been machine-readable, you can make your own component which is literally just a block of Java code to parse the file using the Zerg Rush Of Ad-Hoc Rules Technique, an oft-overlooked method for designing parsers. You can also use that kind of thing to make components which are tasteless enough to demand state variables other than the stereotyped kind Talend itself provides.
I also don’t think that anyone has ever suggested that making a custom language is a small endeavor.
Whatever the future of programming languages it will definitively not be popular at first and negative criticisms will be the top-rated commentary. And when the new paradigm comes I can almost guarantee that the majority of the HN crowd will be too old and set in its ways to make the transition. Why would the future be any different than the past with regards to paradigms shifts?
You might object and say, "but variable assignment and addition, that's a big language thing." It isn't, though; it's just an infix expression. And infix didn't pop out of nowhere; it had to be invented as part of the gradual creep upwards from machine level "coding" into a more abstract semantics. Infix parsers are small, and while the complete language is larger, what it's presenting is infix-compatible. "Regex" is the same way: there is a general definition of regular expressions, and then there are some common variants of regex, the implemented semantics.
The boundary between "the language needs its own compiler and runtime support" and "the language exists as an API call you pass a string into, which compiles into data structures visible to the host language" is a fluid one. And the most reasonable way of making little languages involves seeing the pattern you're making in your host and "harvesting" it. In the previous eras, there were severe performance penalties to trying to bootstrap in this way, and so generating a binary was essential to success. But nowadays, it's another form of glue overhead. If you define syntactic boundaries on your glue, it actually becomes easier to deal with over time.
Documentation-wise, it's the same: if the language is sufficiently small, it feels like documentation for a library, not a language.
There was a cool "wordprocessor-like" (Franken?) demonstrated which was created with a small number of lines, it should be a huge success in the FOSS world no? Well no, nobody managed to make it work.
And then you want to modularize your code because it becomes to big, and you want to create libraries for code reuse.
You end up wanting static typing for the usual reasons, which eventually leads to needing parametric types and recursively defined types, and the type system becoming turing-complete as well.
Or you keep working around the limitations of the little language, writing code generators and wrapping it in general-purpose language APIs.
I suppose all this leads me to the suspicion that little languages fill in for shortcomings in big languages. Big languages can absorb the things that work, this negating the need for small languages in that sphere.
Although how far can that go? Can we keep making ever bigger languages? Or at some point does it crumble under its own weight?
* Add a builtin command [1] that takes a string or filename and calls the interpreter with any additional data you want to pass.
* Add a flow control command [1] that passes the inline block to the interpreter of your choice. You'd probably have to override cmFunctionBlocker as well for this.
Note that this can't fix the deep design issues in CMake like the insane string representation.
And no, I'm definitely not in therapy from CMake-induced PTSD.
[1] https://github.com/Kitware/CMake/blob/master/Source/cmComman...
Generating data files via templating languages was never a good idea
Using data languages as essentially code is also similarly bad idea.
Ansible does both at once.
“The idea is that as you start to find patterns in your application, you can encode them in a little language—this language would then allow you to express these patterns in a more compact manner than would be possible by other means of abstraction. Not only could this buck the trend of ever-growing applications, it would actually allow the code base to shrink during the course of development!”
Functions, frameworks, little languages. It’s all abstractions on top of abstractions. You are shifting the knowledge of the abstraction for the more fundamental knowledge underneath that does the actual work.
You end up just sweeping the codebase growth under some other layer’s rug and blissfully forget about the woes of future maintainers. The code is still there, abstracted and exposed by the “little language”. Hiding this behind a cute moniker doesn’t seduce.
This isn’t the future of programming. This is already programming.
I would imagine that most of the value comes from being able to "refactor" thought patterns to match the best way to cleave the domain into composable concepts -- and it seems like we do this all the time (and in all programming languages?).
This is called a shallow embedding in the Haskell world.
Deep Embedding is more like writing a full blown interpreter.
Then there is tagless final (Oleg Kiselyov) - it feels shallow but is more flexible as simple library functions and it is optimisable like deep embedded DSLs.
With a library you just need to learn the new semantics. I much prefer this to a DSL.
Plus a library has all the debug tools the language has. DSLs usually don't have any debug tools except for printing when you are lucky
Whether or not you agree this philosophy is a good one, and whether or not you like Lisp specifically, I think we can all agree that macros (in whichever language) are a much better way to do it than creating a bunch of tiny languages from scratch. I was surprised not to see the word "macro" appear in the article at all
A classic example comes from Peter Norvig's "Principles of Artificial Intelligence" wherein he defines a subset of English grammar as a data structure [1]:
'((sentence -> (noun-phrase verb-phrase))
(noun-phrase -> (Article Noun))
(verb-phrase -> (Verb noun-phrase))
(Article -> the a)
(Noun -> man ball woman table)
(Verb -> hit took saw liked)))
He then goes on to define a function "generate" that uses the above to create simplistic English sentences.Additional rules can be added by a non-programmar so long as they understand how their domain logic has been mapped to Lisp.
[0] https://news.ycombinator.com/item?id=33705558
[1] https://github.com/norvig/paip-lisp/blob/main/docs/chapter2....
[1] https://www.graalvm.org/22.0/reference-manual/polyglot-progr...
[2] https://www.graalvm.org/22.3/graalvm-as-a-platform/implement...
Yes, but over time very specific problems become bigger/different problems which the little language isn't ideal for, the original developers move on leaving someone new to figure out the problem and language which is probably poorly documented and very brittle. Application developers probably aren't suited to writing and maintaining language code.
The only caveat is an external system provided with its own language - like RDB/SQL - which is proven and well maintained - but its hard to call SQL a little language.
Unless they are using Kubernetes and even then, you shall find a very complicated bunch of languages:
- shell scripts
- Dockerfiles
- Kubernetes YAML
- Makefiles
- Bazel
- Ansible
- python scripts
- Jenkins XML
- Groovy scripts
- Ruby scripts
- CloudFormation
- Terraform
- Fabric or other deployment deployment script
It's very hard to fit together and understand from a high level.
The last thing they were working on at my previous company was a YAML format to define a server, to go through the organisational structure of the company to manage computer systems.
Some people mentioned LISP in this comment thread. For me LISP is an intermediate language, I would never want to build a large system in LISP. It's not how I think about computation.
I wouldn't want to maintain or work on a large Clojure codebase written by other people. I've done that three times.
For reference, I think Python is easy to write and read and understand.
I wrote a simple toy multithreaded interpreter and I've written part of a compiler that does codegen to target the imaginary interpreter. It's basic but my AST is a tree that could be represented in LISP. I use a handwritten recursive descent pratt parser. The language looks similar to Javascript.
I know it can fixate your thinking if you think of it too much in this manner, but I think of modern computers as turing machines. They loop or iterate through over memory addresses which are data or instructions and execute them.
That said, my perspective is not traditional. I design and try implementing programming languages. I am interested in the structure of problems and code, asynchrony, coroutines, parallelism and multithreading more than anything else. Even more than type systems.
I think the expression problem is a huge problem that doesn't have good solutions for managing complexity.
I find other people's LISP code to be difficult to read whereas I can understand a Python, Java algorithm.
What am I trying to say? The structure of the program in the developer and compiler's head is different from the instructions actually executed by the computer. LISP is nearer to the instructions executed by the computer than what exists in my mind. In my mind exists relationships, ideas more complicated and not structured in post order traversal. A post order traversal of LISP is the codegen.
A more recent treatment is Matthew Butterick's book: https://beautifulracket.com/
It doesn't have to be a big standalone DSL with a separate compiler or preprocessor. It can also be an embedded little language, like when you sprinkle HTML templates throughout your normal general-purpose language, and as only a syntax extension: https://docs.racket-lang.org/html-template/
(Aside: I'm seeing tasteful Racket and Scheme influences in Rust, even though they're very-very different languages. I'm hoping to contribute a little more influences.)
What is great about this approach as an individual is that it requires you to tighten your ideas. When you have to implement all of the functionality in a DSL, you really start thinking about what you truly need. A big language nudges you towards using all of its features while a small language challenges you to consider what is truly essential.
Of course DSLs always run the risk of being write only and/or only comprehensible by the original author. Like any powerful tool, DSLs should be used judiciously and responsibly. Often that isn't the case, in part because I don't think the tooling for writing DSLs is generally very good. But I am betting that new tools that make DSL writing easy will have a profound effect on software development.
In fact the entire article seems to boil down to "DSLs are the future", which I'm sure I've seen articles about back when Ruby on Rails was dominating web technologies, Cucumber (and its various ports) created "BDD" testing fad and DevOps started gaining traction on top of various "Ruby DSLs" used as configuration formats.
I don't think DSLs are going to go away any time soon. But there is a trade-off between domain-specific "little languages" and general purpose programming languages (or "DSLs" that are actually subsets of the latter). It can be fun to have to work with a little language, it's not so fun to work with dozens of them, each with different rules you have to memorize, instead of just being able to use the same language for all (and in truth, this was the source of the Ruby DSL craze because developers were already using Ruby on Rails).
for one, the author says knuths program written in WEB was 10 pages long, discounting the fact that these 10 pages are HEAVILY annotated.
my other point is:
tr has 1917 LOC,
sort has 4856,
uniq has 663
and sed is in its own package at around 10 MB
all including comments and docs for sed
it's fine and good that you can use composition with shell utilities, but come on, write that example program in C99 and you'll be a not very happy coder at all. in general i find the comparison rather rude. Knuth was supposed to show ?his? programming language WEB and as a "critique" McIlroy farts out a shellscript like "lmao first".
indeed, you do not often need to count word frequencies.
but what was this article supposed to be really about? software engineering 101 aka dont-reinvent-the-wheel/DRY?
or perhaps literate programming?
As you say, Knuth was asked to demonstrate his literate programming... In some ways this is a direct request for the non-pithy, articulated, first principles answer. I would more say Knuth was set up than that he was framed, but tomato-tomato. :)
It underlines all the points already made here about Lisp, and also includes a CL version of the original code. 67 lines including comments and empty lines. Endless possibilities.
'Official' small languages only make sense when the user base is large, this is why SQL exists but a mainstream language for manipulating Maxwell's equations does not.
For (a), a simple example is an application that reads information from a DAQ, stores it in a database in a compressed format, and later sends the data to a printer. We have a DSL that can easily implement the database, but we need to translate the data from the DAQ into a compatible input, so that requires a different language. Furthermore, the printer requires postscript, so we need some other language or tool for that. Then we need to figure out how to glue it all together. In some cases, it becomes tempting to try to ‘hack’ the DSL into trying to do things it was never designed for.
Or we could just use a general purpose language and libraries.
Then again so is Lisp more or less.
I suspect that instead what will happen as the next big thing in terms of new abstractions in programming is that AI code generators will keep getting better and get better tooling and that's going to be the tool of choice for high level needs.
While these AI tools may not be that accurate today there is vast potential in improving the models and tooling. In the IDE it could be more like you describe what you want, it generates some code and tests and then run the tests and maybe a visual time traveling debugger so you can visualize what it's doing to see if it's doing the right thing.
Where do we draw the line on "enough DSLs" for example? And what happens to the gains from using several DSLs in tandem as opposed to a high-level language with libraries that accomplish the same thing?
It just has map/list access & creation, string manipulation & concatenation, basic arithmetic/comparison, lambdas as first-class values, function calls. No way to do I/O, just data in-data out.
I need some non-programmers friends to collaborate on the tiny data transforms and it is quite easy to write a 1:1 text <> visual editor for this language thanks to its limited features, allowing to go from visual to text or text to visual. The exercise in itself is interesting and worth it.
I think I wouldn't do that in a commercial context though.
You basically write a dumb, easy parser and AST interpreter for your language, and it will magically turn into a JIT compiled dynamic language with state-of-the-art GCs and better performance than what you could likely come up with. And the best thing is that it really unifies the computing model where you can pass a python object to some js lib, effectively giving you every library ever written for any (implemented) language, which is the real productivity booster.
The problem is really "does the language app uses works for target audience that will be doing the DSL". If it does (Ruby makes pretty decent one), job done, if it doesn't, we end up in the mess of making toy languages (usually because developers want to do something fun and writing new small language can be pretty fun) or plugging something like Lua into it.
> There are a few other names for these languages: Domain-specific languages (DSL:s), problem-oriented languages, etc. However, I like the term “little languages”, partially because the term “DSL” has become overloaded to mean anything from a library with a fluent interface to a full-blown query language like SQL, but also because “little languages” emphasizes their diminutive nature.
Ahem, so is SQL diminuitive or not? Because, SQL is NOT diminuitive. SQL is Turing complete.
Another important consideration is that these languages are typically best when declarative as possible; if you can avoid Turing-completeness and stick entirely to something representing static data, then that's the ideal.
I like the idea of abstraction but in my mind it is very easy to have the power of a "little language" inside an all purpose language by using a package. E.g. SQLAlchemy or Pulumi as the alternative to the little languages of SQL and TF.
Constructing Language Processors for Little Languages 1st Edition by Randy M. Kaplan [1].
It's no longer being produced, but used it's $6 or less.
It's dated, from 1994, but it is a fun enjoyable discussion on the benefit of tiny specific languages. It also has a nice tutorial on the use of lex and yacc.
DSLs defined in a type/schema system atop JSON/YAML end up being far easier to write tools around than DSLs which require a custom parser (e.g. Dockerfile.)
That said, there are definitely a subset of languages like JSONPath that would not work written out as an AST.
- more difficulty on boarding - more difficulty adding new features - more challenges with best practices/linting/code reviews - different runtime behaviour between different languages
This all adds up to more complexity.
New languages are often really neat, and enjoyable for their own sake, but I'm not all that interested in maintaining a large swath of different languages for different tools.
The creator (BDFL?) of Elm, Evan, specifically avoids making it general purpose.
Hello ANTLR..
The web is designed using a variety of little languages. It could just be JavaScript with an API but it's not. And it's probably best as it is.
Now I have a folder of little one-offs and REPL scraps. A triumph of tactics over strategy that defies passing on to anyone who didn't author it.
Looking at the JVM and JSON, I wonder to what degree that languages contribute some piece of goodness or idea toward some final Tool To Rule Them All...
The only "little" languages I can think of that I'd reasonably ask people to use at work are lua, make(~), awk, and (ba)sh.
Why use many languages when you can do the same process in one? Sure, even if the "big" language isn't specifically made for a certain job, doing that job in a little language requires time for the programmer to learn the little language.
The syntax may be different, so to convert from the big languages to little languages it would take time. This is why this hasn't happened yet; people are lazy to learn new languages so they just learn the "big" languages that can complete all the tasks that they require.
It seems to be making a comeback though! I prefer it too. It's not really more descriptive than DSL but it's not much less, and is less jargony and just cuter.
The AWK Programming Language, 1988, Chapter 6: Little Languages
https://archive.org/details/pdfy-MgN0H1joIoDVoIC7/page/n11/m...
And classic Knuth argument, of course. It's so hard to count words. General purpose language:
>>> from collections import Counter
>>> import re
>>> Counter(re.findall('\w+', 'boo foo boo +dfd zii')).most_common(1)
[('boo', 2)]The OP even uses "Domain Specific Language" in the article.
" Racket is a Lisp dialect that’s specifically designed for creating
new languages (a technique sometimes referred to as
language-oriented programming). I haven’t had time to play around
much with Racket myself, but it looks like a very suitable tool for
creating 'little languages'."dsl's are now little languages