Universal configuration language (opens in new tab)

(github.com)

35 pointspeterbotond11y ago28 comments

28 comments

Any sufficiently complicated configuration language contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp. [1]

[1] A variation of https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule

CmonDev11y ago

Isn't it ironic how they are used more than Lisp though? Maybe they don't implement that other, worse part?

usrusr11y ago

Yet another tool to parse one large string into a map of smaller strings. Trouble with configuration files is rarely caused by insufficient syntax features, but insufficient schema validation seems to be a consistent source of wtf-moments: a mistyped key here, a duplicate key there, that's what is stealing our time (duplicate key: first definition wins? last definition wins? both get concatenated? doesn't matter, if i was aware of having two of them i would have fixed it in a second).

What i want from a configuration helper library is nothing less than an internal DSL for specifying the typed structure of allowed/expected keys, together with their default values and a short "what is this" description available at runtime. This would be enough to generate nice empty configuration templates and create warnings for unexpected keys (be it from typos or from unexpected duplicates).

Fancy syntax features for the configuration files themselves would be only secondary niceties. And candidates for "stupid" preprocessors ("stupid" in that they would not have to know about the appllicaton's configuration schema).

cebka11y ago

UCL has support for json schema (http://json-schema.org/) so you can achieve some cool things with it. However, it is apparently not a full DSL.

moe11y ago

This looks terrible. An "almost-JSON" with... macros? Seriously?

What problem is this looking to solve?

Why not just use TOML or YAML?

espadrine11y ago

JSON offers data structures that everybody understands, so having a configuration language that has exactly those is what I want in a configuration language.

TOML isn't it (some JSON can't be put in TOML by design). YAML is definitely easier to edit than JSON, but just like XML its flaw is in its complexity (both in syntax and in specialized structures, such as references or types).

That's why I made dotset (http://espadrine.github.io/dotset/). It doesn't have macros™. And it's a YAML subset.

vidarh11y ago

YAML horrible enough to make me avoid tools when I have other alternatives of functional parity.

Semantic whitespace is the second most horrible syntax "feature" after lisp-y parentheses. I even prefer XML over YAML.

cies11y ago

Also came here to mention TOML. It's still in development, but already it makes (IMHO) a better configuration language.

andridk11y ago

Currently, I prefer [toml](https://github.com/toml-lang/toml) for configuration. Both languages can be exported to JSON.

fibo11y ago

TOML is nice, but also YAML is Great! Both coming from Perl community btw

andrewaylett11y ago

I am, I'm sorry to say, unconvinced. I'm not keen on 'magic' behaviours like auto-converting of values to arrays -- while the behaviour of the document may be well specified globally, it's nice to be able to see what something's going to do based on immediate (or even no) context.

What I'm very much liking at the moment is the way Dropwizard uses Jackson's YAML (and by extension, JSON) parsing for configuration -- your configuration file maps 1-1 to a class in your application, and Jackson is configured to fail if you're either missing a field that's not marked as optional or you've got extra fields in your YAML that don't map to anything. Type-safety FTW!

On the subject of wanting to refer to earlier bits of configuration: if you need to use one value as part of another, your configuration system might not be exposing the right level of detail. Of course, you might not be able to change that.

clarkm11y ago

This has some good ideas, but it's very similar to HOCON. I think it's more worthwhile to focus on improving the HOCON spec rather than building yet another configuration json superset.

https://github.com/typesafehub/config/blob/master/HOCON.md

bshimmin11y ago

The "Automatic arrays creation" bit rather worries me - it strikes me that very often if you have non-unique keys, it's because you've made a mistake, so automatically converting that object into an array would probably result in misconfiguration.

_random_11y ago

Looks interesting, however the only reason JSON is generally used IMHO is because it's light and compatible. There is no reason to make the new language look like JS (a lame legacy language). It could look like YAML but still support JSON for legacy stuff.

moomin11y ago

Doesn't Lisp already exist?

krapp11y ago

A "configuration language" by definition (in my opinion) should not be Turing complete. A configuration language is for storing state - key/value pairs or simple structures of primitive types, nothing more (or as little more as necessary), nothing less.

Once you introduce enough complexity (branching, recursion) you've just created another application with global variables for the main application to access - a capability that will probably almost never be desired (see XML), and the ability to unserialize into functions or otherwise executable code, which will also almost never be desired (XML, Yaml, probably a lot of things.)

chriswarbo11y ago

> A "configuration language" by definition (in my opinion) should not be Turing complete.

One of the key insights of LISP is that s-expressions are a simple, universal format. Yes, they can be used for code, but they can also be used for static configuration data. In fact, LISP originally used s-expressions solely for data; code was meant to be written with m-expressions ( http://en.wikipedia.org/wiki/M-expression ). Once `eval` was implemented, s-expressions could be used for code and data, so the idea of m-expressions was abandoned.

> A configuration language is for storing state - key/value pairs or simple structures of primitive types, nothing more (or as little more as necessary), nothing less.

The trouble with "universal" formats like this is that there's no universal agreement on what's a "primitive type" (what happens when I write `0.1`? Are booleans primitive, or should we use `0` and `1`?) and what's a "simple structure" (can I make a circular list?). That in itself wouldn't be too bad, but these languages tend to hard-code special syntax to particular types and structures, so any types or structures we may want to add must either be second-class citizens, or would require hacking the parser.

1 more reply

Houshalter11y ago

The problem is there is a tendency for configuration languages to become Turing complete as they add more and more features. It would be preferable if they just started with something Turing complete. See here:

>Most projects seem to start out small with a few config items like where to write logs, where to look for data, user names and passwords, etc. But then they start to grow: features start to be able to be turned on or off, the timings and order of operations start to be controlled, and, inevitably, someone wants to start adding logic to it (e.g. use 10 if the machine is X and 15 if the machine is Y). At a certain point the config file becomes a domain specific language, and a poorly written one at that.

https://stackoverflow.com/questions/648246/at-what-point-doe...

I like the idea of using Lua as a config language because it's pretty simple, lightweight, and can be sandboxed easily.

sparkie11y ago

The problem is when you want to base the "value" part of any of these key/value pairs off some other value, you can't compute any new value - and you wind up duplicating variables in the configuration files (or worse, over several configuration files). This leads to someone inventing a new configuration-to-configuration converter to do what could be done in a macro.

Key/Value pairs work in a rather limited portion of software, but most configuration formats are calling out for the ability to compute. Because they rarely work in practice, everyone forks the format to add their pet features, until some committee comes along and suggests "I know, I'll add all of your pet features into a universal format" - this thinking brought us to XML. Yaml, JSON and name-your-shitty-markup are continuations of this absurd line of thinking.

When TS says Lisp, he doesn't necessarily mean "configure the world in common lisp", but he's talking about S-expressions - which are a 'universal' way of encoding trees as text (without the element/attribute ambiguity), which you can chose to either treat as data or as code. The in memory representation of parsed s-expressions is equivalent to their textual representation (homoiconicity), which means that you can write code to operate on these structures using only the knowledge of the text layout, and not some extra knowledge your programming language might use for encoding it (ie, objects).

A configuration format using S-expressions need not be turing complete, as you can specify what should be data and what wants evaluating as code, if anything. You can place limits on what you want to be able to compute, by validating the input before evaluating it. As others have stated, the focus of configuration formats needs shifting from "syntactic flavor of the year" to proper validation of input. And the quickest path to validation of input is one where the parsing is automated - because Lisp does it for you.

valw11y ago

Would you agree to call Grunt a configuration tool?

I have found the fact that Grunt lets you write full-featured JavaScript quite useful. In 95% of cases, of course, you want to write your configuration in a declarative, JSON-like form, but I welcome the possibility of having full JS power in the few situations where non-trivial logic is needed. Another advantage, of course, is familiarity: I already know JavaScript.

However, what I really don't like is a declarative data language or some sort of DSL that starts adding some basic variable and control flow features. That's the best way to end up with a tool that's complicated, hard to reason about, and still unexpressive.

So to me, a good configuration language should be either a simple data language (such JSON), OR a simple, powerful, well-known programming language with good data structure literals to encourage a declarative style.

1 more reply

eru11y ago

If you are very careful, you can introduce some forms of recursion and branching without getting full Turing completeness.

calibraxis11y ago

Presumably they mean something like EDN/Fressian.

espadrine11y ago

Do you mean Guile or s-expressions?

jermo11y ago

The Typesafe Config is somewhat similar in the JVM world.

https://github.com/typesafehub/config

sysk11y ago

http://xkcd.com/927/

j / k navigate · click thread line to collapse

28 comments

lifthrasiir11y ago

Any sufficiently complicated configuration language contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp. [1]

[1] A variation of https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule

CmonDev11y ago

Isn't it ironic how they are used more than Lisp though? Maybe they don't implement that other, worse part?

usrusr11y ago

cebka11y ago

UCL has support for json schema (http://json-schema.org/) so you can achieve some cool things with it. However, it is apparently not a full DSL.

moe11y ago

This looks terrible. An "almost-JSON" with... macros? Seriously?

What problem is this looking to solve?

Why not just use TOML or YAML?

espadrine11y ago

JSON offers data structures that everybody understands, so having a configuration language that has exactly those is what I want in a configuration language.

That's why I made dotset (http://espadrine.github.io/dotset/). It doesn't have macros™. And it's a YAML subset.

vidarh11y ago

YAML horrible enough to make me avoid tools when I have other alternatives of functional parity.

Semantic whitespace is the second most horrible syntax "feature" after lisp-y parentheses. I even prefer XML over YAML.

cies11y ago

Also came here to mention TOML. It's still in development, but already it makes (IMHO) a better configuration language.

andridk11y ago

Currently, I prefer [toml](https://github.com/toml-lang/toml) for configuration. Both languages can be exported to JSON.

fibo11y ago

TOML is nice, but also YAML is Great! Both coming from Perl community btw

andrewaylett11y ago

clarkm11y ago

This has some good ideas, but it's very similar to HOCON. I think it's more worthwhile to focus on improving the HOCON spec rather than building yet another configuration json superset.

https://github.com/typesafehub/config/blob/master/HOCON.md

bshimmin11y ago

_random_11y ago

moomin11y ago

Doesn't Lisp already exist?

krapp11y ago

chriswarbo11y ago

> A "configuration language" by definition (in my opinion) should not be Turing complete.

> A configuration language is for storing state - key/value pairs or simple structures of primitive types, nothing more (or as little more as necessary), nothing less.

1 more reply

Houshalter11y ago

https://stackoverflow.com/questions/648246/at-what-point-doe...

I like the idea of using Lua as a config language because it's pretty simple, lightweight, and can be sandboxed easily.

sparkie11y ago

valw11y ago

Would you agree to call Grunt a configuration tool?

1 more reply

eru11y ago

If you are very careful, you can introduce some forms of recursion and branching without getting full Turing completeness.

calibraxis11y ago

Presumably they mean something like EDN/Fressian.