It's presented on https://publi.codes but unfortunately we've not translated it yet. The language keywords themselves are in french, by design, to bridge the law and its official implementation. It's not yet used to compute taxes, just to simulate them on the official mon-entreprise.fr website.
The "code" expressed in YAML is parsed to build the computation model (in TypeScript), to document this model on the Web (each variable has a Web page) and to generate typeform-like forms.
It's in the https://github.com/betagouv/mon-entreprise monorepo, but it's also used to implement a model of our personnal climate impact, here : https://github.com/betagouv/ecolab-data/tree/master/data
Et bravo Denis :)
Edit : in case you didn't know, the french adminstration's code must by law be made public. This is just the beginning, expect lots of similar projects ! You can browse some repos here https://code.etalab.gouv.fr
And this time, the code isn't printed on paper and send by post mails, which is a neat progress ;).
See, https://www.nouvelobs.com/rue89/rue89-nos-vies-connectees/20... (in French) on how “making code public” was just a few years ago.
It depends, and varies greatly by location.
Anything created by the federal government is public domain by law. However, not all federal agencies make their code public. Some, understandably. Others, our of budget constraints or ignorance. In theory, you could file a FOIA request to get the code, assuming it's not classified.
Other levels of government can be problematic. In part, because cities and towns can copyright things they create, while the federal government cannot.
For example, the City of Chicago and some other cities have data portals open to the public. Their utility varies.
Smaller cities, however, are less likely to understand the important or value of making data public.
Back when governments started switching to data processing a lot, I belonged to an organization called Investigative Reporters and Editors. It had lots of guides for extracting data from local governments. I remember lots of newspapers rushing out to buy computers are nine-track tape readers so they could sort through the information.
(I've at least seen DARPA-funded work become open source. That's a good step.)
The "source code" to the calculations are the paper forms which specify the calculations, making them transparent.
In the UK, the HMRC (again, equivalent of IRS) makes worksheets available for computational tax but these are not machine-readable and are not guaranteed to be correct! (Indeed on some points, government websites give incorrect information related to the state pension. [1])
We did something similar to this approach but much simpler - we wrote a little arithmetic language specifying the tax rules, embedded in a spreadsheet for quick verification, and then translated this language into C++ using a Haskell compiler.
[1] https://www.thisismoney.co.uk/money/pensions/article-7100019...
"Four years after the first publication by DGFIP, I have the pleasure of announcing that the source code permitting the calculation of taxes on revenue is finally reusable (recompilable by others)!
To use this algorithm in your application, follow this link...
It took us 1.5 years (with my coauthor Raphael Monat) to identify that which was missing in the published code in order for it to be reusable, and to fix this situation.
More or less, thanks to our project Mlang, a person can simulate IR's calculations without needing to interface with DGFIP.
The difficulty came from a constraint from DGFIP, who did not want us to publish (for security reasons) a part of the code that corresponds to a mechanism that handles "multiple liquidations". Raphael and I recreated this unpublished part in a new DSL.
DGFIP equally didnt want to publish their internal test games (cases). We had proceeded therefore with the creation of a suite of random test cases, separate from the non published ones, to finally be able to reproduce the validation of Mlang outside of DGFIP."
The word "jeu" can indeed mean "game", but it can also mean a group of things. A better translation would be "test suites", "test sets" or similar.
"A little less than a year after the publication of [blog post], we have therefore found a compromise letting us to respect both the obligation to publish the source code and the security constraints of DGFiP.
In letting us publish the code on their site and accessing confidentially the source code they didn't want published, the DGFiP let us find alternative solutions that made the publication of the source code concrete and operational.
This compromise lets both parties come out on top, unlike what happened with the source code of CNAF [link] where the administration simply argued a too-important difficulty and indefinitely postponed [1] it.
Letting those who ask for the source code to see it after a NDA therefore appears to be a possible solution when the publication is delicate for technical reasons. Could this path be useful for the report of @ebothorel?"
[Note: translation here is somewhat more geared towards a natural English translation than a literal French translation.]
[1] "repouss[er] [...] aux calendes grecques" appears to be an idiom that's not in my dictionaries, but from context appears to mean "indefinitely postponed"
IMO (I am not part of this project) it is more interesting as it is language agnostic, easy to use for everyone (based on yaml) and more importantly, it is starting to be implemented in the government actual tax computing system.
We are using it in a challenger bank I started.
I don't share the code because I'm not sufficiently confident that it's correct, don't want liability, and don't want an obligation to keep it up to date.
That said, it feels like the scope of the project would be manageable for a small nonprofit, and would be of great social value. One reflection from my work is that it would be particularly valuable to represent annual changes in the tax code as transformations of the code AST.
https://github.com/betagouv/mon-entreprise/blob/master/mon-e...
It would be quite interesting to check that both the French IRS implementation of the tax & benefit laws and the free software community (though most devs of the project were employed by the French admnistration) implementation of the tax and benefit laws actually output the same results.
Personally I work in the Language Engineering area and it seems obvious that you want tax lawyers and accountants to interpret the tax code and translate it into “code”. Is just that you also want “code” to be obvious for them and support by proper tooling, which catch all inconsistencies.
I would also love to interview the author of this and the work for mon-entreprise. While I understand French, I also have these interviews in English to reach more people
ETA: Diving into my thoughts on this a little: really what I'm describing would require (1) a dumb numerical analysis algorithm or (2) some CAS computer algebra system features, my preference. I don't know all the keywords and concepts, but I think term rewriting and equation solving would get me towards the output I seek: a multivariate, piecewise equation with user-selected input variables and user-selected output variables: e.g., current year tax, n+1 year tax, etc. Seems too involved, but ai have hope.
This work is based on a retro-engineering of the syntax
and the semantics of M, from the codebase released by the
DGFiP.
Sounds like an external re-implementation, of the "original" release here:https://framagit.org/dgfip/ir-calcul
That original release says it's under a free license too.
Wonder why there's a re-implementation?
https://twitter.com/DMerigoux/status/1314531302079688709
> The difficulty arose from a constraint on the part of the DGFiP which did not wish to publish, for security reasons, part of the logic of the calculation corresponding to the "multiple liquidations" mechanism. Raphael and I recreated this unpublished part in a new DSL.
> The DGFiP also did not wish to publish its internal test sets. We therefore proceed to the creation of a completely random test set, from the unpublished content, in order to be able to reproduce the validation of Mlang outside the DGFiP.
> A little less than a year after the publication of https://blog.merigoux.ovh/en/2019/12/20/taxes-formal-proofs...., we therefore found a compromise allowing to respect both the 'source code publication obligation, and the security constraints of the DGFiP.
> By allowing us to go to its operating site and confidentially access the source code that it did not wish to publish, the DGFiP has enabled us to find alternative solutions that make the publication of the source code concrete and operational. .
The M compiler reimplementation linked in this submission allows you to actually execute those rules and perform simulations.
This is fantastic! If interested, you may want to check out our program for Open Source users of GitLab: https://about.gitlab.com/handbook/marketing/community-relati....