Design deficiency
(This is unrelated to the choice of hash.)
Fossil stores blobs as-is. A file containing "hello world" will be stored as "hello world" and referenced as HASH("hello world").
Commits are stored as plain-text manifests, which are also referenced as HASH(manifest_contents). To distinguish between different types of artifacts (commit, file, wiki page, etc.), Fossil checks the contents of the blob.
See https://www.fossil-scm.org/index.html/doc/trunk/www/fileform... for detailed description.
This made possible the following attack:
* Clone repository.
* Modify some files, commit.
* Deconstruct repository.
* Attach the deconstructed artifacts with changes to a ticket in the original repository or to a wiki.
By doing this, you could make commits to the target directory by attaching files to tickets or wiki, and these commits were only visible to people who cloned the repo until rebuilding (then they would be visible to everyone).
This attack was prevented by compressing every attached file with gzip, making it impossible to attach a file that would be recognized as a commit, because gzip adds its own header.
I think this design is deficient: instead, each blob should have a type indicator — that is, file artifacts should have some prefix. This is how Git works: each object has a prefix indicating its type. Also, Plain 9 had a filesystem called... also Fossil! — which was based upon Venti content-addressable storage, which stored typed blobs.
Unfortunately, changing this will break compatibility, and since Fossil artifact format was built to last for ages, I don't think it will be changed.
SHA-1 claims
What made me rant about Fossil after congratulating them on switching to SHA3-256 is that they made false claims regarding their use of SHA-1 in the same documentation which shows these clams are false:
Quoting https://www.fossil-scm.org/index.html/doc/trunk/www/hashpoli...:
The SHA1 hash algorithm is used only to create names for artifacts in Fossil (and in Git, Mercurial, and Monotone). It is not used for security. Nevertheless, when the Shattered attack found two different PDF files with the same SHA1 hash, many users learned that "SHA1 is broken". They see that Fossil (and Git, Mercurial, and Monotone) use SHA1 and they therefore conclude that "Fossil is broken". This is not true, but it is a public relations problem. So the decision was made to migrate Fossil away from SHA1.
If you search the docs, you discover that they use SHA-1 for security:
* To store passwords (https://www.fossil-scm.org/index.html/doc/trunk/www/password...)
* In the client-server authentication protocol in an adhoc MAC construction (https://www.fossil-scm.org/index.html/doc/trunk/www/sync.wik...)
Speaking of passwords, the automatically generated passwords are too short: I just created a repo with Fossil v2.1 and got "efc6f5" as initial password. It's 6 hex characters, or just 3 bytes — trivial to crack.
Finally, I as I said, I really like Fossil even though I don't use it anymore for open source projects (I still use it for some private projects) and have a great respect to its author and other contributors. But in my opinion, it needs at least a fundamental but simple change in the storage format to introduce object types.
If something is unclear or you have questions, I'm happy to answer.
(My concern was that the twitter basically contained nothing of any content: no technical details, no link to a blog post, nothing which can be checked or verified... which makes it indistinguishable from insinuation; and so I have to dismiss what you said out of hand.)
(Regarding provocative language: I don't think publicly calling someone out as a liar, in so many words, particularly on a medium like Twitter which doesn't really allow for an effective response, is particularly effective in producing a useful result...)
Anyway:
Re the manifest issue: to paraphrase, to check I'm understanding you correctly: because each manifest refers to its predecessor, and not vice versa, adding any blob which looks like a manifest implicitly adds that manifest to the tree. Normally Fossil trusts authenticated users to add blobs to the tree, because they're authenticated, but ticket attachments can be added by anyone, which effectively means that you can bypass the authentication and commits can be done by anyone. Is that correct?
In which case, yeah, I agree; that's very bad. I can't spot any holes in your reasoning. It is possible to positively identify attachments by looking at their parent manifest, as each one should be pointed at by an A record, so I suppose you could disallow manifests if they're referenced like this, but my gut tells me that's going to be horribly fragile... you're right, adding a type prefix is obviously the right way to go.
If you create a manifest and check it in as a normal file, so it's referenced by an F record, is it still treated as a manifest? If not, could this machinery be extended to attachments as well?
You did bring this up on the mailing list, right?
Users with commit privileges are granted more trust and do have the ability to forge manifests. But as before, there is an audit trail and rogue manifests (and the users that insert them) can be detected and dealt with after the fact.
Structural artifacts have a very specific and pedantic format. You can forge a structural artifact, but you will never generate one by accident during normal software development activities.