I know because I stumbled on the same page following the links from the blog of the author of another post that made the frontpage yesterday (https://news.ycombinator.com/item?id=45589156), liked the TernFS concept, submitted it and got redirected to https://news.ycombinator.com/item?id=45290245
Agreed, more or less, this would be easy to work around naively. Though an duplicate detection should not block reposts based on removing the anchor nor should the anchor portion automatically be removed generally. Some sites unnecessarily setup as SPAs use the anchor portion for context so it is needed for direct links to the right article in those¹, and going directly to a specific section in a long page can also be useful.
> nice karma trick posting the URL with the anchor to bypass the HN duplicates detector.
Karma jealously is as unbecoming as karma whoring, so perhaps wind the ol' neck in a little there. Laziness/ineptitude is more common than malice and this could have been accidental via quick copy+paste.
A better way to respond in this situation is a more neutral “Already recently discussed at [link]”, as had been done some hours before your comment: https://news.ycombinator.com/item?id=45646691#45647047
----
[1] Yes, those sites are badly designed, but they are unlikely to change because of our technical preferences and breaking the ability to deep link into them would add an issue for HN while not being noticed by those sites at all.
Agreed, and sorry for that (even though my gut and not-so-gut feeling is that it was done on purpose rather than by mistake, but I might be on the wrong side myself on this).
I think being able to identify what's worthy of reposting deserves upvotes, too. If a repost truly provided little to no value, then the number of upvotes would reflect that and it would never get to the front page. But in this case, many people and myself included would have never found the post if it weren't for this repost.
Different batches of users are on HN at different times and on different days. Allowing reposts to collect karma would mean that every link's exposure is derived from the entire HN userbase's votes rather than a small subset of the users that happened to be online at the time of the post.
If it is decisively better than Lustre, I am happy to make the switch over at my sector in Argonne National Lab where we currently keep about 0.7 PB of image data and eventually intend to hold 3-5 PB once we switch over all 3 of our beamlines to using Dectris X-Ray detectors.
Contrary to what the non-computer scientists insist, we only need about 20Gb/s of throughput in either direction, so robustness and simplicity are the only concerns we have.
Something like this [1] gets you 44 disks in 4u. You can probably fit 9 of those and a server with enough HBAs to interface with it in a 42U rack. 9x44x20TB = not quite 8 PB. Adjust for redundancy and/or larger drives. If you go with SAS drives, you can have two servers connected to the drives, with failover. Or you can setup two of these racks in different locations and mirror the data (somehow).
[1] https://www.supermicro.com/en/products/chassis/4U/847/SC847E... (as an illustration, sas jbods aka disk shelves are widely available from server vendors)
However, you are right. Your bandwidth needs don't really require Lustre.
I'm not joking, I didn't ask this as a way to namedrop my experience and credentials (common 'round this neck o' the woods), I honestly don't know what all the much more competent organizations are doing and would really like to find out.
One of the pain points of scaling Zookeeper is that all writes must go to the leader (reads can be fulfilled by followers). I understand this is "leader of a shard" and not a "global leader," but it still means a skewed write load on a shard has to run through a single leader instance
> given that horizontal scaling of metadata requires no rebalancing
This means a skewed load cannot be addressed via horizontal scaling (provisioning additional shards). To their credit, they acknowledge this later in the (very well-written) article:
> This design decision has downsides: TernFS assumes that the load will be
> spread across the 256 logical shards naturally.
https://docs.ceph.com/en/quincy/cephfs/index.html
Still not completely decoupled from host roles, but seems to work for some folks. =3
It would make for one heck of a FreeBSD development project grant, considering how superb their ZFS and their networking stack are separately.
P.S. Glad someone pointed this out tactfully. A lot of people would have pounced on the chance to mock the poor commenter who just didn't know what he didn't know. The culture associated with software development falsely equates being opinionated with being knowledgeable, so hopefully we get a lot more people reducing the stigma of not knowing and reducing the stigma of saying "I don't know".