Ask HN: Which recent research paper blew your mind?

347 pointsfroster2y ago171 comments

171 comments

"Overview of SHARD: A System for Highly Available Replicated Data" it's the first paper to introduce the concept of database sharding. It was published in 1988 by the Computer Corporation of America.

It is referenced hundreds of times in many classic papers.

But, here's the thing. It doesn't exist.

Everyone cites Sarin, DeWitt & Rosenb[e|u]rg's paper but none have ever seen it. I've emailed dozens of academics, libraries, and archives - none of them have a copy.

So it blows my mind that something so influential is, effectively, a myth.

mindcrime2y ago

Huh. OK, here's something that might be interesting. I found another paper[1] that cites SHARD, but the citation is slightly different. Instead of being a CCA memo, it shows it as a Xerox memo:

Sunil Sarin, Mark DeWitt, and Ronni Rosenberg, "Overview of SHARD: A System for Highly Available Replicated Data," Technical Report 162, Xerox Advanced Information Technology (May 1988).

EDIT:

OK, I think I get this now. I had read the Wikipedia blurb about CCA being acquired by Rocket earlier, but only just now did I keep reading further down to find this bit:

in 1984, CCA was purchased by Crowntek, a Toronto-based company.[8] Crowntek sold Computer Corporation of America's Advanced Information Technology division to Xerox Corporation in 1988.[9] The balance of CCA was acquired by Rocket Software, a Boston-based developer of enterprise infrastructure products,[2] in April 2010.

So it seems like the portion of CCA that would be of interest here, is probably the bit that sent to Xerox. Maybe somebody at Xerox could help turn up the missing document?

I doubt it will help, but I took a stab at pinging them on <strike>Twitter</strike> X.

https://fogbeam.com/tweet_xerox_cca_paper.png

[1]: https://apps.dtic.mil/sti/pdfs/ADA209126.pdf

johnmaguire2y ago

I found LinkedIn profiles for Sunil Sarin, Mark Dewitt, and Ronni Rosenberg who all worked at CCA during this time period.

I've gone ahead and sent them each a message asking if they might be able to make the paper available.

If you'd like to get in contact with them yourself and are having trouble finding their LinkedIn, shoot me an email and I'll be happy to provide you links.

johnmaguire2y ago

I received a response from Dr. Rosenberg:

> Yes, I was involved, 35 years ago! I believe it was an internal CCA paper. I don't have a copy and I have no idea how to get it. Sorry about that. It does seem to be the earliest reference to "shard" in the DB context. (The other early reference pointed to in Wikipedia is from much later, 1997.)

> Fortunately, you need not go back 35 years to read about sharding; it's easy to get current info. Cheers.

I've now sent a message to Andy Youniss, CEO of Rocket Software to see if he can help.

mindcrime2y ago

I've now sent a message to Andy Youniss, CEO of Rocket Software to see if he can help.

I suspect that if that memo lives on anywhere, it's somewhere in the bowels of Xerox. I say that based on observing that:

1. In the paper by Ronni L. Roseniberg at https://apps.dtic.mil/sti/pdfs/ADA209126.pdf the citation to the SHARD paper changed to

Sunil Sarin, Mark DeWitt, and Ronni Rosenberg, "Overview of SHARD: A System for Highly Available Replicated Data," Technical Report 162, Xerox Advanced Information Technology (May 1988).

2. Per Wikipedia[1] Crowntek sold Computer Corporation of America's Advanced Information Technology division to Xerox Corporation in 1988.

To me this suggests that it was the "Advanced Information Technology division" specifically which would have had the paper in question, and that bit of CCA wound up with Xerox.

That said, it can't hurt to reach out anybody connected to this in any way. You never know who will wind up "knowing a guy who knows a gal, who knows a ..." or whatever.

[1]: https://en.wikipedia.org/wiki/Computer_Corporation_of_Americ...

mindcrime2y ago

I also sent a note to another former Computer Corporation of America employee that I found on LinkedIn. I don't know them personally, but we live near each other and have some common connections, so maybe they will at least receive my unsolicited message with some favor.

1 more reply

pjs_2y ago

Going through the bibliography of other people's papers and theses, looking for papers that you better cite "for good luck", or because "you gotta cite that one" is a classic PhD student behavior (I've done it) and it's not terribly surprising that something like this can happen. In fact I'd expect it to be much more widespread...

Delk2y ago

Feeling the need to cite a particular work out of convention or for social reasons is understandable and very common AFAIK, but I'd consider it a part of academic rigour to at least take look at the work one is citing. Blindly citing without ever even laying one's eyes on the work doesn't sound quite right.

Of course if nobody can get their hands on a particular work, as seems to be the case here, that makes things kind of hard. But I'd expect most works you need to cite in a fast-moving field such as CS to be available at least somewhere, even if it takes a bit of effort.

InitialLastName2y ago

One of the only notes I got from my MS thesis defense was one of the professors being annoyed that I had cited a result from someone else's paper that he had reported (effectively the same but derived differently and less conclusively) in one of his own papers. I added a note referring to his result and a citation to his paper and everybody went home happy.

kuhewa2y ago

Sometimes it's just an attempt to do the due diligence of citing the primary reference rather than the reference that cited the reference that cited the primary reference. I experienced this recently with a very widely cited basic fact in a very hard to come by technical report from the early 80s. I'd have bet dollars to cents it did in fact contain the statement everyone claimed it did.. however, just in case I made an inter-library loan request to actually read those couple sentences.

benterix2y ago

> Sometimes it's just an attempt to do the due diligence of citing the primary reference rather than the reference that cited the reference that cited the primary reference.

True, but in this case you should include both references.

kfrzcode2y ago

Funny, in my high school literature class I clearly remember being chastised for having sources in my works cited; but not warranting their inclusion with an actual reference in the work.

It's kind of wild that LLMs and other models/sequences will be able to quickly suss out which papers have high levels of referential integrity.

samstave2y ago

>which papers have high levels of referential integrity

The problem with this, as I see it IMO, is that there could be references that are cited due to their influence on the thought process/writing process of the work - thus citing them gives contextual zeitgeist - and this is something that AI would not be able to muster...

SO a LACK of referential integrity should show that it is written by a human as opposed to an AI.

Daviey2y ago

LaTeX will only include the sources in the bibliography if they are used at least once in the document.

leonheld2y ago

I think this is your blog, so I'm going to post it here because it is so great to look at!

https://shkspr.mobi/blog/2021/06/where-is-the-original-overv...

antegamisou2y ago

> It is referenced hundreds of times in many classic papers.

According to Google Scholar, it's cited a measly 11 times .

https://scholar.google.com/scholar?cites=1491448744595502026...

mindcrime2y ago

While it may be fair to say the number isn't "hundreds" (or maybe it is?) I will say that I'd take that Google Scholar number with a grain of salt. Just poking around looking for stuff in the spirit of this sub-thread, I've found 4 or 5 additional documents that cite the SHARD paper, and which aren't on that Google Scholar list.

I've found that Google Scholar's coverage gets a little sketchy on older stuff, and since we're talking way back in the 1980's here, I don't think it's surprising that some things are missing.

cyrillite2y ago

I am infinitely disappointed to discover you are also the only person who seems to care about this online. I found a website, but it’s you apparently.

Now I’m going to be bugged by this too! Great trivia also a heck of a mystery

iib2y ago

It seems like he is not the only one (in case he did not use a pseudonym here): https://en.wikipedia.org/wiki/Talk:Shard_(database_architect...

johnmaguire2y ago

(Replying directly to increase the chances you see this.) I received a couple more responses, from Sunil Sarin and from Mark Dewitt. Sunil had this to say, which was certainly surprising to me:

> John, I'll have to look in my hard copy archives, which are very disorganized and will take me some time. But I don't believe you need this exact paper. I can probably point you at other papers that were in fact published (and are not just company TRs).

> Furthermore, our concoction of the acronym SHARD is different from "database sharding" as currently used. We were referring to the network, not the data, being "sharded", i.e., partitioned, and ensuring high database availability (with some loss of consistency - see "The CAP Theorem").

Mark Dewitt shared:

> Hi John, I believe that was an unpublished report we wrote for our DARPA funding agency. I may still have a copy, but it's currently buried under some stuff in the garage. There is at least one published paper that describes the SHARD architecture for a partially replicated database. It's not entirely obvious because I think the SHARD acronym didn't start getting used until after the paper was published in 1985. By 1987, Sunil Sarin had published another paper with Nancy Lynch in which they refer to SHARD and reference the 1985 paper.

> Here are the two citations:

> S. K. Sarin and N. A. Lynch, "Discarding Obsolete Information in a Replicated Database System," in IEEE Transactions on Software Engineering, vol. SE-13, no. 1, pp. 39-47, Jan. 1987, doi: 10.1109/TSE.1987.232564.

> S. K. Sarin, B. T. Blaustein and C. W. Kaufman, "System architecture for partition-tolerant distributed databases," in IEEE Transactions on Computers, vol. C-34, no. 12, pp. 1158-1163, Dec. 1985, doi: 10.1109/TC.1985.6312213.

littlestymaar2y ago

> It is referenced hundreds of times in many classic papers.

Wait, you mean people include papers they haven't even opened in their references?!

HideousKojima2y ago

Happens far more than you think. It can be an innocent (sort of) mistake, where authors see the citation in a previous paper and simply copy it into their own.

WindyLakeReturn2y ago

This feels like something that would, at large scale, be unhealthy for science as a whole. While existing papers have already gone through their own quality checks, this enables bad, misleading, or false statements to propagate which can end up being a blow to the credibility of the entire model. Shouldn't there be an ethical duty to due one's due diligence?

1 more reply

littlestymaar2y ago

> where authors see the citation in a previous paper and simply copy it into their own.

Without opening the paper to even read the abstract? To me it doesn't sound like “innocent” at all, and borderline malpractice…

3 more replies

nocoiner2y ago

This is really interesting, thanks for posting it here!

On a semi-related topic, I love mysteries like these - mystery songs, those Japanese kanji in Unicode that nobody knows what they mean or where they came from, paper towns on maps.

If anyone else has anything else to read along similar lines, please post it!

yung_steezy2y ago

Is this an example of reference rot or did it never exist?

cinericius2y ago

If it did exist, there's some delicious irony in an original paper on replicating data in a highly-available manner being lost.

agumonkey2y ago

An endless runaway replica wave. Poetic.

mindcrime2y ago

Here's something that seems related. Maybe one of these authors would have a copy of the other paper? Not sure if they would be among the set of folks you've already tried or not...

https://apps.dtic.mil/sti/pdfs/ADA171427.pdf

CamperBob22y ago

Interesting, thanks. Wonder if it was a coincidence that UO adopted that terminology, or if somebody there knew about this.

mindcrime2y ago

It's a good question. I mean, it's not entirely out of the question that the UO folks independently developed the same term without being aware of the SHARD research. But OTOH, it's entirely possible they were aware. Without talking to somebody that was there, I doubt we'll ever know for sure.

1 more reply

jansan2y ago

They should have kept the paper as highly available replicated data.

foobarian2y ago

I guess I'm not too surprised, this seems like a corporate tech report. Some companies were good at having public archives of these (like Bell Labs) but I'm sure it takes a lot of resources to keep that up. It's essentially some company's internal Wiki page.

CamperBob22y ago

1988? As far as anyone can tell, the use of the term "shard" in the context of database replication originated with Ultima Online, which was released in 1997, and which used the term in connection with its underlying mythos (the idea of representing world instances as shards of Mondain's shattered gem).

So a documented reference to sharding that's earlier than that would be interesting to see.

(Disagree? Instead of downvoting, consider posting a citation that actually resolves to a real paper.)

w-m2y ago

You can find reports from before 1988 mentioning SHARD being in development, like the one from June 1986 linked in this sister comment: https://news.ycombinator.com/item?id=36849634

mindcrime2y ago

Some more, from 1989[1][2][3]. Which again, reference the missing "SHARD" paper, but contain enough detail to make it clear that the idea of SHARD existed, regardless of the status of that particular document.

[1]: https://apps.dtic.mil/sti/tr/pdf/ADA214478.pdf

[2]: https://apps.dtic.mil/sti/tr/pdf/ADA216523.pdf

[3]: https://apps.dtic.mil/sti/tr/pdf/ADA209437.pdf

andyjohnson02y ago

(I didn't downvote your comment)

"SHARD" is the name of the software - it was common back then to name systems using acronyms. It's not clear whether the paper/report actually uses the term "shard" in the sense that it is now used in distributed systems, or even whether it uses it at all.

mindcrime2y ago

One of the related papers I stumbled across, while not the SHARD paper, does go into a fair amount of detail about SHARD and the problem they were trying to address. One bit of verbiage here might be illuminating:

The new SHARD) (System for Highly Available Replicated Data) system under development at Computer Corporation of America (CCA) is designed to address the problems described above. It provides highly available distributed data processing in the face of communication failures (including network partitions). It does not guarantee serializability, nor does it preserve integrity constraints, but it does guarantee many practical and interesting properties of the database.

The reader is referred to [SBKJ for a detailed description of the architecture of the SHARI) system. Briefly the main ideas are as follows. The network consists of a collection of nodes, each of which has a copy of the complete database. (Full replication is a simplifying assumption we have used for our initial prototype, many of our ideas seem extendible to the case of partial replication, but this extension remains to be made.) Replication allows transactions to be processed locally, thus reducing communication costs and delays, and providing high availability.

So it sounds to me like their main concern was availability through replication, and not so much horizontal scalability (which seems to be more the "point" of modern day "sharding"). Yet I would probably claim that there is enough conceptual overlap to say that SHARD does relate to the modern use of sharding in some sense. Although it's hard to be sure without that original paper.

alexb_2y ago

This is wild if true. Surely someone has to have a copy of this. How is it even being referenced if it is non existent?

antegamisou2y ago

I don't know why parent comment is stirring up drama but:

1. Not available online doesn't mean the paper's existence is made up. It's a very bold claim to make for the authors that they cite work that is fabricated.

From the available information, this looks like a technical report by a, probably now defunct, company back in the 80s. If this was its only form of publication, and not on some conference proceedings for example, it would be only found available on select university libraries as a physical copy. But most important,

2. This isn't even as an impactful paper as the parent comment states. Or if its proposed concept is, the original idea is probably derived from some other paper that is indeed the one that is highly cited and most definitely available online.

Accumulative citations number from Google Scholar and IEEEXplore doesn't exceed fifteen for the particular paper though.

https://scholar.google.com/scholar?cites=1491448744595502026...

mindcrime2y ago

Not available online doesn't mean the paper's existence is made up.

True, but note that the post you're referring to does say:

I've emailed dozens of academics, libraries, and archives - none of them have a copy.

So this isn't somebody just saying "I couldn't find it with Google, therefore it doesn't exist."

From the available information, this looks like a technical report by a, probably now defunct, company back in the 80s.

Yeah, I think that's the key point. An internal technical memo from a private company, from that far back, isn't likely to be easy to find. It's quite possible that it's never been digitized and put on the 'net, and it it wasn't published in a journal, it may never have been archived by any university libraries or such-like.

That said, I'd be a little surprised if a copy didn't turn up somewhere, even if it means a former employee of CCA finding a copy in a desk drawer and providing it. But who knows?

darkclouds2y ago

Still searching then?

https://shkspr.mobi/blog/2021/06/where-is-the-original-overv...

I can only find the Oracle reference to Sharding, which might be the same thing or not. https://docs.oracle.com/en/database/oracle/oracle-database/1...

Along with the wikipedia reference. https://en.wikipedia.org/wiki/Shard_(database_architecture)

And a Science Direct reference. https://www.sciencedirect.com/topics/computer-science/shardi...

Along with facebooks reference. https://engineering.fb.com/2020/08/24/production-engineering...

And Wolverhamptons reference to Oracle Sharding. http://ora-srv.wlv.ac.uk/oracle19c_doc/shard/sharding-overvi...

And Amazon's. https://aws.amazon.com/what-is/database-sharding/

So is the original paper a myth or was/is this demonstrating the closed circuit nature of the dissemination of knowledge?

How many different ways do you cut up the data?

tracker12y ago

It's funny in that it was such a pivotal work looking back at it. It's possible in some filing cabinet at Xerox or elsewhere (from what I gathered in the thread). That said, it may have simply seemed somewhat obvious to many at the time and nobody bothered to hold on to or archive it. I suspect it's true of a lot of now commonplace software design strategies.

darkclouds2y ago

> it may have simply seemed somewhat obvious to many at the time and nobody bothered to hold on to or archive it.

I suspect this has been the case for many especially last century, everything was moving so fast and there was so much choice, it really was a free for all with no major players dominating the market place and some people didnt know the significance of what they were doing or building. People then retire or get moved to different projects and the knowledge gets lost.

naasking2y ago

That's pretty interesting actually. Someone must have a copy somewhere. Seems like a real failure of scholarship if it's truly lost, and a serious argument against walled gardens-style publishing.

codethief2y ago

Have you contacted authors of papers citing the one you're looking for, especially of papers that appeared shortly after / in the 90s? Maybe one of them still has a paper copy lying around somewhere.

I was in a similar situation before with some math paper from the 50s that's nowhere to be found (neither online nor in library indices) and you'd be surprised how many professors still use paper copies.

bombcar2y ago

Do Sarin, DeWitt & Rosenb[e|u]rg exist? Are they still alive? Tracking them down and going directly to the source would seem to be the way to go. Perhaps even enlisting some "big names" in the industry to ask around?

chubot2y ago

Which papers cite it? Are they old papers, when perhaps it still existed, or recent ones?

Very interesting either way!

pavo-etc2y ago

Wow this is fascinating. Are any of these authors still alive?

w-m2y ago

Integral Neural Networks (CVPR 2023 Award Candidate), a nifty way of building resizable networks.

My understanding of this work: A forward pass for a (fully-connected) layer of a neural network is just a dot product of the layer input with the layer weights, followed by some activation function. Both the input and the weights are vectors of the same, fixed size.

Let's imagine that the discrete values that form these vectors happen to be samples of two different continuous univariate functions. Then we can view the dot product as an approximation to the value of integrating the multiplication of the two continuous functions.

Now instead of storing the weights of our network, we store some values from which we can reconstruct a continuous function, and then sample it where we want (in this case some trainable interpolation nodes, which are convoluted with a cubic kernel). This gives us the option to sample different-sized networks, but they are all performing (an approximation to) the same operation. After training with samples at different resolutions, you can freely pick your network size at inference time.

You can also take pretrained networks, reorder the weights to make the functions as smooth as possible, and then compress the network, by downsampling. In their experiments, the networks lose much less accuracy when being downsampled, compared to common pruning approaches.

Paper: https://openaccess.thecvf.com/content/CVPR2023/papers/Solods...

Code: https://github.com/TheStageAI/TorchIntegral

numbers_guy2y ago

Going just by your description this sounds like they are doing operator learning. It's actually a very old idea. The proof that started operator learning is from 1988 I believe. Mathematicians have been playing around with the idea since 2016 at least.

w-m2y ago

Indeed, this seems closely related, thanks for the pointer!

Unfortunately I'm not deep enough into the topic to understand what their contribution to the theory part of it is. (they have some Supplementary Material in [INN Supp]). In the discussion of the Integral Neural Networks (INN) paper, there's this paragraph about an operator learning publication:

"In [24] the authors proposed deep neural networks with layers defined as functional operators. Such networks are designed for learning PDE solution operators, and its layers are continuously parameterized by MLPs only along the kernel dimensions. A re-discretization was investigated in terms of training on smaller data resolution and testing on higher input resolution. However, the proposed framework in [24] does not include continuous connections between filters and channels dimensions."

Also the weight permutation to perform the resampling on pretrained networks in INNs seems to be novel? And I guess it doesn't hurt that they're bringing new eyeballs to the topic, by providing examples of common networks and a PyTorch implementation.

[INN Supp]: https://openaccess.thecvf.com/content/CVPR2023/supplemental/...

[24]: Zongyi Li Nikola Kovachki. Neural operator: Graph kernel network for partial differential equations. arXiv preprint arXiv:2003.03485, 2020, https://arxiv.org/abs/2003.03485

dicroce2y ago

Damn. It's like jpeg for neural networks.

kirill_sldskkh2y ago

Great understanding of the work! I will add more details about INNs.

* In fact, INNs concept opens possibility to utilise differential analysis for DNNs parameters. Concept of sampling and integration can be combined with Nyquist theorem (https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampli...). Analysing the FFT image of weights allows to create the measure of a layer capacity. Two different size DNNs can be equivalent after conversion to INN because max frequency is the same for both networks.

* Tuning the integration grid is actually first steps for fast knowledge extraction. We have tested INNs on discrete EDSR (super-resolution) and have prune without INN training in 1 minute. We can imagine situation when user fine-tunes GPT-4 for custom task just by integration grid tuning simultaneously reducing number of model parameters keeping only important slices along filters/rows/heads etc. Because of smooth parameters sharing new filters/rows/heads include "knowledge" of neighbours.

* Also interesting application is to utilise integral layers for fast frame interpolation. As conv2d in INN can produce any number of output channels i.e. frames.

You can stay tuned and also check Medium on INN progress and applications. New Medium article already available: https://medium.com/@TheStage_ai/unlocking-2x-acceleration-fo...

smaddox2y ago

Nice. I was wondering if something like this is possible a few days ago. The next step would be somehow extending the discrete->continuous concept to layers.

smaddox2y ago

Ahh, I guess that's been done, too: https://proceedings.neurips.cc/paper_files/paper/2018/file/6...

Now we just need an iterative solver over both the structure and the "weights", and we get both architecture search and training at the same time

diracs_stache2y ago

After finally learning some complex integrals/residue theory and seeing the connection to continuous and discrete signal processing I was very happy that the "magic trick" disappeared, your comment has me interested in pulling the string farther. Thanks!

llaolleh2y ago

Supercool.

ninjaa2y ago

smart

the_snooze2y ago

"Blue Is the New Black (Market): Privacy Leaks and Re-Victimization from Police-Auctioned Cellphones"

https://krebsonsecurity.com/2023/05/re-victimization-from-po...

Researchers bought up a bunch of seized phones from police auction sites and found about 25% of them were trivially unlockable and still held sensitive data about suspects and victims.

paxys2y ago

Is this really "mind blowing"? Knowing the competency of the average police department, I'd consider it more par for the course.

the_snooze2y ago

The result is obvious, but the question is demonstrably not. Good researchers know how to ask interesting questions that no one had bothered to ask before. Seeing clever work like this makes me reflect and continually ask myself "What cool angles am I missing?"

jjoonathan2y ago

Novelty is overrated. Finding proof and pushing for accountability is so much more important.

klabb32y ago

Isn’t novelty also arbitrary and subjective? I recall when everyone seemed to think there was novelty in applying ML to X, where X was something very specific. “Look, it works here too!” I doubt that’s considered novel now.

snarfed2y ago

Not recent but legendary: "Latency Lags Bandwidth" David Patterson http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.115...

Recent:

"How to Hack the Simulation?" Roman Yampolskiy https://www.researchgate.net/publication/364811408_How_to_Ha...

"On the Computational Practicality of Private Information Retrieval" Radu Sion, Bogdan Carbunar https://zxr.io/research/sion2007pir.pdf

(via "Explained from scratch: private information retrieval using homomorphic encryption," https://blintzbase.com/posts/pir-and-fhe-from-scratch/ )

Daviey2y ago

I suppose recent is subjective, but this one totally blew my mind. It's a 260 year old academic publication, which is older than the USA:

  Mr. Bayes and Mr. Price. “An Essay towards Solving a Problem in the Doctrine of Chances. By the Late Rev. Mr. Bayes, F. R. S. Communicated by Mr. Price, in a Letter to John Canton, A. M. F. R. S.” In: Philosophical Transactions of the Royal Society of London 53.0 (1763), pp. 370–418. DOI: 10.1098/rstl.1763.0053. URL: https://doi.org/10.1098/rstl.1763.0053

Not only is this publication amazing because it is the source of conditional probability which is the basis for Bayes's Theorem, but a dated original of the document is available. The thing I find really beautiful is that it was submitted 2 years after the authors death, by his friend who found the material when going through his things. We will never know how much was changed prior to submission, but I find it a really beautiful tribute to his friend and the co-author makes it clear that the attribution should go to his friend, and that letter still exists today (and it is a lovely read). I can't think of anything similar which has been maintained 260 years later, and still a somewhat useful academic publication. I felt privileged to cite this reference in my MSc thesis.

philipkglass2y ago

"Liquid solution centrifugation for safe, scalable, and efficient isotope separation"

https://www.science.org/doi/10.1126/sciadv.adg8993

The authors show that a biological type laboratory ultracentrifuge can efficiently function as a near-universal isotope separator. Any element that can be dissolved as a salt in water -- the entire periodic table, excepting the noble gases -- can be enriched according to its relative mass. This can reduce the cost of refining certain isotopes like calcium-48 by orders of magnitude compared to the previous best techniques.

Left unsaid, but implied by its universality: the new technique is also a new approach to producing enriched fissile materials for nuclear reactors and weapons. It requires less chemical engineering sophistication than current processes which require production and handling of gaseous uranium hexafluoride.

NoMoreNicksLeft2y ago

Surely something about the technique would make it impractical for enriching uranium though... you're not going to be able to produce kilograms of 20% U235 with this are you?

philipkglass2y ago

Not with off-the-shelf biological centrifuges, no. I don't know enough about centrifuge design to judge how easy/hard it is to build a 60,000 G centrifuge that can handle liters instead of milliliters of liquid per cavity.

The other even more exotic possibility is to use something like this to enrich ordinary reactor grade plutonium to weapons grade plutonium-239. The amount of mass to process with plutonium is orders of magnitude less because spent fuel plutonium is already more than 50% Pu-239, versus 0.7% U-235 in natural uranium, and a bare sphere critical mass of Pu-239 is only 10 kg vs 52 kg for U-235:

https://en.wikipedia.org/wiki/Critical_mass#Critical_mass_of...

The United States considered enriching waste fuel plutonium to weapons grade in the 1980s, when it contemplated another big nuclear weapons buildup against the USSR, but the laser based separation technology to be used was much more complicated than centrifuge separation. The project ended shortly after the USSR dissolved. It was called the Special Isotope Separation Project.

The 1988 environmental impact statement for the project gives some background information:

https://www.energy.gov/sites/prod/files/2015/06/f24/EIS-0136...

kaba02y ago

Could someone knowledgeable describe the differences between this new method and “traditional” enrichment of nuclear isotopes? If I’m not mistaken the latter also uses very high RPM rotation, and I believe I have read that the hardest part around managing it is the needed precision/balance of rotating something heavy at such high speed — a tiny imbalance can cause a wobble and make the whole thing break.

What changes?

nukeman2y ago

Hypothetically, if the technique can enrich U, Pu, or other fissile isotopes in decent quantities, you run into other issues. Stirring an aqueous enriched uranium solution is a great way to have a criticality accident (cf. https://en.wikipedia.org/wiki/Criticality_accident). That’s not to say it’s impossible to manage, just difficult. UF6, the current compound of choice for enrichment, is in the gas phase during the process.

Iwan-Zotow2y ago

yeah, especially if water ends up enriched in deuterium as well

philipkglass2y ago

"Co-cultivation enhanced microbial protein production based on autotrophic nitrogen-fixing hydrogen-oxidizing bacteria"

https://www.sciencedirect.com/science/article/abs/pii/S13858...

Certain bacteria can directly assimilate a mixture of hydrogen, carbon dioxide, and nitrogen to produce protein. You could consider it an alternative to bacterial nitrogen fixation in root nodules with much higher productivity. Or you could consider it an alternative to the Haber-Bosch process with much milder reaction conditions -- ambient temperature and pressure. It's a way to turn intermittent electricity into protein with simple, robust equipment. I wouldn't be surprised if this or a related development ultimately supplants much of the current demand for synthetic nitrogen fertilizers.

hirundo2y ago

We hear that eating vegetables is more efficient in ecological footprint than eating meat, since it cuts out the middle man. Is it yet more efficient to cut out the plants and get dietary protein from the bacteria that feed them?

philipkglass2y ago

Possibly, yes. This is the dream of so called "single cell protein" production. One type of SCP has been sold for years under the brand name Quorn (derived from a fungus rather than bacteria).

Bacterial protein may trigger allergic reactions in people and bacterial biomass is purine-rich which can also be a problem for people prone to gout. It's possible that cell engineering, directed evolutionary selection, or additional post-growth processing can minimize these problems.

I personally think that the more likely path is using fast-growing bacteria as feed for animal agriculture or aquaculture. Solar panels are so efficient at sunlight conversion compared to plants that you could farm salmon protein starting from bacterial pellets grown on solar derived hydrogen with per-hectare productivity comparable to conventionally farming soy beans. But the solar farm can go on saline, dry, contaminated, or otherwise agriculturally useless land. And salmon has slightly greater nutritional value than soy protein plus significantly greater market value.

harimau7772y ago

I know that Quorn also has a reputation for causing allergic reactions. However, I don't know if that reputation is deserved or just misinformation.

cratermoon2y ago

Plants use water, sunlight, and organic chemicals. In plants, protein is made from soil nitrates. Bacteria don't feed the plants, but they can fix nitrogen from the air into the soil. If we really wanted to cut out the middleman, we'd turn the nitrogen we breathe directly into proteins. For now, we have to eat to obtain the essential amino acidd, Histidine; Isoleucine; Leucine; Lysine; Methionine; Phenylalanine; Threonine; Tryptophan; and Valine

d--2y ago

"Enso: A Streaming Interface for NIC-Application Communication" https://www.microsoft.com/en-us/research/uploads/prod/2023/0...

We've been using the same API to communicate with our NICs since 1994. That API severely limits network throughput and latency. By simply changing the API (no new NIC) you can get 6x higher throughput in some apps and 43% lower latency.

Code runs on FPGA NIC only for now: https://github.com/crossroadsfpga/enso

Won USENIX OSDI best paper award and best artifact award.

liam02152y ago

This is the best systems paper of the year so far imo

GregarianChild2y ago

Someone managed to GPU-accelerate program synthesis, a form of symbolic ML. First time for ML that is not deep learning:

https://dl.acm.org/doi/10.1145/3591274

Deep learning took off precisely when the ImageNet paper dropped around 2010. Before nobody believed that backprop can be GPU-accelerated.

d_burfoot2y ago

> ... 2010. Before nobody believed that backprop can be GPU-accelerated.

When I was doing my master's in 2004-06, I talked to a guy whose MSc thesis was about running NNs with GPUs. My thought was: you're going to spend a TON of time fiddling with hacky systems code like CUDA, to get basically a minor 2x or 4x improvement in training time, for a type of ML algorithm that wasn't even that useful: in that era the SVM was generally considered to be superior to NNs.

So it wasn't that people thought it couldn't be done, it's that nobody saw why this would be worthwhile. Nobody was going around saying, "IF ONLY we could spend 20x more compute training our NNs, then they would be amazingly powerful".

GregarianChild2y ago

Exactly.

It's easy to see in retrospect, but hard in prospect: the original paper [1] on GPU acceleration of NNs reports a measly 20x speedup. Assuming a bit of cherry-picking on the author's side to make get the paper published, the 'real-world speedup' will have been assumed by the readership to be less. But this triggered a virtuos cycle of continuous improvements at all levels that has been dubbed "winning the hardware lottery" [2].

[1] K.-S. Oh, K. Jung, GPU implementation of neural networks.

[2] S. Hooker, The Hardware Lottery. https://arxiv.org/abs/2009.06489

rch2y ago

I went to a talk on "general purpose GPU programming" at the Colorado School of Mines around 2001 that covered exactly that topic. It was very disappointing to have my interest in FPGAs for this purpose be so entirely destroyed by a quirk of graphics card design.

Hinton also addressed the contribution of hardware performance advances to practical deep neural net applications in his talks in the mid-2000s.

Daviey2y ago

As recent as 2012, I was working for a major US investment and financial services bank (one of the very big ones). An algo dev sent his resignation email to a mailing list with all technologists at the firm, stating he was resigning because they weren't taking FPGA's seriously. Yes, it was very unprofessional, but his statement about "let's see who is proved right" seems to be accurate.

YeGoblynQueenne2y ago

That's cool and all, but the one thing that really made it possible to train deep neural nets was the use of backpropagation, and its polynomial time complexity.

By contrast, there are no known polynomial time algorithms for program synthesis and the standard approach is to search some large combinatorial space [1]. That's the case for all the classical approaches: SMT, SAT, planning and scheduling, etc. At the same time there's very powerful heuristics for all the other classical problems that can solve many problem instances efficiently.

____________

[1] The one exception to this is Inductive Logic Programming, i.e. the inductive synthesis of logic programs, for which we do know a polynomial time algorithm (but that is my work so I'm not pimping it here).

GregarianChild2y ago

> backpropagation, and its polynomial time complexity

How do you reconcile the NP-completeness result in [1] about training neural networks with your claim?

[1] A. L. Blum, R. L. Rivest, Training a 3-Node Neural Network is NP-Complete. https://proceedings.neurips.cc/paper/1988/file/3def184ad8f47...

YeGoblynQueenne2y ago

Thanks, wow, that's a great question to check my assumptions. I turns out I don't know where my claim that backpropagation's time complexity is polynomial comes from! I am repeating what I know from the time of my Master's in 2014, when I studied neural nets. It was most likely something discussed with my tutors during the course (I had some excellent tutors).

In my defense, it certainly doesn't seem to be just my own, personal belief. For example, here are lecture notes from a course on backpropagation, which conclude with a sketch proof of a linear time complexity of O(|V| + |E|) for the computational of a partial derivative over a network with V units and E connections, if I got that right:

https://www.cs.princeton.edu/~rlivni/cos511/lectures/lect16....

There's also other similar calculations I could find floating free on the internet. Those generally seem to look at the complexity of calculating partial derivatives by automatic differentiation, basically.

Then there's all the scholarly articles that claim that training neural nets by backpropagation is not efficient (which is not the same as claiming than automatic differentiation is not efficient) and that the corresponding decision problem is somewhere in the class NP, or worse. I found a whole bunch of such papers while investigating my assumption of polynomiality just now, thanks to your question, and I even found some of them on my hard drive, which I didn't remember!

Well, it seems there is an ongoing debate, continuing all the way to date, and that goes rather further back than the Blum and Rivest paper. Most results I could find seem to be on the side of NP-completeness (or worse).

However, all those results must be examined side-by-side with the irrefutable empirical evidence that training deep neural nets with thousands of layers and millions of units, on lagre datasets no less, is common in practice.

I think the answer lies in the observation that, to compute any complexity result, one must make certain axiomatic assumptions about the structure of the machine (in the abstract sense) that they are investigating. These assumptions may well differ from the assumptions made in common practice for the same general kind of machine. So for example, it seems that the Blum & Rivest paper was criticised, even dismissed as irrelevant, at its time because it assumed a discrete activation function, when continuous functions were already (1992) the norm.

Especially with neural nets it seems that the wild variety of architectures makes it very hard to derive results with general applicability. The following letter to the editor of the journal Neural Networks from 1997, makes this point, and also notes that a popular assumption that a result applying to a simpler architecture can be generalised to more complex architectures is contradicted by the observation that adding more layers or units can _sometimes_ improve the efficiency of training, counterintuitively (though that is not, itself, an assumption that holds in general):

https://dl.acm.org/doi/10.1016/S0893-6080%2897%2900041-5

Unfortunately this is behind a paywall, but email me at the address associated with this github account: https://github.com/stassa and I can send you a copy (totally clandestinely! We'll break the law together :)

The bottom line is: I have no idea whether my claim above about the polynomial time complexity of backpropagation is right or wrong. _But_ it is certainly possible, _in practice_ to train deep neural nets on large datasets, _and_ this was possible even before the use of GPUs was common.

Which is not to say that hardware was not a very important factor in the current domination of deep neural nets in AI research. Hardware- and data.

varranvar2y ago

I'm working on an inductive logic programming algorithm, and from my understanding the search space for logic programs is just as vast as other types of program synthesis. Do you have any information you can share about the polynomial time algorithm?

YeGoblynQueenne2y ago

Of course. It's published as an open-access paper here:

https://link.springer.com/article/10.1007/s10994-020-05945-w

But I recommend the arxiv version where Springer couldn't mess up my LaTex formatting:

https://arxiv.org/abs/2101.05050

(Not that my LaTex formatting is anything to write home about!).

To clarify, the algorithm described in our paper doesn't reduce the size of the search space for logic programs- it sidesteps it.

Note I'm the corresponding author in the paper and my email is in the Springer version (top of the page). I'm always happy to answer questions about my work.

claytonjy2y ago

> First time for ML that is not deep learning

What do you mean by this? Virtually all "classic" or "shallow" ML can be GPU-accelerated, from linear regression to SVM to GBM.

GregarianChild2y ago

Can you point me to papers with reproducible benchmarking that achieves big speedups on those?

Modern GPUs are GP-GPUs: where GP means "general purpose": you can run any code on GPGPUs. But if you want to gain real speed-ups you will have to program in an awkward style ("data parallel"). I am not aware of GPU acceleration of the work-horses of symbolic AI, such as Prolog, or SMT solving. There has been a lot of work on running SAT-solvers on GPUs, but I don't think this has really succeeded so far.

claytonjy2y ago

I think we're conflating two things: shallow/classic ML is not symbolic AI. I'm not sure "ML" even encompasses anything "symbolic"; I see symbolic AI and ML as subfields with little overlap.

I'm not saying symbolic AI has been GPU accelerated in the past, but that non-deep ML has been.

2 more replies

fancyfredbot2y ago

The parent comment is talking about SVM, which is not a form of symbolic AI. SVM and other kernel based algorithms contain naturally data parallel steps like summing f(z-x[i]) for some kernel function f and all sample data x[i]. These will work great on GPU once you have a few thousand data points.

a-dub2y ago

computing a single matrix right divide is probably faster done in place with the cpu using vector math instructions.

IanCal2y ago

This isn't too related but

> Deep learning took off precisely when the ImageNet paper dropped around 2010. Before nobody believed that backprop can be GPU-accelerated.

Deep learning kicked off with RBMs because you didn't have to do backprop and there was a training algorithm called "contrastive divergence". Each layer could be done in turn, which meant you could stack them way deeper. In ~2008-2009 I implemented Hintons paper on GPUs, which meant I could do the same scale of thing that was taking weeks in matlab in hours on a gpu, and then there were lots of gpus available on the cluster in the uni. Lots of fun cuda hacking (I'm just glad cublas was around by then). The original published learning rates/etc are wrong if I remember right, they didn't match the code.

naillo2y ago

How is program synthesis the same as regular expression search? Honest question

GregarianChild2y ago

Good question. All supervised learning is a form of search with three components:

- Specification: what are you are looking for?

- Search space: were are you looking?

- Search mechanism: how are you going through the search space?

Program synthesis is simply learning where the search space is syntax. In deep learning, taking the ImageNet paper as an example, the specification a bunch of photos with annotations, the search space is multi-variate real functions (encoded as matrix of floats) and the search mechanisms is gradient descent (implemented as backprop) with a loss function.

I think this paper uses regular expressions an example of how to search fast over syntax. It claims not to be tied to regular expressions.

abstractcontrol2y ago

https://deepai.org/publication/search-based-regular-expressi...

Here is the full text.

Regexps aren't even Turing complete as far as I know, if whatever they have in their paper works for arbitrary programs it would be shocking. I'll give it a read.

*Edit*: The algorithm in the paper is a DP like algorithm for building regexes. They use a matrix, and it has all the potential strings to be checked on one axis, and all the potential regex programs on the other axis, and in-between values (the actual matrix values) are booleans saying whether the string matches the program. The algorithm builds the matrix iteratively.

I haven't understood how regex evaluation is done, probably directly, but obviously this algorithm is only for checking whether a particular regex program matches an output rather than general purpose synthesis.

We'll have to wait for AI chips to really scale genetic programming, GPUs won't cut it.

1 more reply

lmeyerov2y ago

This seems like an early paper and agreed with the consternation.

The paper, in a modern context and based solely on the abstract and having been in the community, is chipping at the "uninteresting" part of the problem. Around that time, program synthesis started switching to SMT (satisfiability modulo theory) methods, meaning basically a super powerful & general SAT solver for the broad search ("write a wild python program") and then, for specialized subdomains, have a good way to call out to optimized domain solvers ("write a tiny bit of floating point math here"). The paper would solve what the regex callout looks like.. which is specialized. We can argue regex is one of the most minimal viable steps towards moving to general programming on GPUs. Except as a person who does SIMD & GPU computing, optimizing compute over finite automata is not general nor representative and I don't expect to change my thinking much about the broader computational classes. To be fair to the authors... back then, synthesizing regex & sql were hard in practice even for boring cases.

Separately, nowadays synthesis has shifted to neural (copilot, gpt), and more interesting to me, neurosymbolic in R&D land. We're doing a lot of (simple) neurosymbolic in louie.ai, and I'm excited if/when we can get the SMT solver side in. Making GPT call Z3 & Coq were some of the first programs I tried with it :) Till then, there's a lot more interesting low-hanging fruit from the AI/ML side vs solvers, but feels like just a matter of time.

GregarianChild2y ago

The paper claims that there is no neural / deep learning based solver that performs well on regular expression inference.

Calls to Coq and Z3 will be very slow and not competitive with GPU compute.

1 more reply

Uptrenda2y ago

I find the most interesting papers I read all come from the same place: The National Library of Medicine https://www.nlm.nih.gov/

Here's some recent papers I liked:

- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8413749/ -- Lithium is used as a mood stabilizer for bipolar and in other disorders. The form of Lithium used in psychiatry is Lithium Carbonate. But other forms also exist. As a supplement: there is Lithium Orotate which some people use to help them sleep, deal with stress, and so on. This paper puts forwards the idea that Lithium Orotate is preferable to Lithium Carbonate due to lower quantities being needed for the same therapeutic results. Resulting in less side-effects.

- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1525098/ -- In bipolar disorder its known that there are abnormalities in the presence of brain derived neurotrophic growth factor (BDNF.) What's interesting about this is treatments for bipolar help to increase BDNF which may be something of interest to those who are into nootropics.

The papers on this site are honestly some of the best written, in-depth, and accessible works I've come across anywhere. There's enough information here to live a better life if you're willing to sift through papers. No joke.

billylo2y ago

I wonder if Google's Med-PaLM2 use those for training.

Best content + quality LLM + quality verifications => Useful health advice.

dargscisyhp2y ago

I thought the AlphaZero paper was pretty cool: https://arxiv.org/abs/1712.01815

Not only did we get a whole new type of Chess engine, it was also interesting to see how the engine thought of different openings at various stages in its training. For instance, the Caro-Kann, which is my weapon of choice, was favored quite heavily by it for several hours and then seemingly rejected (perhaps it even refuted it?!) near the end.

not-my-account2y ago

totally! MuZero is my favourite[1]

The super cool thing about MuZero is that it learns the dynamics of the problem, i.e. you don't have to give it the rules of the game, which makes the algorithm very general. For example, DeepMind threw MuZero at video compression and found that it can reduce video sizes by 6.28% (massive for something like YouTube)[2][3].

Curious if anyone else knows examples of MuZero being deployed outside of toy examples?

[1] https://arxiv.org/pdf/1911.08265.pdf [2] https://arxiv.org/pdf/2202.06626.pdf [3] https://www.deepmind.com/blog/muzeros-first-step-from-resear...

(edit s/Google/DeepMind)

smokel2y ago

> you don't have to give it the rules of the game

To be fair, it uses MCTS, which requires many simulations of the game. For this, it needs to know which moves are valid, and when a player wins or loses the game.

So it does need to know the rules of the game, but it doesn't need any prior knowledge about which moves are better than others.

not-my-account2y ago

Not quite, you can define an illegal move as losing the game, and winning/losing is a “meta-observation” - ie if the player wins/loses, you don’t invoke another search.

javajosh2y ago

I think you meant "recently published paper" but others are bringing up old stuff. I've always enjoyed Einstein's (translated) papers. They are both precise and readable, and his reputation for genius is well deserved!

This one is interesting, "On the Influence of Gravitation on the Propagation of Light" you can read here: https://einsteinpapers.press.princeton.edu/vol3-trans/393. This was his initial stab at GR. It was a simple approach, later abandoned, that considered light to slow down in a gravity well rather than remaining constant and specifying that mass warps spacetime. I suppose I like the idea that he pursued an idea that didn't work, and was forced to do something far more complex. It's kinda relatable.

DennisP2y ago

The book Einstein's Lost Key by German physicist Alexander Unzicker is all about that idea and its history. Unzicker argues that it's still a good idea, hasn't been falsified since it makes much the same predictions as GR, and we should look into it further.

vackosar2y ago

I read about criticism of General Relativity, that I haven't verified so grain of salt please. The criticism is in that the equations are so complex that only the simplest limits of them are studied and so the theory is therefore not verified.

ranprieur2y ago

Placebo Effect Grows in U.S., Thwarting Development of Painkillers

https://www.scientificamerican.com/article/placebo-effect-gr...

The most interesting thing is that "placebo responses are rising only in the United States."

1letterunixname2y ago

I wonder if there is any correlation with either political or socio-political volatility, e.g., bread and circuses, conspiracy theories, cults, political dysfunction, and an inability to double-blind test painkillers.

rsolva2y ago

Sounds plausible. Would be interesting if we found a way to measure it on a large scale and track it over time. A National Beliefe Index of sorts.

di4na2y ago

All the stuff coming out of the Koka and Effekt development.

In particular last week i read their FBiP2 paper https://www.microsoft.com/en-us/research/uploads/prod/2023/0...

anfelor2y ago

Thanks, I am surprised and delighted to see our paper here! We also wrote a follow-up recently, which you may enjoy: https://www.microsoft.com/en-us/research/uploads/prod/2023/0...

di4na2y ago

Oh nice thanks!

I am slooooowwly working on my own language that is planned to use work from both you and the Effekt team, so i may send you all an email in the future when i start having questions.

anfelor2y ago

Sure, it would be great to chat! If you are interested in contributing to Koka at some point, there might also be opportunities there :)

1 more reply

afroisalreadyin2y ago

When I google for "koka effekt" I get pages in German about the effects of cocain. They should maybe reconsider the "Koka" part of the project name?

anfelor2y ago

These are two different languages, see https://effekt-lang.org/ and https://koka-lang.github.io/koka/doc/index.html -- although I believe that the FP^2 work mentioned by OP currently only works in Koka (and perhaps soon in Lean; https://leanprover.github.io/)

di4na2y ago

Indeed.

That said i am slowly writing my own language that will probably use ideas from both :D

intended2y ago

Content Moderation / Trust and Safety person

Open AI’s como paper, A Holistic Approach to Undesired Content Detection in the Real World.

https://arxiv.org/pdf/2208.03274.pdf

Lots of interesting facts are strewn around the paper.

——-

The first paper that squarely talked about the language resource gap in CS/ML. Before this came out, it was hard to explain just how stark the gap between English and other languages was.

Lost in Translation: Large Language Models in Non-English Content Analysis

https://cdt.org/insights/lost-in-translation-large-language-...

——

This paper gets in for the title:

“I run the world’s largest historical outreach project and it’s on a cesspool of a website.” Moderating a public scholarship site on Reddit: A case study of r/AskHistorians

https://drum.lib.umd.edu/bitstream/handle/1903/25576/CSCW_Pa...

——

This was the first paper I ended up saving on online misinformation. The early attempts to find solutions.

The Spreading of Misinformation online, https://www.pnas.org/doi/10.1073/pnas.1517441113

What I liked here was the illustration of how messages cascade differently based on the networks the message is traveling through.

the-mitr2y ago

More is Different by P. W. Anderson (1972)'arguing that “at each level of complexity entirely new properties appear” — that is, although, for example, chemistry is subject to the laws of physics, we cannot infer the field of chemistry from our knowledge of physics.' The paper https://cse-robotics.engr.tamu.edu/dshell/cs689/papers/ander...

Also its impact https://www.nature.com/articles/s42254-022-00483-x

codethief2y ago

> that is, although, for example, chemistry is subject to the laws of physics, we cannot infer the field of chemistry from our knowledge of physics

I mean we could, with infinite computing power and enough time to look into every interesting phenomenon (and to evaluate the corresponding multi-particle Schrödinger equation numerically) but there are simply too many such phenomena, Schrödinger equations are tough to solve, and quantum mechanics is also not a great level of abstraction for the reason you mentioned.

the-mitr2y ago

no, that is the point of the paper, even with infinite computing power you could not predict. it is a very strong case against reductionism in science. The phenomenological examples he considers in the paper cannot be derived from basic principles of quantum mechanics.

smokel2y ago

Hm, I can't say I really understand most of the paper, so I'm not sure if the paper actually implies this, but even then, I find this unconvincing.

Quantum mechanics allow for many things to happen at a macro level, such as magnetism. If you start from pure quantum mechanics, and try to predict something like that emerging, you'd have to be pretty lucky to find the correct initial conditions to lead to such a phenomenon. But that does not mean that it is impossible to predict it at all.

Another take on this might be that understanding phenomena at different levels requires different abstractions and different modes of understanding. In that sense, the knowledge at one level often does not help us to understand things at a higher level. But I think that says more about our limited intelligence than about the phenomena and emergence in themselves.

codethief2y ago

Huh, I find that very surprising. Looks like I need to have a deeper look at the paper then.

trevyn2y ago

I don’t think the paper rigorously addresses the case of infinite computing power.

bitxbitxbitcoin2y ago

“A classification of endangered high-THC cannabis (Cannabis sativa subsp. indica) domesticates and their wild relatives”

By McPartland and Small.

Moving on from cannabis sativa indica and cannabis sativa sativa to cannabis sativa indica Himalayansis and cannabis sativa indica asperrima depending on distribution from the original location of the extinct ancient cannabis wildtype.

Following this new classification, I believe there’s a third undocumented variety in North East Asia.

If anyone else has noticed the samesameification of cannabis strains and is wondering what the path forward is, this may be illuminating.

https://phytokeys.pensoft.net/article/46700/

Silamoth2y ago

I recently read "Enabling tabular deep learning when d ≫ n with an auxiliary knowledge graph" (https://arxiv.org/pdf/2306.04766.pdf) for one of my graduate classes. Essentially, when there are significantly more data points than features (n >> d), machine learning usually works fine (assuming data quality, an underlying relationship, etc.). But, for sparse datasets where there are fewer data points than features (d >> n), most machine learning methods fail. There's just not enough data to learn all the relationships. This paper builds a knowledge graph based on relationships and other pre-existing knowledge of data features to improve model performance in this case. It's really interesting - I hadn't realized there were ways to get better performance in this case.

maurits2y ago

LoRA: Low-Rank Adaptation of Large Language Models [1]

[1]: https://arxiv.org/abs/2106.09685

ComputerGuru2y ago

Off topic: LoRa is forever low-power long-range communication for me (as in LoRaWAN).

awestroke2y ago

LoRA != LoRa

jeffbee2y ago

https://cseweb.ucsd.edu/~tullsen/halfandhalf.pdf

Half&Half: Demystifying Intel’s Directional Branch Predictors for Fast, Secure Partitioned Execution

1letterunixname2y ago

lwansbrough2y ago

Detailed Rigid Body Simulation with Extended Position Based Dynamics

https://matthias-research.github.io/pages/publications/PBDBo...

A new way of doing real time physics that dramatically outperforms state of the art by simply introducing a new algorithm. No crazy AI or incremental improvements to existing approaches.

carapace2y ago

"Cyclic Combinational Circuits", by Marc D. Riedel

> we present theoretical justification for the claim that the optimal form of some [combinational] circuits requires cyclic topologies. We exhibit families of cyclic circuits that are optimal in the number of gates, and we prove lower bounds on the size of equivalent acyclic circuits.

http://www.mriedel.ece.umn.edu/wiki/images/7/7a/Riedel_Cycli...

danesparza2y ago

Not all of these are research papers. But all are fairly recent.

Gene linked to long COVID found in analysis of thousands of patients https://www.nature.com/articles/d41586-023-02269-2

Surfactants safely take down mosquitoes without using insecticides https://newatlas.com/science/surfactants-safely-take-down-mo...

This is what our Milky Way galaxy looks like when viewed with neutrinos https://arstechnica.com/science/2023/06/ghost-particles-have...

tandr2y ago

>Surfactants safely take down mosquitoes without using insecticides https://newatlas.com/science/surfactants-safely-take-down-mo...

Curious - what would happen to a spider or an ant that ate a couple of mosquitoes with that spice on top? Will they also suffocate?

dooraven2y ago

Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind

https://arxiv.org/abs/2306.09299

TokenFlow: Consistent Diffusion Features for Consistent Video Editing

https://huggingface.co/papers/2307.10373

Need to see code for second one.

pyuser5832y ago

“On the frequency and severity of interstate war” by Aaron Clauset.

https://arxiv.org/pdf/1901.05086.pdf

Analyzes a claim made in the 1950s by a prominent statistician: the frequency of interstate wars follows a simple Poisson arrival process and their severity follows a simple power-law distribution.

He was right, but it’s not clear why.

The paper is interesting because it shows how a new bit of knowledge creates a large number of known unknowns from previously unknown unknowns.

It also shows statistical magic was very much possible prior to computers.

manvel_hn2y ago

Toolformer: Language Models Can Teach Themselves to Use Tools https://arxiv.org/abs/2302.04761

Older one, but still very nice work.

jrmiii2y ago

I just posted the Voyager paper at exactly the same moment you posted this - a lot of the same ideas. https://voyager.minedojo.org/

gabitoju2y ago

"C-Store: A Column-oriented DBMS": https://web.stanford.edu/class/cs345d-01/rl/cstore.pdf

By among others, the great Mike Stonebraker.

Jalad2y ago

"Bounding data races in space and time" was an interesting one I saw recently! It's discussing the memory models of programming languages, and how they can fail pretty horribly when data races occur, and then talks about ways to avoid those. OCaml's multicore support is based on this work, meaning it's memory safety guarantees in when data races occur are pretty interesting

https://kcsrk.info/papers/pldi18-memory.pdf

https://youtube.com/watch?v=eXXzUzt_nAY

dan-g2y ago

Generative Agents: Interactive Simulacra of Human Behavior[1]. Make sure to check out the recorded demo!

[1] https://arxiv.org/abs/2304.03442

msravi2y ago

Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm [0]

Also known as the Viterbi algorithm. Every digital communication device in existence today most likely has an implementation of it.

Later proved optimal by Forney [1]

0. https://www.essrl.wustl.edu/~jao/itrg/viterbi.pdf

1. https://www2.isye.gatech.edu/~yxie77/ece587/viterbi_algorith...

masfuerte2y ago

Grid-free Monte Carlo for PDEs with spatially varying coefficients. https://cs.dartmouth.edu/wjarosz/publications/sawhneyseyb22g...

cratermoon2y ago

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? <https://dl.acm.org/doi/10.1145/3442188.3445922>

adultSwim2y ago

I enjoyed the real world concerns they gathered but found the octopus analogy a bit weak.

rurban2y ago

https://arxiv.org/pdf/1703.04234.pdf

On why Oppenheimer didn't get the Nobel for his black hole paper decades before some else got it for a rediscovery.

freedude2y ago

An analysis of studies pertaining to masks in Morbidity and Mortality Weekly Report: Characteristics and quality of all studies from 1978 to 2023

"0/77 were randomized studies."

https://www.medrxiv.org/content/10.1101/2023.07.07.23292338v...

Here is a pdf.

https://www.medrxiv.org/content/10.1101/2023.07.07.23292338v...

tracker12y ago

TBH, most of the research papers I've read the past few years have been around diet/nutrition. I can't think of anything in terms of recently blew my mind, some of it is definitely interresting, but most of it is low-quality noise.

Probably the biggest thing that blows my mind is the suppression of the Minnesota Coronary Study in the early 1960's. Literally half a century of dis/misinformation from the govt, pharma and medical industries that was disproven long ago. Nothing higher quality or more definitive since.

Basically, the whole, limit cholesterol and saturated fat intake in favor of more grains and seed oils is based on a theory that was long disproven. And, it's still pushed to this day. Why, there's big money/business in pharma and agriculture (corn, soy, wheat).

jrmiii2y ago

I was impressed by the Voyager paper on a GPT-powered Minecraft bot.

https://voyager.minedojo.org/

0asa2y ago

I use Raycast Pro, which includes Raycast AI. It's awesome: it offers a clever integration of LLMs into the operating system.

aaron6952y ago

RNA demethylation increases the yield and biomass of rice and potato plants in field trials (2021)

https://www.nature.com/articles/s41587-021-00982-9

j / k navigate · click thread line to collapse

171 comments

edent2y ago

"Overview of SHARD: A System for Highly Available Replicated Data" it's the first paper to introduce the concept of database sharding. It was published in 1988 by the Computer Corporation of America.

It is referenced hundreds of times in many classic papers.

But, here's the thing. It doesn't exist.

Everyone cites Sarin, DeWitt & Rosenb[e|u]rg's paper but none have ever seen it. I've emailed dozens of academics, libraries, and archives - none of them have a copy.

So it blows my mind that something so influential is, effectively, a myth.

mindcrime2y ago

Huh. OK, here's something that might be interesting. I found another paper[1] that cites SHARD, but the citation is slightly different. Instead of being a CCA memo, it shows it as a Xerox memo:

Sunil Sarin, Mark DeWitt, and Ronni Rosenberg, "Overview of SHARD: A System for Highly Available Replicated Data," Technical Report 162, Xerox Advanced Information Technology (May 1988).

EDIT:

OK, I think I get this now. I had read the Wikipedia blurb about CCA being acquired by Rocket earlier, but only just now did I keep reading further down to find this bit:

So it seems like the portion of CCA that would be of interest here, is probably the bit that sent to Xerox. Maybe somebody at Xerox could help turn up the missing document?

I doubt it will help, but I took a stab at pinging them on <strike>Twitter</strike> X.

https://fogbeam.com/tweet_xerox_cca_paper.png

[1]: https://apps.dtic.mil/sti/pdfs/ADA209126.pdf

johnmaguire2y ago

I found LinkedIn profiles for Sunil Sarin, Mark Dewitt, and Ronni Rosenberg who all worked at CCA during this time period.

I've gone ahead and sent them each a message asking if they might be able to make the paper available.

If you'd like to get in contact with them yourself and are having trouble finding their LinkedIn, shoot me an email and I'll be happy to provide you links.

johnmaguire2y ago

I received a response from Dr. Rosenberg:

> Fortunately, you need not go back 35 years to read about sharding; it's easy to get current info. Cheers.

I've now sent a message to Andy Youniss, CEO of Rocket Software to see if he can help.

mindcrime2y ago

I've now sent a message to Andy Youniss, CEO of Rocket Software to see if he can help.

I suspect that if that memo lives on anywhere, it's somewhere in the bowels of Xerox. I say that based on observing that:

1. In the paper by Ronni L. Roseniberg at https://apps.dtic.mil/sti/pdfs/ADA209126.pdf the citation to the SHARD paper changed to

Sunil Sarin, Mark DeWitt, and Ronni Rosenberg, "Overview of SHARD: A System for Highly Available Replicated Data," Technical Report 162, Xerox Advanced Information Technology (May 1988).

2. Per Wikipedia[1] Crowntek sold Computer Corporation of America's Advanced Information Technology division to Xerox Corporation in 1988.

To me this suggests that it was the "Advanced Information Technology division" specifically which would have had the paper in question, and that bit of CCA wound up with Xerox.

That said, it can't hurt to reach out anybody connected to this in any way. You never know who will wind up "knowing a guy who knows a gal, who knows a ..." or whatever.

[1]: https://en.wikipedia.org/wiki/Computer_Corporation_of_Americ...

mindcrime2y ago

1 more reply

pjs_2y ago

Delk2y ago

InitialLastName2y ago

kuhewa2y ago

benterix2y ago

> Sometimes it's just an attempt to do the due diligence of citing the primary reference rather than the reference that cited the reference that cited the primary reference.

True, but in this case you should include both references.

kfrzcode2y ago

Funny, in my high school literature class I clearly remember being chastised for having sources in my works cited; but not warranting their inclusion with an actual reference in the work.

It's kind of wild that LLMs and other models/sequences will be able to quickly suss out which papers have high levels of referential integrity.

samstave2y ago

>which papers have high levels of referential integrity

SO a LACK of referential integrity should show that it is written by a human as opposed to an AI.

Daviey2y ago

LaTeX will only include the sources in the bibliography if they are used at least once in the document.

leonheld2y ago

I think this is your blog, so I'm going to post it here because it is so great to look at!

https://shkspr.mobi/blog/2021/06/where-is-the-original-overv...

antegamisou2y ago

> It is referenced hundreds of times in many classic papers.

According to Google Scholar, it's cited a measly 11 times .

https://scholar.google.com/scholar?cites=1491448744595502026...

mindcrime2y ago

I've found that Google Scholar's coverage gets a little sketchy on older stuff, and since we're talking way back in the 1980's here, I don't think it's surprising that some things are missing.

cyrillite2y ago

I am infinitely disappointed to discover you are also the only person who seems to care about this online. I found a website, but it’s you apparently.

Now I’m going to be bugged by this too! Great trivia also a heck of a mystery

iib2y ago

It seems like he is not the only one (in case he did not use a pseudonym here): https://en.wikipedia.org/wiki/Talk:Shard_(database_architect...

johnmaguire2y ago

(Replying directly to increase the chances you see this.) I received a couple more responses, from Sunil Sarin and from Mark Dewitt. Sunil had this to say, which was certainly surprising to me:

Mark Dewitt shared:

> Here are the two citations:

littlestymaar2y ago

> It is referenced hundreds of times in many classic papers.

Wait, you mean people include papers they haven't even opened in their references?!

HideousKojima2y ago

Happens far more than you think. It can be an innocent (sort of) mistake, where authors see the citation in a previous paper and simply copy it into their own.

WindyLakeReturn2y ago

1 more reply

littlestymaar2y ago

> where authors see the citation in a previous paper and simply copy it into their own.

Without opening the paper to even read the abstract? To me it doesn't sound like “innocent” at all, and borderline malpractice…

3 more replies

nocoiner2y ago

This is really interesting, thanks for posting it here!

On a semi-related topic, I love mysteries like these - mystery songs, those Japanese kanji in Unicode that nobody knows what they mean or where they came from, paper towns on maps.

If anyone else has anything else to read along similar lines, please post it!

yung_steezy2y ago

Is this an example of reference rot or did it never exist?

cinericius2y ago

If it did exist, there's some delicious irony in an original paper on replicating data in a highly-available manner being lost.

agumonkey2y ago

An endless runaway replica wave. Poetic.

mindcrime2y ago

Here's something that seems related. Maybe one of these authors would have a copy of the other paper? Not sure if they would be among the set of folks you've already tried or not...

https://apps.dtic.mil/sti/pdfs/ADA171427.pdf

CamperBob22y ago

Interesting, thanks. Wonder if it was a coincidence that UO adopted that terminology, or if somebody there knew about this.

mindcrime2y ago

1 more reply

jansan2y ago

They should have kept the paper as highly available replicated data.

foobarian2y ago

CamperBob22y ago

So a documented reference to sharding that's earlier than that would be interesting to see.

(Disagree? Instead of downvoting, consider posting a citation that actually resolves to a real paper.)

w-m2y ago

You can find reports from before 1988 mentioning SHARD being in development, like the one from June 1986 linked in this sister comment: https://news.ycombinator.com/item?id=36849634

mindcrime2y ago

[1]: https://apps.dtic.mil/sti/tr/pdf/ADA214478.pdf

[2]: https://apps.dtic.mil/sti/tr/pdf/ADA216523.pdf

[3]: https://apps.dtic.mil/sti/tr/pdf/ADA209437.pdf

andyjohnson02y ago

(I didn't downvote your comment)

mindcrime2y ago

alexb_2y ago

This is wild if true. Surely someone has to have a copy of this. How is it even being referenced if it is non existent?

antegamisou2y ago

I don't know why parent comment is stirring up drama but:

1. Not available online doesn't mean the paper's existence is made up. It's a very bold claim to make for the authors that they cite work that is fabricated.

Accumulative citations number from Google Scholar and IEEEXplore doesn't exceed fifteen for the particular paper though.

https://scholar.google.com/scholar?cites=1491448744595502026...

mindcrime2y ago

Not available online doesn't mean the paper's existence is made up.

True, but note that the post you're referring to does say:

I've emailed dozens of academics, libraries, and archives - none of them have a copy.

So this isn't somebody just saying "I couldn't find it with Google, therefore it doesn't exist."

From the available information, this looks like a technical report by a, probably now defunct, company back in the 80s.

That said, I'd be a little surprised if a copy didn't turn up somewhere, even if it means a former employee of CCA finding a copy in a desk drawer and providing it. But who knows?

darkclouds2y ago

Still searching then?

https://shkspr.mobi/blog/2021/06/where-is-the-original-overv...

I can only find the Oracle reference to Sharding, which might be the same thing or not. https://docs.oracle.com/en/database/oracle/oracle-database/1...

Along with the wikipedia reference. https://en.wikipedia.org/wiki/Shard_(database_architecture)

And a Science Direct reference. https://www.sciencedirect.com/topics/computer-science/shardi...

Along with facebooks reference. https://engineering.fb.com/2020/08/24/production-engineering...

And Wolverhamptons reference to Oracle Sharding. http://ora-srv.wlv.ac.uk/oracle19c_doc/shard/sharding-overvi...

And Amazon's. https://aws.amazon.com/what-is/database-sharding/

So is the original paper a myth or was/is this demonstrating the closed circuit nature of the dissemination of knowledge?

How many different ways do you cut up the data?

tracker12y ago

darkclouds2y ago

> it may have simply seemed somewhat obvious to many at the time and nobody bothered to hold on to or archive it.

naasking2y ago

That's pretty interesting actually. Someone must have a copy somewhere. Seems like a real failure of scholarship if it's truly lost, and a serious argument against walled gardens-style publishing.

codethief2y ago

Have you contacted authors of papers citing the one you're looking for, especially of papers that appeared shortly after / in the 90s? Maybe one of them still has a paper copy lying around somewhere.

bombcar2y ago

chubot2y ago

Which papers cite it? Are they old papers, when perhaps it still existed, or recent ones?

Very interesting either way!

pavo-etc2y ago

Wow this is fascinating. Are any of these authors still alive?

w-m2y ago

Integral Neural Networks (CVPR 2023 Award Candidate), a nifty way of building resizable networks.

Paper: https://openaccess.thecvf.com/content/CVPR2023/papers/Solods...

Code: https://github.com/TheStageAI/TorchIntegral

numbers_guy2y ago

w-m2y ago

Indeed, this seems closely related, thanks for the pointer!

[INN Supp]: https://openaccess.thecvf.com/content/CVPR2023/supplemental/...

[24]: Zongyi Li Nikola Kovachki. Neural operator: Graph kernel network for partial differential equations. arXiv preprint arXiv:2003.03485, 2020, https://arxiv.org/abs/2003.03485

dicroce2y ago

Damn. It's like jpeg for neural networks.

kirill_sldskkh2y ago

Great understanding of the work! I will add more details about INNs.

* Also interesting application is to utilise integral layers for fast frame interpolation. As conv2d in INN can produce any number of output channels i.e. frames.

You can stay tuned and also check Medium on INN progress and applications. New Medium article already available: https://medium.com/@TheStage_ai/unlocking-2x-acceleration-fo...

smaddox2y ago

Nice. I was wondering if something like this is possible a few days ago. The next step would be somehow extending the discrete->continuous concept to layers.

smaddox2y ago

Ahh, I guess that's been done, too: https://proceedings.neurips.cc/paper_files/paper/2018/file/6...

Now we just need an iterative solver over both the structure and the "weights", and we get both architecture search and training at the same time

diracs_stache2y ago

llaolleh2y ago

Supercool.

ninjaa2y ago

smart

the_snooze2y ago

"Blue Is the New Black (Market): Privacy Leaks and Re-Victimization from Police-Auctioned Cellphones"

https://krebsonsecurity.com/2023/05/re-victimization-from-po...

Researchers bought up a bunch of seized phones from police auction sites and found about 25% of them were trivially unlockable and still held sensitive data about suspects and victims.

paxys2y ago

Is this really "mind blowing"? Knowing the competency of the average police department, I'd consider it more par for the course.

the_snooze2y ago

jjoonathan2y ago

Novelty is overrated. Finding proof and pushing for accountability is so much more important.

klabb32y ago

snarfed2y ago

Not recent but legendary: "Latency Lags Bandwidth" David Patterson http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.115...

Recent:

"How to Hack the Simulation?" Roman Yampolskiy https://www.researchgate.net/publication/364811408_How_to_Ha...

"On the Computational Practicality of Private Information Retrieval" Radu Sion, Bogdan Carbunar https://zxr.io/research/sion2007pir.pdf

(via "Explained from scratch: private information retrieval using homomorphic encryption," https://blintzbase.com/posts/pir-and-fhe-from-scratch/ )

Daviey2y ago

I suppose recent is subjective, but this one totally blew my mind. It's a 260 year old academic publication, which is older than the USA:

  Mr. Bayes and Mr. Price. “An Essay towards Solving a Problem in the Doctrine of Chances. By the Late Rev. Mr. Bayes, F. R. S. Communicated by Mr. Price, in a Letter to John Canton, A. M. F. R. S.” In: Philosophical Transactions of the Royal Society of London 53.0 (1763), pp. 370–418. DOI: 10.1098/rstl.1763.0053. URL: https://doi.org/10.1098/rstl.1763.0053

philipkglass2y ago

"Liquid solution centrifugation for safe, scalable, and efficient isotope separation"

https://www.science.org/doi/10.1126/sciadv.adg8993

NoMoreNicksLeft2y ago

Surely something about the technique would make it impractical for enriching uranium though... you're not going to be able to produce kilograms of 20% U235 with this are you?

philipkglass2y ago

https://en.wikipedia.org/wiki/Critical_mass#Critical_mass_of...

The 1988 environmental impact statement for the project gives some background information:

https://www.energy.gov/sites/prod/files/2015/06/f24/EIS-0136...

kaba02y ago

What changes?

nukeman2y ago

Iwan-Zotow2y ago

yeah, especially if water ends up enriched in deuterium as well

philipkglass2y ago

"Co-cultivation enhanced microbial protein production based on autotrophic nitrogen-fixing hydrogen-oxidizing bacteria"

https://www.sciencedirect.com/science/article/abs/pii/S13858...

hirundo2y ago

philipkglass2y ago

Possibly, yes. This is the dream of so called "single cell protein" production. One type of SCP has been sold for years under the brand name Quorn (derived from a fungus rather than bacteria).

harimau7772y ago

I know that Quorn also has a reputation for causing allergic reactions. However, I don't know if that reputation is deserved or just misinformation.

cratermoon2y ago

d--2y ago

"Enso: A Streaming Interface for NIC-Application Communication" https://www.microsoft.com/en-us/research/uploads/prod/2023/0...

Code runs on FPGA NIC only for now: https://github.com/crossroadsfpga/enso

Won USENIX OSDI best paper award and best artifact award.

liam02152y ago

This is the best systems paper of the year so far imo

GregarianChild2y ago

Someone managed to GPU-accelerate program synthesis, a form of symbolic ML. First time for ML that is not deep learning:

https://dl.acm.org/doi/10.1145/3591274

Deep learning took off precisely when the ImageNet paper dropped around 2010. Before nobody believed that backprop can be GPU-accelerated.

d_burfoot2y ago

> ... 2010. Before nobody believed that backprop can be GPU-accelerated.

GregarianChild2y ago

Exactly.

[1] K.-S. Oh, K. Jung, GPU implementation of neural networks.

[2] S. Hooker, The Hardware Lottery. https://arxiv.org/abs/2009.06489

rch2y ago

Hinton also addressed the contribution of hardware performance advances to practical deep neural net applications in his talks in the mid-2000s.

Daviey2y ago

YeGoblynQueenne2y ago

That's cool and all, but the one thing that really made it possible to train deep neural nets was the use of backpropagation, and its polynomial time complexity.

____________

GregarianChild2y ago

> backpropagation, and its polynomial time complexity

How do you reconcile the NP-completeness result in [1] about training neural networks with your claim?

[1] A. L. Blum, R. L. Rivest, Training a 3-Node Neural Network is NP-Complete. https://proceedings.neurips.cc/paper/1988/file/3def184ad8f47...

YeGoblynQueenne2y ago

https://www.cs.princeton.edu/~rlivni/cos511/lectures/lect16....

https://dl.acm.org/doi/10.1016/S0893-6080%2897%2900041-5

Which is not to say that hardware was not a very important factor in the current domination of deep neural nets in AI research. Hardware- and data.

varranvar2y ago

YeGoblynQueenne2y ago

Of course. It's published as an open-access paper here:

https://link.springer.com/article/10.1007/s10994-020-05945-w

But I recommend the arxiv version where Springer couldn't mess up my LaTex formatting:

https://arxiv.org/abs/2101.05050

(Not that my LaTex formatting is anything to write home about!).

To clarify, the algorithm described in our paper doesn't reduce the size of the search space for logic programs- it sidesteps it.

Note I'm the corresponding author in the paper and my email is in the Springer version (top of the page). I'm always happy to answer questions about my work.

claytonjy2y ago

> First time for ML that is not deep learning

What do you mean by this? Virtually all "classic" or "shallow" ML can be GPU-accelerated, from linear regression to SVM to GBM.

GregarianChild2y ago

Can you point me to papers with reproducible benchmarking that achieves big speedups on those?

claytonjy2y ago

I think we're conflating two things: shallow/classic ML is not symbolic AI. I'm not sure "ML" even encompasses anything "symbolic"; I see symbolic AI and ML as subfields with little overlap.

I'm not saying symbolic AI has been GPU accelerated in the past, but that non-deep ML has been.

2 more replies

fancyfredbot2y ago

a-dub2y ago

computing a single matrix right divide is probably faster done in place with the cpu using vector math instructions.

IanCal2y ago

This isn't too related but

> Deep learning took off precisely when the ImageNet paper dropped around 2010. Before nobody believed that backprop can be GPU-accelerated.

naillo2y ago

How is program synthesis the same as regular expression search? Honest question

GregarianChild2y ago

Good question. All supervised learning is a form of search with three components:

- Specification: what are you are looking for?

- Search space: were are you looking?

- Search mechanism: how are you going through the search space?

I think this paper uses regular expressions an example of how to search fast over syntax. It claims not to be tied to regular expressions.

abstractcontrol2y ago

https://deepai.org/publication/search-based-regular-expressi...

Here is the full text.

Regexps aren't even Turing complete as far as I know, if whatever they have in their paper works for arbitrary programs it would be shocking. I'll give it a read.

We'll have to wait for AI chips to really scale genetic programming, GPUs won't cut it.

1 more reply

lmeyerov2y ago

This seems like an early paper and agreed with the consternation.

GregarianChild2y ago

The paper claims that there is no neural / deep learning based solver that performs well on regular expression inference.

Calls to Coq and Z3 will be very slow and not competitive with GPU compute.

1 more reply

Uptrenda2y ago

I find the most interesting papers I read all come from the same place: The National Library of Medicine https://www.nlm.nih.gov/

Here's some recent papers I liked:

billylo2y ago

I wonder if Google's Med-PaLM2 use those for training.

Best content + quality LLM + quality verifications => Useful health advice.

dargscisyhp2y ago

I thought the AlphaZero paper was pretty cool: https://arxiv.org/abs/1712.01815

not-my-account2y ago

totally! MuZero is my favourite[1]

Curious if anyone else knows examples of MuZero being deployed outside of toy examples?

[1] https://arxiv.org/pdf/1911.08265.pdf [2] https://arxiv.org/pdf/2202.06626.pdf [3] https://www.deepmind.com/blog/muzeros-first-step-from-resear...

(edit s/Google/DeepMind)

smokel2y ago

> you don't have to give it the rules of the game

To be fair, it uses MCTS, which requires many simulations of the game. For this, it needs to know which moves are valid, and when a player wins or loses the game.

So it does need to know the rules of the game, but it doesn't need any prior knowledge about which moves are better than others.

not-my-account2y ago

Not quite, you can define an illegal move as losing the game, and winning/losing is a “meta-observation” - ie if the player wins/loses, you don’t invoke another search.

javajosh2y ago

DennisP2y ago

vackosar2y ago

ranprieur2y ago

Placebo Effect Grows in U.S., Thwarting Development of Painkillers

https://www.scientificamerican.com/article/placebo-effect-gr...

The most interesting thing is that "placebo responses are rising only in the United States."

1letterunixname2y ago

rsolva2y ago

Sounds plausible. Would be interesting if we found a way to measure it on a large scale and track it over time. A National Beliefe Index of sorts.

di4na2y ago

All the stuff coming out of the Koka and Effekt development.

In particular last week i read their FBiP2 paper https://www.microsoft.com/en-us/research/uploads/prod/2023/0...

anfelor2y ago

Thanks, I am surprised and delighted to see our paper here! We also wrote a follow-up recently, which you may enjoy: https://www.microsoft.com/en-us/research/uploads/prod/2023/0...

di4na2y ago

Oh nice thanks!

I am slooooowwly working on my own language that is planned to use work from both you and the Effekt team, so i may send you all an email in the future when i start having questions.

anfelor2y ago

Sure, it would be great to chat! If you are interested in contributing to Koka at some point, there might also be opportunities there :)

1 more reply

afroisalreadyin2y ago

When I google for "koka effekt" I get pages in German about the effects of cocain. They should maybe reconsider the "Koka" part of the project name?

anfelor2y ago

di4na2y ago

Indeed.

That said i am slowly writing my own language that will probably use ideas from both :D

intended2y ago

Content Moderation / Trust and Safety person

Open AI’s como paper, A Holistic Approach to Undesired Content Detection in the Real World.

https://arxiv.org/pdf/2208.03274.pdf

Lots of interesting facts are strewn around the paper.

——-

The first paper that squarely talked about the language resource gap in CS/ML. Before this came out, it was hard to explain just how stark the gap between English and other languages was.

Lost in Translation: Large Language Models in Non-English Content Analysis

https://cdt.org/insights/lost-in-translation-large-language-...

——

This paper gets in for the title:

“I run the world’s largest historical outreach project and it’s on a cesspool of a website.” Moderating a public scholarship site on Reddit: A case study of r/AskHistorians

https://drum.lib.umd.edu/bitstream/handle/1903/25576/CSCW_Pa...

——

This was the first paper I ended up saving on online misinformation. The early attempts to find solutions.

The Spreading of Misinformation online, https://www.pnas.org/doi/10.1073/pnas.1517441113

What I liked here was the illustration of how messages cascade differently based on the networks the message is traveling through.

the-mitr2y ago

Also its impact https://www.nature.com/articles/s42254-022-00483-x

codethief2y ago

> that is, although, for example, chemistry is subject to the laws of physics, we cannot infer the field of chemistry from our knowledge of physics

the-mitr2y ago

smokel2y ago

Hm, I can't say I really understand most of the paper, so I'm not sure if the paper actually implies this, but even then, I find this unconvincing.

codethief2y ago

Huh, I find that very surprising. Looks like I need to have a deeper look at the paper then.

trevyn2y ago

I don’t think the paper rigorously addresses the case of infinite computing power.

bitxbitxbitcoin2y ago

“A classification of endangered high-THC cannabis (Cannabis sativa subsp. indica) domesticates and their wild relatives”

By McPartland and Small.

Following this new classification, I believe there’s a third undocumented variety in North East Asia.

If anyone else has noticed the samesameification of cannabis strains and is wondering what the path forward is, this may be illuminating.

https://phytokeys.pensoft.net/article/46700/

Silamoth2y ago

maurits2y ago

LoRA: Low-Rank Adaptation of Large Language Models [1]

[1]: https://arxiv.org/abs/2106.09685

ComputerGuru2y ago

Off topic: LoRa is forever low-power long-range communication for me (as in LoRaWAN).

awestroke2y ago

LoRA != LoRa

jeffbee2y ago

https://cseweb.ucsd.edu/~tullsen/halfandhalf.pdf

Half&Half: Demystifying Intel’s Directional Branch Predictors for Fast, Secure Partitioned Execution

1letterunixname2y ago

lwansbrough2y ago

Detailed Rigid Body Simulation with Extended Position Based Dynamics

https://matthias-research.github.io/pages/publications/PBDBo...

A new way of doing real time physics that dramatically outperforms state of the art by simply introducing a new algorithm. No crazy AI or incremental improvements to existing approaches.

carapace2y ago

"Cyclic Combinational Circuits", by Marc D. Riedel

http://www.mriedel.ece.umn.edu/wiki/images/7/7a/Riedel_Cycli...

danesparza2y ago

Not all of these are research papers. But all are fairly recent.

Gene linked to long COVID found in analysis of thousands of patients https://www.nature.com/articles/d41586-023-02269-2

Surfactants safely take down mosquitoes without using insecticides https://newatlas.com/science/surfactants-safely-take-down-mo...

This is what our Milky Way galaxy looks like when viewed with neutrinos https://arstechnica.com/science/2023/06/ghost-particles-have...

tandr2y ago

>Surfactants safely take down mosquitoes without using insecticides https://newatlas.com/science/surfactants-safely-take-down-mo...

Curious - what would happen to a spider or an ant that ate a couple of mosquitoes with that spice on top? Will they also suffocate?

dooraven2y ago

Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind

https://arxiv.org/abs/2306.09299

TokenFlow: Consistent Diffusion Features for Consistent Video Editing

https://huggingface.co/papers/2307.10373

Need to see code for second one.

pyuser5832y ago

“On the frequency and severity of interstate war” by Aaron Clauset.

https://arxiv.org/pdf/1901.05086.pdf

Analyzes a claim made in the 1950s by a prominent statistician: the frequency of interstate wars follows a simple Poisson arrival process and their severity follows a simple power-law distribution.

He was right, but it’s not clear why.

The paper is interesting because it shows how a new bit of knowledge creates a large number of known unknowns from previously unknown unknowns.

It also shows statistical magic was very much possible prior to computers.

manvel_hn2y ago

Toolformer: Language Models Can Teach Themselves to Use Tools https://arxiv.org/abs/2302.04761

Older one, but still very nice work.

jrmiii2y ago

I just posted the Voyager paper at exactly the same moment you posted this - a lot of the same ideas. https://voyager.minedojo.org/

gabitoju2y ago

"C-Store: A Column-oriented DBMS": https://web.stanford.edu/class/cs345d-01/rl/cstore.pdf

By among others, the great Mike Stonebraker.

Jalad2y ago

https://kcsrk.info/papers/pldi18-memory.pdf

https://youtube.com/watch?v=eXXzUzt_nAY

dan-g2y ago

Generative Agents: Interactive Simulacra of Human Behavior[1]. Make sure to check out the recorded demo!

[1] https://arxiv.org/abs/2304.03442

msravi2y ago

Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm [0]

Also known as the Viterbi algorithm. Every digital communication device in existence today most likely has an implementation of it.

Later proved optimal by Forney [1]

0. https://www.essrl.wustl.edu/~jao/itrg/viterbi.pdf

1. https://www2.isye.gatech.edu/~yxie77/ece587/viterbi_algorith...

masfuerte2y ago

Grid-free Monte Carlo for PDEs with spatially varying coefficients. https://cs.dartmouth.edu/wjarosz/publications/sawhneyseyb22g...

cratermoon2y ago

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? <https://dl.acm.org/doi/10.1145/3442188.3445922>

adultSwim2y ago

I enjoyed the real world concerns they gathered but found the octopus analogy a bit weak.

rurban2y ago

https://arxiv.org/pdf/1703.04234.pdf

On why Oppenheimer didn't get the Nobel for his black hole paper decades before some else got it for a rediscovery.

freedude2y ago

An analysis of studies pertaining to masks in Morbidity and Mortality Weekly Report: Characteristics and quality of all studies from 1978 to 2023

"0/77 were randomized studies."

https://www.medrxiv.org/content/10.1101/2023.07.07.23292338v...

Here is a pdf.

https://www.medrxiv.org/content/10.1101/2023.07.07.23292338v...

tracker12y ago

jrmiii2y ago

I was impressed by the Voyager paper on a GPT-powered Minecraft bot.

https://voyager.minedojo.org/

0asa2y ago

I use Raycast Pro, which includes Raycast AI. It's awesome: it offers a clever integration of LLMs into the operating system.

aaron6952y ago

RNA demethylation increases the yield and biomass of rice and potato plants in field trials (2021)

https://www.nature.com/articles/s41587-021-00982-9

j / k navigate · click thread line to collapse