> The ongoing apparent failure of deep-learning based ink detection based on the fragments indicated to me that direct inspection of the actual data would be more fruitful, as it has been here.
> ...
> I found similar “cracked mud” and “flake” textures corresponding to known character ink, but only for perhaps 10% of the known characters. It’s been a long day, I can probably find more on closer inspection, but that does make one wonder about automated ink detection and what that is seeing.
These new images are much better than I hoped for, but still only in one small area, so I'm still pessimistic about more than an odd sentence being readable.
[1] https://scrollprize.org/img/tutorials/sem.png
[2] https://caseyhandmer.wordpress.com/2023/08/05/reading-ancien...
However, what I did was a bit different -- instead of looking for a crackle, I surmised that that 'crackling' effect actually is just of course slices of the data over different rifts in the parchment, and that the data of the ink lay on the manifold of that crackling and bending.
It would not be as clear to the human eye for all of the letters, I think, as there are many, many, many layers in the scanned image, and you can only start to see a pattern emerge over time as you cycle through the images.
I was working on code that minimized an optimization function that was basically the total variance loss if I recall correctly, where it just interpolated each pixel column up and down bilinearly to 'align' the blocks of the image so that the crackle texture was flattened.
From there I planned on using a rather optimized convolutional network on the 'flattened' image, which can I think be done rather efficiently as if you look at a cross section of the scroll you can see where it's like a tree in that the pinching and such seems to be somewhat locally consistent, so you might be able to get away with some interpolation.
I should probably share the code if this is of interest to anyone, since I'm not pursuing the competition at the moment.
Also, this is why I did not buy into 3D convolutions for this, at least. Ink that has been laid and dried should follow a semi-predictable pattern that a 2D convolution can detect, I do not know if a 3D convolution really brings us anything, as the invariances we desire can be structured up front more easily.
If there is interest in the code, let me know and I can do a little digging, otherwise, it is a fun challenge, for sure.
I'd studied the problem to work on it, but didn't get as far as you. I agree that intuitively (not backed up by experiment) I expect that preprocessing to further flatten the segments and other hand-crafted features based on desired invariances should work well. It seems that a lot of people really have just used 2D convolutions, applied to just one or a few surface layers and then combined. So I'd also be interested in your code.
> I surmised that that 'crackling' effect actually is just of course slices of the data over different rifts in the parchment, and that the data of the ink lay on the manifold of that crackling and bending.
I'm afraid I can't follow this.
$700k is a life changing amount of money. I admit, it’s tempting to drop everything and go devote myself like a monk to the pursuit of ancient enlightenment via modern ML. I wonder where we’d start…
It’s also funny that the scroll might just be a laundry list.
the machine learning stuff is cool, but it's important not to discount the apparently pretty manual labour still involved in the virtual unwrapping:
> Early in the summer, a small team of annotators (the “segmentation team”) joined our effort. They began mapping the 3D structure of the scroll using tools initially built by EduceLab and improved by our community. By July we had segmented and “virtually flattened” hundreds of cm2 of papyrus.
So, it sounds like it was about a month or two of work, for a single scroll. Although, it probably could be partially or fully automated too, with some effort. Already they developed some tools to help, and I guess it's the kind of task that gets easier after you do it the first time.
Apparently "What do you take me for" is an extremely old phrase. Funny how things stick around. I wonder if that's a result of translation though.
https://www.reddit.com/r/ReallyShittyCopper/
Also: https://xkcd.com/2758/
Most likely not, I believe they're starting with scrolls that were readable on the outside, which we know are minor works of Greek stoic philosophy. Also a laundry list would be written on a reusable wax tablet, rather than costly papyrus.
Similarly, various Emperors far away in China had a similar enforced color-monopoly, except it was on yellow.
https://en.wikipedia.org/wiki/Color_in_Chinese_culture#Yello...
Still interesting that they found that word. As far as I know the sea snail it comes from didn't inhabit the waters off Herculaneum.
Probably ~half of that will go to taxes?
Even if it were, a laundry list from 2000 years ago would be a fascinating read.
I think you'd be shocked how well LLMs translate cuneiform in the CDLI notation. What's hilarious is my first attempt included examples in-context and Claude prefaced the translation by stating that there's nothing in my example translations about "bulls", "horns" or "grabbing" and that it will ignore that translation. I looked it up word-by-word and realized Claude was right. Blew me away. Yet Assyriology subreddits were as excited about my findings as lawyer subreddits are about LLMs. Not sure why, either. Just a bunch of, "So what? Does that mean it's useful?".
Suggested Reading for beginners:
* Life of Pythagoras, by Iamblichus
* The Golden Ass, by Apuleius of Numenia (specifically, translation by Robert Graves)
* Life of Alexander by Plutarch
* Education of Cyrus by Xenophon
* Parmenides by Plato
Also, I have found SHWEP.net to be invaluable for a gentle yet rigorous guide through many classics, though it takes an esoteric bent (which I love)
Could you elaborate a little bit about what you think gives it these qualities? I've dabbled in some classical literature before but I've always found them to be very difficult reads, so I rarely have the motivation to finish them. I am wondering if there is something I am missing about the genre.
My experiences with ancient texts makes me realize that there are so many remaining mysteries (that can be illuminated!), so much material that has never been “processed” by historians or philosophers, and so much that can be useful for the present day.
I’m working on an English translation for Marsilio Ficino’s 1497 publication of “De Mysteriis” — which includes 13 tracts, including Ficino’s own “Philosophy of Pleasure.”
Marsilio Ficino was hugely influential in the 1460s-1500 Florentine Renaissance because he was hired by the Medici’s to translate the old Greek classics (Plato, Plotinus, Hermetica, etc). He helped classical ideas spark the renaissance! So the fact that his own book has never been translated is mindblowing — I get to see where I can contribute.
But then in his actual book, I learn that it was fairly common to conceive of the soul, gods, demons etc as entities in the world of Nous or mind. Yet, he specifically says that the soul does not feel and that gods do not feel. That’s weird! Often times people associate soul with “the feeling part.” But there were multiple perspectives on this!
How does this relate to the present? We typically associate intellect and mind with consciousness— yet now AI developments force us to consider mind or intelligence without conscious experience. So, it gives a genuinely interesting framework for understanding “noetic reality” — the unconscious mathematical world of forms and information that seemingly preexists the material cosmos (ie perfect triangles or spheres can be conceived as a part of math that are eternal and timeless).
So that’s just one example but there are a lot of them I could share. Particularly as they relate to history of science and ideas — but also fascinating social phenomena — like how hard the Roman’s came down on the Bacchae — or how important the Oracle of Delphi was to Greek colonization — etc etc.
My favorite quote:
> Yet world was not complete. > It lacked a creature that had hints of heaven > And hopes to rule the earth. So man was made. > Whether He who made all things aimed at the best, > Creating man from his own living fluid, > Or if earth, lately fallen through heaven's aether, > Took an immortal image from the skies, > Held it in clay which son of Iapetus > Mixed with the spray of brightly running waters — > It had a godlike figure and was man. > While other beasts, heads bent, stared at wild earth, > The new creation gazed into blue sky; > Then careless things took shape, change followed change > And with it unknown species of mankind.
When I first started reading classical literature I was struck by an idea I found in Bruno Snell's The Discovery of the Mind, that Homer, apart from having no words corresponding to our "mind" or "soul," didn't even refer to the body as a single whole--more as a collection of limbs. The article here talks about this: https://intertheory.org/torrente.htm .
None of this makes ancient literature easier to read, though, unfortunately.
https://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1...
Here’s a short segment from the dialogue:
“Then the one cannot have parts, and cannot be a whole?
Why not?
Because every part is part of a whole; is it not?
Yes.
And what is a whole? would not that of which no part is wanting be a whole?
Certainly.
Then, in either case, the one would be made up of parts; both as being a whole, and also as having parts?
To be sure.
And in either case, the one would be many, and not one?
True.
But, surely, it ought to be one and not many?
It ought.
Then, if the one is to remain one, it will not be a whole, and will not have parts?
No.
But if it has no parts, it will have neither beginning, middle, nor end; for these would of course be parts of it…”
> If these words are indeed what we think they are, this papyrus scroll likely contains an entirely new text, unseen by the modern world.
Title: Herculaneum scrolls: A 20-year journey to read the unreadable
it goes a little bit into the technology of how this was done, deep learning finally cracked the code. They had the scans for a decade but it took ML training to be able to identify which parts were paper and which parts were the ink on top. This had been done on a different set of scrolls with easier to read higher contrasting materials like the video says, 20 years ago. Deep learning is cracking the code for these datasets we had previously thought were impossible to algorithmically solve.
Also, so far the process of virtually unrolling the scrolls is mostly manual and extremely labour intensive.
[1] https://scrollprize.org/img/firstletters/youssef-new.png
[2] https://caseyhandmer.wordpress.com/2023/08/05/reading-ancien...
[3] https://lh6.googleusercontent.com/C-vKV4SdsyH961w6KPwD6rypt0...
As an aside, the "Professor Seales and team scanning at the particle accelerator" photo looks like it came from a TV show. "If we keep telling the computer 'enhance', we'll be able to read it".
interesting terminology, I've never been given accolades for being a multifaceted human being.
I've gotten "generalist" and "after much consideration, we have decided not to proceed with your candidacy "
From what I've seen, genuine polymaths shirk away from being identified as one, and if they do decide to get recognition, they're shooed away for the unforgivable sin of Not Being Famous Enough.
If you can just stack 20 random books and within seconds have them be indexed and searchable digital ones, libraries as we know them will suffer perhaps the final blow in obsolescence.
You can avoid the longform essay below if you want. The short of it is there are several potentially common works possibly in the library that could directly prove or disprove what is found in the New Testament and the predicates of Rabbinic Judaism as established at the Council of Jamnia.
We could be seeing the beginning of conclusive proof that invalidates the narratives of Christianity, Judaism, and Islam by the end of the year.
The Vesuvius Challenge isn't just an interesting contest in the machine learning realm; it's a groundbreaking endeavor that could redefine our understanding of the humanities if successful. The opportunity to digitally unroll and read the Herculaneum Papyri could offer unprecedented insights into ancient civilizations and the total feedstock of civilization today. This is not merely about filling in some historical gaps; it’s about fundamentally altering how we understand antiquity and, by extension, our own intellectual heritage.
The loss of the Library of Alexandria has long been considered a "dark age" event for intellectual progress. Now, consider the Herculaneum library—a collection of papyri from a villa once owned by Julius Caesar's father-in-law, carbonized but preserved by the Vesuvius eruption in 79 AD. Hundreds of these scrolls are unreadable because their carbon-based ink blends in with the carbonized papyrus, and thus are invisible to conventional imaging techniques. Yet, these scrolls are quite possibly on the cusp of revelation.
Recent developments have introduced machine learning and high-resolution X-ray scans as methods for reading these "unreadable" scrolls. What texts do they contain? Treatises on science and philosophy? The lost books of Livy? The epic cycle? Governmental policies like the Twelve Tables? It’s a tantalizing question because whatever is locked in those scrolls could be an unfiltered look at the Roman Empire—an empire that fundamentally influenced the trajectory of Western culture, religion, governance, and philosophy.
Ponder a history of Rome that has not been retouched by myriadic emperors, by Constantine's Christianity, or the interpretive lens of the Roman Catholic Church. Unmediated accounts of Roman society, unaltered by the layers of religious and political power that came later, could rewrite our textbooks and shift the justification of history. It’s not just about enriching our understanding of ancient civilizations; this could be a cornerstone on which to build a fresh philosophical understanding of human society.
If the project succeeds, there will be repercussions in the academic realm. The humanities have long struggled to justify their existence in a world that increasingly prizes STEM and lacks any novel sources for the classical world. Suddenly, there could be a concrete, urgent task at hand: to decode, interpret, and integrate an influx of new knowledge. The Vesuvius Challenge could revitalize the field, offering an unforeseen but compelling reason for its study. In essence, it provides a utilitarian justification for the humanities, one that transcends 'cultural enrichment' and enters the realm of 'historical redefinition.'
The Vesuvius Challenge could be the hinge upon which history swings, yielding intellectual treasure that could be as groundbreaking as the writings that were lost in Alexandria. For millennia, those scrolls have remained unread. Now, it's a software problem. That's not just a challenge; it’s an imperative.
The presence of specific works in the Herculaneum Papyri could dramatically impact our understanding of major historical events.
In particular for me, I pray that the biography of Herod the Great by Nicholas of Damascus is discovered intact. While mainstream accounts generally portray the life of Herod within the context of Roman patronage and Judaean politics, uncovering a contemporary account by a close intimate (and used as a primary source by Josephus) would offer fresh, unmediated insights into his rule and its socio-political intricacies. Chronologies of the life of Jesus could be explicitly validated or disproved.
The relevance here is far from academic. Consider the following naturalistic hypothesis: that the inception and rise of Christianity was entirely a dynastic struggle within the Hasmonean-Herodian line. What if the tale of Jesus is, in essence, a dramatized, mystified rendition of a 1st-century dynastic conflict, one that was subsequently co-opted and transformed into a religious narrative by an early form of conspiratorial thinking? Something like a 1st-century version of Q-anon, distorting real events to serve an alternative, concealed agenda in the aftermath of the First Jewish-Roman War.
Unveiling a document like Nicholas of Damascus' biography could be groundbreaking in testing such a hypothesis. If Herod's life and rule were detailed without the religious overlays that later Christian interpretations bring into the picture, one could make more definitive assertions about the socio-political environment of the time. Furthermore, it could provide concrete evidence to either substantiate or refute theories about Christianity's emergence as a byproduct of a Herodian-Hasmonean power struggle.
The fact that such a theory could be tested is significant in its own right. Traditionally, discussions about early Christianity rely heavily on religious texts and subsequent historical accounts, many of which are fraught with dogma and ideological interpretations. A primary source devoid of such influences would be a game-changer, offering a baseline of raw data from which more accurate and reliable hypotheses could be drawn.
And it's not limited solely to Christianity. Rabbinic Judaism could have equally monumental implications as a result. The owner of the villa, likely a wealthy Roman, would be unlikely to have had any primary Hebrew texts like the Pentateuch. However, that doesn't rule out the possibility of possessing Greek or Latin works discussing Jewish culture, beliefs, and politics. Given the villa's historical context, it's conceivable that there might be indirect ethnographic accounts from the period surrounding the destruction of Jerusalem in 70 AD but before the Council of Jamnia, traditionally dated around 90 AD, which helped canonize Hebrew scriptures.
Why is this important? The Council of Jamnia is often cited as a crucial moment for the development of Rabbinic Judaism. It allegedly led to the fixing of the Hebrew Bible canon and crystallized what would become Talmudic tradition. If documents were to surface that provide a snapshot of Judaic thought and practice just before this council, it could upend millennia of precedent and identity.
In a broader context, discovering pre-Jamnia ethnographic sources could significantly change our understanding of how Judaism adapted and evolved in the aftermath of the Second Temple's destruction. This could lead to far-reaching questions. How much of the Talmudic tradition was actually a post-hoc rationalization or systematization of beliefs and practices that were far more fluid before the Council of Jamnia? How much anti-Romanism was pared away to prevent suppression? Moreover, how would such a revelation interact with or even challenge the validity of current Rabbinic and Orthodox Jewish practices?
The implications for the Judeo-Christian heritage as a whole are staggering. If both Christianity and Judaism could be traced back explicitly to politically or socially motivated machinations, rather than divinely inspired or time-honored traditions, the entire foundation of Judeo-Christian culture would come into question. In essence, the Vesuvius Challenge has the potential to destabilize two of the world’s major religious traditions at their historical roots. It is difficult to overstate the potential impacts.
The Vesuvius Challenge is not just an academic or technological endeavor. Its success could instigate an unparalleled epistemological crisis in religious studies and the humanities. It provides the opportunity to re-examine, with primary sources, the historical foundations of Western religious, cultural, and ultimately political traditions. We're not just potentially rewriting history here; we're reevaluating the very frameworks through which that history has been understood.
On the bright side, it will be really fascinating to those of us who like history. We might learn a thing or two from these ancient texts, so there's certainly a silver lining.
Anyway, Christian followers already expect science to bring faith into question.
I'm one such follower who believes science and faith can describe the same truth, as long as the science and faith are both accurate. Both are systems of experimentation, trial and error. There is much we can learn from historical records!
"Early attempts to open the scrolls unfortunately destroy many of them. A few are painstakingly unrolled by an Italian monk over several decades, and they are found to contain philosophical texts written in Greek. More than six hundred remain unopened and unreadable.
What's more, excavations were never completed, and many historians believe that thousands more scrolls remain underground.
Imagine the secrets of Roman and Greek philosophy, science, literature, mathematics, poetry, and politics, which are locked away in these lumps of ash, waiting to be read!"
Anyway, if there's religion involved, I doubt any revelation will shake anything.
At any time in the last sixteen hundred years, if such evidence were uncovered, it would've been burned immediately and the monastic reading it likely consigned to perpetual silence, lest the word get out.
But the hegemony of Christianity in the West is over.
For example, the Iliad and Odyssey alone contain around 500 hapaxes. Even if the ground is not shaken, there will at least be some tremors in the field, regardless of whether whole works will.be able to be successfully recovered.
Surely writing an essay isn't a good way to convince them if they're illiterate? Maybe they can use TTS.
[1] https://en.wikipedia.org/wiki/Antikythera_mechanism [2] https://www.youtube.com/watch?v=6Wp3wL8g2Eg
Paging Germans
The idea of reading text backwards and
bɘɿoɿɿim ɿɘɈɈɘl ʜɔɒɘ ʜɈiw ,ƨbɿɒwɿoʇ
is absolutely nuts though. I'm definitely
.ɈɒʜɈ Ɉqobɒ Ɉ'nbib ɘw bɒlϱ
better page Japanese
That's what u get for optimising your code
>Youssef used a model from the Kaggle competition and was inspired by Luke’s results to look in the same area.
Before long, the model was unveiling traces of crackle invisible to his own eye. Soon, these traces began to form letters and hints of actual words."
This does not sound like a "Large Language Model (LLM)" or other large set of training data, like the sort hyped by so-called "tech" companies; this sounds relatively small. What am I missing. (Besides brain cells.)
Amusing that this implies the Vesuviuans had the ability to read unopened scrolls.