I've been lurking on the bitcoin-dev list for a while to observe how they handle issues just like this. I'm confident that these problems will be transient.
For instance, if I want to publish an 'n' byte message, I could generate 'n' wallets, each having as their final byte of their fingerprint the n'th byte of my message. Constructing 'n' such wallets will require on average "256*n/2" units of work---quite small, all things considered. I can then transfer a single bitcoin to each wallet in turn, forming a linked list of the bytes in my message. Even better, I also get my coin back at the end.
This is to say nothing about the fact that everyone else running bitcoin will also possess these bytes in their blockchains, making the possession of them rather unextraordinary.
This whole article is just the latest in "Bitcoin doomed to fail, and here's why!" bullshit that's been going on for what feels like a decade but is really only 3 years or so.
We also not really talking about small amounts of data (at least at the moment) a few megabytes is relatively significant...
I think the fact that it's "unextraordinary" to possess this data is the interesting thing. That may force a legal distinction which in itself pushes us toward a different understanding of "illegal data" and that perhaps the legal system has to give up on that and move towards accessing or "distributing with intent" being the illegal rather than just possession.
The real problem here is not that child pornographers would actually use bitcoin to distribute links, it's that assholes who want to damage bitcoin would put contraband in the blockchain in order to cause legal trouble for innocent users.
But I think that's a broader problem than just bitcoin. You can encode anything into anything. Take anything anyone else has posted and xor it with something you want to encode. The output will resemble garbage rather than either input. But now you can post the "garbage" and instructions on what to xor it with to allow anyone to recover your encoded message, and the poster of the other message becomes an unwilling participant in your encoding scheme. It clearly makes no sense to punish distributors of the original message just because the encoded message is contraband. Which doesn't mean there won't be laws that will punish it anyway, but that is the fight that needs to be won -- to not allow stupid laws that would punish innocent people.
This is not like XORing data, or as some people have said "everything occurs somewhere in the digits of pi". The blockchain in no sense encodes all possible values, or a fraction thereof, the data is trivial to extract.
This is much more like, it's sitting on the webserver, but not indexed by google. It's actually even worse than that because you can still just grep through the blockchain and find what your interested in.
This may or may not be a problem for bitcoin, but I think it is legally problematic at the moment. This may move us toward a world where it's not illegal to store any particular data or even distribute it. The illegal act might be the viewing or "distribution with intent" or the data. I think that would be an interesting development.
Personally as a user of Bitcoin I've deleted the standard qt client, I personally don't want that data on my computer. I now use a blockchainless client (Electrum).
An encoding method is ASCII text. You could use ASCII compressed with gzip, or bzip2, or lzma. You could use Unicode. You could use a previous block as the key and encrypt with AES, or Blowfish, or 3DES. You could store an IP address and port rather than a URL as the first six binary octets. Or encode the IP using base64, or hex.
No matter what you use, you have to convey that to the party you're trying to communicate the information with -- you at least have to convey the fact that you've encoded something in the blockchain so that the receiver knows to look for it there. How is it easier to convey "you should download the bitcoin blockchain and run strings against it and the URL is the 352nd one you find [out of the six thousand URLs various unrelated people will have encoded]" than to just send the damn URL directly to the person you're telling where to look for it?
>This may or may not be a problem for bitcoin, but I think it is legally problematic at the moment. This may move us toward a world where it's not illegal to store any particular data or even distribute it. The illegal act might be the viewing or "distribution with intent" or the data. I think that would be an interesting development.
I think it would be a welcome development. Right now people are too afraid to be distributors, which makes things difficult for whistle blowers and democracy advocates in oppressive regimes and others who have legitimate reasons to want anonymous censorship-resistant publication methods.
Any information contained in those bytes is just that: information. What can you say in 20 bytes that can have permanent, material damage to human beings?
A wonderfully sensationalist title, but really nothing to back it up.
The AACS encryption key was 16 bytes: http://en.wikipedia.org/wiki/AACS_encryption_key_controversy
In general, private keys are in that range of byte size.
Saying something meaningful enough to get someone jailed for possessing a drive with the string on it (the criteria for this to be actually harmful to bitcoin) is nearly impossible in 20 bytes in most parts of the developed world.
The article is sensationalist bullshit.
You've considered only the single-use nature of the 20 byte attack, without realizing that that can be done over, and over, and over again.
As moxie pointed out elsewhere in these comments, Travis Goodspeed and Dan Kaminksy embedded a eulogy to the Len Sassaman in the blockchain. See http://pastebin.com/raw.php?i=BUB3dygQ .
But no need to read the other comments, since the author wrote "Some folks have exploited that feature/flaw to publish Wikileaks cables." Information about that publication is in the immediately previous article: "That publishing capability was put into use a couple of days ago when someone publish 2.5 MB of Wikileaks cables in the bitcoin blockchain. It cost a bit of money (about $500) to accomplish that, but the information that was published is now going to be public forever."
A search finds someone who wrote "The wikileaks data starts at transaction 5c593b7b71063a01f4128c98e36fb407b00a87454e67b39ad5f8820ebc1b2ad5".
Therefore, I find your claim that there is "nothing to back it up" untenable.
I think part of the problem is that people don't want to directly point to the data due to it's nature. But you can easily run strings over the blockchain and see what's there. I did myself and then deleted it and zero'd by free space, it's unfortunately not something I would want on my HD. I moved to a blockchainless client.
Specifically, for the blockchain, a transaction sending bitcoins to multiple addresses would do it.
What if someone manages to embed something very much like the EICER string[0] in it? How many people do you think would use the bitcoin client on windows if their AV automatically deleted the blockchain as it downloaded in a misguided attempt to protect them?
Of course, first we have to know if this is possible at all. Does anyone know if there's either a) 20 bytes with a very high AV detection rate or b) some way to embed more than 20 bytes in a row in the block chain?
It is a problem that exists in a different layer than the currency, even if it is to some degree 'passed on' through the currency. Likewise, the solution (imho) lies in a different layer: detect a cp link in the blockchain? Great, take down the link, problem solved.
Just as it is not the fault of TCP/IP, or its 'downfall', that it is able to transmit 'evil data', it is not Bitcoin's fault what vandals sometimes write on it.
So while the issue may or may not be sensationalized, writing ignorant and wholly incorrect commentary is not the antidote.
The problem is that such "evil" messages can be broadcast globally without the ability to remove them. A dollar bill with a message on it can be taken out of circulation; the bitcoin blockchain can't be reset without major chaos ensuing.
It's not practical to shut down everyone with a bitcoin database any more than it's practical to raid every server with wikileaks data. If they're going to declare this nuclear war on bitcoin it's not going to be on the basis of some piece of data which by the point it's in the blockchain is out of the bag anyway.
Obviously that's absurd, but where do you draw the line? You need specialized software and the 32 byte transaction ID in order to extract the data.
What other permanent public records could be manipulated like this?
US legal system is based on intent so it's kind of pointless.
~/.bitcoin/blocks $ ls | xargs strings -n 20 | tee ~/Downloads/hiddenblockchain.txt
https://en.bitcoin.it/wiki/Original_Bitcoin_client/API_Calls...
Are they just sending coins to an invalid address (their string)?
I mean, you can verify that you are who you say you are simply by using your private key to sign a message; I would think a comparable process would work for this.
EDIT: Facepalm; you're hashing the public key. You don't need to hide that. See my comment below.
The Bitcoin address is just some chain of hashes (and a checksum) applied to the public key. To prove that the address IS actually output from the hash functions [and not spam], simply provide the public key along with it. Of course, you might say that is way too much data for the blockchain to handle. So you only limit the requirement of providing the public key to "suspicious" transactions. What constitutes a suspicious transaction could be a matter of debate, but I imagine it could be done, and it would avoid the problem of a Bitcoin's value depending on its ancestry.
Sounds like an airtight case.
Has spam appeared yet? Because you know that is next.