The BitTorrent Protocol Specification v2 (opens in new tab)

scrollaway8y ago

The problem with SHA1 being broken is unrelated to the amount of bits in the hash, so the point of switching to sha256 isn't to "increase the strength beyond the original 160 bits" but to avoid Shattered and future potential attacks.

the84728y ago

There are 2 uses of the hashes in bittorrent. a) integrity checking b) as unique identifier for the swarm.

The hash only gets truncated when used in places as unique identifier. When you start with a v2 magnet link or torrent file you get the full 32bytes hash, which means your integrity-checking is unaffected.

nenreme8y ago

You download 32 bytes hash and check it using 20 byte hash from the magnet link.

[0] https://en.wikipedia.org/wiki/Length_extension_attack

sha2throwaway8y ago

Actually you do get some added security, because it prevents the length extension attacks sha-2 and related hash functions have thanks to the Merkle-Damgard construction [0]. Specifically, by truncating the hash, the output no longer contains enough state to perform the attack.

loup-vaillant8y ago

There are better ways to prevent length extension attacks, such as choosing Blake2b. With the current scheme, only 12 bytes are missing, so Length extension attacks only get 96 bits of security…

Replacing SHA-1 with SHA-2, what are they thinking? Blake2 is faster and more secure than either.

https://en.wikipedia.org/wiki/Brave_(web_browser) https://brave.com/

snakeanus8y ago

Length extension attacks are a non-issue for torrents.

boyce8y ago

I'd love to see a p2p wikipedia and a p2p twitter alternative

Moving bittorrent away from it's present image could be achieved by making p2p useful beyond bluray rips

fps_doug8y ago

I liked how the old Opera included a bittorrent client. It was the only browser that did, and felt like it was actually the way it was meant to be used. But nobody got it. You had warez kiddies left and right complaining how it sucked compared to Azureus and later µTorrent which had bazillions of features to tweak and max out their connection, or saying it was useless because their favorite ALT did not whitelist Opera.

But I just used it a lot when running bigger downloads like install discs for Linux distros, OpenOffice etc., and it made a difference when there was some major release and half of the plain old http mirrors were painfully slow or down entirely. Admittedly, that situation got a lot better compared to 10 years ago, but still I'm delighted by how natural it felt to use, since it seamlessly integrated with the browser's download manager. And you didn't have this "uh, I need to start an external program for this" kind of reluctant thought when you saw a website offered download via torrent. Today I just wonder if BT would have evolved differently if all browsers would have included a client.

johnwaynedoe8y ago

The Brave browser has a bittorrent add-on pre-installed. I've been toying with it as my daily driver for a couple weeks and have been really enjoying it. The project still needs work but I love what they're doing.

boyce8y ago

I think there's still potential to steer BT in a more exciting direction if there was the will to from those involved. More quality software projects using it to download by default. Bring the trust back, shake off the reputation for just being trojan-infested cracks.

There's plenty of things besides piracy people could be doing with torrents and related tech and it seems like such a waste of an idea. A Linux package manager, an open-source Acestream alternative, collaborative work on large scales

cesarb8y ago

Also, the original Bittorrent client looked like an Internet Explorer file download dialog. The download experience was nearly identical: you clicked on a download link, and a download dialog with a progress bar opened up. Behind the scenes, the torrent file MIME type was registered to the Bittorrent client, so it was downloading the torrent file and launching the Bittorrent client, while a normal download showed the download dialog directly, but the user didn't have to be aware of these details.

So yes, it's the way it was originally meant to be used.

Klathmon8y ago

Personally I really truly believe that once you can make a torrent client in javascript that runs directly in the browser without any plugins or gateways/tunnels that you will see it explode.

Developers will code it into their download pages, decentralized systems like a p2p wikipedia will be possible and always accessible by anyone with a browser.

Piskvorrr8y ago

You will see it explode indeed (where "it" means "the browser" and "explode" means "tab gets throttled and/or crashes"). I understand the motivations for everything-in-a-browser, but this approach has major externalities - e.g. the client's power, both as in CPU and as in battery.

lqdc138y ago

https://webtorrent.io/

boyce8y ago

Yes please, someone do this

dandelion_lover8y ago

> p2p twitter alternative

here you go https://en.wikipedia.org/wiki/Twister_%28software%29

> They've decided to move forward with SHA256 truncated to 20 bytes [...]

I wonder why not SHA512? It's actually faster to compute than SHA512 on 64-bit architectures.

silotis8y ago

SHA256 was chosen over SHA512 because the two most popular 64-bit ISAs (x86-64 and ARMv8) both define instruction extensions to provide hardware acceleration of SHA256. Hardware support makes SHA256 much faster than SHA512 in software, even on 64-bit processors.

Huh, I didn't know that's the case for x86. My quick search suggests that support is still quite limited though, and none of my HW (1~4 yrs old) support it. I suppose by the time the spec is finalized and widely adopted HW-accelerated SHA256 would be, too (and I should switch my zfs checksum back from SHA512/256 to SHA256).

mtgx8y ago

The word encryption doesn't even seem to be mentioned there. At the very least it would help against traffic shaping, which you know is coming once net neutrality rules are dead.

the84728y ago

Bittorrent already had an obfuscation protocol wrapper for years. It was effective for a while but the companies that implement traffic shaping equipment stepped up their game and probably rely on traffic flow matching now.

It is an arms race that is not won by updating a slowly evolving core protocol.

Scaevolus8y ago

1) Chunks don't span files. Each file is validated by the hash of its merkle tree. This is the biggest user-visible change, since it means you can download one file without downloading others.

2) SHA1 is replaced with SHA2-256 (2x longer hashes and not broken).

3) Files are represented by a tree structure instead of a list of dictionaries with paths-- this reduces duplication in deeply-nested hierarchies.

4) Backwards compatible-- you can make a .torrent file with both old and new pieces, and a swarm can speak either. This requires padding files from BEP47, which most clients probably don't support.

Per-file metadata increases pretty significantly, from ~19B (just length) to ~68B (length + hash).

phire8y ago

Per-file metadata increases significantly, but it gets rid of the per piece data (which in bittorrent v1 is 20 bytes of sha1 hash per piece and made up the bulk of the .torrent file).

The .torrent file only stores the merkle tree's root hash for each file, and the torrent client will query it's peers to get the rest of the merkle tree (verifiable against the root hash). The leafs of the merkle tree are the hash of each 16kb block.

Interesting consequences of this:

Piece size isn't baked into the file anymore (and I've seen torrents with 16mb blocks), the client can dynamically chose it's verification piece size by requesting only so many layers of the merkle tree. Or it could skip requesting the tree and verify the whole file at once.

Merkle tree roots will be globally unique. You can scan torrent files for duplicated files and download common files from multiple swarms.

Scaevolus8y ago

Right, in BitTorrent v1 the size of the .torrent file is O(number of files) + O(number of bytes), but with this it's just O(number of files) with a higher constant factor.

Piece size is still baked into the file (as piece length), and is used for presence bitsets, which are a crucial part of the swarm algorithm. Clients download the rarest pieces first to boost efficiency, and this information is handled as bitsets shared between clients indicating "I have chunk {1, 2, 3, ... 50, 52, ... }".

Merkle tree roots will only be unique for each piece length. Piece length should still correlate with total size, to prevent huge bitsets-- a 16KB piece length on a 64GB torrent would have a 4 million item / 500KB bitset (!), so it could take 500KB of RAM per connected peer to maintain state-- or maybe compressed bitsets make this problem irrelevant in practice?

the84728y ago

v1: O(path-depth * number of files + number of bytes)

v2: O(log(path-depth) * number of files)

that is assuming some constantish branching factor in your directory structure

> Merkle tree roots will only be unique for each piece length.

Merkle trees are independent of piece size, which means you can use them to dedup across torrents.

[0] https://en.wikipedia.org/wiki/Great_Filter

infogulch8y ago

> You can scan torrent files for duplicated files and download common files from multiple swarms.

This is one of the biggest things I feel is missing from the current protocol and I'm very glad it's in v2 draft. Now when a group of related torrents are repacked into a single torrent all the swarms are complementary instead of competitive. You don't have to choose between seeding the big pack instead of the individual files, just do what you want and the whole swarm still benefits.

computerphage8y ago

> Or it could skip requesting the tree and verify the whole file at once.

To clarify, this works by the client deterministically reconstructing the tree once they have the whole file, then checking the root's hash, correct?

phire8y ago

Yeah, it's a deterministic tree of hashes.

Each leaf is the hash of a 16KB chunk. On the next layer up you have a series of nodes which are the hashes of the two leafs below it hashed together.

You add enough layers until you get a single root hash at the root of the tree.

jacksonsabey8y ago

would 2 different torrents that contain a file with the same exact physical hash share the same torrent file hash?

if torrent A and B both contain the exact same file, but torrent A only has the first half available, and torrent B has the second half available, could I combine both torrents to download that file? this could help fix old dead torrents or at least make the file searchable elsewhere by it's sha256 for example

caf8y ago

As long as they also have the same piece size.

infogulch8y ago

Not necessarily. If you have a smart client it should be able to combine them at the file-level since individual files are hashed as a whole now, regardless of the block size within it.

yodsanklai8y ago

> Chunks don't span files. Each file is validated by the hash of its merkle tree. This is the biggest user-visible change, since it means you can download one file without downloading others.

But can't you already download one file? I suppose if a chunk spans two files, you may get a few extra KB of another file you don't want, but it's not noticeable from a user perspective.

lmm8y ago

With a lot of clients you'll end up with bogus files on disk as the neighbours of the file you wanted - the client has to download the whole chunk and has to be able to validate its checksum, so it has to put it on disk somewhere. Not a huge problem, but annoying.

daurnimator8y ago

more than a few KB: can be a few MB. It's especially significant if you have a torrent full of text files, and want a subset.

snakeanus8y ago

Since support for Merkle trees is being added, does that mean that it could allow for someone who seeds a torrent to also seed a shared file of a peer that leeches a different torrent?

Klathmon8y ago

I have to admit, BitTorrent is one of the things I took for granted.

I never really thought about the details of how it works, or the really really impressive feats that were accomplished to get it to work. I knew it was a really good technology, but reading this and the comments here puts it on a whole other level.

Why isn't this technology talked about more? Why are blockchains the big "thing" right now with people trying to use them everywhere to see where they fit best, but torrent networks are kind of just... ignored?

The decentralized nature of it seems to open so many possibilities at first glance, is there a reason they aren't being taken advantage of? Is there some kind of "great filter" kind of thing that is preventing widespread usage of something like a torrent network?

znfi8y ago

Firstly I think that Bittorrent-style techniques are or have been used in some places even though it may not have been advertised very clearly. For example until a few years ago Spotify used to use something like Bittorrent to reduce the load on their servers. It's just that they didn't really tell anyone who used their product about this, which I honestly felt was kinda bad style.

Similarly I heard that Skype used to do something similar, I'm not sure exactly how it worked and apparently it was a pain to maintain so I think it has been scraped as well by now. I think some software updaters do use Bittorrent still, though.

If I were to guess, the really big reason for the lack of interest from big corporations is that collecting as much data as possible for use in machine learning is very much in vogue, while at the same time bandwidth seems to be very much a no-issue. Thus there is not much to gain and possible something to lose from employing bittorrent.

nailer8y ago

Perhaps because a lot of content is streamed live?

Streaming wants us to download A, B, C, D just in time.

Bottorrent (simplified) wants me to download piece P, you to download piece G, then I get P from you and you get G from me.

There are Bottorrent streaming apps but they kind of mess with the nature of BT.

OTOH things like RPM/Deb/WindowsUpdate etc it would make great sense.

freeone30008y ago

World of Warcraft (used to?) patch with Bittorrent.

pdimitar8y ago

All Blizzard games make heavy use of BT-like technology for downloading patches to this day. It's extremely efficient. A 300MB patch to Overwatch downloading with the max speed my ISP is giving me (100 Mbps). It's a matter of 30-40 seconds.

wongarsu8y ago

BitTorrent is "just" exchanging files via a p2p connection. It's kind of useful, a lot of projects use it in one way or another, but it's unlikely to be instrumental for "the next big thing".

The BitTorrent DHT is great for storing and exchanging metadata, but a DHT is not something most people associate with BitTorrent (Bitcoin also has uses a DHT (for client discovery), as do countless other services).

Blockchain technology on the other hand offers verifiable distributed timestamping (with ok-ish resolution). That has much wider applicability than just payment tracking (which is essentially all bitcoin does), which is why there's plenty of people exploring what's possible.

rmc8y ago

I suspect the close association with copyright infringement means that BitTorrent is a little toxic for many corporations.

tantalor8y ago

What is this "great filter"?

Klathmon8y ago

Its from the "fermi paradox"[0]. Basically it says that there might not be any other life out there because there is some kind of "event" or "limitation" of life that makes it so that it can only exist for so long, or it is just extremely difficult for life to get past this "filter" (and then there is the question of whether or not that filter is ahead of us, or behind us).

In this case, I was trying to use it to ask if there is some kind of "unsolved problem", inherent limitation or issue/problem with torrent networks that prevents their widespread usage.

ue_8y ago

I wanted to implement a distribured imageboard over bittorrent but I quickly realised it's hard to add data after the initial publication, and further to verify it, and the nature of trackers may make it prone to censorship. So I gave up.

the84728y ago

> but I quickly realised it's hard to add data after the initial publication

combining BEPs 46 and 50 enables rapid updates of torrents, but they are fairly new and there are no implementation designed with low latency in mind. Most bittorrent implementations focus on large amounts of data and throughput, so this use-case is not well served in practice even though the protocol could support it now.

wongarsu8y ago

Distributing the images/posts via bittorrent and the relations between them in the DHT might be the way to go with such a project.

On the other hand, the an uncensorable imageboard would profit from the verifiable timestamping of a blockchain, with just the images distributed via a bittorrent-like mechanism. That also gives you a decent anti-spam mechanism (you can post in exchange for mining blocks, similar to the original idea of hash-cash)

ue_8y ago

I thought about this too, but one of the features of imageboards is that it doesn't splinter into subthreads, there's a big list of posts, unlike Reddit. And because a post can reply to multiple posts at once, you can't separate them in blockchain forks. If the blockchain forks, it becomes hard to reference posts in the other forks from any particular fork.

On the other hand, there has to be a way to avoid downloading (and sharing) certain parts of the chain, for example if someone uploads illegal content, they should have the option to never download that data, so I like the idea of keeping images separate.

For posting to be feasible, the time to mine has to be low, though of course it'll increase over time, meaning that either shorter blockchains are favoured for ease of use (nobody wants to wait 5 minutes and waste a lot of power just to make a post) but long enough to make them hard to forge.

There's also the issue of segmentation; there's an interest in certain users wanting not to share certain posts, for example people against political issue X may not want to share posts about X. With a small number of peers, this could mean that only one or two peers keeps track of the posts talking about issue X. And then you'd have to trust that you're not downloading illegal content from those people, so if you are committed to anti-censorship but also don't want to download illegal content, you have to trust those peers to only remove illegal content.

In the end, I'm not sure if it comes out better than NNTP, or even centralised discussion boards with multiple independent archive sites available (which can archive posts before they are deleted by moderators).

rmc8y ago

Have you seen ipfs? It might do part of what you want...

richdougherty8y ago

Rationale for hash function change: https://github.com/bittorrent/bittorrent.org/issues/58

Discussion of other changes: https://github.com/bittorrent/bittorrent.org/pull/59

lowglow8y ago

Can someone diff the spec from the previous version? What's the changelog? :)

mouldysammich8y ago

The main differences I can see is a change from SHA1 -> SHA2 and also seems to have added official spec for webtorrent.

Luminarys8y ago

It also appears to be using a merkle hash tree for piece hashing now along with a few new peer wire messages to support that.

phire8y ago

It also switches to a merkle hash per-file and specifies that each file is aligned to the start of a piece, and the size of the last piece of each file matches the amount of remaining data.

This means in large multi-file torrents you don't have to download (and store) the two extra 1-4mb pieces at the start/end of each file anymore.

cjbprime8y ago

> official spec for webtorrent

Huh, where do you see that? Not seeing any ctrl-f hits for webtorrent or webrtc.

mouldysammich8y ago

Whoops, i skimmed and they mentioned the end user web browser and I assumed they had.

I guess this is why they say when you assume you make an ass out of you and me.

kpcyrd8y ago

It would be interesting to see how the new version compares to ipfs.

supergreg8y ago

The tree structure seems very similar. It would be nice if torrent clients could interact with ipfs or gain ipfs capabilities. Think torrents that update themselves when the files change (thanks to ipns).

sktrdie8y ago

There's already such BEP: http://bittorrent.org/beps/bep_0046.html - "Updating Torrents Via DHT Mutable Items"

redm8y ago

I just don't see this technology ever going mainstream. I first deployed this type of application in 2003. It was named Redswoosh and did effectively the same thing as BitTorrent, just in a closed client. I was also a very early adopter of BitTorrent using it personally.

Users hated it for general use, even when downloading big files. 1) They didn't like having to install/run some special software to download a file. 2) They didn't like the effects of uploading to others and it slowing down the connections.

Consumer networks are asymmetric having far more download capacity in upload capacity. This makes sense since 1) most users download and want to use the available bandwidth for faster downloads, and 2) it prevents commercial applications on consumer circuits. This is far from ideal for applications like BitTorrent.

I'm not saying there isn't an application for this technology, I'm saying all the good applications don't want to ask the users to pay for distribution to other users. Thus it's relegated to mostly piracy, open source, etc.

Bittorrent Inc. has been trying to commercialize this for a decade now, I just don't see it happening. If there was anyone who could commercialize it, it was Travis Kalnik, and while he exited for 20m, he was very lucky, (and happy) to get out of that market.

snakeanus8y ago

> I just don't see this technology ever going mainstream

It already is though.

0x08y ago

What's the stuff about "proof layers", is that new in this v2? The paper briefly talks about proof layer requests. Is this something merkle-tree related? What is the purpose? Is it to prevent clients from lying about having pieces they do not have by requesting a verifiable random hash chunk?

the84728y ago

It's part of switching to merkle trees instead of flat piece lists. A merkle tree can only be verified if you either have a whole layer or send ancestor-siblings (uncle, great-uncle, etc.) along with a partial layer.

Merkle trees allow torrents to start faster from magnet links since only the tree roots need to be front-loaded while the tree can be fetched incrementally.

shmerl8y ago

Do all Bittorrent clients support it already?

silotis8y ago

Currently no bittorrent clients support it. This is still just a draft. I've only just started working on an implementation for libtorrent, it will be quite a while before it is production ready.

smegel8y ago

Pity we will never see a genuine version of uTorrent that will support it. That was a real loss.

vanderZwan8y ago

We have plenty of good open source alternatives now. qTorrent works fine

smegel8y ago

Thanks I haven't hear of that. And it is written in C++ which is nice.

Is it considered the spiritual successor to the original uTorrent?

nyolfen8y ago

it is:

>The qBittorrent project aims to provide an open-source software alternative to µTorrent.

though in my experience it is more of a memory hog and buggier than utorrent. but that doesn't stop me from using it

lmm8y ago

I wouldn't be surprised if qbittorrent predates uTorrent; Wikipedia says it dates to 2006. But it wasn't popular on windows until uTorrent started doing dodgy things.

vopi8y ago

I love qtorrent. They even have built in torrents search for various sites like TPB and Linux isos ;)

StreamBright8y ago

Sorry I am not sure why is that. Would you mind explaining?

kyberias8y ago

uTorrent used to be really efficient, small memory-footprint, full featured bittorrent-client. One of the best software I've ever used to download... perfectly legal content [1] from the internets.

Now it's full of ads and performs poorly.

[1] Like all the different Linux distro install images over and over again.

pwg8y ago

Take a look at rtorrent: https://github.com/rakshasa/rtorrent

It fits the efficient, small memory-footprint and no ads requirements. "Full Featured" is subjective as it depends upon what you consider "Full Featured".

[1]: https://cryptologie.net/article/417/how-did-length-extension...

pdimitar8y ago

I found qBittorrent to be very adequate and full of good options to provide you granularity on how would you like to manage your upload traffic.

j / k navigate · click thread line to collapse

112 comments

jzelinskie8y ago

the84728y ago

The DHT BEPs specify a network that is only barely related to the bittorrent core protocol, they can already be used independently, and some people do.

> I feel like trackers were largely overlooked in this update, but I'm biased because I work on a popular tracker.

jzelinskie8y ago

>...and there were no such open issues with the http tracker protocol

[0]: http://www.bittorrent.org/beps/bep_0034.html

the84728y ago

Also, both BEP3 and 52 already forward-reference the tracker extensions (compact and UDP), so someone who writes a new bittorrent implementation should already be aware of them.

Maybe we could make it more clear that some BEPs are almost-mandatory.

aninhumer8y ago

>The DHT BEPs specify a network that is only barely related to the bittorrent core protocol

pdimitar8y ago

More generic question, apologies if it feels inserted without relevance:

Did you guys talk with the IPFS team? Do both of you have a desire to start bringing both families of protocols and technologies closer together?

I feel in this age we must make de-fragmentation of efforts our topmost priority.

predakanga8y ago

How do you intend to handle incompatibilities in the tracker protocol that are introduced by BEP52?

the84728y ago

hybrid torrents essentially consist of two swarms, so you announce twice.

userbinator8y ago

SHA256 truncated to 20 bytes

baby8y ago

80-bit of collision resistance is usually the number accepted for legacy cryptosystems or for lightweight crypto. It's not great but it's not "too bad".

By removing 96 bits from the state you also prevent length extension attacks (which SHA-256 is vulnerable to, see [1]). Or rather, provide 96-bit of security against them. Which should be enough.

This is better than using SHA-1 because SHA-1 has "efficient" chosen-prefix algorithms to find collisions while SHA-2 currently does not.

scrollaway8y ago

the84728y ago

There are 2 uses of the hashes in bittorrent. a) integrity checking b) as unique identifier for the swarm.

nenreme8y ago

You download 32 bytes hash and check it using 20 byte hash from the magnet link.

[0] https://en.wikipedia.org/wiki/Length_extension_attack

sha2throwaway8y ago

loup-vaillant8y ago

There are better ways to prevent length extension attacks, such as choosing Blake2b. With the current scheme, only 12 bytes are missing, so Length extension attacks only get 96 bits of security…

Replacing SHA-1 with SHA-2, what are they thinking? Blake2 is faster and more secure than either.

https://en.wikipedia.org/wiki/Brave_(web_browser) https://brave.com/

snakeanus8y ago

Length extension attacks are a non-issue for torrents.

boyce8y ago

I'd love to see a p2p wikipedia and a p2p twitter alternative

Moving bittorrent away from it's present image could be achieved by making p2p useful beyond bluray rips

fps_doug8y ago

johnwaynedoe8y ago

boyce8y ago

cesarb8y ago

So yes, it's the way it was originally meant to be used.

Klathmon8y ago

Personally I really truly believe that once you can make a torrent client in javascript that runs directly in the browser without any plugins or gateways/tunnels that you will see it explode.

Developers will code it into their download pages, decentralized systems like a p2p wikipedia will be possible and always accessible by anyone with a browser.

Piskvorrr8y ago

lqdc138y ago

https://webtorrent.io/

boyce8y ago

Yes please, someone do this

dandelion_lover8y ago

> p2p twitter alternative

here you go https://en.wikipedia.org/wiki/Twister_%28software%29

> They've decided to move forward with SHA256 truncated to 20 bytes [...]

I wonder why not SHA512? It's actually faster to compute than SHA512 on 64-bit architectures.

silotis8y ago

mtgx8y ago

The word encryption doesn't even seem to be mentioned there. At the very least it would help against traffic shaping, which you know is coming once net neutrality rules are dead.

the84728y ago

It is an arms race that is not won by updating a slowly evolving core protocol.

Scaevolus8y ago

1) Chunks don't span files. Each file is validated by the hash of its merkle tree. This is the biggest user-visible change, since it means you can download one file without downloading others.

2) SHA1 is replaced with SHA2-256 (2x longer hashes and not broken).

3) Files are represented by a tree structure instead of a list of dictionaries with paths-- this reduces duplication in deeply-nested hierarchies.

4) Backwards compatible-- you can make a .torrent file with both old and new pieces, and a swarm can speak either. This requires padding files from BEP47, which most clients probably don't support.

Per-file metadata increases pretty significantly, from ~19B (just length) to ~68B (length + hash).

phire8y ago

Per-file metadata increases significantly, but it gets rid of the per piece data (which in bittorrent v1 is 20 bytes of sha1 hash per piece and made up the bulk of the .torrent file).

Interesting consequences of this:

Merkle tree roots will be globally unique. You can scan torrent files for duplicated files and download common files from multiple swarms.

Scaevolus8y ago

Right, in BitTorrent v1 the size of the .torrent file is O(number of files) + O(number of bytes), but with this it's just O(number of files) with a higher constant factor.

the84728y ago

v1: O(path-depth * number of files + number of bytes)

v2: O(log(path-depth) * number of files)

that is assuming some constantish branching factor in your directory structure

> Merkle tree roots will only be unique for each piece length.

Merkle trees are independent of piece size, which means you can use them to dedup across torrents.

[0] https://en.wikipedia.org/wiki/Great_Filter

infogulch8y ago

> You can scan torrent files for duplicated files and download common files from multiple swarms.

computerphage8y ago

> Or it could skip requesting the tree and verify the whole file at once.

To clarify, this works by the client deterministically reconstructing the tree once they have the whole file, then checking the root's hash, correct?

phire8y ago

Yeah, it's a deterministic tree of hashes.

Each leaf is the hash of a 16KB chunk. On the next layer up you have a series of nodes which are the hashes of the two leafs below it hashed together.

You add enough layers until you get a single root hash at the root of the tree.

jacksonsabey8y ago

would 2 different torrents that contain a file with the same exact physical hash share the same torrent file hash?

caf8y ago

As long as they also have the same piece size.

infogulch8y ago

Not necessarily. If you have a smart client it should be able to combine them at the file-level since individual files are hashed as a whole now, regardless of the block size within it.

yodsanklai8y ago

> Chunks don't span files. Each file is validated by the hash of its merkle tree. This is the biggest user-visible change, since it means you can download one file without downloading others.

But can't you already download one file? I suppose if a chunk spans two files, you may get a few extra KB of another file you don't want, but it's not noticeable from a user perspective.

lmm8y ago

daurnimator8y ago

more than a few KB: can be a few MB. It's especially significant if you have a torrent full of text files, and want a subset.

snakeanus8y ago

Since support for Merkle trees is being added, does that mean that it could allow for someone who seeds a torrent to also seed a shared file of a peer that leeches a different torrent?

Klathmon8y ago

I have to admit, BitTorrent is one of the things I took for granted.

znfi8y ago

nailer8y ago

Perhaps because a lot of content is streamed live?

Streaming wants us to download A, B, C, D just in time.

Bottorrent (simplified) wants me to download piece P, you to download piece G, then I get P from you and you get G from me.

There are Bottorrent streaming apps but they kind of mess with the nature of BT.

OTOH things like RPM/Deb/WindowsUpdate etc it would make great sense.

freeone30008y ago

World of Warcraft (used to?) patch with Bittorrent.

pdimitar8y ago

wongarsu8y ago

BitTorrent is "just" exchanging files via a p2p connection. It's kind of useful, a lot of projects use it in one way or another, but it's unlikely to be instrumental for "the next big thing".

rmc8y ago

I suspect the close association with copyright infringement means that BitTorrent is a little toxic for many corporations.

tantalor8y ago

What is this "great filter"?

Klathmon8y ago

In this case, I was trying to use it to ask if there is some kind of "unsolved problem", inherent limitation or issue/problem with torrent networks that prevents their widespread usage.

ue_8y ago

the84728y ago

> but I quickly realised it's hard to add data after the initial publication

wongarsu8y ago

Distributing the images/posts via bittorrent and the relations between them in the DHT might be the way to go with such a project.

ue_8y ago

rmc8y ago

Have you seen ipfs? It might do part of what you want...

richdougherty8y ago

Rationale for hash function change: https://github.com/bittorrent/bittorrent.org/issues/58

Discussion of other changes: https://github.com/bittorrent/bittorrent.org/pull/59

lowglow8y ago

Can someone diff the spec from the previous version? What's the changelog? :)

mouldysammich8y ago

The main differences I can see is a change from SHA1 -> SHA2 and also seems to have added official spec for webtorrent.

Luminarys8y ago

It also appears to be using a merkle hash tree for piece hashing now along with a few new peer wire messages to support that.

phire8y ago

It also switches to a merkle hash per-file and specifies that each file is aligned to the start of a piece, and the size of the last piece of each file matches the amount of remaining data.

This means in large multi-file torrents you don't have to download (and store) the two extra 1-4mb pieces at the start/end of each file anymore.

cjbprime8y ago

> official spec for webtorrent

Huh, where do you see that? Not seeing any ctrl-f hits for webtorrent or webrtc.

mouldysammich8y ago

Whoops, i skimmed and they mentioned the end user web browser and I assumed they had.

I guess this is why they say when you assume you make an ass out of you and me.

kpcyrd8y ago

It would be interesting to see how the new version compares to ipfs.

supergreg8y ago

sktrdie8y ago

There's already such BEP: http://bittorrent.org/beps/bep_0046.html - "Updating Torrents Via DHT Mutable Items"

redm8y ago

snakeanus8y ago

> I just don't see this technology ever going mainstream

It already is though.

0x08y ago

the84728y ago

Merkle trees allow torrents to start faster from magnet links since only the tree roots need to be front-loaded while the tree can be fetched incrementally.

shmerl8y ago

Do all Bittorrent clients support it already?

silotis8y ago

Currently no bittorrent clients support it. This is still just a draft. I've only just started working on an implementation for libtorrent, it will be quite a while before it is production ready.

smegel8y ago

Pity we will never see a genuine version of uTorrent that will support it. That was a real loss.

vanderZwan8y ago

We have plenty of good open source alternatives now. qTorrent works fine

smegel8y ago

Thanks I haven't hear of that. And it is written in C++ which is nice.

Is it considered the spiritual successor to the original uTorrent?

nyolfen8y ago

it is:

>The qBittorrent project aims to provide an open-source software alternative to µTorrent.

though in my experience it is more of a memory hog and buggier than utorrent. but that doesn't stop me from using it

lmm8y ago

I wouldn't be surprised if qbittorrent predates uTorrent; Wikipedia says it dates to 2006. But it wasn't popular on windows until uTorrent started doing dodgy things.

vopi8y ago

I love qtorrent. They even have built in torrents search for various sites like TPB and Linux isos ;)

StreamBright8y ago

Sorry I am not sure why is that. Would you mind explaining?

kyberias8y ago

uTorrent used to be really efficient, small memory-footprint, full featured bittorrent-client. One of the best software I've ever used to download... perfectly legal content [1] from the internets.

Now it's full of ads and performs poorly.

[1] Like all the different Linux distro install images over and over again.

pwg8y ago

Take a look at rtorrent: https://github.com/rakshasa/rtorrent

It fits the efficient, small memory-footprint and no ads requirements. "Full Featured" is subjective as it depends upon what you consider "Full Featured".