dominikh / gamayun Goto Github PK

View Code? Open in Web Editor NEW

14.0 14.0 0.0 275 KB

Torrent client designed for seeding tens of thousands of torrents.

License: MIT License

Go 100.00%

bittorrent

gamayun's People

Contributors

Stargazers

Watchers

gamayun's Issues

Be quicker at unchoking new peers

We choke/unchoke peers every 10 seconds, based on an choking algorithm. However, I don't think it makes sense to wait 10 seconds to unchoke peers if we still have free slots. We should unchoke them as soon as they express their interest.

Provide CLI tool for showing information from torrent file

Basically, provide a tool similar to transmission-show, but add support for structured output and accessing all information, including the source tag. This tool will be part of how to use gamayun. For example, to seed torrents from directories based on the tracker, one could do something like gamayun --add foo.torrent --dir /seed/$(gamayun-info --source foo.torrent)

Support protocol encryption

On the surface, encryption > no encryption. However, BitTorrent's encryption is rather useless and more akin to obfuscation; it's not very secure. It also seems to fail at obfuscation:

Analysis of the BitTorrent protocol encryption (a.k.a. MSE) has shown that statistical measurements of packet sizes and packet directions of the first 100 packets in a TCP session can be used to identify the obfuscated protocol with over 96% accuracy.

It would seem that encryption would be a "feel good" feature, without practical use.

Work around bad traffic accounting in trackers

Traffic accounting in bittorrent trackers is a hacky mess and has a tendency to break.

One series of events that has been observed to cause issues is this:

peer_id=1, event=started, downloaded=0
peer_id=1, event="", downloaded=5
peer_id=1, event=completed, downloaded=10
a day of no internet connectivity while the client continues to run
peer_id=1, event="", downloaded=10

Some trackers seem to interpret event 5 as a new download, with 10 bytes downloaded, causing a total of 20 bytes downloaded across two downloads.I don't know why they would do that, as opposed to rejecting the announce since, if the previous session has timed out, there is no matching event=started announce, but it's the only explanation I could come up with for behavior I have observed, where a multi-hour internet outage resulted in doubled download stats.

We should probably stop retrying announces after a while and consider the session closed. The next announce, then, would be a new event=started announce, with downloaded=0. This will cause us to under-report some downloaded, but only the amount between the last successful announce and last failed announce. This will generally be much less than what we can over-report, namely the entire torrent.

We might also want to immediately stop and then start a torrent that has finished downloading, to ensure a fresh session with downloaded=0.

Add rate limiting

Support UPnP or whatever is required for automatic port forwarding

Support UDP trackers

Support µTP

https://www.bittorrent.org/beps/bep_0029.html

Export Prometheus metrics

Support IPv6

Investigate using io_uring for disk I/O

We don't want to maintain an in-process read cache, as that just duplicates what's already in the kernel's page cache. We also want to let the OS handle readahead for us. However, in the worst case, we'll end up doing 16 KiB random reads, which will spend a significant amount of time in syscall overhead. See if io_uring might help with this. It would also reduce the number of threads we'd need.

Implement 9p server

Support fast resume

At a minimum support our own form of fast resume. We might also want to support libtorrent's format to make migration to our client easier.

Support DHT

Allow flexible mapping between files in a torrent and files on the file system

Allow specifying per-file where to download to/seed from. For a single torrent file with the files a/a.ext and a/b.ext, it should be possible to back them with /whatever/cool.ext and /somewhere_else/not_so_cool.ext. This mapping should be changeable at runtime, especially after a torrent has finished downloading and has been moved.

Implement super-seeding

Support PEX

Load metainfo lazily

In a seed-centric workload with lots of torrents, keeping all metainfo in memory at all times is a waste of memory. The majority of torrents will be idle and not need to know more than their infohash. Even active torrents won't need piece hashes if they're just seeding. We should load metainfo lazily, and only load the subset of metainfo that we need.

Add torrent priorities

We have P available peer connections and t active torrents. By default, each torrent will be able to use P/t connections (but may use more if other torrents use fewer.) However, we may want to prioritize some torrents over others. There are two ways to do this, and we probably want both of them:

Give individual torrents a bigger share of connections. Each torrent will have a factor f in [0, ∞] defaulting to 1. The minimum peers per individual torrent will then be f / ∑(all factors) · P. The user could have a set of such factors, like 1, 5, 10, ... and give them names such as "normal", "medium", "high"... Ultimately, the choice of factors determines just how starved a torrent can be in relation to other torrents.
Assign priority groups to torrents, and allocate connections to groups in order of priority, allowing more important groups to use up all connections, leaving none for lower priority groups. Within a group, factors would apply, but only consider torrents within the group.

The same approaches can be extended to upload/download slots.

Implement flexible connection limit

A static per-torrent peer limit is quite wasteful.

If we're happy to have 100 peers globally, and we have only one active torrent, then that torrent should be able to have 100 peers. Instead, have a global peer limit, a limit of active torrents, and dynamically adjust per-torrent limits, possibly disconnecting peers from busy torrents when new torrents become active.

The same algorithm probably works for incoming and outgoing connections.

dominikh / gamayun Goto Github PK

gamayun's People

Contributors

Stargazers

Watchers

gamayun's Issues

Recommend Projects

Recommend Topics

Recommend Org