BEP:47
Title:Padding files and extended file attributes
Version: 62b836671fe222554523d37211f4944565f455cf
Last-Modified:Sun Aug 7 18:27:20 2016 +0200
Author: The 8472 <the8472.bep@infinite-source.de>
Status: Draft
Type:Standards Track
Content-Type:text/x-rst
Created:05-Aug-2016
Post-History:

Padding files and extended file attributes

This BEP specifies some additional file properties beyond those described in BEP 3 [1].

Multi-file format

{
  "info":
  {
    "files":
    {[
      {
        "attr": "phxl",
        "sha1": <20 bytes>,
        "symlink path": ["dir1", "dir2", "target.ext"],
        ...
      },
      {
        ...
      }
    ]},
    ...
  },
  ...
}

Single-file format

{
  "info":
  {
    "attr": "hx",
    "sha1": <20 bytes>,
    ...
  },
  ...
}
attr
A variable-length string. When present the characters each represent a file attribute. l = symlink, x = executable, h = hidden, p = padding file. Characters appear in no particular order and unknown characters should be ignored.
sha1
20 bytes. The SHA1 digest calculated over the contents of the file itself, without any additional padding. Can be used to aid file deduplication [2]. The hash should only be considered as a hint, pieces hashes are the canonical reference for integrity checking.
symlink path
An array of strings. Path of the symlink target relative to the torrent root directory.

Padding files

Padding files are synthetic files inserted into the file list to let the following file start at a piece boundary. That means their length should fill up the remainder of the piece length of the file that is supposed to be padded. For the calculation of piece hashes the content of padding file is all zeros.

Clients aware of this extension don't need to write the padding files to disk and should also avoid requesting byte-ranges covering their contents, e.g. via request messages. But for backwards-compatibility they must service such requests.

While clients implementing this extensions will have no use for the path of a padding file it should be included for backwards compatibility since it is a mandatory field in BEP 3 [1]. The recommended path is [".pad", "N"] where N is the length of the padding file in base10. This way clients not aware of this extension will write the padding files into a single directory, potentially re-using padding files from other torrents also stored in that directory.

To eventually allow the path field to be omitted clients implementing this BEP should not require it to be present on padding files.

Piece-aligned files simplify deduplication [2] and the operations on mutable torrents [3].

The presence of padding files does not imply that all files are piece-aligned.

Internally inconsistent torrents

If used incorrectly or maliciously symlinks and padding files can result in internally inconsistent torrents which cannot finish downloading because they contain conflicting hash information.

Similarly the sha1 fields may be inconsistent with the piece data and lead to failures after deduplication.

Clients should ensure that adding and deduplicating such a torrent does not lead to loss of already existing data.

References

[1](1, 2) BEP_0003. The BitTorrent Protocol Specification. (http://www.bittorrent.org/beps/bep_0003.html)
[2](1, 2) BEP_0038. Finding Local Data Via Torrent File Hints. (http://www.bittorrent.org/beps/bep_0038.html)
[3]BEP_0039. Updating Torrents Via Feed URL. (http://www.bittorrent.org/beps/bep_0039.html)