May 03, 2007

Hal Finney on 'AACS and Processing Key'

Hal Finney posts an explanation of the AACS movie encryption scheme. This FC scheme has just been cracked, and the primary keys published, to much media and legal attention. As digital rights management is a core financial cryptography application, it's worth recording the technology story as a case study, even if the detail is overwhelming!

Since this is the cryptography mailing list, there might be interest in the cryptography behind this infamous key. This is from AACSLA Specifications page, particularly the first spec, Common Cryptographic Elements. The basic cryptography is from Naor, Naor and Lotspiech.

The AACS system implements broadcast encryption. This is a scheme which has also been used for satellite TV. The idea is that you want to encrypt data such that any of a large number of devices can decrypt it, with also the possibility of efficiently revoking the keys in a relatively small subset of devices. The revocation is in case attackers manage to extract keys from devices and use them to decrypt data without authorization.

Broadcast encryption schemes such as that used by AACS equip each device with a set of keys, and encrypt a content key to various subsets of these keys such that each authorized device can decrypt the content key but the revoked devices cannot. Various methods have been proposed for achieving this, with tradeoffs between the number of keys held by each device and the amount of data which must be broadcast to hold all of the necessary encryptions.

AACS uses a binary tree based method, where each device corresponds to the leaf node of a tree. It uses a tree with depth of 31, so there are 2^31 leaf nodes and 2^32 - 1 nodes in all. At this point it is believed that software players of a particular type are all considered a single device, while hardware players are each a unique device. This will allow individual hardware players to be revoked, while requiring all software players of a given brand or type to be revoked at once. This tradeoff is assumed to be acceptable because it is easy to get a new version of a software player.

The method of assigning and handling the keys is called subset-difference. It allows a single encryption to be decrypted by any of the devices in a given subtree of the main tree, minus any sub-subtree of that subtree. In this way, any set of revoked nodes can be handled by the union of an appropriate chosen set of subset-difference encryptions. For example, suppose two nodes A and B are to be revoked. Let A be to the left of B, and call their lowest common ancestor node C. Encrypt to the whole tree minus the subtree rooted at C; also to C's left child's subtree minus A; also to C's right child's subtree minus B. This will cover all nodes except A and B.

To implement subset-difference, the root node of each subtree is assigned a unique key called a device key. Then going down the subtree from that node, each node gets its own device key as a distinct one-way hash of its parent's device key. The result is that if you know a node's device key, you can deduce the device keys of all descendants of that node.

This assignment of keys is carried out independently for each subtree, so a node at level n has n+1 device keys associated with it, one for each of the n+1 subtrees that it is a part of.

Leaf nodes correspond to devices, but devices do not get the device keys for "their" leaf node. Instead, they are given the device keys of the sibling node of their leaf, as well as the device keys of all of the siblings of their ancestor nodes. Because knowing a device key allows deducing the device keys of all its descendents, this assignment allows each physical device to deduce all device keys in the tree except for their "ancestor" nodes: those on the one branch of the tree leading to the leaf node.

To implement subset-difference encryption, suppose we want to encrypt to all nodes in the subtree rooted at node A except those nodes in the sub-subtree rooted at node B. Then we encrypt to the device key of node B that was assigned as part of the device key system rooted at node A. All nodes in the node-A subtree except those below node B can deduce this device key, because B is not one of their ancestors. Nodes below B cannot deduce the device key because B is an ancestor, and nodes not below A cannot deduce it because this set of device keys was unique to the node-A subtree.

In order to get the system started, one node is considered pre-revoked and not assigned to any physical device. Initially, the data is encrypted to the device key assigned to that node as part of the system for the whole tree. Every device will be able to deduce that device key and decrypt the data.

That one key is the "processing key" about which so much fuss is being made. All HD-DVD disks that were initially produced have their content keys encrypted to that single key. Knowing this processing key, along with other information available from the disk, allows determining all necessary decryption keys and provides access to the plaintext of the content. With this value having been published, all of the first generation of HD-DVD disks can be played.

The interesting thing is that publishing a processing key like this does not provide much information about which device was cracked in order to extract the key. This might leave AACSLA in a quandary about what to revoke in order to fix the problem. However in this particular case the attackers made little attempt to conceal their efforts and it was clear which software player(s) were being used. This may not be the case in the future.

AACSLA has announced that they will be changing the processing keys used in disks which will begin to be released shortly. Software players have been updated with new device keys, indicating that the old ones will be revoked. In the context of the subset-difference algorithm, there will now probably be a few encryptions necessary to cover the whole tree while revoking the old software player nodes as well as the pre-revoked node. This will make the processing key which has been published useless for decrypting new disks.

Because processing keys do not unambiguously point to their source, AACSLA may choose to set up subset-difference encryptions in which each software player is part of a different subtree and therefore uses a different processing key. This might require a few more encryptions than the minimal number that subset-difference allows, but it would reduce the chance that AACSLA would find themselves unable to determine the source of a published processing key. This will only work as long as attackers restrict themselves to the relatively few software players. If some group were to succeed in extracting keys from a hardware player and publish a processing key that might apply to the majority of hardware players in use, AACSLA would seemingly have no way to determine how to address the problem.

Now I must confess that this already long message has oversimplified the AACS system in certain respects. First, the subset-difference system is only carried on for the lowest 22 levels of the 31 level tree. There are effectively 512 independent trees where the algorithm is applied, each with a single pre-revoked leaf node. However at this time it appears that only one is in use.

Second, the processing key is not actually the same as the node's device key, but rather is a hash of the device key. Further, the exact details of how you go from the processing key to the various disk content keys involve several levels of indirection and encryption.

Third, even given the processing key, some of the information needed to derive all of the disk's content is not easily available. One piece needed is a per-title Volume ID which is not readable from the disk in a straightforward way. Volume IDs have been discovered by eavesdropping on the USB bus connected to a disk player, or by hacking disk player firmware. At this point it is hard for typical end users to read Volume IDs, so knowing the processing key is not generally sufficient to read disks. Databases of Volume IDs have been published online, but disk media keys could just as easily have been published.

Speculating now, the AACS system is flexible but it is possible that publication of processing keys may not have been fully anticipated by the designers. The difficulty of tracing processing keys to their source in an environment in which new disks may require many weeks or even months of lead time may interfere with the planned revocation system. The current processing key will soon be rendered invalid for new releases, so AACSLA's aggressive legal tactics seem disproportionate compared to the relative unimportance of this particular key. Perhaps these legal actions are primarily intended to limit distribution of future processing keys that are found on the next set of disk releases. That would further point to technical difficulties in revocation strategy when a new processing
key is published.

Hal Finney

Posted by iang at May 3, 2007 07:53 AM | TrackBack
Comments talks about Blacklisting and so forth

Posted by: Iang ==> Freedom to Tinker at May 5, 2007 05:06 AM
Post a comment

Remember personal info?

Hit preview to see your comment as it would be displayed.