User login

The Impact of the Copyright Amendment Act: Update for September 2012

25

Oct

2012

Updated on October 26, 2012 to reflect that the P2P_Structure category was not entirely reliable.

Introduction

Earlier this year, we managed to generate a bit of interest by studying changes in application protocol usage at one New Zealand ISP after the Copyright Amendment Act came into effect. This eventually led to a publication at IMC 2012, which can be accessed here.

One outstanding question from this work was whether the changes that we observed would persist, particularly given that there have been no notable instances of people being brought before the Copyright Tribunal and punished. Would people eventually revert back to their old methods of file-sharing or would they continue to use more obfuscated methods? Would those people that stopped file sharing return once they felt more secure in not being caught out?

With this in mind, we have updated our results with data captured from the same New Zealand ISP during September 2012, one year on from the CAA coming into force. Again, we have looked at the traffic for a subset of the ISP's DSL subscribers only. Unfortunately, we do not have detailed information about the number of subscribers using each protocol, but we do have statistics about the number of flows and bytes for each protocol (both incoming and outgoing) which we can make use of. In this blog post, I'll be comparing the most recent measurements with our earlier results to determine if anything has changed in the past few months.

For those of you who want more detail about the numbers we used to create the graphs below, I've attached a PDF that contains tables describing the raw numbers for the major protocols observed in our dataset. If you want the data for all protocols and the combined category stats, get in touch with me and I'll try to rustle something up for you.

Bytes Downloaded

The first graph presented below shows the change in the number of bytes downloaded for each of the major protocol categories compared with the baseline value observed in January 2011 (i.e. well before the CAA). The change is expressed as a percentage: a value of zero means there has been no change, a value of 100 means that the amount of traffic has doubled and a value of -50 means that traffic has halved.

Bytes Downloaded by Residential DSL Users -- by Application Category

P2P downloads are still significantly down compared with the level they were at in January 2011, although there appears to be signs of a slow recovery. However, the amount of observed P2P traffic is still well below what was present before the CAA came about. Encryption (i.e. SSL over non-standard ports) continues to be non-existant in data sets following the CAA coming into effect, which suggests to me that this traffic was SSL-encrypted P2P traffic.

Tunneling and Remote Access protocols are still significantly up on what they were back in January 2011. While these may be being used to transfer downloaded files back to New Zealand from seedboxes located overseas, there are other reasons that could explain this growth too. Much of the growth in Tunneling, for instance, is actually the result of increased use of Teredo - an IPv4 to IPv6 tunneling protocol. It is relatively unlikely that Teredo is being used for copyright infringement; rather, the increase in Teredo can probably be attributed to the growing number of IPv6 capable end hosts.

Bytes Downloaded by Residential DSL Users -- by Application Protocol

The above graph shows the same metric, except this time we are looking at individual application protocols rather than grouping the data based on the application category. It becomes apparent that the recovery in P2P traffic is almost entirely down to growth in BitTorrent UDP traffic - this is because of the increased prevalence of uTP. It is no longer feasible to think of BitTorrent as solely a TCP protocol - most file sharing using BitTorrent is now done over UDP, based on these results.

The contribution of Teredo to the growth in the Tunneling protocol is also shown in this graph. Other tunneling protocols, particularly VPN protocols, also experienced very high relative growth but Teredo was the most significant in terms of traffic volume, although it still accounted for much less than 1% of all downloaded traffic in each of our data sets (the peak was 0.6% in Sept. 2012).

Seedboxes - A Possible Theory

However, the most noteworthy result in this graph is the rapid growth in HTTPS - downloaded HTTPS bytes have increased five-fold in the last 20 months. My suspicion is that this is where the majority of the missing P2P traffic has gone; instead of participating in P2P exchanges directly from home and run the risk of getting an infringement notice from their ISP, the canny subscriber will acquire their content using a foreign seedbox and use HTTPS to safely transfer the content back home.

HTTPS has several features that make it perfect for this use case:
* The transfer of the copyrighted file is encrypted, so it is not possible to inspect the packets and determine that the file being transferred contains copyrighted content.
* The content on the seedbox can be explored and downloaded using a web browser rather than specialised software.
* Secure login via HTTPS means that it is difficult for anyone else to gain access to your content on the seedbox.
* HTTPS has many legitimate uses, so won't be rate-limited or blocked. It also doesn't require any firewall or NAT holes to function properly, like passive FTP.

The legitimate uses of HTTPS mean that it is not possible to definitively claim that the growth in HTTPS is the result of seedbox usage. Some other uses of HTTPS that are also likely to have contributed to an increase in HTTPS traffic include online shopping, Internet banking and social networking.

The increased use of seedboxes may also explain observed growths in a number of remote access and VPN protocols, which would be used to communicate with the seedbox and operate the BitTorrent client. OpenVPN, RDP, ESP, and SSH are examples of protocols that have become much more popular since January 2011. Perhaps some of the Teredo growth too could be attributed to this -- seedboxes are often IPv6-capable.

Uploaded Traffic

Bytes Uploaded by Residential DSL Users -- by Application Category

Bytes Uploaded by Residential DSL Users -- by Application Protocol

The above graphs show the changes in traffic in the reverse direction, i.e. bytes transmitted by DSL users. Again, there have been major drops in P2P and Encrypted traffic and they have persisted through to September 2012. Remote Access traffic has grown significantly in the past few months, primarily on the back of an increase in RDP (Windows Remote Desktop Protocol). HTTPS uploads have also grown rapidly over the time period that we have examined, which constitutes much of the increase we observed in the Web category.

Conclusions

Overall, our most recent results appear to confirm the trends that we had observed from our earlier measurements. It seems that, for this ISP at least, there has been a significant shift away from running BitTorrent at home that does not appear likely to be reversed any time soon. Having said that, BitTorrent is still in the top 5 application protocols in terms of bytes downloaded and BitTorrent DHT traffic is actually growing rather than shrinking. BitTorrent is still significant, but not as much as it was.

The surge in HTTPS is the other major result from our measurements. Downloaded HTTPS traffic is five times what it was in January 2011 and now constitutes 9% of all bytes downloaded by the measured DSL subscribers (compared with 1.6% in January 2011). My theory is that this is indicative of illegal file sharing moving to foreign seedboxes where the user can transfer the files back to their home computer using HTTPS. The corresponding increase in VPN and remote access protocols appear to corroborate this, as these protocols would be used to access and configure seedboxes. However, this theory is difficult to confirm, especially with the data we have at the moment.

I should also note that these results are from one New Zealand ISP only and merely indicate that there is a strong correlation between the CAA and the behaviour that has been noted in this blog post (not a causation!). To be able to form firmer conclusions, we would need to examine the traffic mixes for other ISPS both inside and outside New Zealand to determine whether the changes we observed are definitely related to the change in New Zealand law or simply reflect global Internet usage patterns.

AttachmentSize
proto_table.pdf27.05 KB