End-to-end encryption (E2EE) in messaging platforms enable people to securely and privately communicate with one another. Its widespread adoption has however raised concerns that illegal content might now be shared undetected. Following the global pushback against key escrow systems, client-side scanning based on perceptual hashing has been recently proposed by governments and researchers to detect illegal content in E2EE communications.
Last week, Apple announced that it will use client-side scanning to detect child sexual abuse material in iCloud photos, users’ personal photo libraries. The announcement has triggered concerns among experts about the trade-off achieved by client-side scanning mechanisms and the risk of them being misused.
In this talk, we will present what is to the best of our knowledge the first framework to evaluate the robustness of perceptual hashing-based client-side scanning which we proposed two months ago. We will present a general black-box attack against any perceptual hashing algorithm and two white-box attacks for discrete cosine-based algorithms. Using these, we will show in a large-scale evaluation that more than 99.9% of images can be successfully attacked in a black-box setting while preserving the content of the image. We will then show our attack to generate diverse perturbations, suggesting that straightforward mitigation strategies would be ineffective. Taken together, our results raise concerns on the robustness of perceptual hashing-based client-side scanning mechanisms to black-box adversarial machine learning attacks.
This talk is based on “Adversarial Detection Avoidance Attacks: Evaluating the robustness of perceptual hashing-based client-side scanning” by Shubham Jain*, Ana-Maria Cretu*, Yves-Alexandre de Montjoye and available as preprint here: https://arxiv.org/abs/2106.09820.
The rampage of incessant cyber attacks have caused the disclosure of billions of users’ private data, shaking the Internet to its core. In response, various data privacy laws and regulations have emerged, forcing the industry to change their practice and bringing the demand for large-scale secure computing to the spotlight. Such a demand, however, cannot be met by the state-of-the-art cryptographic techniques, even with decades of effort, due to the overheads (speed, bandwidth consumption) they incur. To narrow the gap, recent years have seen rapid progress in hardware based trusted execution environments (TEE), such as Intel SGX, AMD SEV and ARM TrustZone, which enable efficient computation on encrypted data within a secure enclave established by a trusted processor. In this talk, I will present our research on understanding and addressing the security challenges in this new secure computing paradigm and enhancing its design to achieve scalability, for the purpose of supporting accelerated machine learning. Further I will present the big questions that need to be answered in the area and introduce our genome privacy competition as a synergic activity that helps move the science in this area forward.
This talk will explore the usage of information-flow analysis to study security and privacy issues in mobile-to-wearable interactions. The talk will cover both low-level interactions enabled directly by Bluetooth Low Energy APIs and higher-layer interactions such as those enabled by Wear OS.
The steady reports of privacy invasions online paints a picture of the Internet growing into a more dangerous place. This is supported by reports of the potential scale for online harms facilitated by the mass deployment of online technology and by the data-intensive web. While Internet users often express concern about privacy, some report taking actions to protect their privacy online.
We investigate the methods and technologies that individuals employ to protect their privacy online. We conduct two studies, of N=180 and N=907, to elicit individuals’ use of privacy methods, within the US, the UK and Germany. We find that non-technology methods are among the most used methods in the three countries. We identify distinct groupings of privacy methods usage in a cluster map. The map shows that together with non-technology methods of privacy protection, simple privacy-enhancing technologies (PETs) that are integrated in services, form the most used cluster, whereas more advanced PETs form a different, least used cluster. We further investigate user perception and reasoning for mostly using one set of PETs in a third study with N=183 participants. We do not find a difference in perceived competency in protecting privacy online between advanced and simpler PETs users. We compare use perceptions between advanced and simpler PETs and report on user reasoning for not using advanced PETs, as well as support needed for potential use. This paper contributes to privacy research by eliciting use and perception of use across 43 privacy methods, including 26 PETs across three countries and provides a map of PETs usage. The cluster map provides a systematic and reliable point of reference for future user-centric investigations across PETs. Overall, this research provides a broad understanding of use and perceptions across a collection of PETs, and can lead to future research for scaling use of PETs.
How to Make Private Distributed Cardinality Estimation Practical, and Get Differential Privacy for Free [Usenix Security ’21]
Secure computation is a promising privacy enhancing technology, but it is often not scalable enough for data intensive applications. On the other hand, the use of sketches has gained popularity in data mining, because sketches often give rise to highly efficient and scalable sub-linear algorithms. It is natural to ask: what if we put secure computation and sketches together? We investigated the question and the findings are interesting: we can get security, we can get scalability, and somewhat unexpectedly, we can also get differential privacy — for free. Our study started from building a secure computation protocol based on the Flajolet-Martin (FM) sketches, for solving the Private Distributed Cardinality Estimation (PDCE) problem, which is a fundamental problem with applications ranging from crowd tracking to network monitoring. The state of art protocol for PDCE is computationally expensive and not scalable enough to cope with big data applications, which prompted us to design a better protocol. Our further analysis revealed that if the cardinality to be estimated is large enough, our protocol can achieve [latex](\epsilon,\delta)[/latex]-differential privacy automatically, without requiring any additional manipulation of the output. The result signifies a new approach for achieving differential privacy that departs from the mainstream approach (i.e. adding noise to the result). Free differential privacy can be achieved because of two reasons: secure computation minimizes information leakage, and the intrinsic estimation variance of the FM sketch makes the output of our protocol uncertain. We further show that the result is not just theoretical: the minimal cardinality for differential privacy to hold is only [latex]10^2-10^4[/latex] for typical parameters.