Google on Tuesday unveiled a new privacy-enhancing technology called Private AI Compute to process artificial intelligence (AI) queries in a secure platform in the cloud.
The company said it has built Private AI Compute to "unlock the full speed and power of Gemini cloud models for AI experiences, while ensuring your personal data stays private to you and is not accessible to anyone else, not even Google."
Private AI Compute has been described as a "secure, fortified space" for processing sensitive user data in a manner that's analogous to on-device processing but with extended AI capabilities. It's powered by Trillium Tensor Processing Units (TPUs) and Titanium Intelligence Enclaves (TIE), allowing the company to use its frontier models without sacrificing on security and privacy.
In other words, the privacy infrastructure is designed to take advantage of the computational speed and power of the cloud while retaining the security and privacy assurances that come with on-device processing.
Google's CPU and TPU workloads (aka trusted nodes) rely on an AMD-based hardware Trusted Execution Environment (TEE) that encrypts and isolates memory from the host. The tech giant noted that only attested workloads can run on the trusted nodes, and that administrative access to the workloads is cut off. Furthermore, the nodes are secured against potential physical data exfiltration attacks.
The infrastructure also supports peer-to-peer attestation and encryption between the trusted nodes to ensure that user data is decrypted and processed only within the confines of a secure environment and is shielded from broader Google infrastructure.
"Each workload requests and cryptographically validates the workload credentials of the other, ensuring mutual trust within the protected execution environment," Google explained. "Workload credentials are provisioned only upon successful validation of the node's attestation against internal reference values. Failure of validation prevents connection establishment, thus safeguarding user data from untrusted components."
The overall process flow works like this: A user client establishes a Noise protocol encryption connection with a frontend server and establishes bi-directional attestation. The client also validates the server's identity using an Oak end-to-end encrypted attested session to confirm that it's genuine and not modified.
Following this step, the server sets up an Application Layer Transport Security (ALTS) encryption channel with other services in the scalable inference pipeline, which then communicates with model servers running on the hardened TPU platform. The entire system is "ephemeral by design," meaning an attacker who manages to gain privileged access to the system cannot obtain past data, as the inputs, model inferences, and computations are discarded as soon as the user session is completed.
![]() |
| Google Private AI Compute Architecture |
Google has also touted the various protections baked into the system to maintain its security and integrity and prevent unauthorized modifications. These include -
- Minimizing the number of components and entities that must be trusted for data confidentiality
- Using Confidential Federated Compute for collecting analytics and aggregate insights
- Encryption for client-server communications
- Binary authorization to ensure only signed, authorized code and validated configurations are running across its software supply chain
- Isolating user data in Virtual Machines (VMs) to contain compromise
- Securing systems against physical exfiltration with memory encryption and input/output memory management unit (IOMMU) protections
- Zero shell access on the TPU platform
- Using IP blinding relays operated by third-parties to tunnel all inbound traffic to the system and obscure the true origin of the request
- Isolating the system's authentication and authorization from inference using Anonymous Tokens
NCC Group, which has conducted an external assessment of Private AI Compute between April and September 2025, said it was able to discover a timing-based side channel in the IP blinding relay component that could be used to "unmask" users under certain conditions. However, Google has deemed it low risk due to the fact that the multi-user nature of the system introduces a "significant amount of noise" and makes it challenging for an attacker to correlate a query to a specific user.
The cybersecurity company also said it identified three issues in the implementation of the attestation mechanism that could result in a denial-of-service (DoS) condition, as well as various protocol attacks. Google is currently working on mitigations for all of them.
"Although the overall system relies upon proprietary hardware and is centralized on Borg Prime, [...] Google has robustly limited the risk of user data being exposed to unexpected processing or outsiders, unless Google, as a whole organization, decides to do so," it said. "Users will benefit from a high level of protection from malicious insiders."
The development mirrors similar moves from Apple and Meta, which have released Private Cloud Compute (PCC) and Private Processing to offload AI queries from mobile devices in a privacy-preserving way.
"Remote attestation and encryption are used to connect your device to the hardware-secured sealed cloud environment, allowing Gemini models to securely process your data within a specialized, protected space," Jay Yagnik, Google's vice president for AI Innovation and Research, said. "This ensures sensitive data processed by Private AI Compute remains accessible only to you and no one else, not even Google."
Found this article interesting? Follow us on Google News, Twitter and LinkedIn to read more exclusive content we post.



