Open Source Implementation of Apple's Private Compute Cloud (opens in new tab)

(github.com)

433 pointsadam_gyroscope6mo ago102 comments

102 comments

Reading the whitepaper, the inference provider still has the ability to access the prompt and response plaintext. This scheme does seem to guarantee that plaintext cannot be read for all other parties (e.g. the API router), and that the client's identity is hidden and cannot be associated with their request. Perhaps the precise privacy guarantees and allowances should be summarized in the readme.

With that in mind, does this scheme offer any advantage over the much simpler setup of a user sending an inference request:

- directly to an inference provider (no API router middleman)

- that accepts anonymous crypto payments (I believe such things exist)

- using a VPN to mask their IP?

macrael6mo ago

Howdy, head of Eng at confident.security here, so excited to see this out there.

I'm not sure I understand what you mean by inference provider here? The inference workload is not shipped off the compute node once it's been decrypted to e.g. OpenAI, it's running directly on the compute machine on open source models loaded there. Those machines are cryptographically attesting to the software they are running. Proving, ultimately, that there is no software that is logging sensitive info off the machine, and the machine is locked down, no SSH access.

This is how Apple's PCC does it as well, clients of the system will not even send requests to compute nodes that aren't making these promises, and you can audit the code running on those compute machines to check that they aren't doing anything nefarious.

The privacy guarantee we are making here is that no one, not even people operating the inference hardware, can see your prompts.

bjackman6mo ago

> no one, not even people operating the inference hardware

You need to be careful with these claims IMO. I am not involved directly in CoCo so my understanding lacks nuance but after https://tee.fail I came to understand that basically there's no HW that actually considers physical attacks in scope for their threat model?

The Ars Technica coverage of that publication has some pretty yikes contrasts between quotes from people making claims like yours, and the actual reality of the hardware features.

https://arstechnica.com/security/2025/10/new-physical-attack...

My current understanding of the guarantees here is:

- even if you completely pwn the inference operator, steal all root keys etc, you can't steal their customers' data as a remote attacker

- as a small cabal of arbitrarily privileged employees of the operator, you can't steal the customers' data without a very high risk of getting caught

- BUT, if the operator systematically conspires to steal the customers' data, they can. If the state wants the data and is willing to spend money on getting it, it's theirs.

3 more replies

jiveturkey6mo ago

> The privacy guarantee we are making here is that no one, not even people operating the inference hardware, can see your prompts.

that cannot be met, period. your asssumptions around physical protections are invalid or at least incorrect. It works for Apple (well enough) because of the high trust we place in their own physical controls, and market incentive to protect that at all costs.

> This is how Apple's PCC does it as well [...] and you can audit the code running on those compute machines to check that they aren't doing anything nefarious.

just based on my recollection, and I'm not going to have a new look at it to validate what I'm saying here, but with PCC, no you can't actually do that. With PCC you do get an attestation, but there isn't actually a "confidential compute" aspect where that attestation (that you can trust) proves that is what is running. You have to trust Apple at that lowest layer of the "attestation trust chain".

I feel like with your bold misunderstandings you are really believing your own hype. Apple can do that, sure, but a new challenger cannot. And I mean your web page doesn't even have an "about us" section.

2 more replies

ryanMVP6mo ago

Thanks for the reply! By "inference provider" I meant someone operating a ComputeNode. I initially skimmed the paper, but I've now read more closely and see that we're trying to get guarantees that even a malicious operator is unable to e.g. exfiltrate prompt plaintext.

Despite recent news of vulnerabilities, I do think that hardware-root-of-trust will eventually be a great tool for verifiable security.

A couple follow-up questions:

1. For the ComputeNode to be verifiable by the client, does this require that the operator makes all source code running on the machine publicly available?

2. After a client validates a ComputeNode's attestation bundle and sends an encrypted prompt, is the client guaranteed that only the ComputeNode running in its attested state can decrypt the prompt? Section 2.5.5 of the whitepaper mentions expiring old attestation bundles, so I wonder if this is to protect against a malicious operator presenting an attestation bundle that doesn't match what's actually running on the ComputeNode.

1 more reply

Terretta6mo ago

> the inference provider still has the ability to access the prompt and response plaintext

Folks may underestimate the difficulty of providing compute that the provider “cannot”* access to reveal even at gunpoint.

BYOK does cover most of it, but oh look, you brought me and my code your key, thanks… Apple's approach, and certain other systems such as AWS's Nitro Enclaves, aim at this last step of the problem:

- https://security.apple.com/documentation/private-cloud-compu...

- https://aws.amazon.com/confidential-computing/

NCC Group verified AWS's approach and found:

1. There is no mechanism for a cloud service provider employee to log in to the underlying host.

2. No administrative API can access customer content on the underlying host.

3. There is no mechanism for a cloud service provider employee to access customer content stored on instance storage and encrypted EBS volumes.

4. There is no mechanism for a cloud service provider employee to access encrypted data transmitted over the network.

5. Access to administrative APIs always requires authentication and authorization.

6. Access to administrative APIs is always logged.

7. Hosts can only run tested and signed software that is deployed by an authenticated and authorized deployment service. No cloud service provider employee can deploy code directly onto hosts.

- https://aws.amazon.com/blogs/compute/aws-nitro-system-gets-i...

Points 1 and 2 are more unusual than 3 - 7.

Folks who enjoy taking things apart to understand them can hack at Apple's here:

https://security.apple.com/blog/pcc-security-research/

* Except by, say, withdrawing the system (see Apple in UK) so users have to use something less secure, observably changing the system, or other transparency trippers.

michaelt6mo ago

> 3. There is no mechanism for a cloud service provider employee to access customer content stored on instance storage and encrypted EBS volumes.

Are you telling me customer services can't reset a customer's forgotten console login password?

1 more reply

jiggawatts6mo ago

My logic is that these "confidential compute" problems suffer from some of the same issues as "immutable storage in blockchain".

I.e.: If the security/privacy guarantees really are as advertised, then ipso facto someone could store child porn in the system and the provider couldn't detect this.

Then by extension, any truly private system is exposing themselves to significant business, legal, and moral risk of being tarred and feathered along with the pedos that used their system.

It's a real issue, and has come up regularly with blockchain based data storage. If you make it "cencorship proof", the by definition you can't scrub it of illegal data!

Similarly, if cloud providers allow truly private data hosting, then they're exposing themselves to the risk of hosting data that is being stored with that level of privacy guarantees precisely because it is so very, very illegal.

(Or substitute: Stolen state secrets that will have the government come down on you like a ton of bricks. Stolen intellectual property. Blackmail information on humourless billionaires. Illegal gambling sites. Nuclear weapons designs. So on, and so forth.)

3 more replies

sublimefire6mo ago

Yes but at the end of the day you need to trust the cloud provider tools which expands the trust boundary from just hardware root of trust. Who is to guarantee they will not create a malicious tool update and push it then retract it? It is nowhere captured and you cannot prove it.

1 more reply

7e6mo ago

At the end if the day, Nitro Enclaves are still “trust Amazon”, which is a poor guarantee. NVIDIA+AMD offers hardware backed enclave features for their GPUs which is the superior solution here.

3 more replies

amelius6mo ago

> Folks may underestimate the difficulty of providing compute that the provider “cannot”* access to reveal even at gunpoint.

It's even harder to do this plus the hard requirement of giving the NSA access.

Or alternatively, give the user a verifiable guarantee that nobody has access.

1 more reply

immibis6mo ago

It's probably illegal for a business to take anonymous cryptocurrency payments in the EU. Businesses are allowed to take traceable payments only, or else it's money laundering.

With the caveat that it's not clear what precisely is illegal about these payments and to what level it's illegal. It might be that a business isn't allowed to have any at all, or isn't allowed to use them for business, or can use them for business but can't exchange them for normal currency, or can do all that but has to check their customer's passport and fill out reams of paperwork.

https://bitcoinblog.de/2025/05/05/eu-to-ban-trading-of-priva...

anon7216563216mo ago

at that point, it seems easier to run a slightly worse model locally. (or on a rented server)

rimeice6mo ago

Which is apples own approach until the compute requirements need them to run some compute on cloud.

1 more reply

rasengan6mo ago

We are introducing Verifiably Private AI [1] which actually solves all of the issues you mention. Everything across the entire chain is verifiably private (or in other words, transparent to the user in such a way they can verify what is running across the entire architecture).

[1] https://ai.vp.net/

jmort6mo ago

It should be able to support connecting via an OpenPCC client, then!

poly2it6mo ago

Whitepaper?

derpsteb6mo ago

I was part of a team that does the same thing. Arguably as a paid service, but source availability and meaningful attestation.

Service: https://www.privatemode.ai/ Code: https://github.com/edgelesssys/privatemode-public

jmort6mo ago

OpenPCC is Apache 2.0 without a CLA to prevent rugpulls whereas edgeless is BSL

jiveturkey6mo ago

m1ghtym06mo ago

Exactly, attestation is what matters. Excluding the inference provider from the prompt is the USP here. Privatemode can do that via an attestation chain (source code -> reproducible build -> TEE attestation report) + code/stack that ensures isolation (Kata/CoCo, runtime policy).

saurik6mo ago

Yes: "provably" private... unless you have $1000 for a logic analyzer and a steady hand to solder together a fake DDR module.

https://news.ycombinator.com/item?id=45746753

Lord-Jobo6mo ago

well, also indefinite time and physical access.

saurik6mo ago

Which is what the provider themselves have, by definition. The people who run these services are literally sitting next to the box day in and day out... this isn't "provably" anything. You can trust them not to take advantage of the fact that they own the hardware, and you can even claim it makes it ever so slightly harder for them to do so, but this isn't something where the word "provably" is anything other than a lie.

2 more replies

sublimefire6mo ago

If you trust the provider then it does not make it much better to use such architecture. If you do not then at least the execution should be inside a confidential system so that even soldering would not get you to data

rossjudson6mo ago

GCP can and does live migrate confidential VMs between machines. Which of the 50k machines in a cluster were you going to attach your analyzer to?

saurik6mo ago

1) If you were GCP (as they are the attacker in this scenario), you'd attach the analyzer to ANY (!) ONE (!) server and then you migrate the user's workload that you wanted to snoop on (or were required to snoop on by the FBI) to your evil server. Like, you are clearly trying to say this makes it harder (though even if this were true that doesn't make it at all "provable")... but, if you support migration, you actually made it EASIER for you (aka, GCP) to abuse your privileged position.

2) These attacks are actually worse than what I am pretty sure you are assuming (and so where I started my response), as you actually just need one hacked server and then you can simulate working servers on other hardware that isn't hacked by either stealing an attested key or stealing the attestation key itself. You often wouldn't even then need to have the hacked server anymore.

1 more reply

kiwicopple6mo ago

impressive work jmo - thanks for open sourcing this (and OSI-compliant)

we are working on a challenge which is somewhat like a homomorphic encryption problem - I'm wondering if OpenPCC could help in some way? :

When developing websites/apps, developers generally use logs to debug production issues. However with wearables, logs can be privacy issue: imagine some AR glasses logging visual data (like someone's face). Would OpenPCC help to extract/clean/anonymize this sort of data for developers to help with their debugging?

jmort6mo ago

Yep, you could run an anonymization workload inside the OpenPCC compute node. We target inference as the "workload" but it's really just attested HTTP server where you can't see inside. So, in this case your client (the wearable) would send its data first through OpenPCC to a server that runs some anonymization process.

If it's possible to anonymize on the wearable, that would be simpler.

The challenge is what does the anonymizer "do" to be perfect?

As an aside, IMO homomorphic encryption (still) isn't ready...

wferrell6mo ago

Really nice release. Excited to see this out in the wild and hopeful more companies leverage this for better end user privacy.

sublimefire6mo ago

Quite similar to what Azure with conf ai inference did [1].

[1] https://techcommunity.microsoft.com/blog/azureconfidentialco...

jmort6mo ago

I haven’t been able to find their source code. Pretty important for the transparency side of it. Have you seen it?

DeveloperOne6mo ago

Glad to see Golang here. Go will surpass Python in the AI field, mark my words.

jabedude6mo ago

Where is the compute node source code?

utopiah6mo ago

That's nice... in theory. Like it could be cool, and useful... but like what would I actually run on it if I'm not a spammer?

Edit : reminds me of federated learning and FlowerLLM (training only AFAIR, not inference), like... yes, nice, I ALWAYS applaud any way to disentangle from proprieaty software and wall gardens... but like what for? What actual usage?

utopiah6mo ago

Gimme an actual example instead of downvoting, help me learn.

Edit on that too : makes me think of OpenAI Whisper as a service via /e/OS and supposedly anonymous proxying (by mixing), namely running STT remotely. That would be an actual potential usage... but IMHO that's low end enough to be run locally. So I'm still looking for an application here.

wat100006mo ago

Are you looking for a general application of LLMs too large to run locally? Because anything you might use remote inference for, you might want to use privately.

1 more reply

fragmede6mo ago

> would I actually run on it if I'm not a spammer?

> Gimme an actual example instead of downvoting, help me learn.

Basically you asked a bunch of people on a privacy minded forum, why should they be allowed to encrypt their data? What are you (they) hiding!? Are you a spammer???

Apple is beloved for their stance on privacy, and you basically called everyone who thinks that's more than marketing, a spammer. And before you start arguing no you didn't, it doesn't matter that you didn't, what matters is that that's how your comment made people feel. You can say they're the stupid ones because that's not what you wrote, but if you're genuinely asking for feedback about the downvotes, there you are.

You seriously can't imagine any reason to want to use an LLM privately other than to use it to write spam bots and to spam people? At the very least expand your scope past spamming to, like, also using it to write ransomware.

The proprietary models that can't be run locally are SOTA and local models, even if they can come close, simply aren't what people want.

1 more reply

nixpulvis6mo ago

Thought this was going to be about Orchard from the title.

MangoToupe6mo ago

@dang can we modify the title to acknowledge that it's specific to chatbots? The title reads like this is about generic compute, and the content is emphatically not about generic compute.

I realize this is just bad branding by apple but it's still hella confusing.

jmort6mo ago

It does work generically. Like Apple, we initially targeted inference, but it under the hood just an anonymous, attested HTTP server wrapper. The ComputeNode can run an arbitrary workload.

MangoToupe6mo ago

Interesting!

j / k navigate · click thread line to collapse

102 comments

ryanMVP6mo ago

With that in mind, does this scheme offer any advantage over the much simpler setup of a user sending an inference request:

- directly to an inference provider (no API router middleman)

- that accepts anonymous crypto payments (I believe such things exist)

- using a VPN to mask their IP?

macrael6mo ago

Howdy, head of Eng at confident.security here, so excited to see this out there.

The privacy guarantee we are making here is that no one, not even people operating the inference hardware, can see your prompts.

bjackman6mo ago

> no one, not even people operating the inference hardware

The Ars Technica coverage of that publication has some pretty yikes contrasts between quotes from people making claims like yours, and the actual reality of the hardware features.

https://arstechnica.com/security/2025/10/new-physical-attack...

My current understanding of the guarantees here is:

- even if you completely pwn the inference operator, steal all root keys etc, you can't steal their customers' data as a remote attacker

- as a small cabal of arbitrarily privileged employees of the operator, you can't steal the customers' data without a very high risk of getting caught

- BUT, if the operator systematically conspires to steal the customers' data, they can. If the state wants the data and is willing to spend money on getting it, it's theirs.

3 more replies

jiveturkey6mo ago

> The privacy guarantee we are making here is that no one, not even people operating the inference hardware, can see your prompts.

> This is how Apple's PCC does it as well [...] and you can audit the code running on those compute machines to check that they aren't doing anything nefarious.

2 more replies

ryanMVP6mo ago

Despite recent news of vulnerabilities, I do think that hardware-root-of-trust will eventually be a great tool for verifiable security.

A couple follow-up questions:

1. For the ComputeNode to be verifiable by the client, does this require that the operator makes all source code running on the machine publicly available?

1 more reply

Terretta6mo ago

> the inference provider still has the ability to access the prompt and response plaintext

Folks may underestimate the difficulty of providing compute that the provider “cannot”* access to reveal even at gunpoint.

BYOK does cover most of it, but oh look, you brought me and my code your key, thanks… Apple's approach, and certain other systems such as AWS's Nitro Enclaves, aim at this last step of the problem:

- https://security.apple.com/documentation/private-cloud-compu...

- https://aws.amazon.com/confidential-computing/

NCC Group verified AWS's approach and found:

1. There is no mechanism for a cloud service provider employee to log in to the underlying host.

2. No administrative API can access customer content on the underlying host.

3. There is no mechanism for a cloud service provider employee to access customer content stored on instance storage and encrypted EBS volumes.

4. There is no mechanism for a cloud service provider employee to access encrypted data transmitted over the network.

5. Access to administrative APIs always requires authentication and authorization.

6. Access to administrative APIs is always logged.

7. Hosts can only run tested and signed software that is deployed by an authenticated and authorized deployment service. No cloud service provider employee can deploy code directly onto hosts.

- https://aws.amazon.com/blogs/compute/aws-nitro-system-gets-i...

Points 1 and 2 are more unusual than 3 - 7.

Folks who enjoy taking things apart to understand them can hack at Apple's here:

https://security.apple.com/blog/pcc-security-research/

* Except by, say, withdrawing the system (see Apple in UK) so users have to use something less secure, observably changing the system, or other transparency trippers.

michaelt6mo ago

> 3. There is no mechanism for a cloud service provider employee to access customer content stored on instance storage and encrypted EBS volumes.

Are you telling me customer services can't reset a customer's forgotten console login password?

1 more reply

jiggawatts6mo ago

My logic is that these "confidential compute" problems suffer from some of the same issues as "immutable storage in blockchain".

I.e.: If the security/privacy guarantees really are as advertised, then ipso facto someone could store child porn in the system and the provider couldn't detect this.

Then by extension, any truly private system is exposing themselves to significant business, legal, and moral risk of being tarred and feathered along with the pedos that used their system.

It's a real issue, and has come up regularly with blockchain based data storage. If you make it "cencorship proof", the by definition you can't scrub it of illegal data!

3 more replies

sublimefire6mo ago

1 more reply

7e6mo ago

At the end if the day, Nitro Enclaves are still “trust Amazon”, which is a poor guarantee. NVIDIA+AMD offers hardware backed enclave features for their GPUs which is the superior solution here.

3 more replies

amelius6mo ago

> Folks may underestimate the difficulty of providing compute that the provider “cannot”* access to reveal even at gunpoint.

It's even harder to do this plus the hard requirement of giving the NSA access.

Or alternatively, give the user a verifiable guarantee that nobody has access.

1 more reply

immibis6mo ago

It's probably illegal for a business to take anonymous cryptocurrency payments in the EU. Businesses are allowed to take traceable payments only, or else it's money laundering.

https://bitcoinblog.de/2025/05/05/eu-to-ban-trading-of-priva...

anon7216563216mo ago

at that point, it seems easier to run a slightly worse model locally. (or on a rented server)

rimeice6mo ago

Which is apples own approach until the compute requirements need them to run some compute on cloud.

1 more reply

rasengan6mo ago

[1] https://ai.vp.net/

jmort6mo ago

It should be able to support connecting via an OpenPCC client, then!

poly2it6mo ago

Whitepaper?

derpsteb6mo ago

I was part of a team that does the same thing. Arguably as a paid service, but source availability and meaningful attestation.

Service: https://www.privatemode.ai/ Code: https://github.com/edgelesssys/privatemode-public

jmort6mo ago

OpenPCC is Apache 2.0 without a CLA to prevent rugpulls whereas edgeless is BSL

jiveturkey6mo ago

m1ghtym06mo ago

saurik6mo ago

Yes: "provably" private... unless you have $1000 for a logic analyzer and a steady hand to solder together a fake DDR module.

https://news.ycombinator.com/item?id=45746753

Lord-Jobo6mo ago

well, also indefinite time and physical access.

saurik6mo ago

2 more replies

sublimefire6mo ago

rossjudson6mo ago

GCP can and does live migrate confidential VMs between machines. Which of the 50k machines in a cluster were you going to attach your analyzer to?

saurik6mo ago

1 more reply

kiwicopple6mo ago

impressive work jmo - thanks for open sourcing this (and OSI-compliant)

we are working on a challenge which is somewhat like a homomorphic encryption problem - I'm wondering if OpenPCC could help in some way? :

jmort6mo ago

If it's possible to anonymize on the wearable, that would be simpler.

The challenge is what does the anonymizer "do" to be perfect?

As an aside, IMO homomorphic encryption (still) isn't ready...

wferrell6mo ago

Really nice release. Excited to see this out in the wild and hopeful more companies leverage this for better end user privacy.

sublimefire6mo ago

Quite similar to what Azure with conf ai inference did [1].

[1] https://techcommunity.microsoft.com/blog/azureconfidentialco...

jmort6mo ago

I haven’t been able to find their source code. Pretty important for the transparency side of it. Have you seen it?

DeveloperOne6mo ago

Glad to see Golang here. Go will surpass Python in the AI field, mark my words.

jabedude6mo ago

Where is the compute node source code?

utopiah6mo ago

That's nice... in theory. Like it could be cool, and useful... but like what would I actually run on it if I'm not a spammer?

utopiah6mo ago

Gimme an actual example instead of downvoting, help me learn.

wat100006mo ago

Are you looking for a general application of LLMs too large to run locally? Because anything you might use remote inference for, you might want to use privately.

1 more reply

fragmede6mo ago

> would I actually run on it if I'm not a spammer?

> Gimme an actual example instead of downvoting, help me learn.

Basically you asked a bunch of people on a privacy minded forum, why should they be allowed to encrypt their data? What are you (they) hiding!? Are you a spammer???

The proprietary models that can't be run locally are SOTA and local models, even if they can come close, simply aren't what people want.

1 more reply

nixpulvis6mo ago

Thought this was going to be about Orchard from the title.

MangoToupe6mo ago

@dang can we modify the title to acknowledge that it's specific to chatbots? The title reads like this is about generic compute, and the content is emphatically not about generic compute.

I realize this is just bad branding by apple but it's still hella confusing.

jmort6mo ago

It does work generically. Like Apple, we initially targeted inference, but it under the hood just an anonymous, attested HTTP server wrapper. The ComputeNode can run an arbitrary workload.

MangoToupe6mo ago

Interesting!

j / k navigate · click thread line to collapse