Apple said that the probability of a collision is quite a bit higher than that:
> As the system is initially deployed, we do not assume the 3 in 100M image-level false positive rate we mea- sured in our empirical assessment
The "1 in 1 trillion" part is the probability that the number of false positives could exceed the threshold needed to trigger a human review:
> Apple always chooses the match threshold such that the possibility of any given account being flagged incorrectly is lower than one in one trillion, under a very conservative assumption of the NeuralHash false positive rate in the field.
source: https://www.apple.com/child-safety/pdf/Security_Threat_Model..., page 10