> The new Signal voice and video beta functionality eliminates the need for ZRTP. The "signaling" messages used to set up the voice/video beta calls (offer/answer SDPs, ICE candidates, etc) are transmitted over the normal Signal Protocol messaging channel, which binds the security of the call to that existing secure channel. It is no longer necessary to verify an additional SAS, which simplifies the calling experience.
https://whispersystems.org/blog/signal-video-calls-beta/
And it's not in beta anymore:
Yup, in regards to Signal our findings are already obsolete :D I think that the new Signal developments are great. It is better to allow only one key verification mechanism for unified usability and also use key continuity. Before, SAS needed to be verified for each call again.
I hope these authors will eventually look at the new thing too.
Deep learning and ability to train on a specific callers' voice [1] then mimic it might be an interesting attack vector. In practice Silent Circle's implementation does something interesting and instead of SAS numbers use dictionary words. So you end up with something like "Pink Elephant Salad". Could probably MitM that. However callers are then supposed to make some extra puns or discuss it a bit and say something like "Ha-ha! Wonder how tasty the an elephant salad would be". And if after MitM-ing, the string to the other side was "Plastic Blue Llamas" then a MitM attack becomes more obvious.
[1] http://research.baidu.com/deep-voice-production-quality-text...
There is existing work on testing the feasibility of impersonating other person's voice. We discuss them in our related work section at the end of the paper.
I think on the long run, SAS will no longer be a sufficient authentication technique due to advances in speech synthesis. To prolong ZRTP's life we propose usage of sentences instead of words/chars. This is discussed in detail in our best practices section.
I noticed that UX/UI is important and a guarantee that SAS should increase in length, what are some of the recommendations that you advise to have a good ZRTP implementation ?
Or should we start discussing the fadeoff of ZRTP and a change to something like Matrix protocol or even Signal's one ?