In that case, let's save some bandwidth and just synthesize the facial expressions. Mostly, it would be work/concentration, with occasional glances of affection towards the "viewer." Of course, we'd probably get this slightly wrong, fall into the uncanny valley, and everyone would be left with the subconscious impression that everyone else hates them, and furthermore is actually an alien doppelganger posing as a an actual human. (Or that everyone else is a grievously defective human who needs to be shunned/euthanized for the good of the species.)
I've been curious about what the bandwidth would be like to image recognize my facial expression, encode it and then just send the event stream to a virtual avatar.