“Solving things like "eye contact" or the equivalent could be done.“
I bet if you had several cameras you could compute a video feed where people have direct eye contact instead of seeing them staring at a screen during a conversation
You can track gaze with just a single web cam with something like webgazer.js although it's not super precise. There are companies like Tobii that make dedicated gaze tracking sensors in multiple form factors. The trick is figuring out what to do with that information.