EyesOff: How I built a screen contact detection model (opens in new tab)

(ym2132.github.io)

35 pointsTwo_hands6mo ago25 comments

25 comments

Man, it's going to be great when this gets adapted to make sure I'm looking into the screen at all the ads I'm required to watch, or when it complies a report of whether or not I'm paying attention to my boss in an all-hands...

Two_handsOP6mo ago

Others are already trying to do this https://arxiv.org/abs/2504.06237. I haven't seen anyone take the approach I tried though, as most uses cases focus on tracking the main user rather than others around.

harvey96mo ago

Article links to a library for that but says the licence didn't suit this project.

https://github.com/rehg-lab/eye-contact-cnn

Freak_NL6mo ago

> It’s an application which detects people looking at your screen. The aim is to keep you safe from shoulder surfing, utilising your webcam to give you the power to prevent snoopers.

When is this ever a problem that cannot be solved by positioning yourself with a wall behind you or going somewhere private? This feels like overkill for the stated use-case. I can imagine someone thinking they might need this to do private stuff in a public space (a coffee shop?), but they'd turn paranoid from everyone passing by just glancing around.

Also, is this a realistic threat model anywhere? People snooping by standing behind you tend to be colleagues or totally random passers-by; not people actually interested in gleaning private information. Anything more serious than logging into your Facebook account would imply simply having proper OpSec procedures (like: 'only do this in private').

All I can think of is employee monitoring where such tools will just end up making people insecure in their workplace (and less productive, because gazing out of a window or into nothingness actually helps when you are doing work which requires pondering; and less healthy, because looking away from your screen into the distance is recommended for anyone with working eyes).

_menelaus6mo ago

I've done a ton of mobile gaze tracking. We never went for the most important application here: babies will preferentially look at different things on a screen if they predisposed to autism. A screening tool is the easiest thing to make from a technical point of view and also the most useful for society. Why don't you try that? Current methods wait until the baby can talk and this could trigger intervention a very critical year earlier.

Two_handsOP6mo ago

This has been done! It’s the paper I first looked at for this task: https://github.com/rehg-lab/eye-contact-cnn

They create this CNN for exactly this task, autism diagnosis in children. I suppose this model would work for babies too.

Edit: ah I see your point, in the paper they diagnose autism with eye contact, but your point is a task closer to what my model does. It could definelty be augmented for such a task, we’d just need to improve the accuracy. The only issue I see is sourcing training data might be tricky, unless I partner with some institution researching this. If you know of anyone in this field I’d be happy to speak with them.

_menelaus6mo ago

That's great! What I'm talking about is a bit different though and might be a lot easier to deploy and work on much younger subjects:

Put a tablet in front of a baby. Left half has images of gears and stuff, right half has images of people and faces. Does the baby look at the left or right half of the screen? This is actually pretty indicative of autism and easy to put into a foolproof app.

The linked github is recording a video of an older child's face while they look at a person who is wearing a camera or something, and judging whether or not they make proper eye contact. This is thematically similar but actually really different. Requires an older kid, both for the model and method, and is hard to actually use. Not that useful.

Intervening when still a baby is absolutely critical.

P.S., deciding which half of a tablet a baby is looking is MUCH MUCH easier than gaze tracking. Make the tablet screen bright white around the edges. Turn brightness up. Use off the shelf iris tracking software. Locate the reflection of the iPad in the baby's iris. Is it on the right half or left half of the iris? Adjust for their position in FOV and their face pose a bit and bam that's very accurate. Full, robust gaze tracking is a million times harder, believe me.

Two_handsOP6mo ago

Thats a cool idea, thanks for sharing! It's cool to see other uses for a model I built for a completely different task.

Is there any research/papers on this type of autism diagnosis tools for babies?

To your last point, yes I agree. Even the task I setup the model for is relatively easy compared to proper gaze tracking, I just rely on large datasets.

I suppose you could do it in the way you say and then from that gather data to eventually build out another model.

I'll for sure look into this, appreciate the idea sharing!

1 more reply

dinobones6mo ago

So much text and not a single example, diagram, or demo.

I'm honestly skeptical this will work at all, the FOV of most webcams is so small that it can barely capture the shoulder of someone sitting beside me, let alone their eyes.

Then what you're basically looking for is callibration from the eye position / angle to the screen rectangle. You want to shoot a ray from each eye and see if they intersect with the laptop's screen.

This is challenging because most webcams are pretty low resolution, so each eyeball will probably be like ~20px. From these 20px, you need to estimate the eyeball->screen ray. And of course this varies with the screen size.

TLDR: Decent idea, but should've done some napkin math and or quick bounds checking first. Maybe a $5 privacy protector is better.

Here's an idea:

Maybe start by seeing if you can train a primary user gaze tracker first, how well you can get it with modeling and then calibration. Then once you've solved that problem, you can use that as your upper bound of expected performance, and transform the problem to detecting the gaze of people nearby instead of the primary user.

Two_handsOP6mo ago

Sorry, I haven't gotten around to gathering examples yet. I ran the models on some example videos which is where the accuracy stats come.

Perhaps I have been jaded by the Mac webcam, I agree on most old webcams it wont be great but on newer webcams I have had success.

I did try a calibration approach but it's simply too fragile for in the wild deployment, calibration works great if you only care about one user but when you start looking at other people it doesn't work so well.

Good idea, it may be more fruitful to do that. At least then for the primary user we can be much more certain.

xlii6mo ago

Assuming this works it will for sure be used for employee tracking.

Privacy protector solves different problems - they prevent people from extracting information on screen, not merely inform about possible infraction.

That being said it's useful in a way that if I'd see anything like that in a contract it wouldn't be a red flag. It'd be red flashing GT*O alarm ;)

Two_handsOP6mo ago

This models supports the EyesOff application which will prevent information extraction by either having a popup, switching to another app, or a notification(you can define the behaviour in a few different ways).

Privacy screens are still useful and I recommend people to use EyesOff and the screen protector. A privacy screen won't stop someone shoulder surfing from directly behind you etc.

There is also better ways to do this sort of task when all you care about is tracking the main user: https://arxiv.org/abs/2504.06237, https://pmc.ncbi.nlm.nih.gov/articles/PMC11019238/

jauco6mo ago

Thanks for the detailed log on what it takes to build your own model and how you prepared your own dataset. Interesting read!

Two_handsOP6mo ago

Thanks glad you enjoyed it.

IshKebab6mo ago

Not going to be very useful for its stated purpose because front facing cameras generally have quite a narrow field of view.

Interesting problem anyway. I'm surprised the accuracy is so low.

Two_handsOP6mo ago

Yeah tbh I do recommend using this alongside a privacy screen for best protection. Privacy screens also suffer from the fact that they won’t block someone directly behind you from seeing the screen, so both methods have issues.

Any tips on improving accuracy? A lot of it might be due to lack of diverse images + labelling errors as I did it all manually.

IshKebab6mo ago

I dunno, my only idea is that maybe if you use traditional face detection to find the face/eyes and then do classification (assuming you aren't doing that already?).

Two_handsOP6mo ago

Right now that's pretty much what I do. I use YuNet to get faces, crop them out and run detection. It's probably a factor of a not enough data/poor model choice.

welcome_dragon6mo ago

Am I the only one that remembers a phone several years ago that advertised a feature like this?

I remember a guy watching a video them looking up and it paused, etc

Two_handsOP6mo ago

Yeah I did see something like this, may have been huawei. Not sure if they use a model or sensor based approaches though

xunil2ycom6mo ago

Cool. Now delete it please.

j / k navigate · click thread line to collapse

25 comments

jwcacces6mo ago

Two_handsOP6mo ago

harvey96mo ago

Article links to a library for that but says the licence didn't suit this project.

https://github.com/rehg-lab/eye-contact-cnn

Freak_NL6mo ago

> It’s an application which detects people looking at your screen. The aim is to keep you safe from shoulder surfing, utilising your webcam to give you the power to prevent snoopers.

_menelaus6mo ago

Two_handsOP6mo ago

This has been done! It’s the paper I first looked at for this task: https://github.com/rehg-lab/eye-contact-cnn

They create this CNN for exactly this task, autism diagnosis in children. I suppose this model would work for babies too.

_menelaus6mo ago

That's great! What I'm talking about is a bit different though and might be a lot easier to deploy and work on much younger subjects:

Intervening when still a baby is absolutely critical.

Two_handsOP6mo ago

Thats a cool idea, thanks for sharing! It's cool to see other uses for a model I built for a completely different task.

Is there any research/papers on this type of autism diagnosis tools for babies?

To your last point, yes I agree. Even the task I setup the model for is relatively easy compared to proper gaze tracking, I just rely on large datasets.

I suppose you could do it in the way you say and then from that gather data to eventually build out another model.

I'll for sure look into this, appreciate the idea sharing!

1 more reply

dinobones6mo ago

So much text and not a single example, diagram, or demo.

I'm honestly skeptical this will work at all, the FOV of most webcams is so small that it can barely capture the shoulder of someone sitting beside me, let alone their eyes.

Then what you're basically looking for is callibration from the eye position / angle to the screen rectangle. You want to shoot a ray from each eye and see if they intersect with the laptop's screen.

TLDR: Decent idea, but should've done some napkin math and or quick bounds checking first. Maybe a $5 privacy protector is better.

Here's an idea:

Two_handsOP6mo ago

Sorry, I haven't gotten around to gathering examples yet. I ran the models on some example videos which is where the accuracy stats come.

Perhaps I have been jaded by the Mac webcam, I agree on most old webcams it wont be great but on newer webcams I have had success.

Good idea, it may be more fruitful to do that. At least then for the primary user we can be much more certain.

xlii6mo ago

Assuming this works it will for sure be used for employee tracking.

Privacy protector solves different problems - they prevent people from extracting information on screen, not merely inform about possible infraction.

That being said it's useful in a way that if I'd see anything like that in a contract it wouldn't be a red flag. It'd be red flashing GT*O alarm ;)

Two_handsOP6mo ago

Privacy screens are still useful and I recommend people to use EyesOff and the screen protector. A privacy screen won't stop someone shoulder surfing from directly behind you etc.

There is also better ways to do this sort of task when all you care about is tracking the main user: https://arxiv.org/abs/2504.06237, https://pmc.ncbi.nlm.nih.gov/articles/PMC11019238/

jauco6mo ago

Thanks for the detailed log on what it takes to build your own model and how you prepared your own dataset. Interesting read!

Two_handsOP6mo ago

Thanks glad you enjoyed it.

IshKebab6mo ago

Not going to be very useful for its stated purpose because front facing cameras generally have quite a narrow field of view.

Interesting problem anyway. I'm surprised the accuracy is so low.

Two_handsOP6mo ago

Any tips on improving accuracy? A lot of it might be due to lack of diverse images + labelling errors as I did it all manually.

IshKebab6mo ago

I dunno, my only idea is that maybe if you use traditional face detection to find the face/eyes and then do classification (assuming you aren't doing that already?).

Two_handsOP6mo ago

Right now that's pretty much what I do. I use YuNet to get faces, crop them out and run detection. It's probably a factor of a not enough data/poor model choice.

welcome_dragon6mo ago

Am I the only one that remembers a phone several years ago that advertised a feature like this?

I remember a guy watching a video them looking up and it paused, etc

Two_handsOP6mo ago

Yeah I did see something like this, may have been huawei. Not sure if they use a model or sensor based approaches though

xunil2ycom6mo ago

Cool. Now delete it please.

j / k navigate · click thread line to collapse