The risk and part of the returns there are for the investors. While it will generate additional revenue (and diversification) for your bootstrapped company allowing you to keep building and mitigate some of the risk of having a narrow (military) client base.
And if it becomes a major success (sounds like pg thinks that's possible) you'll co-own it.
I do think your suggestion is a good approach, but what's key is finding smart investment that has experience in developing medical devices.
Do you have software that converts internal tracking info into pixel coordinates? Multiple screens?
Of course cost can be substantially decreased if everything was standardized and the mech engineering was done once. BOMs (excluding the high-end desktop computer) is about $2000-$3000. My dream would be to reduce that cost by moving the computer vision to compute modules per pairs of cameras reducing the BOM to < $1000 and avoiding the desktop.
They have generally struggled to find funding for their eye tracking focused work, and have recently had to pivot away from the really exciting but hard to fund stuff into PTSD screening (which is important too).
I can connect you with the founder if desired via the email in my bio
The problem is not the eye-tracking, it is reasonably easy to build robust systems that can do that easily enough, even with custom hardware under all sorts of lighting conditions. The hard part is the UX if you are trying to build something that isn't hampered by current UI paradigms.
Rapid typing and menus of custom actions with just eye movement, though fatiguing, shouldn't be hard to solve, and then render the output however you want; text, text to speech, commands issued to an machine, etc. Making a usable user interface to do anything else, that's where the rubber hits the road.
@pg, which software is your friend using? If it is anything like I've looked in to in the past, it's over-priced accessibility crap with a UI straight out of the 1990s.
Input modalities define platforms. Eye tracking is a new input modality and will define a new platform. It needs a whole new UX designed around its limitations and strengths. It needs a keyboard, it needs a browser, it needs copy and paste, it needs an app switcher, it needs a whole vocabulary of standard interactions and best practices. Apple has a good start in Vision Pro but they're not going to be the only ones doing UX for eye tracking. There's definitely room for other players with fresh ideas.
- gesture based eye movements, maybe two sweeps on a nine by nine grid, which map directly to phonemes
- enormous 4k 75inch tv with thousands of words or ideograms or phrases
- "writing" with your eyes then doing line to text AI to clean up
- standardish keyboard with massive LLM prediction and clean UX for autocomplete/throwaway with branching options
Ideas are cheap so no clue if these work. Also Tobi split between cheap good non-hackable gaming eyetracking and medical products doesn't help. Finally, with ALS you want to communicate about different things and are more tired.
The Windows keyboard does actually implement something similar to your first and third suggestion. You can spell words by fixating on the first letter of a word, glancing between the following letters and finally fixating on the last letter. Some people can do this to successfully and achieve good input speeds, however, it is a skill that takes some mastery.
For me the real problem comes from three places.
Firstly, having to spell out words either with some kind of keyboard or even with a Dasher-like system means word length matters, long words are harder to enter than short words. The amount of effort needed to express an idea should be proportional to how unusual that idea is, not how many letters are needed to express it; "Hello Dave, nice to see you, how are you today?" should be easier to write than "Eel shoes".
Secondly, in order to achieve some level of throughput, you need to accept that you're going to be living near an edge where typos are inevitable. On current systems, the mechanisms for making corrections are extremely disruptive to throughput, mostly involving repeatedly pressing a key to delete the last character, word or sentence.
Thirdly, similar to the second issue, revising finished text is also a fraught problem that is inadequately addressed and often requires repeated pressing of arrow keys and the like.
I am working on a solution that I believe addresses all of these issues. A solution that allows text to be created quickly using input methods slower than typing, be it eye tracking or switch access scanning. Similarly, a system that allows the same input methods to be used to review and revise text efficiently.
I will be looking to see if I can take advantage of Paul's interest in this area to help his friend and others.
Is this used in the field, and what level of testing and validation was required?
This is coming from a skeptical POV, but I’m genuinely curious to hear about the process. Historically, of course, law enforcement technology has a history of being anything but evidence based. It’s good that there’s finally progress away from pseudoscience like fiber analysis and smoke patterns.
But from your experience is there anything to stop LEOs from adopting new tools and technologies without adequate evaluation and standards, aside from the courts, which have such a poor track record in this space?
Unfortunately, EEG (including P300) doesn’t provide sufficient signal-to-noise ratio to support good communication speeds outside of the lab with Faraday cages and days/weeks of de-noising including removing eye-movement artifacts in the recordings. This is a physical limit due to attenuation of brain’s electrical fields outside of the skull, which is hard to overcome. For example, all commercial “mind-reading” toys are actually working based off head and eye muscle signals.
Implanted electrodes provide better signal but are many iterations away from becoming viable commercially. Signal degrades over months as the brain builds scar tissue around electrodes and the brain surgery is obviously pretty dangerous. Iteration cycles are very slow because of the need for government approval for testing in humans (for a good reason).
If I wanted to help a paralyzed friend, who could only move his/her eyes, I would definitely focus on the eye-tracking tech. It hands-down beat all BCIs I’ve heard of.
My family is one of the unlucky ones that has genes for ALS so I’ve watched enough family members struggle. (I’m lucky, selfishly, because I dodged the gene but I still care deeply about this).
Seems like all the solutions out there are some flavour or variation of this.
Either:
- Part of the whole world-coin thing was privately trying to get the data to help his friend
- He doesn't want to say "looking to develop eye tracking tech for my world-coin scam", since most devs won't touch that thing. Conveniently found a "friend" with ALS.
Saying, on behalf of a friend, that he doesn't believe PG.Although I still wonder, since the two run YC together.
Thank you for correcting me.
It's not 'easy' to mix up people who you are accusing of lying about having a friend with ALS for profit.
Worldcoin also has nothing to do with eye-tracking, unless you purposefully play with English words. The request is optimizing software & eye tracking. Eye tracking itself is a solved problem, maybe more is needed for disabilities but that is not Worldcoin.
And since the request is for the Bay Area to meet the friend with ALS, what was the game plan here, invite them in and trap them in his lair instead?
Once he can require you to focus on a series of letters with your eyes, it will be harder to fake, easier to pick up dupes. Aside for the fact that getting a "typing cadence" with each individual's eyes is probably valuable in and of itself.
Hopefully, whomever takes this on doesn't take the standard Accessibility approach, which is adding an extra layer of complexity on an existing UI.
A good friend, Gordon Fuller, found out he was going blind. So, he co-founded one of the first VR startups in the 90's. Why? For wayfinding.
What we came up with is a concept of Universal design. Start over from first principles. Seeing Gordon use an Accessible UI is painful to watch, it takes three times as many steps to navigate and confirm. So, what is the factor? 0.3 X?
Imagine if we could refactor all apps with a LLM, and then couple it with an auto compete menu. Within that menu is personal history of all your past transversals.
What would be the result? A 10X? Would my sister in a wheelchair be able to use it? Would love to find out!
He is using tobii eye tracker. There is a video he made about the eye tracker. It's in Turkish but you can see how he uses it.
https://www.youtube.com/watch?v=pzSXyiWN_uw
Here is a article about him in English: https://www.dexerto.com/entertainment/twitch-streamer-with-a...
1. https://www.nasa.gov/centers/ames/news/releases/2004/subvoca...
[1] https://neuroscience.stanford.edu/research/funded-research/s...
Because of that, I'm also sure that eye tracking will go mainstream in other areas once the Vision Pro is released once everyone else catches on to it as a great input method.
I think that would be my major hesitation but I don't have a lot of experience evaluating patents.
Apples Eye Tracking Patent: https://patents.google.com/patent/US20180113508A1/en
1) its eye tracking isn't good enough for this kind of application.
2) direct access to the gaze vector is disabled
3) its really intrusive
4) its heavy.
5) it doesn't exist(in consumer world) yet.
The goal is to enable someone who has motor control issues, be able to communicate directly with the outside world. Shoving a heavy skimask that totally obscures the outside world on their face directly stops that.
Not only that, but you'll need to create and keep up to date the software needed to make a communicator. Apple are many thing, but it's new platforms are not stable, rapid os updates will break things.
When I worked for one of the big game engines I got contacted by the makers of the tech that Stephen Hawking used to communicate, which includes an eye tracker:
https://www.businessinsider.com/an-eye-tracking-interface-he...
By my math, 5k people in the US are diagnosed per year, and if your keyboard costs $1k, then your ARR is $5m, and maybe the company valuation is $50m. Numerically, this is pretty far from the goal of a typical YC company.
I hate to be so cold-hearted about the calculations, but I've had a few friends get really passionate about assistive tech, and then get crushed by the financial realities. Just from the comments, you can see how many startups went either the military route or got acquired into VR programs.
The worst I've seen, btw, is trying to build a better powered wheelchair. All the tech is out there to make powered wheelchairs less bulky and more functional, but the costs of getting it approved for health insurance to pay the price, combined with any possible risk of them falling over, combined with the tiny market you are addressing makes it nearly impossible to develop and ship an improvement. I do hope that we reach a tipping point in the near future where a new wheelchair makes sense to build, because something more nimble would be a big improvement to people's lives.
For example, I wrote a NLP parser for a calendar app, at Tempo.AI. It was much more efficient than the visual interface. And thus, it was accessible. But, it didn't use the accessible idiom. Instead, it was universally more efficient, whether you are blind or not.
A good example is a wheelchair accessible doorway. One method is to have a button at wheelchair height. The other method is to have the door open with an electronic eye. The first is Accessible. The second is Universal. Doesn't matter if you are in a wheelchair or not. It's a throughput multiplier.
IMO Talon wins* for that by supporting voice recognition and mouth noises (think lip popping), which are less fatiguing than one-eye blinks for common actions like clicking. The creator is active here sometimes.
(* An alternative is to roll your own sort of thing with https://github.com/dictation-toolbox/dragonfly and other tools as I did, but it's a lot more effort)
https://en.wikipedia.org/wiki/EyeWriter
https://github.com/eyewriter/eyewriter
https://dasher.acecentre.net/ , source at https://github.com/dasher-project/dasher
---
I remember seeing a program years ago, which used the mouse cursor in a really neat way to enter text. Seems like it would be far better than clicking on keys of a virtual keyboard, but I can't remember the name of this program nor seem to find it...
Will probably get some of this wrong, but just in case it rings a bell (or someone wants to reinvent it - wouldn't be hard):
The interface felt like a side-scrolling through through a map of characters. Moving left and right controlled speed through the characters; for instance moving to the left extent would backspace, and moving further to the right would enter more characters per time.
Up and down would select the next character - in my memory these are presented as a stack of map-coloured boxes where each box held a letter (or, group of letters?), say 'a' to 'z' top-to-bottom, plus a few punctuation marks. The height of each box was proportional to the likelihood that letter would be the next you'd want, so the most likely targets would be easier+quicker to navigate to. Navigating in to a box for a character would "type" it. IIRC, at any instant, you could see a couple levels of letters, so if you had entered c-o, maybe 'o' and 'u' would be particularly large, and inside the 'o' box you might see that 'l' and 'k' are bigger so it's easy to write "cool" or "cook".
(I do hardware+firmware in Rust and regularly reference Richard Hamming, Fred Brooks, Donald Norman, Tufte. Could be up for a change)
Geometrically 50% of the movement does nothing (going left is a null action). In terms of distance moved by the pointer it would be more efficient to arrange the options as a circle around the pointer, with maybe the bottom 10% of the circle used for the null action
A possible benefit is 90% of the screen could then be used to display a larger set of options, which could let the user bisect their choices more quickly
Have imagined a music theory/improvisation practice program that looks something like Dasher + Guitar Hero.
https://thinksmartbox.com/products/eye-gaze/
I once interviewed at this company. Unfortunately didn't get the job but very impressed nonetheless.
The solution actually works pretty well, especially when calibrated to a single individual.
[1] https://techcrunch.com/2016/10/24/google-buys-eyefluence-eye...
> But there's way more to do. We've barely scratched the surface of what's possible with eye tracking and I'd love to take a second crack at it.
What do you have in mind? What would you like to see?
https://www.optikey.org/
which ran on a < $1k computerAt the time, the other options were much more expensive (> $10-15k) which were sadly out of out budget.
Imagine being a parent and being ok with this?
The real "moneymakers" in eye-tracking have always been and will continue to be Defense applications for better or worse.
>Imagine being a parent and being ok with this?
I'm sure there are many tiger-style parents which would be perfectly ok, nay, thrilled!, with this.
---
I didn't describe a utopian scenario. More like a dystopian one. The Defense applications do exist, but I expect the advertisers to dominate as they usually do.
I would also recommend Jean-Dominique Bauby's Le Scaphandre et le Papillon to anyone interested in this topic. Typing using eye movements was used in that book in a slow, inefficient manner. In the book's case, the question one should ask is, was his UI paced at the exact correct speed? I was and still am deeply emotionally moved by what the author was able to accomplish and convey. I am unsure if a faster keyboard would have made a meaningful and positive difference in that particular case, to the author's quality of life. I'll need to give that book another read with that question in mind.
Happily, I expect eye tracking to find fascinating, novel and unexpected applications. As others have stated, UI/UX design is an interesting part of this puzzle. For example, if you ask an LLM to output short branches of text and have a writer look at the words that he wants to convey. It's definitely blurring the line between reading and writing. Myself, finding writing to be a tactile exercise, I think that emotional state comes into play. That's what I'm interested in. Yes, can you literally read someone's eyes and tell what they are thinking?
For inspiration, check out the Vocal Eyes Becker Communication System: https://jasonbecker.com/archive/eye_communication.html
A system invented for ALS patient Jason Becker by his dad: https://www.youtube.com/watch?v=wGFDWTC8B8g
Also already mentioned in here, EyeWriter ( https://en.wikipedia.org/wiki/EyeWriter ) and Dasher ( https://en.wikipedia.org/wiki/Dasher_(software) ) are two interesting projects to look into.
@pg - If your friend has not tried adding a mouse-click via something they can activate other than eye-gaze, this would be worth a shot. We have a lot of MND patients who use our combination to great success. If they can twitch an eyebrow, wiggle a toe or a finger, or even flex their abdomen, we can put electrodes there and give them a way forward.
Also, my contact details are in my profile. I'd be happy to put you in touch with our CEO and I'm confident that offers of funding would be of interest. The company is listed on the Australian stock exchange, but could likely go much further with a direct injection of capital to bolster the engineering team.
Cheers, Tom
We built a prototype for roadside sobriety checks. The idea was to take race/subjectivity out of the equation in these traffic stops.
We modified an oculus quest and added IR LEDs and cameras with small PI zero's. I wrote software for the quest that gave instructions and had a series of examinations where you'd follow a 3D ball, the screen would brighten and darken, and several others while I looked for eye jerks (saccades) and pupil dilation. The officer was able to see your pupil (enlarged) on a laptop in real time and we'd mark suspicious times on the video timeline for review.
It was an interesting combination of video decoding, OpenCV and real-time streams with a pretty slick UI. The Pi Zero was easily capable of handling real-time video stream decoding, OpenCV and Node. Where I ran into performance problems I wrote node -> c++ bindings.
We did it all on something silly like a 50k budget. Neat project.
With my group we are developing an eyetracker for studying developmental and clinical populations, which typically present challenges to conventional eyetrackers. It is a spin off from our academic work with infants, and we already have a study almost done that uses it. We are still into the very beginning phase in terms of where this may lead us, but we are interested in looking into contexts where eyetracking for different reasons may be more challenging.
I'm guessing a combination of projection mapping, built in lighting, and some crowdsourced data will get accuracy to very usable levels
Or how about a UI that automatically adapts to your eye movement and access patterns to minimize the amount of eye movement required to complete your most common tasks by rearranging the UI elements.
After you parse what an object is, tracking it doesn't take anywhere near the effort of original segmentation. No need to re-evaluate until something changes.
Maybe even use activations to turn on and off networks. "Oh text better load ocr into memory"
And it does inform a lot of our built world It's strange to think that when watching a movie only 10% is in focus.
Eye movement does provide a lot of information to other people and I think the physical movement produces feedback for velocities and things too. Mimicking biology is often a good bet.
Seems like all the solutions out there are some flavour or variation of this.
It would be great to hear from paul about how his friend uses the keyboard and what kind of tasks he’d love to do but can’t with current solutions.
It seems like a throughput problem to me. How can you type quickly using only your eyes?
Have people explored using small phonetic alphabets or Morse code style encoding?
Once I got tensorflow working, I’d start mapping different kinds of ux. Throughput is king.
Both apple and Facebook acquired eye tracking companies to kickstart their own development.
Here are some Top-lists
https://imotions.com/blog/insights/trend/top-eye-tracking-ha... https://valentinazezelj.medium.com/top-10-eye-tracking-compa...
Its also an active research field, this is one of the bigger conferences: https://etra.acm.org/2023/
I believe Paul Graham can Google or use AI and already knows about the companies and links you posted. His post was a call to action to connect with people working on yet to be discovered innovations and inspire those and others quietly working to come forward and connect with him.
You are talking about cobalt and that is only used in lithium ion batteries. You can avoid cobalt by using lithium iron phosphate batteries.
There are active efforts to develop new lithium ion chemistries to avoid cobalt and there are even commercial sodium ion batteries now.
You bring this in to a completely irrelevant conversation, what have you done to help solve it?
Go to hell.
Unless, of course, you'd like to commit the funded work to the free commons, unencumbered by patents and copyrights, and free to use by any entity for any purpose.
That's what we'd do for ALS, right ?
PG is a good guy - he sides with entrepreneurs, and created an industry-wide standard for a founder friendly seed stage investment deal (called a SAFE) in a world of hugely predatory deal makers. And from his twitter its pretty clear that he's biased towards fairness and humanity in general.
Read some of PG's essays. People dont share deep insights and knowledge like that for free if they are made of the wrong stuff.
Agreed.
My comment violates the HN guideline to "respond to the strongest plausible interpretation of what someone says" and I beg the op's pardon.
A low (or no) return investment in basic technology to help the disabled better express their intentions - for their own private uses - is an unalloyed good.
Of my two interpretations, I will set aside my initial, very cynical reaction and wait to see what actually comes from this.
It'd be good to know what rate we need to beat and some other metrics.
I'd consider an approach like the human powered helicopter parable.
I'd create a model for the eyes / face. So a screen like a phone is put in front of a camera with a model face and controllable eyes that you use software to control. Maybe skip the screen in front of a camera and link straight into the video feed.
It knows the limits of the eyes (different models for different people and situations) can measure fatigue etc.
You could run billions of simulations....
As far I know, I don't don't think mainstream medicine is close to solving _any_ chronic condition, except managing it.
Second Sight was giving patients artificial eyes. When they ran out of funding, they closed shop. The patients lost support system for their eyes. If anything goes wrong with their artificial eye, there is no one to repair or fix it. They just have to carry a piece of useless metal junk in their head.
They won't share information on where you look, but they will share info on where you 'click', which is used to navigate apps. IIRC you are supposed to use your fingers to do this, and other actions, but I imagine that Apple's accessibility team will have an alternate mode for people with motor limitations. It could be a long-blink, or rapid blinking, for example.
VR Eye-Tracking Add-on Droolon F1 for Cosmos(Basic Version) https://a.co/d/asAAZwT
You can use it with any headset cause the data it provides is independent of the headset. Not sure how accurate that one is in particular.
Eye tracking data is not particularly complex nor super sensitive in nature. Once the sensor is calibrated it just sends over two vectors.
But maybe I’m wrong, it might be a purely software based problem that can be solved quicker.
Whilst it plays an unskippable and unblockable ad (thanks weiapi!)
YC was investing in ways the traditional VCs weren’t when it started, and coding HN was a part of it. I doubt I’m the only one who had having a few HN tech support emails from PG.
can you elaborate?
Interested in hearing more?
If you can show your new approach works, sure. Usually this is done via papers in ML conferences, but if you have reproducible results on Github I'll take a look.