It's only 720p and around 15fps but real shallow dof, very little sensor noise, autofocus works. Well worth trying if you have a Sony camera from the last few years.
Sensor size and good optics still wins. Having said that,the effort and detail gone into this feature is very impressive, enjoyed the blog post. Also webassembly SIMD looks super cool, looking forward to a new class of webapps using wasm.
I ended up getting a $10 HDMI USB capture stick from Aliexpress. I get a perfect 1080p/60fps signal, and at least on Linux it worked out of the box with Zoom.
The only problem now is that most of my meetings start with "wow, why do you look like you're on TV?"
I'm using my old T1i which can be had for less than $50 these days, plus you can pick up a 18-55mm kit lens for like $20 and the video quality blows away any webcam, especially for the same price. Also recommend battery->power adapter.
The codec situation with h264/HEVC/vp9/AV1 software/hardware encoding is a mess. Hopefully we'll get wide hardware support for AV1, although it might take a while.
(I ended up having to buy a little logitech webcam, which has been fine, but being able to pick my lens etc is awesome!)
I also tried gPhoto2/ffmpeg and virtual cam driver with Nikon D5200 (USB) on Linux but I prefer the Redmi since I do not have a decent low light lens for my DSLR.
1/ Your internet connection, especially upload bandwidth and latency matter a lot.
2/ Zoom's desktop app performs very well, but its web version is atrocious. Not just because of the dark patterns they use to force you to install the desktop app, but also its performance is terrible compared to its desktop version, as well as worse than almost everything else. Unfortunately, I don't trust them and refuse to use their desktop app on anything but my iPad.
3/ Meet used to be bad like Zoom on web 6 months ago, but has improved a lot and is slowly approaching Zoom desktop in performance. I have noticed that Meet on my work GSuite calls at work perform much better than on my personal account. This might be explained by #1 above I.e. my family has worse internet connections than my coworkers, but I am not sure if all improvements have been rolled out to personal accounts.
I moved to a new house, and the quality of my video calls dropped dramatically. Constant freezing and dropouts. It was extremely frustrating to try to participate in a meeting. I could receive fine, but anytime I spoke out, I would drop out within minutes.
Speed tests showed plenty of bandwidth, but my modem statistics showed high upstream power levels, occasionally out of the allowed range, and lots of "uncorrectable" packets.
I finally got a Comcast technician in to look at it (yay for business-class support), and they replaced the cable from the pole all the way to the first splitter in the basement, and since then it's been flawless. 100/15 Megabit service has been totally adequate for our needs, so long as it's reliable and the latency is low enough.
It kills me that our city isn't putting in conduits or fiber while doing utility work, though. The whole time that was happening, there were gas contractors opening the street and running new supply lines to every house, but not putting in any extra conduits or dark fiber. The construction sounds were almost like being back in the office...
My mobile internet is really fucking good, and often outperforms my sodding wired connection
It grates me when people claim DSL/cable qualifies as sufficiently good broadband in the US because of the lack of upload bandwidth and slow latency (can add packet loss in here too). The situation is so bad that you can't even find how much upload bandwidth so called "broadband" cable ISPs offer.
The experience on symmetric fiber connections is noticeably improved, and we can have a house with a whole group of people streaming video up and down simultaneously without a hiccup. Such as in times of work from home and school from home.
For the last item, personal accounts (only?) default to send and receive video at lower resolution (360p). So if you meant that the quality is lower, you can set it on both sides to 720p.
Edit: I don’t think Meet remembers those settings though, so you have to do it every time (and show your family members how to do so).
As a legacy free GApps user it is even more confusing because the admin page gives me an option to default to higher quality video but that doesn't do anything.
Why does Google, with all the resources at its disposal, choose to cheap out like this when competitors in the video chat space (from tiny startups to gigantic corporation of similar size) have offered near native resolution video chat for ages?
Are they even _trying_ to compete?
Meet was much worse than Zoom, even when I take the bad web interface of Zoom into account.
I ain't a fan of either, though.
I refuse to install Zoom. They have removed the dark pattern, and the "join via browser" option is almost immediately available. If you have it installed, now is a good time to uninstall it.
In my experience, it doesn’t completely cover the background most of the time, and if you move at all, as you point out, it can’t keep up.
Kind of funny to see Google engineering blogging about it when it feels extremely half baked.
This makes me sad, because in all other areas, I think Meet excels well beyond the competition.
EDIT: removed my general sentiment on Google
I've always wondered what proportion of modern real-time video effects rely on ML vs. classical image processing; this not only answers that question, but provides details down to the level of model architecture and the final latency and IOU benchmarks.
Of course I'd be more interested to read how Zoom manages to do even better, but I'm not holding my breath for them to publish those details.
is it _better_ than zoom tho? I my experience, I don't see much of an improvement worth switching.
The other thing I've noticed is the background blur absolutely annihilates my CPU. To the point where I would rather just turn off my camera if I don't want my background visible.
Google UX/UI team: Please fucking make the mute/unmute button visible at all times.
People in high paid position certainly want "has taste" and "knows what looks good" to be part of their self image. Many fails in design and architecture happen for that reason alone.
I then ended up programming and working in film sound, because very few people in both fields tell you what to do when they have no idea what's going on.
Ironically forgetting that visual minimalism produced by hiding things isn’t really minimalism.
It would be like me throwing all my things in the garage and advertising my house as Spartan. No, it’s not, it’s a mess. The mess is just hidden until I need to do something.
If we want to give awards for this my vote would go to Apple. I find their products to be horrific when it comes to completely undiscoverable features. iOS is bad on its own but the Apple TV is a total train wreck. I couldn't get rid of that thing with its awful interface and remote fast enough.
And sometimes it's great, because you get to focus on the content, and sometimes it's not, because you lose control. It's something that should be optional or configurable. It's great to have shortcuts for the most common commands (like space for pause in youtube), and I guess it would make a lot of sense if video conferencing tools also had such a shortcut for mute/unmute.
But again, give people more control over their UI. There are too many applications that mess this up one way or another.
This is true. I find Android UI so offensive that if I did not have iOS as an alternate I probably would carry a dumb phone and live like a monk. I can’t stand the miles of white space and brightly coloured tiny UI controls.
Evokes such a visceral reaction in me that even I am startled at times haha
Physical button to block the microphone, LED on the button itself and a tray icon with the microphone status displayed.
I used MS Teams and zoom and both are decent (ms teams works fine for school)
but it's insanely unbelievable that this kind of software lacks of features that gaming communities had probably 20 years ago
PUSH TO TALK is probably one of the most important features of any voice software. The lack of it is big WTF.
It gives you 100% control over when you're talking and you don't have to alt-tab between programs in order to "mute" yourself.
You can bind it to e.g MOUSE3 (scroll-push) and it works fine with other programs, games and stuff. Switching between muted/unmuted is different thing.
From somebody who uses/used ventrilo, mumble, teamspeak and nowadays discord for like last 12 years for hours per day, almost everyday.
That's not something doable today on the web for obvious security reasons, but it's possible for Discord that has a separate app, would be doable for Zoom too I guess.
In fact, now I think about it, this has happened many times over the years with traditional mouse drive interfaces too.
I'm sure some power users would like to shorten the 'reaction time delay' or even remove it entirely so I guess that should be an option as well.
The mute/unmute changes position and can be hidden in a top bar that slides out. In some fullscreen situations there is no button to get out of fullscreen. Sometimes double-click works, sometimes it doesn't. Recently I could not even alt-tab away, basically my computer got 'locked' by zoom.
https://tacosteemers.com/articles/2020-10-16-ux-anti-pattern...
I really think there is a market for a physical video conference controller. If I could get a hefty slab of something with quality buttons to enable/disable video, push to talk/mute/unmute, bring to foreground, ‘on air’ light and end call, I’d easily pay $100 for it.
[1] https://support.zoom.us/hc/en-us/articles/201362973-What-Is-...
The best conferencing solutions I’ve used to shame those not using video
Not that you should have to install an extension to get basic UX
But as the recent Google Icon kerfuffle, UI/UX is not their strength (probably because of opinionated technical people that think you need to A/B shades of blue)
Perhaps I'm missing something obvious (or a Chrome plugin that will allow me to mute based on the page URL rather than site). In the unlikely event that a Googler is reading this I'm not asking for yet another product or complicated new piece of functionality aimed at this specific use case. Just a mute button for audio. Thanks!
No, vastly different products. Hangouts is the legacy thing and never worked quite right for me. Meet is much better.
It works for me for Chromium on Ubuntu.
It renders a big cross through the microphone when muted.
Simple, yet insanely effective UI (#).
Best thing ever.
#) Especially when compared to the mess that is Google Meet. My favourite "feature" of theirs is how when someone is presenting, it's impossible to view the presentation as just another stream - no they have to make it dominate everything, meaning it's so hard to see the other team members.
And it can be extremely hard to see who's talking when viewing a lot of cameras at the same time. And for whatever reason the quality turns to a blurry mess a far cry from 720p just way too often. (I have fibre internet).
And if you need minimalism, offer a toggle for that. But I think most people should have it forced on them, would save anyone a lot of trouble -- just think about all the aggregate time lost talking into a muted mike by all users.
I did donate a contribution to say thank you.
I'm a 28 yr old software developer.
But you will hit a dog probably, because the steering wheel suddenly blocks your view too.
When I leave a meeting, can you please stop asking me for feedback every time and just take me back to the main meet screen?
It would be so easy just to put that small dialogue box on the main meet screen rather than prompt me to click the button to return.
Doesn't excuse the UI, but at least this lets you avoid using it!
I bought an external microphone for my laptop with a hardware mute button.
I still can't stand the bottom popping up and down and not being able to tell if I'm muted.
There's a tendency to think of ML as "not programming," or something other than just plain programming. But as the tooling matures, that'll go away.
(Lisp used to be considered "AI programming," till it became useful in many other contexts.)
In maybe a decade, it might be found in standard libraries of programming languages and on top of things like `Math.abs`, we will have `ML.textToSpeech("Hello world")`, or `ML.isCat(image)`, etc. However, the problem I see with that is that no matter how far we wind the clock forward, we will only be able to put the most simplistic use cases into a library. `ML.isCat()` could be one of those, since most humans will be able to image categorization, it stands to reason that you could put this into a library. However, most industry application involved highly customized ML algorithms that are optimized for a very specific use-case. So there will always be a need for a research team in big companies at least. Maybe smaller companies will try to build their stuff by chaining libraries together.
What you're talking about is using AI as programming tools. It's still programming, but using pre-trained models as part of the plumbing.
Anyone who uses the blue realizes that it's far lacking in quality from other offerings and Google Meet UI is very bad also.
Zoom, Teams, even WebEx are superior quality and usability wise.
Zoom's web client is particularly terrible, and we can't install the desktop client for security reasons.
And the new background noise cancellation feature is magic.
Out of these I'm really surprised how "not as horrible" MS Teams are. Loads of functionality and the UX is bearable.
I already have RTX Voice now and it's the best thing ever.
https://www.nvidia.com/en-au/geforce/news/nvidia-broadcast-a...
Are they able to change the bg in the browser?
Jitsi also has background blur but it's only ok-ish on Chrome and unusably slow on Firefox.
I thought the whole point of having a video call is to see who you are talking to, and their environment to further enhance the effectiveness of the conversation.
If you are in your kitchen, or under a tree, I definitely would like to see that because that environment will have an effect on how we communicate.
I have coworkers who are in house shares with 5 other adults all trying to work from home around tiny desks. Background blur for them is a nice way to hide some of the chaos of their living arrangements.
In the above scenarios, if I'm not certain there aren't going to be ackward things in behind me, I'd want to blur or set a custom background. Back against a wall also works which is what a lot of people seem to be doing.
> In the current version, model inference is executed on the client’s CPU for low power consumption and widest device coverage.
Naively I would think model inference done server side would have the lower CPU power (from the client point of view) and widest device coverage (client does nothing more), what am I missing ?
If the segmentation is done server-side, then you need to sync it to the sender and reflect that quickly in the preview. It's probably not a great experience, at least for a launch.
It sucks and it’s distracting.
Your hair and hands pop in and out of blur. Sometimes part of your face will blur.
I don’t care if your workspace is messy or your kid walks in the room. I do care that we’re all being distracted by your weirdly blurred hair and hands.
Given that many had to start WfH with short notice meaning they couldn't relocate to circumstances enabling a dedicated home office space blurry hair and hands are a very reasonable compromise.
I think you are overthinking it. I've seen people use it when it provides no real material benefit other than the placebo effect on the user to believe that the blur makes other people focus on their face.
But that's not always true tho, I have seen background replacement all over people's face (and yes, I seem to be the only one who thinks that's wrong).
Can we get a mute button visible at all times before 2024?
Is it just me or is the button visible at all times? I could see the button visible on the bottom of the screen at all times I used meet during a session with friends. I even tried it right now to make sure.
Once you have depth information integrated with a camera, then it should be pretty trivial to do background removal.
Whereas a 35mm f1.8 from Nikon is like $200 and whatever you mount it to is still going to need to do auto focusing and a bunch of other camera-y stuff to make it accessible to non photo geeks and then you’re going to need an off camera microphone so the entire call isn’t listening to your autofocus motor and...
We're being herded into the new more useless products.
I also think it makes the subject look better for some reason.
Advantages: it looks natural, it covers whatever is going on behind you (in case you are not alone and people walks by, or if your living room is messy), and it blends better than fake backgrounds (because it's the same image behind it). I have a picture of my office that I use both at home and at my real office, and most people can't tell. And since I took the picture with my phone which has better resolution, my video feed looks better for cheap.
https://1.bp.blogspot.com/-viEA4OY0sxA/X5s7IBwoXOI/AAAAAAAAG...
As in, the blurred background looks totally different (light:dark, shapes, etc.) to the unblurred background.
(I get that they’d need to do something funky to show blurred and unblurred backgrounds with the same foreground video, and faking it is likely easier than doing it programmatically, but this is just odd/sloppy.)
The right clip is an example of background replacement.
This is why the blurred background on the left does not look anything like the unblurred background on the right.
Although there's a lot of blurring on the shoulder of the guy at the beach: https://i.imgur.com/D5ueGUh.png
There are some works on OBS to get the green screen AI working, so I hope we will get that on GNU/Linux one day.
When the video is encoded, the codec does motion estimation (among other things) to reduce the bandwidth required. So why don't we use the motion vectors from the video codec to modify the foreground/background mask in real time? Obviously this is going to create weird artifacts pretty soon, but it might just be good enough for a few frames before the ML model produces another accurate mask.
I have observed in the last couple months that whenever I create a Google Calendar invite with others, Google has started inserting a Google Meet conference as the location to meet.
It was one thing to ask/offer this as an option if you'd like to use it, but now Google is positioning it as if you had chosen that. So if you left it empty, because you usually use some other understood method with your friends/colleagues, now your participants are confused and think you wanted to use Google Meet.
I think that's going too far to get people to adopt your product.
Disclaimer: I work at Google but not on these products.
Edit: it seems the tooltip only appears the first time you try to add Meet. After that it doesn't appear and you have to go into settings.
It's still shady that they turned that on by default to get people to use it...
Given the number of IT people I’ve heard express concerns about UI quality and eventual cancellation even for enterprise purchases, it’s also far from a given that the IT department is just blindly pushing a product.