Alexa first came out in 2014. If it was the world changing product that people said it was going to be it would have spawned multiple billion dollar companies by now. Much like smartphones led to Uber.
Amazon can keep pooping out Alexa into 15 new products every year. Google will copy them because they don't actually know what a good product is but they're the internet company version of Samsung, a fast follower, so they'll do it because they're competitor focused and just in case it takes off.
None of that will change that this product category isn't going anywhere.
For the speakers like Alexa and Google Home, voice being the only input allows user to say whatever they want hence making the task space infinite. But the voice recognition and NLP is not in a place where it can recognize everything the user has said. This creates a less than stellar experience with the user having to repeat, rephrase or even worse abandon the task. I think this platform will blow up when NLP/AI is able to detect user intent with near perfect accuracy and is able to make the interaction with the user as fluid as with a well designed app. It doesn't hurt for Amazon to have a large installed base ready to use the platform if/when intent recognition becomes par.
Of course it will never replace phone/desktop as there will be things which we cannot say over voice (secrets) and where it is not possible (loud places) or just not courteous behavior.
Not to mention: constant wondering whether the task can even be accomplished. When a voice assistant rejects your query, in many cases you can't be sure whether it's because it couldn't understand you, or because it can't possibly accept what you said as a valid input in the context it's in. In regular interfaces, visible constraints matter as much as affordances.
Even saying that, Alexa and competitors don't need to be computing platforms, they just need to be the interaction layer for day-to-day human-computer interaction. Even if Alexa were as technically simple as an IFFTT recipe with the "THIS" being "what the person says out loud", that's hugely valuable. It's very clear to me that if it's easier to do something with on device than without that device, and even easier to make a device do a thing without physically touching the device, that will ultimately be the way most people do the thing, provided they have access to the technology.
That is why Amazon is pumping out so many different Alexa form factors. I couldn't figure out who would want the Alexa ring. Then I met someone who hates ear buds and swears by his Swiss watch who was excited about a wearable he'd use. Amazon wants to be the dominant player, and is simply crossing form factors off of their (presumably) stack ranked by opportunity size list.
[0] https://en.wikipedia.org/wiki/Uber#History
[1] https://pitchbook.com/news/articles/uber-by-the-numbers-a-ti...
From a technical perspective, there's no reason you couldn't have had a ride-hailing app on a Nokia Series 60 phone in 2006, or a Palm Treo 270 in 2002. All the necessary components were present in both of them. What was missing was a critical mass of users of those devices -- a user base large enough for third parties to justify building services on that scale for them.
Which could help explain why Amazon is pumping out Alexa devices in so many different form factors; they haven't cracked the nut yet on a single form factor that will attract that critical mass of users, but maybe if you can aggregate enough small user bases together they'll add up to a large enough one to attract third parties.
Is it Voice or Spacial Computing (AR/VR/xR)?
It's clearly voice, for many reasons. To cut the argument short, realize that language is what separates humans from other animals, and recognize that voice is the natural form of language, not writing. The advancement of AI requires major advancement in understanding human emotions, which are conveyed through subtleties in voice, and not picked up in text.
But it doesn't need to be either/or. Why not both Voice and xR?
Over time, voice and xR will converge, as voice interfaces get integrated into more consumer services, and xR gets more productive applications (right now it's all games and porn). But by then, Amazon will be well-positioned to get into the fun, but Facebook will not be taken seriously.
Unless Facebook can disrupt the global monetary system, pushing Libra by providing a discount for all purchases made within Oculus.
But can Facebook create innovative products that people want? This is still an open question. After all, Facebook's success comes from out-executing on the ideas of others.
I haven't used a Portal, but isn't it video-focused? And doesn't it integrate Alexa?
I'm sure Facebook is aggressively trying to build a competing voice interface, but it's far behind Google, Amazon, and Apple. It will be harder to out-execute Amazon than Snapchat. And Facebook will need to improve its branding to even have a chance.