Tom Blintz, Absolute expert on... EVERYTHING.
Répondu il y a 136w
Voice assistants are both useful for what they are currently able to do yet limited by their design scope. Siri cannot open your garage door (yet, wait for Apple HomeKit). Cortana cannot do anything on your phone or computer than your fingers (or mouse) can do. Google Now is pretty much the same.
Apple, Microsoft and Google want to lock you into their platform to keep you as a source of income... for them. Not useful to me, useful to them.
What is actually useful with the Amazon Echo is that Amazon has built in compatibility with several manufacturers products, is a fully operational Bluetooth speaker, but the main thing that will make the Echo not just a useful gadget but a household necessity is that Amazon has opened its program code up to 3rd party hardware vendors and software developers. This is where a limited device spreads its wings and takes flight, so to speak.
Today you can tell the Echo to turn on your lights, or dim the lights in one room or a group of rooms to whatever level you want. It can control electrical outlets. It can open your garage door, listen to music on Pandora or Spotify as well as Amazon of course. Echo will play your TuneIn radio stations or access Audible books for your listening pleasure. As an entertainment hub the Echo hosts your own game of "Jeopardy" which might make for a fun evening, just remember to answer in the form of a question.
Order pizza from Domino's, call an Uber, keep tabs on your car with Automatic --a device that plugs into your car's diagnostic port and makes a dumb car a smart car, check out reviews on Yelp. These are all what Amazon calls "skills" for the Echo, meaning they are abilities written by 3rd party companies.
Lifx et Philips Hue's connected lights, Belkin's WeMo line of smart switches, Ecobee's connected thermostats and smart home platforms such as clin, SmartThings et Insteon all offer native support for the Echo.
All these things have made the Echo actually useful, especially for the elderly or the disabled.
And, beyond TODAY, the Echo voice recognition will enable devices made by other companies to be just as smart because Amazon is offering its Alexa Voice Service available to 3rd party manufacturers. Is Apple or Microsoft likely to do that? I think not.
Some info in this answer from:
Sizheng Chen (), worked on Microsoft Speech Server and Response Point (2006-2009)
Répondu il y a 136w · L'auteur dispose de réponses 198 et de vues de réponses 979k
It is useful.
It helps reduce task completion time and people see the value.
For the first part, let's take a look at how speech / voice recognition application getting evolved.
- IVR (Interactive voice response)
Reconnaissance de la parole and its usefulness are actually not new. IVR over telephone became popular among call centers in 2000s.
It works perfectly for voice dialing, call routing and simple data entry. It helps dramatically reduce the cost for call centers. For savvy users who are familiar with the workflow, it also help users achieve your task faster.
- Windows Reconnaissance vocale
Microsoft first introduced speech recognition in 2002. It is not treated a separate product but rather a new user interface within Windows.
It became super useful for disabled people to interact with their computer for the first time at that time. I read a lot of stories people appreciate how it helped change their lives.
In order to use windows speech recognition, a microphone was required, and since people were close to their keyboard anyway, for people who didn't need it as a accessibility feature, it didn't help reduce the task completion time.
It was more like a novelty feature for most people at that time.
- Siri, Google Now and Cortana
The next wave came with mobile internet. In early 2010s, suddenly everyone carried microphone and almost all the voice recognition began to happen on the cloud.
Cloud computing enabled collecting much more raw voice data and faster model training iteration. Now almost all major player Cortana , Siri (logiciel) , Google Now et Nuance Communications (company) building their speech recognition engine based on deep learning methods, this is not imaginable before the cloud computing era.
- Amazon Echo and Kinect
Amazon Echo embraces far field voice recognition. Instead of pressing a button on your smart phone, you say a wake word and the conversation with your voice assistant begins, even you are several feet away from Amazon Echo.
This is a BIG change.
You no longer required to carry your smartphone to engage a conversation with your voice assistant. You no longer have a dependency. The overall "bootstrap" time was also reduced from getting your phone out of your pocket and pressing a button to simply saying a wake word. (btw, the wake word approach is also too expensive for smart phone because even the stand by consumes battery quickly. So even Siri supports "Hi siri", it requires the phone to be in charge mode.)
Actually Kinect was the first widely recognized commercial product using far field voice recognition, to give Microsoft credit (but it again, as expected, missed the opportunity to apply it to the smart home market).
From IVR via telephone to today's Amazon Echo which not requiring close to the microphone, voice interaction becomes more and more accessible to us.
For the second part, let's look at several use cases, and let's judge them by task completion time.
- Voice control other devices
Now everyone is having a "a basket full of remote control" problème.
Smart phone app is one option, but how to build a generic and easy to use UI is a big challenge.
Voice control, on the other hand, could be one way to solve that problem. "Turn off living room light", "set alarm at 8AM", "watch Modern family" are natural ways how people thinking about these problems, and is probably faster than manipulate different apps on your smartphone for these tasks.
With protocols like IFTTT , more and more devices are connecting themselves to the internet. This presents a huge opportunity for devices like Amazon Echo to connect them all together. In fact, Amazon Echo indeed integrates with all the smart devices via Amazon Alexa Channel.
Protocol like IFTTT enables future players (I can bet my money Google will be there) connect to all devices Amazon Echo can as well.
Voice control in these scenarios obviously saves time.
- Routine queries
"What's the weather today?", "What's the best route to my work?", "What is the glossary shopping list for today?"
These are common / routine questions we ask our voice assistants. It is feasible and easy to train voice assistants to understand these queries and all their variants well.
A lot of these tasks can be done in parallel via voice with other things, if not saving us time, compares to do it on the smart phone.
- General voice search
I actually don't think voice search in general is a good use case. Sure, you see a lot of cool demos from Siri to Amazon Echo. But if you have tried it yourself, you will find more frustration than satisfaction.
Put voice recognition accuracy side, to get the aha moment, the result of the search needs to be, court. Text-to-speech is a low throughput interface for obvious reason.
Listening to more than one minute of text to speech voice is really a user experience disaster.
The result also needs to be Déterministe. Unlike text search, people are comfortable with multiple result options, it is a disaster to get multiple results from a voice interface.
It is pretty obvious that we don't save time when asking general questions over voice interface, so I don't expect people to switch their majority search activities to voice interface in the foreseeable future.
There are two big areas we should expect the industry to improve in the coming years.
- Voice recognition accuracy
With more and more user data collected and more and more advanced machine learning technologies, we should expect voice recognition engine can understand us better and better.
We should also expect engine interprets our voice in a more personalized and contextual way. This is also essentially what humans do to understand others.
- Far field voice recognition
The far away from the voice source, the lower the signal noise ratio will be. How to improve the hardware to better collect voice at a farther distance, sometimes with background noise, is a big challenge for the hardware and signal processing.
Ideally, we should be understood anywhere in the entire house, not only the living room, so we can "turn off the living room lights" in our bedroom, "play Lady Gaga's new album" while we are showering.
I intentionally don't want to quote any stats from internet for "are usage and retentions of Amazon Echo or Siri growing". Because I believe in the future of voice assistants in the scenario I described above, and I can see it help reduce task completion time. Voice assistants also become the standard feature for smart phone, automobiles and future smart devices.
Le meilleur est à venir.
Scott Danzig, Senior Software Engineer at Lockheed Martin
Mise à jour il y a 136w · L'auteur dispose de réponses 2.1k et de vues de réponses 8.5m
The future of these new personal assistants including Amazon Echo, Apple Siri, Microsoft Cortana, and Google Now are bright. As if the concept was an infinitely absorbent cloth that's only been nominally moistened, with the world trying to wring the meager drops of value that has soaked in, we still have not seen this cloth generously soaked beneath an open faucet.
Ask Amazon Echo "what it wants to be when it grows up", and you open a window to the ambitions of the players within this market: "I want to be the computer from Star Trek." Sci-fi is the stuff of our dreams. When it becomes a reality, it is a sign of the very ascension of our race. Just like the day when Tony Hawk rode a hoverboard, such will be the case when our machines truly become our companions.
Internet-of-Things has been branching out from the realm of the home owner. While a personal assistant is a natural fit for controlling things like:
- systèmes de sécurité
- portes de garage
- coffee makers
technologies such as Amazon Echo-controllable wireless lightbulbs (Philips Hue, Belkin Wemo) are exceedingly simple to purchase and use with an immediate value proposition, even for a renter, as well as personal assistant technology-friendly TV controller boxes such as Amazon Fire TV and Apple TV. Just buy the components, which are becoming increasingly more affordable, and the personal assistant will help you operate them. More and more devices will leverage the APIs of such technologies, and be operable out of the box by at least one.
Cloud-based technologies are also a natural fit for personal assistant technologies. Massive computing done in the server farms of corporations is a double-edged sword that consumers are willing to accept. You make my life easier, and in exchange, you can learn what I actually want to buy from you to fill in the gaps. Apple, Microsoft, and Google all leveraged existing hardware platforms to deliver personal assistant technologies with reasonable accessibility. However, Amazon, with the failure of their Fire phone, instead chose to debut with an impulse purchase-friendly flagship device spotlighted on their powerful online shopping platform. Even Amazon, however, has quickly learned the benefit of leveraging the mobility of the cloud, recently unveiling the mobile Amazon Tap.
Increasingly and necessarily, cars will leverage Carplay-like technology not merely for convenience, but also for safety, such from the obvious benefit of keeping one's eyes on the road. Personal assistants are a natural extension, allowing personalized control and interaction.
HealthKit will likely be more pervasive however, as it is much more a financial benefit for powerful compagnies d'assurance maladie than a mere luxury item. Currently, I wear a pedometer that I sync daily with my smartphone and the number of steps I've taken get uploaded to Aetna, earning me increasing insurance discounts. As much as $600 in a year. You'd better believe that's enough to get the attention of an indie filmmaker such as myself. Technologies such as HealthKit can take that to the next level, monitoring my vitals, ensuring I take prescriptions as prescribed, reminding me to get adequate sleep, and, I'm sure maybe one day we'll all be earning discounts on our dental insurance by brushing and flossing sufficiently. Your personal assistant will be right there with you, reminding you vigilantly and cheering you on, of course within the bounds of your appreciation.
The best sign for the market is the number of major technology corporations that are individually compétition. Siri was certainly off to the races first, but its limited ecosystem is giving alternatives solid purchase.
(Google search trend over times, Siri = blue, Echo = red)
You can bet that Google's "Now" is being used quite heavily as well.
Competition inspires innovation. While often not direct innovation, to be sure, certainly the prospect of incredibly lucrative acquisition has been a boon to academic and entrepreneurial research. Apple bagged Siri itself via the acquisition of Siri Inc, a spinoff of SRI International, while Amazon has quietly purchased companies such as Evi, Yap, and Ivona to add to its own technology portfolio. Open source will concurrently and collectively advance the field thanks to efforts such as the Kickstarter-born Mycroft projet.
My personal take:
I own the Amazon Echo and have used Apple Siri quite a bit. Amazon has put their marketing machine to work for the Echo, and certainly, more and more units are being sold. Barring frequent cases of cognitive dissonance theory in action, most are eventually disappointed with the true limitations of Echo, not even including it being conspicuously restricted to one room. Personal assistants attached to smartphones are more growing in use rather than number as people explore the features of the capabilities their phone already has. Currently, these products are occasionally useful if you learn what they're reliably handling, and work to maximize their potential.
However, that's not what these assistants are being billed as. There are serious impediments that remain -- the largest of which is lack of context-based understanding. You don't have the freedom to diverge from tasks that the manufacturers have put in an effort to support. They all will tell you the local weather. They can help with the scores of major sports. The most useful I personally have found Siri for is as a hands-free device, sending small text messages while driving.
However, if you ask Amazon Echo "What sound does a cat make?", it cannot respond with a simple meow, while "What sound does a fox make?" gets a comical canned response worth of gibberish based on the pop song. On occasion, I test the current state of Siri while with friends, and a question comes up that I'd think Siri should reasonably be able to handle. The room goes silent and I ask the question, but we're in a car or the question uses a word that Siri isn't used to. It's hard enough to get the question right. The best I can hope for in 98% of those cases is a relevant web search. While the search was performed hands-free, you certainly need to redirect your attention to the screen to analyze the web search results.
There has got to be some semantic understanding for these products to reverse the translation of intent. Right now, I have to figure out how to say things rather than the machine figuring out what I'm trying to say.
Let's talk about how you figure out what Echo and Siri CAN handle. Generally, you can use common sense, based on what would be the obvious most common requests that are simple enough that they would reasonably be able to handle. What is the weather? What is on my calendar today? But otherwise, I find myself having to search for "lists of examples".
However, while I'm not expecting a "technological singularity", where artificial intelligence is born and these personal assistants are spontaneously upgraded with sentience, I do expect major advances in knowledge engineering/semantic awareness to happen in the next decade, not in company research labs, but rather in universities. A PhD dissertation, likely at MIT or Carnegie Mellon, will present work that demonstrates a significant advance in the field, and this doctorate degree will be paired with a patent-based startup and subsequent acquisition. Soon we will notice one of the phalanx of personal assistants we may or may not have purposefully acquired is growing in capability, with others to follow as interest in the field is renewed and patents expire. Personal assistants will transcend from being glorified alternative input-devices to being companions that can offer meaningful understanding to your true needs.
A note about the future:
While Star Trek's computer may very well represent the ascension of our race, the true accomplishment is not speech recognition-based instruction. It was made clear to me when I grabbed a bottle of mouthwash placed near an Amazon Echo, and took a swig, intending to rinse for a minute before spitting out.
Echo seemed the ideal assistant to tell me when a minute was up, but, there was an obvious problem. I shrugged, forced to guestimate. This is also an obvious limitation for the Deaf and hard of hearing, along with anyone in an unsuitably noisy environment.
Eventually, will devices be able to read our minds? I don't think that question is particularly relevant to our ascension, as reasonable albeit limited means of communication can be provided for most situations before the likes of which is possible. The true ascension will be through an "assistant's" comprehension of intent. The context and semantics are what's important, and that is where this field will inevitably take us, perhaps in short order, if the market for such is truly leveraged to its potential.
Chuck Rogers, Former Chief Evangelist for MacSpeech, Inc. Managed marketing and tech support.
Répondu il y a 136w · L'auteur dispose de réponses 12.5k et de vues de réponses 23.8m
First of all, usage and retention of speech recognition products are definitely growing. This is largely due to the fact that accuracy over the past few years has finally gotten to the point it is good enough to actually be useful. So accurate, in fact, that little or no training is needed to actually use a product with speech recognition. (It used to be you would have to read several training stories to get even adequate accuracy.) Another sign they are growing is that Amazon just added two new devices to it's line of Alexa-enabled devices: the Echo Dot et Amazon Tap.
As to whether or not it is a novelty, this depends entirely on how each individual system is used. At first, I think it is a novelty for pretty much everyone. It is fun to ask about the weather or for the answer to a trivia question and have the speech recognition agent (whether it be Siri, Amazon Echo, Google Now, or Cortana) answer with accurate information. And there is no doubt that for some people the usefulness of these devices ends there.
To be sure, there are plenty of places a voice interface would not be desirable: a movie theatre, church, or perhaps an open workplace where people sit at desks in a shared room. There are also environments where a speech interface would not be very useful due to potential recognition issues, such as a crowded nightclub or a loud construction site.
But being able to interact with both the environment around you and retrieve information from the web with your voice can be powerful and enabling. In a sense, your voice becomes a "third hand" which can assist you at times your physical hands are busy doing something else. Two places a voice interface really comes in handy are the kitchen and in the car.
I use Alexa pretty extensively when I am in the kitchen. While preparing a meal, your hands are often involved a variety of activities that make it difficult or undesirable to use a tablet or computer. With Alexa I can listen to just about any kind of music for my mood, and I can skip songs I don't like. (Obviously, I can also ask for songs by names.) But it goes much further than that. I can also listen to the latest news, and ask for conversions (which really comes in handy: "Alexa, how many ounces is 350 milliliters?") When I am done and I put the meal in the oven, I can have Alexa set a timer for me. When the timer goes off, the Echo blinks my Philips Hue lights in the living room, so I don't have to worry about not hearing the timer go off in the kitchen.
In the car, a voice interface is definitely not a novelty. Cars have had voice interfaces for years, although not a lot of people I know use theirs. Car manufacturers include voice interfaces because it is allows people to keep their eyes on the road instead of looking at the car's controls. But the number of people using voice recognition in cars — whether the built-in system or something else like Siri or Google Now — is growing. Since people won't give up using their cell phones, a voice interface becomes critical in helping to keep people's hands on the steering wheel instead of holding their phone.
Outside of the limitations imposed by où you are using these voice assistants, there is the issue of contexte I am convinced that this is the biggest barrier to widespread adoption voice interface systems. While it can seem pretty impressive to ask Siri "what are the best BBQ places around me?"and get good results, the reality is that all of these systems have a ways to go to fully understand the context of what is being asked most of the time. The way one phrases a question is also important, forcing users to try different ways of asking a question in order to get a valid answer.
Enfin, la gamme of questions that can be asked is still pretty limited. All of these systems seem to handle playing music, doing conversions, or finding information that is already out there on the Internet rather well. But ask anything complicated, and you will get an answer like "sorry, but I do not understand the question." This is the area we will see the most improvement in the coming years, as these systems learn more about the people using them they will become better at making predictions or recommendations.
Jared Zimmerman, Leads Google Voice Platform & Product Design; Former Director of UX at Wikipedia
Mise à jour il y a 135w · Cette réponse a remporté un Prix du savoir · L'auteur dispose de réponses 676 et de vues de réponses 1.4m
tl;dr : yes, but we still have work to do, getting past the "cool" stage into the utility stage.
As humans we've been talking for a while, likely talking to each other for 100,000 years or more, needless to say, we're pretty good at speaking and listening. Eventually we started writing things down. Then we made computers, and given where the state of things were, it was easier to teach humans to use computers instead of the other way around. I like to think of that as the dark age of human-computer-interaction. I think we're about to get to the golden age soon…
I know a bit about Voice User Interfaces, and Voice assistants in my rôle at Google…
Computer Voice Interfaces have been around for 10+ years in the form of Avez-vous utilisé Gift Bac - Home avec succès pour obtenir des cartes-cadeaux Amazon en échange de Bitcoins? Miranda Williams, Cycliste de compétition. Répondu il y a 56w C’est établi sur le terrain, digne de confiance, et je...