It's a company with the time, resources, and customer awareness to do this. They can. They have chosen not to.
Why people are acting surprised about this? To me it has been always more than obvious that whatever you say to any of the internet connected “assistants” will be stored and kept as long as the assistant owning company likes.
There's a difference between "it's obvious that they're doing it" (which is only true for people above a certain paranoia line) and "here, look at this proof that they're doing it." There are plenty of things that are obvious to me based on my idea of people's motivations and past behavior, but if I talk to someone lower down on the paranoia spectrum they don't believe me - unless I can point to evidence that they're actually doing it. (A fair thing to ask for, after all.)
This reminds me of the announcement that Google will let u delete your history.
Yeh, right, when pigs fly. Your history powers their "AI". Aint no unlearning that.
It seems like you're talking the difference between suspicion and proof? Saying "it's obvious they're doing it" when you mean you suspect they're doing it is just going to start an argument.
I was using it in the sense of the parent comment, I.E. if you put $100 in a room unsupervised with a career thief it's "obvious" that he'll walk out with it. Another way of phrasing it would be, "anyone who isn't suspicious appears negligent."
It's not obvious from interacting with the device that your voice data leaves, or that there's any permanent record made of your commands. That's something you have to know ahead of time.
You don't think it's obvious given that the device is completely non-functional without internet access?
I think this is obvious however I also understand that most people aren't that clever/interested in these things, and would not realize it. This creates a dilemma of how to message this information in a way that is all at once concise so as to not waste people's time, informative, and not alarmist which freaks people out and is counter productive not only to having a viable product but also to the general adoption of technology.
What's obvious to tech workers is not obvious to the general population. Most people don't give a second thought to putting all of their communications with friends and family on facebook's unencrypted messenger app (apparently the new version uses E2E encryption but those millions of person-years of chat logs aren't going anywhere).
It's not the surprising at all. It is an absolute disgrace though.
Just yet another reminder that GDPR is the bare minimum for something like internet to be tolerable.
So like probably the majority of useful text boxes everyone types into on the internet. Great news, thanks for the info
This is why we are building 100% offline and private-by-design Voice AI at https://snips.ai, it is free for makers and works in english, french, german, japanese, spanish, italian, and more coming!
I don't see any guarantee that if the platform gets popular and the company acquired, privacy would stay the same. I have seen the same pattern too many times...
Thank you for that. Voice recognition makes for awesome devices, but we dont need every gadget connected to the net sending our words elsewhere. Keep up the good work!
Somewhat related: is there a lib or common api for these IFTTT type actions at this point or is everyone building them from scratch?
This is exactly what I was looking for and was my only qualm about writing a voice assistant! Will check this out
Which voice engine do you use?
Thanks to the sacrifice of these clueless users (or at least a good part of them are) the era of offline assistants is near looking at what Google has shown recently.
> Thanks to the sacrifice of these clueless users (or at least a good part of them are) the era of offline assistants is near looking at what Google has shown recently.
I think I'm being clueless, but I can't figure out what this sentence means. Is there a typo in it?
I think gvand means that these users are training the AI models to eventually provide a similar level of service offline. The "cluelessness" is just a value judgment of the users.
I was referring, in a convolute way, to the fact that all the data collected have been/will be used to train the models that will allow offline voice recognition (like the one google has shown at i/o last week).
I might be mistaken, but the reason we don't see offline recognition (amongst other things) is hardware limitations, not the lack of training data. The small onboard chip doesn't have that much compute power, so they offload to more powerful Amazon/Google servers that can run the inference.
> amongst other things
I think that this is an important point. Obviously there's more computing power available in Apple/Google/whoever's data centres than on my device, and I'm sure that is, or at least was, a concern; but I also don't believe that they are indifferent to the utility of sitting on such a huge volume of user-submitted, real-world data.
Which internet company isn't perma-storing any and all 'valuable' bits that happen to be routed through their network?
In theory DuckDuckGo. Just because someone can, does not make it good/right.
is there a law regarding whether the presence of these bugs should be disclosed when you are a guest in a business premises or someone's home ?
I can’t believe that people actually pay for these things... Alexa has always seemed like a gimmick to me. I have an Ecobee smart thermostat that has integrated Alexa, which I promptly disabled after installation.
What's wrong with gimmicks? Voice control itself is not a bad idea, the interface to control your technology is something you are born with :)
Gimmicks are short-lived for fundamental reasons. I'm going to sound like a broken record but, as someone who has owned multiple Alexa devices and quit their job to start a voice-UI business(didn't pan out), the current generation of "smart speakers" are a joke when it comes to doing anything serious besides asking for the weather or toggling light switches. I have no doubt that they are useful to people in very limited ways, but they are still sold as being a lot smarter than they really are. For what they provide, the privacy trade-off is too great for me.
Alexa is a gimmick because it's a speech-to-text command line, and it's sold as being smart even though it's not. Since before I was a kid in the 90s, there have been many attempts to revolutionize computing with speech-to-text technology. Because speaking comes so naturally to us, it's easy to assume that voice-activated anything is better than pushing buttons. In reality, without intelligence and autonomy, lots of interfaces are made slightly worse with voice activation. For those who aren't visually impaired, the ability to use voice to turn off lights barely even makes sense. Alexa frequently gets things wrong and activates from sounds that aren't even close to the wake-word. The ability to create lists is barely practical because it so often can't understand a word, in which case the user has to go to the Alexa app and manually punch in the item.
Voice control would be great if it were revolutionized, but it's hardly in a different state than it was decades ago. The only two things that have changed are improved speech synthesis and ample cloud computing. Because of this, most people I know who own one barely use them beyond a select number of features that are hard to get wrong like "Alexa, what's the weather?" or "Alexa, what time is it?". My parents still sometimes use it to play music(which I gave up on as a music fan), but it gets requests hilariously wrong 1/5 times.
"Besides [a bunch of useful stuff], Alexa is useless! Therefore it's a gimmick."
I have an Echo Dot which I use almost exclusively for a few static purposes: the weather, setting alarms, turning my lights on and off, and asking what time it is.
I also ask it basic questions like whether various sports teams won or what time they're playing, which it also answers well.
Not sure why you think it is useless just because it's not a magical general AI that can do everything.
I find it extremely useful.
> For those who aren't visually impaired, the ability to use voice to turn off lights barely even makes sense
I'm not visually impaired and I use this feature all the time. It allows me to turn off the light when reading in bed without having to get up and walk to the light switch.
> most people I know who own one barely use them beyond a select number of features that are hard to get wrong
Yeah, exactly. How does this make it useless?
The point is that these few useful things can be all achieved much cheaper with much greater ergonomics.
> the weather, setting alarms, (...) and asking what time it is.
You can do that on your phone. Even assuming you wanted it hands-free, it doesn't justify an always-on microphone sending data to the cloud. We had the tech to do this level of voice recognition reliably in the 2000s.
> It allows me to turn off the light when reading in bed without having to get up and walk to the light switch.
Kids from my generation used to solder clap detectors for like $5-$10, and they're already more reliable and faster to use.
Voice is cool, it's like being in Star Trek. I get it, I built my own system to control music in the 2000s, complete with audio responses snipped from Star Trek shows. But the feeling of "living in the future" wears off pretty quick, and you're left with a ridiculously expensive and user-hostile gimmick.
an always-on microphone sending data to the cloud.
This is a dishonest characterization of how every smart speaker in existence works. There is no continuous stream as this statement implies.
There is a continuous buffer of a couple of seconds for the device to locally catch a wake word. (You can verify this by disabling the device's internet connectivity - it will still catch the wake word and speak an apologetic message about not being able to connect). Also, changing the wake word requires a full restart, which says "firmware" to me.
After the wake word is spoken, that buffer and anything immediately after it is what gets sent up to the cloud for voice recognition.
The buffer serves a purpose in that it prevents an awkward pause between the wake word and the action reqeusted. (So you get to do "Alexa, turn on the lights" rather than "Alexa? bong Turn on the lights.")
The "user hostility" and "gimmick"-ness of this design is entirely subjective and quite overblown, in that "nobody will ever use Dropbox when they could just use rsync and a cronjob"-type bias that HN tends to have.
I'd say it beats the alternative from a pure functionality standpoint.
In an ideal world that's true. But Alexa doesn't only upload your audio when you say the wake word, it uploads it when it thinks you said the wake word. Unless the recognition is 100% accurate, that's an important difference.
There was the one very publicized instance where it recorded private conversations and sent them to a random contact, but you have to wonder how many other non-Alexa commands have been picked up and sent off to Amazon even though they shouldn't have been.
So even if you trust Amazon with recordings of your weather queries, you also need to trust Amazon's wake word recognition to not pick up anything other than your weather queries.
And there was that other time when they took 1700 of their recordings, accidentally sent them to the wrong person, and a journalist was able to track down the recorded person based on their personal information from the recordings:
So, I used to think the same thing, and have several Alexa devices in my home. However, last week I got an ad from Amazon about shower head installation. I have never searched for that on any device, nor asked my Alexa about it. But I did have a meatspace conversation about it the day before I got the ad.
Maybe I said something which triggered it? No idea, but I'm sure that Alexa listening in caused the advertisement to show up.
I promptly set them all so they no longer listen and I'm trying to figure out a better solution (HomePod? Mycroft?). I absolutely love the (limited) capabilities smart speakers provide. But I'm done with Alexa.
> [description how Alexa works]
Citation needed. I don't think the actual process is publicly available (though I may be wrong). Call me too cynical, but given the bugs that happened to both Google and Amazon, and that the law can force compel companies to do things with leased hardware they ordinarily wouldn't, I don't really trust it.
> The "user hostility" and "gimmick"-ness of this design is entirely subjective and quite overblown, in that "nobody will ever use Dropbox when they could just use rsync and a cronjob"-type bias that HN tends to have.
It's not like that. Dropbox is a simple service - syncs files, optionally allows sharing them. Doesn't do any weird stuff around it. Doesn't force an Internet round-trip for things that can be done locally. The user hostility of voice assistants comes from the way it's designed - cloud-based voice recognition instead of local one, forcing a cloud round-trip for doing things entirely over LAN, being tied to an "ecosystem" of third-party integrations you can't easily manage or complement with your own as end-user, etc. The "gimmick-ness" part comes from it doing very little, and doing it worse than alternatives.
> Citation needed. I don't think the actual process is publicly available (though I may be wrong). Call me too cynical, but given the bugs that happened to both Google and Amazon, and that the law can force compel companies to do things with leased hardware they ordinarily wouldn't, I don't really trust it.
I don't think you need a citation here. The network traffic is not sufficient to send always-on audio and the power consumption is not sufficient to do always-on speech-to-text or other similar analysis.
>Also, changing the wake word requires a full restart, which says "firmware" to me.
Well to me it says "static variable".
You use case almost exactly mirrors my own. Even if weather, alarms, time & lights was all it ever did, it has 100% been worth the purchase price.
It is genuinely useful to have a no-hands-required timer in the kitchen, and being able to turn off the bedroom lights when I'm done reading for the night without having to reach for a switch is great.
I was even pleasantly surprised by how much I enjoyed Alexa's Skyrim. Sure, it's really more of a joke as it is, but it made me think that some choose-your-own-adventure skills would be a lot of fun.
(Asking Alexa to play white noise to help me sleep has been nice, too.)
The issue here is that a not too powerful local device could do that without recording everything you say in the cloud and keeping it forever with a label with your name on it.
I don't disagree on that point at all. Just commenting on how even if these devices aren't particularly smart, they still provide useful functionality.
Call me crazy but, but paying 30 dollars to lose all of my privacy just so I don't have to get up to turn off the light at night seems like a poor trade. Maybe not useless, but I honestly think you can do better.
And a basic raspberry pi could be programmed to cycle through the weather, alarms, lights, time. I guess the crux of doing it that way however, is taking personal responsibility for security, which still seems better if you're slightly lax at it, than sharing your "house microphone" with a multi-billion dollar company with motivation to exploit it.
I already own other microphones, for example my smartphone, laptop, various earbud/microphone combos, etc.
I already trust all those manufacturers not to secretly upload everything I say to the cloud. Why is amazon any different?
Not that I agree 100% with this argument, but it's the best I've heard.
With laptops, smartphones, earbuds, etc. they're not designed to be heard from across the room.. Or even in your pocket. Even when my phones on speaker across the room people will have problems hearing me properly. With Alexa, Google Home, and others, the microphones are specifically designed to hear you from a distance away, over music and other ambient noises.
Simply put, an Alexa or Google Home is a much more effective listening device.
Because it does upload everything you say to the cloud, and saves it for undisclosed amount of time, and has the ability to link those recordings to your account and the motivation to exploit it for business purposes? Not to mention possible data breaches in case that data is not perfectly secured.
> Because it does upload everything you say to the cloud
What evidence do you have of this?
That's a necessary component of how it works: these devices don't do the speech recognition locally. They upload it to their cloud and run it through their machine learning network.
They're not supposed to upload "everything" you say, only the stuff you say after the wake phrase. The issue is that they're known to wake up and start recording when they're not supposed to, because they're not perfect about recognizing the wake phrase. There's also the fact that they have to be _listening_ all of the time in order to hear the wake phrase, and the security fear is that, due to a bug or malicious software, they could end up recording and uploading everything they hear as well. That's not a problem of the stated intent of the device, it's a problem of the latent capabilities being used improperly. The best way to secure the device is to remove those capabilities.
Yes, everything you said is true, but none of it contradicts my point.
"You could hypothetically be giving up all your privacy if there are bugs, and in some cases there are known bugs causing you to give up a limited amount of privacy" is a very different level of threat from "you are giving up all your privacy to Amazon who will use it for ad targeting".
I'm willing to accept the former, in return for the amount of convenience I get from the device.
However, others in this thread are arguing that nobody should accept the latter, which is irrelevant because it's not the real situation.
I fairness to your bed time reading example, it’s very common for people to have bed side lights - even if they’re just little lamps that rest on a table. At least this is the norm in the UK. So for a great many of us, turning our reading light out would just be an extension of our arm.
My wife tends to prefer using the TV as her reading light. I find that rather bizarre but she likes the background noise. In any case, TVs have remote controls and sleep timers so that mitigates the need for voice control.
To be honest, I couldn’t think of a place I’d less want an Echo than the bedroom. Even the bathroom seems less inappropriate (eg you might want music when in the shower / bath).
True, but not everyone does, and for them, the Echo provides extra functionality. I've also added a lot of dimmable lights in my home, and voice control to set a light or set of lights at 50% is much much less impactful to whatever I'm doing than pulling out a phone.
Why am I getting downvoted? Do people disagree that it’s common for bedrooms to have lamps? A little context would be appreciated because my comment is 100% accurate in terms my own experiences yet several people seem to disagree. Genuinely interested to know why.
I didn't downvote, but just to answer- yes, I have a nightstand light, like most bedrooms here in the US. However, I also use a CPAP machine, and it is a genuine convenience to be able to put it on with the lights on, then use words to shut the lights off instead of reaching over to the side and having to deal with a slightly-too-short hose length. And frankly it's nice to not have to leave one's blanket cocoon for light/dark, not even it a little.
A small thing? Sure. But it is a genuine comfort that I enjoy.
To be honest I’d lump you into the same as the caveat made earlier about people with physical disabilities because even though you’re (presumably) able bodied, at night time your movement is restricted.
I can totally see the benefit for people in circumstances like your own (and those with greater concerns too). But for the average Joe? That’s another matter. Like any accessibility tool, I agree they are fantastic for those who depend upon it. However I wouldn’t advocate everyone gets a blind dog, hearing aid or wheelchair unless they actually needed it. Similarly would you actually want an Echo device in your bedroom if you didn’t have the accessibility concern that your CPAP machine creates?
Ultimately it’s harder to justify that privacy vs convenience trade off when accessibility isn’t an issue. But I accept that’s a decision which will be as divided as it will be personal. Myself; I have an Echo in the kitchen but I have no interest in allowing them to migrate upstairs.
Also thanks for the response. I do appreciate reading someone else’s perspective:)
>the current generation of "smart speakers" are a joke when it comes to doing anything serious besides asking for the weather or toggling light switches.
The primary use of my Alexa devices is being a voice-controlled IP radio. I paid about $300 many years ago for Logitech/Slim Devices' crack at this, the Squeezebox, and I loved it.
Now, I get that same functionality for $40 shipped with more on top. Everything else is a bonus. including the smart home stuff - being able to turn the lights on with grocery bags in hand is damn futuristic.
Does it screw up sometimes? Sure. But it beats the hell out of a keyboard or dials for the use case.
I've got a setup of google home devices, and that's pretty much all I use em for - the occasional stupid question and the rest of the time it's playing music that I can control with my voice, which I find incredibly convenient.
I completely agree with every single word you’ve posted there. From the history recap to the modern era problems.
These days the only thing my Echo does which is remotely useful is setting named timers while cooking. So I can have my hands dirty with raw meat and ask Alexa to set various timers for each step of the meal. I found that particularly good when cooking meals that have large gaps in time between stages (like Sunday roasts when there can be 5 minutes or more between the cooking times of different vegetables). However even there it sometimes becomes more trouble than its worth when it starts mishearing names of vegetables or duration numbers when spoken.
The most disappointing thing is that I spent a few days working with Alexa’s - frankly terrible - SDK to integrate it into my existing home automation (all stuff I’ve built myself and powered by a FreeBSD server). Not only was the development progress of Alexa skills amongst the most frustrating I’ve had in my ~30 years of experience writing software; but it turned out to be a complete waste of my time because Alexa is so piss poor at any interactions more complex than the very basic (as you described). It’s also very laggy at such interactions so even when it does work it feels slow. So slow, in fact, that it ends up taking longer and being more painful using the voice control than it would have been to wake my laptop from sleep and trigger the same HTTP API endpoints Alexa would used but instead doing so manually from the command line using curl. So needless to say I very quickly gave up using Alexa for home automation.
I wasn't born knowing how to speak. I had to learn. Babies can use their fingers to point long before they can use their mouths to speak.
Also, based on every other case where I've tried voice recognition, I have to learn to speak in an extra-distinct and artificial way for a computer. (I don't think I have an unusual accent, but I can say "TRACK A PACKAGE" into the receiver all day long and not get where I want to go, even though no human would have trouble with it.) What's the point in that?
Did you write that comment with speech-to-text? Why not? If it's such a good interface, why use a keyboard to write text?
No, it's something you acquire--no one is born speaking.
The interface is sound and you can generate sound with your organs if you want to go full pedantic.
In that case every computer interface is controlled by things you’re born with.
People are also born with fingers.
It's similar to mobility scooters. Made for handicapped, but also popular among the lazy.
ha, great analogy but without the snooping long tail result of usage...
Me too, but now they are even giving them away. Google has offered me a free Google home several times now. Thank you, I don't need your spy device in my home.
So, for me, it is a glorified cooking timer and alarm clock. If I could buy a closed unit that just did that, then I would be happy.
You mean.... your phone, right?
The timer that is part of iOS is not as good as Alexa's. With Alexa you can easily set concurrent timers and label them. Siri doesn't know how to do that.
A cellphone is pretty much the absolute nadir of "closed unit" devices (in the sense the prior poster was describing) that could be possibly imagined.
How else would you use the word "nadir"
"Lowest point" basically.
I.e. "Political threads are the nadir of discussion on HN"
I don't want to touch the phone with raw meat juice; voice commands plus learning to cook has the potential for amazingness.
I'm going to presume that you have Siri enabled... no touching required, right? Just knuckle the thing. Curl the pinky finger and tap w/ the knuckle. 0_o
I don't have Siri, and I generally don't have my phone near me all the time. Same for the wife, it's super awesome to just be able to talk to Alexa. I share the privacy concerns, so I am hopeful https://mycroft.ai being a suitable replacement as I have ordered and plan to build a better rig to handle the deepspeech.
I use my voice controls for 3 big reasons daily: 1) Timers (laundry, cooking, etc) 2) Music 3) Smart home stuff
On the 3rd one, voice actually made the smart home easier. To use a smart light bulb you had to unlock your phone, open an app, login, and do your thing. Now I tell my Alexa/Google to do it and it's super easy.
Also the chromecast integration on Google Home is killer "OK google play pandora on TV" or "Play xyz on youtube on tv"
I only use my voice controls once a day. When I come home, Siri automatically turns on my lights. When I go to bed, I shout "Hey, Siri, turn off the lights."
(10% of the time she responds with, "OK, your six a-m alarm is off." I don't know why.)
Honestly it took me awhile to get to this state (aside from the fact I work with this stuff as my job). It was almost like the transition of getting a smart watch. Once I forced myself to do certain things with it, you get into a state where you realize how much easier it is to use it.
>Alexa has always seemed like a gimmick to me
Maybe you don't ever listen to music, but the first Echo was a GAME CHANGER. 99% hassle free being able to just say "play x on spotify" "play x on pandora" "play x on amazon music" "play radio station 103.5whatever". How else could you do that before the echo? You either had to plug in your phone/ipod and control it from there limited to what was on your device.. stream bluetooth from phone to speakers which was a nightmare and dependent on your phone there etc.
Ever cook and want to set a timer without having to touch anything and stop what you are doing?
Those two things ALONE are absolutely massive.. music + alarms.
Not to mention the pure simple info you can get- especially for kids. They can ask so many questions and get answers. They can set their own timers.
I can go on and on.. but it's pointless because you have decided a device that is vital to many peoples day-to-day life is still a gimmick in 2019....
I can't believe people actually pay for mobile phones. They seem like a gimmick to me. My rotary phone works just as well.
I feel like your analogy is predicated on missing some pretty important facts. Sales and usage numbers, maybe. Obviously the fad for communicating between humans will gradually fade, as the time-proven wonders of shouting at a rock on your countertop demonstrates infinite utility.
This is, while sad and even maddening, obvious.
It's obvious to anyone who spends a moment thinking about it that some portion of what you say remains.
What's less obvious is that they store everything and most definitely index it so it can be used later against you (all it takes is one legal action - separation, police, you name it).
What's further disappointing is that Amazon stores the transcribed text. Which may be incorrect but deemed "truth".
I told alexa to go away. It did not. It just persists on nagging me to say things other than alexa go away
Well color me surprised.
In the EU this is a violation of GDPR if true.
Not sure why you're being downvoted. Is it not a GDPR violation?
Agreed, I was about to say the same thing. If it's their data (i.e. data the customer generated and stored on the service), and there's a way to validate that it's theirs (as it exists in the data store) based on information the customer provided, I'm pretty sure the GDPR requires deletion on request.
ZDNet's article https://www.zdnet.com/article/amazon-cant-yet-completely-del...
Thanks! We'll change to that from https://www.zdnet.com/video/amazon-cant-yet-completely-delet..., which is a video.
Cache issue? I still don't see [video] in the title.
The url was changed to a text article.
interested to know how much amazon is spending to store hundreds of recordings of me saying, "HDMI1"
According to this video they're only keeping text transcripts, so... maybe a penny over your entire lifetime.
I hope a bug doesn’t cause your assistant to accidentally record everything while it’s listening for your prompt.