"We just learned that one of these language reviewers has violated our data security policies by leaking
confidential Dutch audio data what was going on."
Germany's data protection commissioner in Hamburg has launched an investigation into Google over revelations that contracted workers were listening to recordings made via smart speakers. Google has been ordered to cease manual reviews of audio snippets generated by its voice AI for three months while the investigation is under …
You can't be doing this shit, put a bullet in someone for all our sake. What happen to actual punishment of those with money? It was never a common thing but is it completely lost? Perversion of the common people's rights should be held accountable by no less than death of the infidels involved, possibly even shame of the entire families.
> What happen to actual punishment of those with money?
See: Broadcom billionaire Henry Nicholas and pal on drugs rap cough up $1m to avoid the clink
That should answer that question. :-(
@Gene Cash: but at least they were tried and paid a million dollars (they also just seem like a mostly innocent partying couple, rich, but still).
Now the people deciding it's O.K. to listen to anyone's voices at anytime is amazingly immoral, and these people are not even tried for anything. Someone needs to at least throw eggs or shoes or something. Violence has it's place against the evil, and these tech companies have welcomed evil for the sake of money and control.
A female called Alexa is in even more trouble, bloody everything will be recorded. Since these gadgets came on to the market it was obvious what their REAL intention is in the post-Snowden world. Don't buy them. Disable Siri etc. Don't buy a TV with a build in mic and camera, same with a PC, just stick your own in when you need to use it, etc, etc.
There is no way to get this product to work the way that customers desire without manual review of audio segments in which the system is uncertain. How to achieve this legally and ethically is an exercise for the lawyers as much as the engineers, but let's not be naive. Let's not also not be naive about people violating company policy and the law.
The question for me is if G has managed to get meaningful consent for these manual reviews from their customers. (BTW, I am dubious about this, but I'm not going to boil them in oil without a thorough review.)
Of course, I'm one of those Luddites who will walk out of a space where one of these creeps is active in the first place, but apparently there is a market for these things...
Google (and Apple, and Amazon) need to do something ethical for a change: they need to pay users who are willing to participate in product development.
It's really that simple.
Spy corporations home-automation makers can only use your recorded interactions with their product if they pay you for value received.
Oughta be a law.
Yeah, spold, I agree. The only way paying users fair value for their personal information would happen is through legislation, and no country is actually concerned enough about citizens (versus corporations) to even attempt it. Well, maybe Iceland or Norway. Damned ethical Nordic democracies...
Odd, my tiny brain jumped a few rails -- "Just dump the sewage in the river, no need to pay to clean it up" was the 1950s mantra. Now -- "Just grab the user information, no need to pay for anything". In the one case, the environment was degraded; in the other, privacy is eroded.
Make of that what you will. I've had a glass too many to sort it.
Just politely ask. Most people are happy to help, especially when it is something "personalised" like that. It would have to be immediate, and only people with app installed, and located near the device. You want to avoid asking husband John at work about a recording made while wife Julie was shagging the electrician.
Same issue (which I fucking hate) with phone calls where you are told your voice is recorded for quality control etc. Wankers don't provide an opt out.
I'm sure it's handled as "anonymized quality control" in para 437 of the privacy statement that "you" had to agree on after you've already bought and paid for the device. (Whatever happened to the shrinkwrap EULA cases of the 90s?)
The other pertinent question, of course, is: does that imply consent by anybody in earshot? I see a future of "to opt out, please have a Genuine Google Android device with Bluetooth enabled on you at all times. This is necessary to transmit your opt-out decision to the device. Your location data may be stored and processed for privacy and quality control purposes."
I am fairly certain we can get "using voice-activated assistants in public spaces violates GDPR" quite easily, but that hammer might fall on the owner of the device, not the company that makes it, unless somebody pulls quite stunning tricks around how you've only licensed rights to use the software and don't own it.
They still use similar tactics. I have seen clauses along the lines of continued use means you agree to their terms, and agreement to their terms means you've given explicit consent, conveniently making the most tenuous assertion of implicit consent the same as explicit consent. 10/10 for sheer front, I suppose.
Google admitted it works with experts to review and transcribe a small set of queries to help better understand certain languages.
That "small set" probably includes all the ones that begin with "Oh! Oh! Oh!"
Just to confirm whether the user was trying to say "OK, Google," of course.
To me, the easiest way to signify your device should start to record is to use two sequential words e.g., "Alexa Smith". There is a 2 second ring buffer which is listening out for the first word. A microphone continually starts and resets that buffer until the word "Alexa" is used in its entirety. This triggers a flag to say that "Smith" needs to be captured using the same buffer within the next 2 seconds otherwise it reverts to quiescent state.
Full-blown recording ensues until a one-word stop fragment is picked up, which saves the recording and puts the device back into quiescent mode. With this scheme the maximum unsolicited recording witll be 2 seconds long.
If coding vulnerabilities arise in coding this simple utility it will bring a whole new meaning to the term "buffer overflow".
By syllable, by word: these are "human" definitions. To Alexa it matters not.
By building a "state machine" in the way I have described into the product, Alexa's security could be customised to a very high standard by having an "enrolment" where the user specifies the number of words/syllables to define ("in series" as it were), so that P(activating the device) is P(1st word) x P(2nd word)...x P(nth word), followed by uttering the actual words to use in order to cause activation. An LED will need to be lit on the device to indicate when it has reached activation state.
De-activation philosophy could use a similar technique, but it is arguable that one word should suffice.
If this sounds convoluted: it won't be once you get into the habit. How many people do you know who preface a question with "Let me ask you a question..."? There you go, a six-state activation sequence without even having to change your mode of delivery.
Arguably false positives will be less of an issue with this concept because you are partitioning your words/syllables into distinct states which, if you get wrong, resets you back to the beginning again.
It doesn't matter whether words or syllables are used. That's exactly the point. It's pretty simple--the longer the wake word or phrase, the less likely the device is to think it's heard it. You could make it thirty seconds long, and you'll probably eliminate all false positives. Your method of two words that must be separated would help as well. But it doesn't really do much about the problem of data being recorded. It matters little whether they're recording mistakes or real audio, either way they've received something potentially private.
In addition, they're very likely to want something with a few false positives because these devices have a relatively difficult task identifying their wake word. There's a lot of background noise, and the manufacturers don't want the reputation as the one you have to wake up six times before it starts working. And the false positive rate doesn't really have to be an issue; I think most people who have one don't really care that it goes off incorrectly every once in a while. They would care very much that recordings of their home were being sent somewhere random, to be listened to by someone unknown, entirely without their permission. Making the wake word longer doesn't really solve the privacy issue.
While I hope that Google get a good going over for this, I think something is missing from the commissioner's statement :
Finally, due account must be taken of the need to protect third parties affected by voice recordings.
I don't see any way that Google can do that - their agreement is with the owner of the device. Perhaps responsibility should rest with the owner - maybe owners/operators of such things should be obliged by law to inform 3rd parties and get their consent - or turn it off if any one declines.
That's actually interesting. Certain jurisdictions prohibit voice recording without consent of ALL parties -- can I then stick a felony charge against anyone that has Alexa, Siri, etc. on when I'm talking to them in a place I have expectation of privacy (private flat, medical office, etc.)?
I enjoyed your joke, but let's analyze their statement for its real meaning.
"We don't associate audio clips with user accounts during the review process"
That doesn't stop them associating the clips with the accounts after the review process, or for that matter before. They could also associate the text with the accounts and not technically be lying there.
"and only perform reviews for around 0.2% of all clips."
Given how often there are false positives, a significant amount of those must be completely silent, or at least blank background noise with no speech. While I doubt it rises to 99.8%, it could be a healthy chunk. A program that eliminates these from the dataset would help get to the 0.2% level without improving at all on the privacy aspect. Similarly, they probably don't need to conduct reviews on things that get recognized properly. If we presume the chunks of the data in recognized languages saying standard things like "What time is it?" are left out, this gets us much closer to 99.8%. Once again, the 0.2% is probably the private stuff.
I bought two Google Minis for me and my parents for £48 mostly because you can make free phone calls on them. I've taught my parents to leave it unplugged unless they are using it. Speech recognition is scarily accurate, when I bought the first Dragon Dictate it could understand my dad but not my accent. Free phone calls though, why can't BT do that from phone boxes?
"When something is free - like Alexa or hey Google (you only pay for the HW) YOU are the product."
I dont know how anyone struggles with this concept. People seem to want things for free but then complain about the price. It isnt a hidden cost, its well known and people are free to choose.
When did people lose their responsibility?
Biting the hand that feeds IT © 1998–2019