Ethical Ethics Policy
How ethical is to insist on biasing ethical policy in favour of one half of the political spectrum? Your half!
25 posts • joined 17 Dec 2007
The Register article states, WRT Ada Lovelace: "We also have her notes for Babbage's machine that represent the first algorithm every produced."
That's merely reference to a refutation or two - not a claim for first algorithm.
For the avoidance of doubt, IMHO Ada Lovelace is certainly worthy of such recognition.
In learning to drive (in the UK), I probably drove not more than 3,000 miles (50 lessons of 30 miles each and as much again with parental supervision). On top of that, I had probably been, by then, an observing passenger for 3 to 10 times as many miles. Waymo, in its 10 million miles, has done not less than 300 times as much.
After decades, my miles driven count is probably up around 400,000 - plus another lesser but similar amount as observing passenger. One tenth as much as Waymo.
Does that give us confidence, or the opposite? In human drivers like me? In Waymo?
So far (and assuming the USA fatality rate of 7 persons per billion km driven), I have no evidence to claim better or worse than the average kill rate (for 400,000 miles) of 0.0045 others.
In the linked article under the byline of John Krafcik, Waymo CEO, some claims are made.
"Our self-driving vehicles just crossed 10 million miles driven on public roads." The thing that gets me about this sort of claim is that it makes no allowance for the number of different roads driven. It's clearly not 10 million miles of different road. Nor is it the same 1 mile driven 10 million times - which we all surely think is less useful 'experience'. Am I alone in thinking the difference matters - and that the raw claim is thus overstatement likely to mislead Joe/Jo Public (and his/her political representatives).
"By the end of the month, we’ll cross 7 billion miles driven in our virtual world (that’s 10 million miles every single day)." Well, that's enough miles for an average real-world Joe/Jo Public to kill 78 people, and seriously injure many more. How many virtual deaths, Surely not zero? How many virtual serious injuries; how many virtual crashes (USA: wrecks) occurred with Waymo driving? There must have been some where the other virtual human driver was at fault. Or did virtual evaluation fall short of measuring those numbers? And/or reporting them to us? At least after normalisation for the real-world occurrence of the driving conditions.
"Today, our vehicles are fully self-driving, around the clock, in a territory within the Metro Phoenix area. Now we’re working to master even more driving capabilities so our vehicles can drive even more places." Am I 'unkind' in thinking that every road driven by Waymo unsupervised within that area has been driven (recorded, computer analysed, and the rest) by human drivers - many times over. In many ways that is good; it's a good way to get started. But it's not the same word "driving" as is commonly understood when applied to humans -- that would be like first sitting and observing while one's driving instructor shows one how to do it on that very same road, several times over.
"Today, our cars are designed to take the safest route, even if that means adding a few minutes to your trip." Again, 'unkind' thoughts creep into my mind. I know of real-world people who actually avoided all motorways, all UK right turns (USA left turns), all roads not previously experienced. No such policies inspire confidence in the drivers! What does driver-Waymo do with unexpectedly less-safe routes: especially with all that 'experience' of not driving similar roads?
"Building the world’s most experienced driver is a mission we’ll pursue for millions of miles to come, from 10 to 100 million and beyond." Which, I think introduces a gulf between "most experienced" and "safest".
"We hope you’ll come along for the ride!" I spot ambiguity!!!
IMHO, it really is not good practice (nor good ethics) to mix/confuse, with product advertising, the serious considerations that should be underpinning health and safety policy.
Nick Kew writes (WRT facial recognition not being safe for making serious decisions): "Is anyone seriously trying to claim otherwise?"
I was rather under the impression that the UK Government was making some such claim with its incoming checks on UK and other EU passports, and its facial scanning booths and 'biometric' passport photographs.
I am struggling here with fitting parts of the article with my understanding.
The article contains a plot labelled "ROC". This means Receiver Operating Characteristic - which is a curve (as stated in the starred footnote); also monotonic. Additionally, in real-world applications with reasonably good discriminative performance, it is usually better for comparison of 'algorithms' for the two axes to be on logarithmic scales of error - so, for example, Log Miss Rate versus Log False Alarm Rate. The given plot in the Register article has the performance of two 'algorithms' expressed as a single point each, rather than as a curve each; thus being examples of a very restricted sort of pattern discrimination 'algorithms' (those without ability to run at many different Operating Points or acceptance thresholds).
Next, the usefulness of each 'algorithm' surely needs to be defined as having a chosen Operating Point that gives (from those available for the 'algorithm') the most desirable trade-off between the two types of error. This has (in addition to the ROC curve defining 'algorithmic' performance): (i) the prior probabilities of the use (eg ratio of attacks to legitimate use attempts); (ii) the costs of each type of error (eg for each burglary and for each inconvenience of legitimate access).
In determining such usefulness, the prior probabilities and the unit costs of each type of error are often unknown, or known only approximately (say within likely ranges).
If one plots average cost (ie as weighted by likelihood of occurrence) against Operating Point of the 'algorithm', that curve will normally have a minimum (and often a broad range close to that minimum). By plotting multiple curves of cost for various prior probabilities and various unit costs, one can usually find a sensible range of likely useful Operating Points - and so chose the actual Operating Point for use from around the middle of that range.
It is undoubtedly true that ROC curves do not take account of costs of the two error types, nor of the operational scenario (as specified by approximate prior probabilities and unit error costs). ROC curves do however embody many/most of the technical performance characteristics of the 'algorithms'. As such, ROC curves can be used to usefully compare technical performance of 'algorithms' over (most/many) scenarios - without having to also consider the prior probabilities and unit costs. However, sometimes ROC curves for different 'algorithms' do cross - which means that the ROC curves alone are insufficient for ranking the technical performance of the 'algorithms', and the likely ranges of prior probabilities and unit costs are also needed.
So, as I understand it, after worries about (circa) 1,000 GWR online customers having the passwords compromised (seemingly on another website or websites where the same passwords were used as on GWR's website), GWR have now cancelled the passwords of all (circa one million) of their website registered users. That includes deregistering those fairly wise persons who use strongish and different password for every website (or other thingy) that they are registered with - and hence were not compromised in the way feared by GWR.
If every website is going to have all their registrants deregistered every time a few of their unwise users get their (easily guessable or multiply used) passwords hacked, the modern world is going to grind to a halt.
From Wikipedia (2013 figures) the USA has 7.1 deaths per billion vehicle-km driven (11.4 deaths per billion miles driven). This is very much less than 1 death per million miles. Also very different from "one deadly accident every million miles" (as mentioned in a comment above) - that is unless there is an average of around one eighty-eighth of a death for each such deadly accident.
My thoughts from the dash-cam video.
(i) As for Roland6 above, it looks to me as if the car was on dipped headlights only, though the external circumstances (low street illumination and no oncoming vehicles) should mandate full-beam headlights. As a potentially relevant supplementary on this, does the 'autonomous' vehicle system control the headlight dipping, or not? If not, the 'supervisor' should be continually monitoring and do so - to retain adequate visibility for his/her 'supervision' function.
(ii) I am concerned that the dash-cam is not adequately showing the lighting contrast that would be available to a human 'supervisor' who had adjusted to night vision requirements. Thus the seeming lack of visibility for a human driver in those circumstances is not actually in any way certain: otherwise, surely the 'supervisor' would have taken back manual control much earlier.
(iii) If the 'supervisor' had had his hands on the steering wheel, given the road position of the pedestrian at the time of impact, swerving left would have been adequate to miss the pedestrian - braking would probably not have been adequate. Accordingly, always or at least at night, it seems to me that 'supervisors' should have their hands on the steering wheel. Also, why did not the 'autonomous' vehicle system steer/swerve the car to the left; it looks to me as if it should have had time to do so.
(iv) Given that LIDAR, as an active illumination system, clearly has no capability to judge distance unless an adequacy of light is reflected from every object on the road (including black clothing), there must surely be an overriding requirement for additional sensors and processing to be active in parallel with the LIDAR. On this, there needs to be an requirement on the 'autonomous' vehicle system control (as for a prudent driver) to drive at a speed consistent with being able to make an emergency stop within the distance that it can 'see'.
(v) The article states "Internal documents also revealed that the company’s self-driving cars were struggling to meet its target of driving 13 miles without any human intervention during testing on roads in Arizona, ..." Surely the whole concept of requiring, instructing, seeking, hoping that 'supervisors' would avoid 'override' would be entirely against safety-critical requirements - on any grounds other that those of a reduction in safety below what the 'supervisor's' own personal driving style/requirements would be.
(vi) Has it been established what the 'supervisor' was doing looking down. Was this at the speedometer or other dashboard display? Was it at a mobile phone? If the latter, this immediately implies full culpability: on grounds of lack of attention and, very probably, on too-bright illumination causing degradation of the 'supervisor's' night vision capability.
Reinforcement learning is IIRC supposedly an approach inspired by (human) behavioural psychology. However, the example given (robot driving in a nail) strikes me as ignoring pretty much all we could 'learn' from human learning practice.
Years if not decades before humans drive nails, they play with such things as a toy hammer bench.
On top of that, any child will be shown what to do, in steps, and sequences of steps of increasing complexity. For example, to drive one peg down to be level with all the others, before learning to mount a peg first, and then to mount each peg in its correctly shaped hole. In AI circles, this approach is given the grand name "Apprenticeship Learning". The requirement is to copy the 'master' (usually a parent). There is, I suppose, reward - parental smiles, clapping, etc. However there is an explicit act of (direct) supervised learning - which is, by definition, different from the indirect use of reward in reinforcement learning.
I would have hoped that AI researchers would have learned (most likely by apprenticeship themselves) that machine learning, to be both effective and efficient, is best done through a combination of Apprenticeship Learning (ie steps to copy) followed by tuning (eg how hard to hit the peg given how far down it must be driven) done through (mainly a mix of) supervised learning (early emphasis) and reinforcement learning (later emphasis). It is inefficient, and hence inappropriate, to (attempt to) have the machine learn initially and overall from only the mechanisms suitable for later refinement.
And, of course, General AI is very largely the stringing together, in a useful order, of (quite sophisticated) steps that have been previously mastered (for other purposes). And the reward function (such as it is) is the general one (in engineering) of minimisation of resource usage (including time) with achievement of adequate performance/quality.
A major feature of human intelligence is the memory from generation to generation, of everything that was previously learned. And don't forget the importance of language (and speech) in that societal functioning.
The analogy between biometrics and ANPR is a poor one, both in terms of the technology and in terms of the law.
Any two (different) car number plates are different. For the UK's current system (for new vehicles), there are (less than) around 1.19 billion possible number plates. There are less than approximately 5% of this number of registered vehicles in total, using the current and all previous numbering systems. Any two (different) people are not necessarily different (in practice) with any single chosen biometric system.
Number plates have zero domain overlap, because every 'number' on a number plate is different (if it is not exactly the same). Errors in recognition arise because of transmission noise (eg dirt, damage, poor illumination, speed) added to the transmission - between conceptually pure 'number' and the ANPR system.
Biometrics do not differ in the same way as number plates. No two measurements of a single person's particular type of biometric are (at all likely to be) the same. There is both domain overlap and transmission noise. So even if all transmission noise were suppressed (say by clever image processing techniques), the domain overlap would remain. We know for example that identical twins are identical for DNA, and can and often do look extremely closely similar for facial recognition; this though there are two different people. [Note aside: though iris scans and fingerprints (in most ways but not all) vary largely randomly, between identical twins as much as between members of the general population.] There are many pairs of unrelated people who have (in measurements with practical accuracy) effectively the same fingerprints and (nearly all different pairs of people) who have (again in measurements with practical accuracy) the same iris scans.
It is this domain overlap that means that no single biometric will ever give certain identification for every possible person. Even multi-biometric fusion (ie use of multiple biometrics like finger prints together with iris scans), though performing much better than each of the single contributing biometrics, will fail from time to time within very large populations of people.
The differences in law are really that ANPR only automates vehicle recognition that could be done manually, thought in a way that makes results available much faster. Use of biometrics actually gives (in some very useful ways) identification of people in ways that cannot be replicated by manual methods - both iris recognition and separately fingerprints actually give more reliable recognition in practice than manual methods of identifying people. Even so, biometrics are not foolproof.
The problem of domain overlap of biometrics is one of those presenting some difficulties in law. There are also difficulties with forgeries (such as gummy fingerprint overlays, contact lenses with false iris patters). There are also difficulties with organised criminal gangs selecting impersonators from available gang members (infiltrator selection) whose biometrics are an adequate pairwise match for identified desirable targets.
Back to the politics, government delay in issuing a biometric strategy is really pointless. Many of the issues have been known for years (see for example my 2005 presentation). Furthermore the issues of changing technology will continue at similar to the current rate. Biometric protection and biometric circumlocution are an ongoing battle - no more and no less than forgery of bank notes and new protective measures.
The six questions in the Register article strike me as very reasonable ones to ask.
I have a few more, particularly extending The Register's second question about training data. For supervised machine learning, the training data obviously needs to be tagged with the actual outcome: suicide or not.
So does Facebook have reliable access to the actual outcome (up to some point in time) for all those (included in their training data) who did or did not commit suicide?
Next, what do they do concerning 'up to some point in time'? One might perhaps expect there to be a period beyond that (during which the subject did not commit suicide) for which (training data) evidence from Facebook postings is considered to be too recent to establish a contribution to state of mind - say weeks or a few months. But how long? And is that length of time the same for all types of mind-state evidence seen in Facebook postings. The issues here are at least two-fold: making the time period too short would increase the false alarm rate; making it too long would increase the risk of missing seriously suicidal intent.
What about cases of attempted suicide that were unsuccessful (say as a cry for help). Do those count as suicides, non-suicides or as one or more types of special case. And how reliable is Facebook's classification, particularly between non-suicide and failed attempt - about which Facebook might well have no knowledge.
Finally for now, what about 'friends' who have concerns but do nothing - because they have chosen to rely on Facebook's known AI algorithms to detect any 'real' concerns?
Oh Sigh, Sigh!
You have a point, but I think you take it far too far.
"... teach the idiots how to use them properly (even the ones that have PhDs). Obvious problems:"
We see things here, between us, of markedly differing severity.
"Their dataset has a prior probability of criminality around 50%. That's way higher than normal and leads the system to think that criminality is common." And "Same problem with a lot of diagnostic medicine ANNs. They try to detect rare diseases with an equal handful of normal and diseased cases. They look great in the literature, but never get adopted, because they keep flagging up healthy people--they've been heavily biased to think that the problem exists."
I'm not sure at all that this is relevant, especially the first bit. Training on the examples is best done with near equal numbers of samples for each class: otherwise there is likely to be criticism on that very issue. Evaluation is, likewise, best done on datasets of near equal class size; and it's easier with equal-size evaluation sets.
For operational use: Bayesian statistics does indeed require weighting with the real-life class occurrence rates - this can be dealt with totally outside of class-specific modelling. This by use of the a priori knowledge of class occurrence statistics.
"Second problem is the data. Are the pictures random? I doubt it."
Read the paper, as linked. It is much better than you (think and) write, though it does have its deficiencies.
They've started by just looking at Han Chinese.
Looking within one racial characteristic (especially on such a small dataset) it actually sound science.
"Then they picked pictures of non-criminals by browsing the web and picked pictures of criminals by scouring for wanted posters."
No! Read the paper. All the photos are from non-criminal identification sources. I suspect this is from existing photos on ID cards or driving licences, or similar. Whilst this is not ideal, there is no bias in data-capture mechanism or in the likely 'happiness' of the subjects.
"Looking at their conclusion faces, I can easily classify criminals vs. non-criminals simply by noticing whether the person is smiling."
No you cannot: see above!
However, there is a problem with the demographics, particularly of the non-criminal dataset. There is a high preponderance of university-educated people. I suspect (only suspect) that this is derived from using current students/staff and their spouses or near-spouses. Note in the paper, the collared shirts of the non-criminals and the non-collared shirts of the criminals. Some clear demographic selection would have been useful here: most likely on employment status and earnings for the non-criminals; also on the type of crime for the criminals: violence against the person, violence against property, white-collar crimes - and so on.
"Third problem is feature selection. I'm sure the algorithm didn't automatically choose to look at facial features."
True, but so what?
"So, the authors picked out a bunch of features they thought might be relevant (neo-phrenology as previously noted) and discovered that some of them were more relevant than others."
Again, so what? Whether the individual or composite features are designed manually or automatically matters nothing, providing their training and the evaluation is unbiased (including lack of bias by repeated manual feedback).
"From this paper, I would conclude that Chinese people tend to post pictures of smiling people online and criminals tend to look unhappy in mugshots. Thus, it's easy to distinguish between a selfie and a mugshot."
See the paper and above: neither 'mugshots' (definitely) nor 'selfies' (it seems) are used. Thus neither data capture quality nor associated (mood/stylistic) effects are relevant deficiencies.
Time goes forward. Adding a leap second makes it go forward a bit faster, for a short while.
Now, subtracting a leap second must be more dangerous - as it makes time go backwards.
Having looked, I can find no record of humans ever having subtracted a leap second. So it is something we (BIH) have no experience of how to handle.
Worriers should worry a awful lot about this. Time might stop, and not restart. With time stopped, we might all live forever. The world might end, or we all might spend forever waiting for it to end - soon!
Well, the seasons they do go round: 'AI Summer' is here again.
It used to be that we were happy just that the Sun did shine, but then we had to consult the Oracle.
Beware Watson, it/he is only one step ahead of HAL, and always behind Holmes: here's to watching you - all.
Your grid has gone cloudy, your coffee machine is being unplugged, imperturbable and repetitive female avatars direct your life ever more closely - do a U-turn.
You'll really know how well all these autonomous cars are going to work when the roads all start to need continuous sets of induction loops: longitudinally, one set per lane.
Hold onto your wallets as best you can. Volunteer no information to
anyoneanything. Look forward, for autumn!
Why cannot the AI car do what the manually driven car does: have a look ahead at the road surface to determine whether it is wet, icy, oily, contains debris, has potholes, is flooded (especially useful this week in the UK), etc.
Listening is so present tense: so past best usefulness!
I am wondering if the rate of female employment has any effect on inequality.
Though difficult to find, going very far back, this link gives the proportion of employed persons who are female, for the USA from 1900 onwards.
Over the period chosen as particularly relevant by Tim (1920 to 1980), the female proportion of the labour force goes from around 20% to just over 42%. This is a very substantial change. It also occurs to me that female wages for each year, particularly over that period, may well have varied significantly less than male wages for each year.
Subsequently, from 1980 to 2008, the female proportion of the labour force fluctuated mildly between 42.5% and 47.0%. There was then a marked increase, to 53.6% in 2010 and to 57.0% in 2014.
I am really struggling with the ECJ ruling (or at least the reports on it), along similar lines to commentor Brent Longborough above.
The returning of a search result is banned, but the original 'document' still remains.
This is like a library holding a copy of, for example, "Mein Kampf" on its shelves (in plain view for anyone who cares to look, and subsequently read) but not having the book in its card index (or modern database equivalent).
In the particular case of the article by Robert Peston, it is not his article that someone has requested to remove from such public view, but a comment they themselves posted under his original article. If we allow this sort of thing, anyone could post a comment that is reasonably obviously undesirable and then seek for Google, and/or other search engine providers, to remove the reference to the article (and all its comments) from search engine results. This allows 'privacy' requests, potentially of unspecified things for hidden reasons, which would obviously be rejected if requested as an edit to or removal of the original 'document'.
From the Register article: "He wrote algorithms without having a computer – many young scientists would never believe that was possible. It was an outstanding accomplishment."
Was that one a 'canned statement' too? I'd love to know who drafted it.
Algorithms have been around for a long time. Euclid, around 300BC wrote a rather good one: http://en.wikipedia.org/wiki/Euclid%27s_algorithm The very term came from the name of al-Khwārizmī, the Persian mathematician born circa 800 AD.
The Royal Navy had trigonometric tables computed by hand (using similar algorithms to those now used in computers) for navigational purposes; they were very interested in automating that work through the Difference Engine of Charles Babbage (1791-1871) who started work on it 1822. The Fast Fourier Transform (FFT) algorithm was actually first used by Gauss, in 1805, to reduce his manual effort in his calculations concerning astronomy.
The minimax algorithm from game theory was (so I have quickly checked) first proved by John von Neumann in 1928: http://en.wikipedia.org/wiki/Minimax#cite_note-1 Doubtless Turing would have known of this algorithm.
Of course, Turing would have been better using the minimax algorithm (with its arbitrary depth look-ahead). There is nothing wonderful or disappointing that Turing drafted a program with 2-step look-head. What is disappointing is that someone who should know better thought to claim it was wonderfully original.
Turing was a great scientist/mathematician. If this sort of stuff continues, the 100th anniversary of his birth will not do him justice.
[Aside: I have interviewed for jobs (soon-to-be) computer science graduates who could not even explain a single viable argument passing mechanism of a non-recursive programming language. And that was in the late 1970s and early 1980s; I don't expect things to be better now. And certainly not if their teachers allow them to believe computers predated algorithms, in practical use.]
The recent announcement by Ms Jacqui Smith makes me wonder if my earlier contributions should again be brought to mind.
Point 13 of my January 2004 submission ( http://www.camalg.co.uk/nids_040116a/NIdS_A031219a_v2.pdf ) to the House of Commons Home Affairs Committee states: "Registration stations on non-government sites may be too vulnerable. These sites and their NIdS staff are likely targets for identity fraud attacks. It is questionable whether sufficient security can be provided at registration stations located on non-government sites." Point 14 might also be of some relevance.
Slide 35 of my presentation on Technical Aspects of the National Identity Card in November 2005 ( http://www.camalg.co.uk/tk051116a/TK051116A_bcs_02.pdf ) showed citizen registration was the largest cost for the whole basic scheme. At £32.10 per person (at somewhat under £2billion over 10 years), my costs were about 40% of those originally quoted by the Home Office; however, they excluded access and usage costs by government and commerce. This is because I had assumed those components would be run at break-even or a profit for commercial use and would represent an overall cost saving for government. And surely part of the whole scheme was to save government effort/costs, directly by efficiency savings in identity checks and indirectly by reduction in mistakes in identity checks and by reduction in identity fraud concerning tax, benefits, etc.
On registration costs, it is interesting to note that the Home Secretary now seems to be claiming that these costs were never included in their original pricing. What on earth were they spending the money on? Maybe someone should check their original figures, just to be sure that registration was really left out, and no one noticed!.
@Tom, who wrote: "My in-laws don't have a house number."
Then they either have a unique post code (so enter house number 0), or are (with my suggestion) being defrauded with the co-operation of a neighbour who shares their postcode.
@Rhyd, who wrote: "AVS was designed for card terminals, which only have buttons with numbers."
On number of buttons, likewise my mobile phone. However, one can enter all letters with multiple key presses, in a way understood by most people. Alternatively (though less easy to understand) one could enter enough 3/4-letter groups to reduce the entropy (residual 'unknownness') sufficiently to make the attack somewhere between useless and much less useful.
In any case, the attack mentioned by El Reg refers to e-commerce. Therefore, for goods physically delivered, the postcode will have been entered using a 100+ character keyboard; likewise the postcode would/could be entered for goods downloaded or otherwise not delivered by post, but there is no fraud-reducing check then possible, based on the address (that is for fraudster who knows the cardholders address).
@Wize, who wrote: "Even if they did the whole address, it wouldn't cover two people in the same flat."
Why don't you try that one on your flat-mate, and see whether his/her credit card company comes after either you, for the crime, or holds your flat-mate to pay the transacted money.
Interesting; are you afloat?
There are, of course, exceptions to every scheme. And sometimes a bit of inconvenience for those exceptions. However, if there is a reasonably sound security system based on delivery address, why not support credit card companies (and their cardholders) benefitting from it as much as is practical.
I am concerned that the raw data reported to be 'anonymised' cannot be 'anonymised' in a very large proportion of cases.
This is because location itself is largely an identifier. Even if the locations given are not particularly precise, a high prevalence of the same locations (eg home and work) could quite easily lead to identification of the individual person, with a very high probability of being correct.
Then, obviously, further locations, even if also approximate, could disclose private information about the individual concerned.
Biting the hand that feeds IT © 1998–2019