Scroll Top

Ethical AI Quandaries

Whilst I pondered the ethical issues around autonomous driving and why Toyota had only opted to go for autonomy level 2 (no feet or hands) whereas Tesla was opting for autonomy level 4 (no feet, hands, eyes, brain), it struck me that this was not a question about Data Science but about AI’s ethical impact in a social environment.

This is a very clear distinction between AI and Data Science, where the strategy of a social environment comes into effect. The old conundrum of the runaway trolley, ‘do we do’ as Mercedes MD would do and save the occupants of the car or do as the Mercedes Twitter followers suggest and protect the pedestrians until they see the increased cost of the car. Ethics, therefore, are a balance between, cost/convenience and moral values.

I believe we should adapt environments, not just introduce new tools. We should do what the strategists suggest and remove the cyclists and pedestrians, as we currently do on the motorways, where both cyclists and pedestrians are restricted. We cannot drive at speed AND respond safely to protect pedestrians and cyclists.

We adapt the environment, we will see this in cities. Just as AI may impact humans, so might humans impact AI. Imagine London full of autonomous cars, driving us conveniently around and going off to park themselves. Instead of pulling up in front of buildings to drop us off, the architects will have removed lifts from their buildings and requested the car to drive to your office floor in a helical concrete ascending driveway, to prevent congestion in front of the building. The architecture will adapt, and the environment will respond for efficiency and safety.

City plans will adapt. City planners will have commandeered the now permanently empty car park for new buildings for an expanding population.
Meanwhile, humans will be gaming the AI, being “unethical in reverse.” Humans will know that driverless cars are designed to stop when a pedestrian steps out in front of an autonomous vehicle. So, the humans in shopping districts will walk carefree into the roads, instead of waiting at designated crossings and ethically sharing time on the roads.

Cyclists will game the driverless cars to act as an escort, to drive at an appropriate distance behind, to protect them from human car drivers dangerously tailgating them.

Another issue to be considered is around the early release of a model before they are fully tested and productionised. Take for example this simple STOP sign image, with four strategically placed stickers to fool a driverless car’s image recognition into believing it has observed a 40 mile an hour street sign. The consequences could be devastating for the car, its occupants and anyone in the car’s path as it jumps a STOP sign and accelerates. This particular example did not even require code to hack the neural network, just a few simple stickers to upset the probabilistic model. Is it ethical to release a model into the real world which has the potential to be hacked so easily?

There are rigorous ways in which to test code quality and also at the National Physical Laboratory we assess uncertainty in AI to see if it actually does what it says, and how well it performs that activity. Our Data Science team has worked on some recent examples in this space:
CNNs to classify image transformations of ECGs, see this paper: to generate synthetic image data from one medical imaging modality to another, section 5 of this report:

Ethical AI is a fascinating area. There is a balance between computers being helpful or overstepping into what we might consider being “creepy” or a “loss of control.” Computer ethics teaches us a lot about ourselves and our own culture. Buried within many data sets is a history of our cultural beliefs. None is better than that from 600 years of case histories from the Old Bailey – an amazing data set but with an evolution of our culture contained within. For instance, if I use the Old Bailey data set to train an AI to judge us, it might suggest that using excessive force is acceptable if someone were to steal something as nondescript as your pen.

Not only do data sets encapsulate our culture, but they can also be skewed on where the information has been captured. An issue we once had with medical data sets was that the bulk of the data was captured from Northern European males, meaning that dosage measures and the analysis of the impact on phenotypes would fall short for other groups of patients.


Netflix recently released a documentary about biased data sets, following Jo, an AI researcher at MIT and her petition with the US government on gender and race bias in data. Jo discussed the controversy of AI and in one particular example, the everyday practical implications applied to a soap dispenser’s inability to detect her hands.

My career started by looking at the bias in algorithms as opposed to data sets. My focus was on how I could use psychology in my algorithms to gain more time from the customer to process their data – this was back in the day with dial-up modems. Today we see this in computer games, an industry worth £5.5 billion in the UK alone, which uses techniques to slow the player down while they load the next scene. Psychological manipulation of the customer is employed in most user interfaces and more recently in chatbots, to seek empathy and fool us into a sense of security.

Hannah Fry, in her book, Hello World: How to be Human in the Age of the Machine, discusses the American company Compass, which built an algorithm to assess probation periods for ex-offenders. This particular debate was about allowing non-publicly viewable algorithms to make decisions that impact human lives at a profound level, and whether we should have the right to see the inner workings. However, when the algorithm is hidden from us we can often still find how it works. That access is escalated in AI where computer scientists using a technique known as shadow modelling, where they are able to reverse engineer the proprietary algorithm and extract the original sensitive training sets on which the AI was trained. In Fry’s example, she explains that the algorithm possibly didn’t measure enough of the important societal factors impacting the ex-offenders, e.g. ability to access child care or a working mode of transport.

Our social data

In China, they have a social scoring tool known as Sesame. As you disembark the train into Beijing, a passenger is reminded that if they have a score over 500 they can apply to travel on a holiday overseas or use a 5-star hotel. This might seem abhorrent to us, but it is something they have visibility of. For us, in the West, large tech rather than our governments make those decisions, for example, food purchase data is shared with insurance companies and from this, they can determine if we are model citizens and less likely to call on household insurance. All from observing if we buy certain products such as cauliflowers or dill.

Writers have an interesting way of looking at the social and ethical impact of new technology, helping us to visualise and ponder the impacts. Here is a fun explanation of the social scoring dilemma from Charlie Brooker talking about “Nosedive”, an episode in the series of the British science fiction anthology series Black Mirror, which is based on a story by Brooker, also the series creator and co-showrunner –

Too many insights

During my work in Vietnam with the government to analyse accents, we very quickly started to show insights into where a person was educated, if they were a native speaker, all from their voice. This threw up an ethical dilemma for me. Had I crossed a line? There are many examples today where the gathering of data that could prove to be of benefit could also be seen as intrusive. Such as using smart meter data to detect my use of appliances via their energy consumption signature, linked to the time of day to indicate if I have early onset dementia, years before I am even aware!

Future considerations

We need to think about what role might the UK play in a world that uses AI more and more to make decisions about the governance of human lives. How rules to socialise innovation might mirror those in the way the Human Rights Act constrains unwanted human behaviours but enables freedom of action where it does not adversely impact others. This may mean reflecting on the environments into which we release AI systems and how they need to also be adapted to constrained to prevent adverse activities, or humans “gaming” the AI by tricking it into new unwanted activities; protecting the AI from humans.

We need to reflect on existing approaches that have been well researched to underpin seemingly complex areas. For instance, would you want to fly with an airline that had not fully tested the automation of lowering its landing gear? The same mathematical proofing tools, like Z methodology, applied to this scenario, also need to be applied to the code that builds the AI tool that leads to the “impenetrable and inexplicable model.” We also need to apply other classical software engineering principles like Big-Oh notation to understand the efficiency, as there is no point in having the mathematically correct operation of the landing gear, but it takes three days to lower the wheels.

These techniques and other skills from the backend software engineer’s toolbox need to be applied to artificial intelligence, and we can apply them to the observable part of AI. But as people entering the world of AI come from multiple disciplines, this area of rigour is sometimes overlooked and we have edge cases coded into our AI models that lead to biased results with decisions made about human lives.

As statisticians, we also need to also focus on the data we use to train the AI. As a parent, we would expect a child who is schooled in Japan to be less eloquent in English when compared with a child schooled in the USA. AI is also dependent on the data it is fed. The quality and breadth of the data need to be considered if we want a system that is not biased.

As humans we are naturally biased, so selecting the key parameters by which the AI model will be trained, will lead to a skewed model. Instead, we need to extract these key parameters using statistics or we will bias the system.

The good news is we have the necessary skills to prevent bias and we can perform this on the visible components from which we construct the AI models. We can avoid the “black box scenarios.” We can avoid bias.

All of these activities are familiar safety steps that classically trained backend software developers are trained to apply to AI and non-AI systems, so it is important to bring those design principles into the governance conversation.