- Research
- 25/01/2017
Responsible data handling
The stormy developments regarding the gathering and analysis of digital data entail opportunities as well as threats. Dutch data scientists, too, are not blind to the potential drawback of their specialization. For this reason the Responsible Data Science consortium has been set up which is led by TU/e professor Wil van der Aalst and has to contribute to accurate, transparent, and fair data science with respect for privacy.
The quantity of digital data becoming available for analyses is growing exponentially. At the same time ever better techniques are available to draw sensible conclusions from all those data. In this way data science can contribute to better healthcare, more efficient companies and governments, and new scientific insights. However, there are great concerns about privacy breaches, and fear of algorithms which determine on the basis of obscure criteria whether you can get a mortgage loan or can come for a job interview.
Those objections are understandable and not unjustified, says Wil van der Aalst, scientific director of the Data Science Center Eindhoven. Moreover, the negligent or unethical application of data science constitutes a threat to his own area of expertise as well, he explains. “There is a risk that people who suffer negative experiences will turn against the use of data, which may lead to a kind of general prohibition of acting on the basis of data. On the basis of facts, then. That would be disastrous.” With some apocalyptic exaggeration, the term ‘Data Science Winter’ is occasionally used in that context.
With some apocalyptic exaggeration, people speak of a ‘Data Science Winter’
Within Europe you can already perceive a trend towards restrictive legislation, says the computer scientist. That makes it all the more important to tackle the potential drawbacks of data science in a positive manner, as Van der Aalst explains, so as to prevent rejecting the good with the bad.
With this goal the bee’s knees of data-oriented Dutch science have been gathered under the name Responsible Data Science (RDS), which includes Internet lawyers from Tilburg, statisticians from Leiden, ethicists from Delft, privacy specialists from Nijmegen, media experts and linguists from Amsterdam (respectively from UvA and VU). In addition, various academic hospitals are represented - which are pre-eminently organizations that can profit from ‘sound’ data science. The Eindhoven contribution is formed by the data visualization experts from the group of Jack van Wijk and the Architecture of Information Systems group of Van der Aalst himself.
The area of activity of Responsible Data Science is summarized pithily in the acronym FACT (Fairness, Accuracy, Confidentiality, Transparency). Fairness stands for honest conclusions based on data, for instance in connection with a mortgage loan application or a job interview. Accuracy is largely concerned with statistics. “It must be clear how reliable the conclusions are that you draw on the basis of data.” Under the heading Confidentiality comes the well-known discussion about privacy and how to anonymize data, and Transparency, finally, concerns insight into the way in which processes and algorithms lead to a certain conclusion.
Discussion