Slowing down Covid-19 with Data Science and Machine Learning

von | 24 March 2020 | [at] News

The threats from Sars-CoV-2, better known as the "Corona virus" - and the associated restrictions and output limitations keep us all on tenterhooks. And I ask myself: are we really doing enough to use and design our healthcare system as efficiently as possible in times of impending overstretch - especially of intensive care capacities? In view of the sensational successes of artificial intelligence and machine learning in medicine, this question seems very justified to me.  

The algorithms, which are mostly based on artificial neural networks, have long been able to detect pneumonia (1,2,3,4), for example, but also skin cancer (5), malaria and many other diseases with a higher, or at least the same, accuracy as the best specialists in the respective field. The use of these algorithms would by no means make them superfluous - on the contrary: doctors would have more time to take care of other things, such as better educating and informing patients. In addition, they would not take the diagnosis away from the doctors, but they could support the doctors in a meaningful way. Also and especially in the case of corona infections. 

You may wonder why the last paragraph contains so many subjunctives. In fact, the situation is paradoxical, at least in Germany and many parts of Europe: while research institutions keep reporting new records in the computerised detection of diseases, these models and systems are still used far too little in practice. At least in Europe (6). Among the reasons for this are also different data protection requirements for research purposes on the one hand and for actual use in a clinical setting on the other.  

In concrete terms, this means that anyone who wants to train an algorithm to recognise disease patterns and use it in hospital operations needs data from many hundreds or, better still, thousands of patients. But above all, according to the General Data Protection Regulation, you need the unambiguous consent of each of these patients to use their data. And this must be for a specific purpose, in this case: to train an AI or ML algorithm and use it in the clinical environment. And this is the case even if the data is anonymised. However, the GDPR provides for an exception - namely, if there is a "public interest" in the data, which is probably beyond doubt in the Corona crisis (7, 8). 

Perhaps you are now wondering why your telecommunications provider is allowed to sell anonymised connection data (9, 10). After all, this is also personal data, i.e. your very personal data. The key lies in the definition of the word "anonymised".  

According to current case law and interpretation, data is only actually anonymised if this anonymisation or encryption cannot be reversed or decrypted again. At least not by the user. With telephone providers, the matter is quite simple: the companies MUST destroy the data after half a year. And if the data no longer exists, the encryption can no longer be undone.  

With medical data, such a procedure is of course out of the question. And even for the training of neural networks, it is necessary to be able to trace back the data in rare cases. For example, if doubts arise in the error analysis as to whether a certain disease is actually correctly labelled (i.e. diagnosed) on a blood smear or an X-ray.  

Of course, data protection is important - especially in times of digitalisation. However, health data can save lives and therefore, on the one hand, we must ensure the protection of the data - but on the other hand, we must also use its possibilities. Various, also international organisations, including the US National Institutes of Health (NIH) (11,12), but also many international initiatives, such as "AI for Good" (13) or the "Roundtable on Global Initiative and Data Commons"(14) the view that our data - anonymised and in accordance with strict standards and security requirements - are a common good 

And why not?  

After all, our health minister is of the opinion that we should have no right to our organs in the case of brain death, unless we have expressly objected to a transplant beforehand. The Bundestag did not go that far; instead, citizens are now to be asked more often whether they want to donate their organs after their death.  

The possibilities of machine learning show how important a similar solution would be for data right now and today. For example, an algorithm that detects corona infections on lung x-rays or even on computer tomography scans would have the following effects in a crisis:  

  • According to media reports, doctors currently wait up to 3 days or even longer for the results of a corona test - at least if they do not have their own laboratory. With the algorithm, they would have the possibility to get a much faster diagnosis, especially for patients with severe courses of the disease. 
  • In the case of overloaded intensive care capacities, they could isolate patients infected with Covid-19 in a more targeted manner - which would relieve the burden on the one hand and also reduce the risk of infection in the clinics. 
  • The algorithm could also help doctors who have not yet come into contact with corona-infected patients, or have hardly come into contact with them at all - and especially in ruling out corona infections. 
  • In a later phase, the algorithm could possibly also be used to recognise and identify patients who are at risk of a severe course of the infectious disease, which would enable the earliest possible targeted treatment. 

Initial research in this area is in full swing. Hospitals in Wuhan, for example, are already using a similar algorithm, but the data is not publicly available (15, 16) - furthermore, a data set with only Chinese patients would be highly problematic due to a possible bias. In the USA, a scientist at Stanford University (17) is currently building a publicly available dataset together with doctors, but the number of images uploaded so far, currently 105 x-rays of 65 patients (as of Sunday, 22 March 2020), is far too small to train a reliable algorithm. 

As far as I know, there is currently no comparable initiative in Europe. Alexander Thamm GmbH could programme and train such an algorithm within a very short time and make it available to all interested doctors and clinics free of charge. In addition to hundreds of industrial projects in machine learning, we also have experience in the medical field, for example with 

  • the detection of pneumonia on X-ray images 
  • the identification of malaria infections in images of blood smears 
  • the classification of proteins in microscope images 
  • Identifying the severity of diabetic retinopathy from images of the eyeball.

In order to realise this project and actually make the algorithm available quickly in the tense Corona situation, we are on the one hand looking for contacts with clinics that can provide us with anonymised data - i.e. lung X-rays or CT scans - and on the other hand for sponsors to support this project. As a socially responsible company, we would bear part of the costs ourselves.  

We are happy to provide a project outline to seriously interested sponsors and/or cooperation partners from the medical sector. For this purpose, please contact:

Andreas Gillhuber (Co-CEO)
andreas.gillhuber@alexanderthamm.com

We would be very happy if we could help in this situation. Thank you very much for your interest and attention,
 
Alexander Thamm 
Founder & CEO - Alexander Thamm GmbH

Sources

  1. https://arxiv.org/pdf/1711.05225.pdf 
  2. https://www.nature.com/articles/s41746-019-0189-7 
  3. https://www.researchgate.net/publication/332049903_An_Efficient_Deep_Learning_Approach_to_Pneumonia_Classification_in_Healthcare  
  4. https://ieeexplore.ieee.org/document/8869364 
  5. https://cs.stanford.edu/people/esteva/nature/ 
  6. https://www.digitale-technologien.de/DT/Redaktion/DE/Downloads/Publikation/SSW_Policy_Paper_KI_Medizin.pdf?__blob=publicationFile&v=4 
  7. https://dejure.org/gesetze/DSGVO/6.html 
  8. https://staufer.de/blog/2019/06/dsgvo-schriftliche-einwilligung-patienten/
  9. https://www.mdr.de/datenspuren/datenbroker-daten-handel-100.html 
  10. https://netzpolitik.org/2016/mobilfunkbetreiber-telefonica-macht-jetzt-daten-seiner-kunden-zu-geld/ 
  11. https://www.ncbi.nlm.nih.gov/books/NBK54304/ 
  12. https://commonfund.nih.gov/commons/awardees 
  13. https://medium.com/berkman-klein-center/data-commons-version-1-0-a-framework-to-build-toward-ai-for-good-73414d7e72be 
  14. https://www.itu.int/en/ITU-T/extcoop/ai-data-commons/Pages/default.aspx 
  15. https://jamanetwork.com/journals/jama/fullarticle/2762997 
  16. https://www.alizila.com/how-damo-academys-ai-system-detects-coronavirus-cases/ 
  17. https://github.com/ieee8023/covid-chestxray-dataset 
<a href="https://www.alexanderthamm.com/en/blog/author/alexander/" target="_self">ALEXANDER THAMM</a>

ALEXANDER THAMM

Alexander Thamm is Founder, CEO and pioneer in the field of data & AI. His mission is to generate real added value from data and restore Germany's and Europe's international competitiveness. He is a founding member and regional manager of the KI-Bundesverband e.V., a sought-after speaker, author of numerous publications and co-founder of the DATA Festival, where AI experts and visionaries shape the data-driven world of tomorrow. In 2012, he founded Alexander Thamm GmbH [at], which is one of the leading providers of data science & artificial intelligence in the German-speaking world.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

You may also be interested in