Posted by on
Tags: , , , , , , , ,
Categories: Uncategorized

Code availability

The code for preprocessing and for predictions can be found at GitHub ( The Swarm Learning software can be downloaded from


Fast and reliable detection of patients with severe and heterogeneous illnesses is a major goal of precision medicine1,2. Patients with leukaemia can be identified using machine learning on the basis of their blood transcriptomes3. However, there is an increasing divide between what is technically possible and what is allowed, because of privacy legislation4,5. Here, to facilitate the integration of any medical data from any data owner worldwide without violating privacy laws, we introduce Swarm Learning—a decentralized machine-learning approach that unites edge computing, blockchain-based peer-to-peer networking and coordination while maintaining confidentiality without the need for a central coordinator, thereby going beyond federated learning. To illustrate the feasibility of using Swarm Learning to develop disease classifiers using distributed data, we chose four use cases of heterogeneous diseases (COVID-19, tuberculosis, leukaemia and lung pathologies). With more than 16,400 blood transcriptomes derived from 127 clinical studies with non-uniform distributions of cases and controls and substantial study biases, as well as more than 95,000 chest X-ray images, we show that Swarm Learning classifiers outperform those developed at individual sites. In addition, Swarm Learning completely fulfils local confidentiality regulations by design. We believe that this approach will notably accelerate the introduction of precision medicine.

HPE present their concept of Swarm Learning. AI Innovation of Sweden is going deeper into the subject Federated Learning by arranging a series of webinars related to the strategic area of Decentralized AI.


Identification of patients with life-threatening diseases, such as leukaemias, tuberculosis or COVID-196,7, is an important goal of precision medicine2. The measurement of molecular phenotypes using ‘omics’ technologies1 and the application of artificial intelligence (AI) approaches4,8 will lead to the use of large-scale data for diagnostic purposes. Yet, there is an increasing divide between what is technically possible and what is allowed because of privacy legislation5,9,10. Particularly in a global crisis6,7, reliable, fast, secure, confidentiality- and privacy-preserving AI solutions can facilitate answering important questions in the fight against such threats11,12,13. AI-based concepts range from drug target prediction14 to diagnostic software15,16. At the same time, we need to consider important standards relating to data privacy and protection, such as Convention 108+ of the Council of Europe17.

AI-based solutions rely intrinsically on appropriate algorithms18, but even more so on large training datasets19. As medicine is inherently decentral, the volume of local data is often insufficient to train reliable classifiers20,21. As a consequence, centralization of data is one model that has been used to address the local limitations22. While beneficial from an AI perspective, centralized solutions have inherent disadvantages, including increased data traffic and concerns about data ownership, confidentiality, privacy, security and the creation of data monopolies that favour data aggregators19. Consequently, solutions to the challenges of central AI models must be effective, accurate and efficient; must preserve confidentiality, privacy and ethics; and must be secure and fault-tolerant by design23,24. Federated AI addresses some of these aspects19,25. Data are kept locally and local confidentiality issues are addressed26, but model parameters are still handled by central custodians, which concentrates power. Furthermore, such star-shaped architectures decrease fault tolerance.

We hypothesized that completely decentralized AI solutions would overcome current shortcomings, and accommodate inherently decentral data structures and data privacy and security regulations in medicine. The solution (1) keeps large medical data locally with the data owner; (2) requires no exchange of raw data, thereby also reducing data traffic; (3) provides high-level data security; (4) guarantees secure, transparent and fair onboarding of decentral members of the network without the need for a central custodian; (5) allows parameter merging with equal rights for all members; and (6) protects machine learning models from attacks. Here, we introduce Swarm Learning (SL), which combines decentralized hardware infrastructures, distributed machine learning based on standardized AI engines with a permissioned blockchain to securely onboard members, to dynamically elect the leader among members, and to merge model parameters. Computation is orchestrated by an SL library (SLL) and an iterative AI learning procedure that uses decentral data (Supplementary Information).

Read Full article here:

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.