Why a Coronavirus canSAR?
Selfish really. Cancer research labs across the world have shut down. Recruitment of new cancer patients to innovative, potentially life-saving clinical trials has all but stopped, cancer referrals and diagnoses have plummeted. We have to do something to re-instate our battle against cancer in full.
Also, as we explain below, all drug discovery learning is useful learning. We have learned much and developed useful new tools for Cororonavirus canSAR that we will re-deploy into our oncology canSAR.
As we follow with interest the developments with Covid-19, we have become increasingly concerned about the growing cacophony of opinions and misinformation. From claims about drug activity in patients to recommendations of 'drug-like' compounds for testing. Meanwhile, there are likely to be genuine valuable opportunities that are being missed precisely because of the noise and chaos.
There is clearly a need for an objective web resource on coronavirus research/drug discovery that can provide data at scale – and one that is automatically kept up to date for the benefit of the mechanistic research and drug discovery community.
canSAR was therefore a natural starting off point from which to build a Covid-19 resource, as it can be rapidly adapted to provide much of the objective data needed. canSAR is not only a place to search for objective up-to-date information– it also comes with unique capabilities such as our comprehensive druggability assessments, 3D structural analyses and druggable protein interactomes. And all with fully interlinked data.
We are cancer drug discoverers and we built the oncology canSAR to help empower smarter drug discovery that:
- Makes smart use of the vast amounts of data available, and access key information quickly.
- Make objective decisions about what drug targets to select, what models to use, what compounds and chemical tools look good, bad or ugly!
- Discover hidden opportunities and risks from data.
Integrate not collate
To achieve this, we developed canSAR to integrate billions of experimental measurements from biology, chemistry, pharmacology, systems biology, structural biology and more. We designed canSAR to comprehensively link these data from all these different domains. So from a clinical trial you can rapidly link to the cellular networks that involve proteins targeted by the drug in the trial. From a biological sample you can with a few clicks hypothesise key genes, and find chemical tools and models to help validate them.
Big, Big data from oncology and far beyond
While some data in canSAR is oncology-specific, such as cancer genome sequencing or cancer cell line vulnerability data, most of the data in canSAR are agnostic with respect to therapeutic area. For example, it contains all of the >500,000 individual protein 3D structures from the PDB regardless of the organism or role in disease. Similarly, the >2 million bioactive small molecules and >8,000,000 pharmacological data points are disease- and organism-agnostic.
We analysed >8100 cavities on > 430 protein structures (857 PDB chains). As results of our analysis, we identify 284 ligandable pockets and 339 potential ligandable cavities at the interfaces of protein complexes (biological assemblies). We found novel ligandable cavities at the interface between the coronavirus Spike protein and their receptor ACE2, and at the complex’s interface with the Sodium-dependent neutral amino acid transporter B(0)AT1.
AI to discover hidden opportunities
Once we have integrated these data, we developed machine learning tools to learn from them and predict new therapeutic opportunities. For this, using as much data as possible from across biology and chemistry was key. We trained our algorithms on protein 3D-structural information from as many proteins as possible and from as many different organisms as possible. Similarly, our protein-protein interaction networks that we curate and our predictive analytics are trained on protein networks well outside oncology. This way, we harness big data to inform oncology (and the wider drug discovery field).
Drug discovery always crosses boundaries
The reason why we built canSAR to be this comprehensive was to allow cancer scientists worldwide to explore novel areas of biology as they become relevant to cancer. And also to learn and use tools from other fields. It is well known that many human proteins play a role in different diseases, and many chemical tools can be useful to examine them. At the same time we recognized that canSAR is indeed already being utilised by researchers in other fields – such as neurodegeneration and inflammation. And of course, stimulating or suppressing the immune system is a common strategy across different disease areas.
What's different in Coronavirus canSAR
On top of all the valuable information in canSAR, we are also curating much valuable information from viral biology and drug discovery. For example, we are curating protein networks that are highjacked by coronaviruses old and new and looking for novel drug target opportunities within them. We update daily information on coronavirus clinical trials. We are curating chemical probes for use in Coronavirus research. Finally, we are generating ‘live’ reports that summarise this information.
Coronavirus canSAR is being continually updated for the coronavirus research community. More information will be added. We are also painstakingly curating some data (such as drugs) - it's all work in progress. We invite users to feedback in any way that would make the resource more useful ( contact us ).