CovidGraph was founded in early 2020 as a nonprofit collaboration of researchers, software developers, data scientists, and medical professionals. In April 2021, CovidGraph became part of HealthECCO and will be pursued under this umbrella going forward.
The research and communication platform we built starting with the CovidGraph project currently includes over 130,000 publications, case statistics, genes and functions, molecular data, and more. All of this forms the basis for current and future projects within the HealthECCO initiative.
Who Is This Project Aimed At?
Our aim is to help researchers quickly and efficiently find their way through COVID-19 datasets and to provide tools that use artificial intelligence, advanced visualization techniques, and intuitive user interfaces. This allows to explore papers, patents, existing treatments and medications around the family of the corona viruses.
In addition to literature data we connected information from fundamental entities in biology - namely genes and proteins and their function - , spanning a network of unparalleled size and knowledge.
Knowledge is primarily centered around the domain of corona-viruses but is steadily extended to other connected diseases.
The CovidGraph project provides a growing number of applications to interact with the data stored in the Knowledge Graph. Please feel free to use these apps free of charge, no registration/sign-up needed.
You can scroll through our available applications below:
Neo4j Bloom is an easy-to-use graph exploration application for visually interacting with Neo4j graphs. A public version is available at:
Visual Graph Explorer
This application is provided by yWorks and allows you to explore the knowledge data in a visual way.Launch App
The Neo4j browser is a user interface for querying the graph directly on database level by pattern matching via Cypher. It offers a basic visualization of the result graph as well as data export and API access. A public version is available at:
The new SemSpect Graph App for Neo4j is used for data quality checking as well as query tool by the CovidGraph development team. A public version is available at:
We integrate data from various sources and link them in our knowledge graph:
COVID-19 Open Research Dataset (CORD-19)
In response to the COVID-19 pandemic, the Allen Institute for AI has partnered with leading research groups to prepare and distribute the COVID-19 Open Research Dataset (CORD-19), a free resource of over 44,000 scholarly articles, including over 29,000 with full text, about COVID-19 and the coronavirus family of viruses for use by the global research community.
The Lens COVID-19 Datasets
The Lens has assembled free and open datasets of patent documents, scholarly research works metadata and biological sequences from patents, and deposited them in a machine-readable and explorable form.
Ensembl Genome Browser
Ensembl is a genome browser for vertebrate genomes that supports research in comparative genomics, evolution, sequence variation and transcriptional regulation. Ensembl annotate genes, computes multiple alignments, predicts regulatory function and collects disease data. Ensembl tools include BLAST, BLAT, BioMart and the Variant Effect Predictor (VEP) for all supported species.
NCBI Gene Database
Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide.
The Gene Ontology Resource
The Gene Ontology (GO) knowledgebase is the world’s largest source of information on the functions of genes. This knowledge is both human-readable and machine-readable, and is a foundation for computational analysis of large-scale molecular biology and genetics experiments in biomedical research.
2019 Novel Coronavirus COVID-19 (2019-nCoV) Data Repository by Johns Hopkins CSSE
This is the data repository for the 2019 Novel Coronavirus Visual Dashboard operated by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). Also, Supported by ESRI Living Atlas Team and the Johns Hopkins University Applied Physics Lab (JHU APL).
United Nations World Population Prospects 2019
The 2019 Revision of World Population Prospects is the twenty-sixth round of official United Nations population estimates and projections that have been prepared by the Population Division of the Department of Economic and Social Affairs of the United Nations Secretariat.
Systems Biology Models
Systems biology data is integrated from MaSyMoS (Management System for Models and Simulations), a Neo4j graph database. MaSyMoS stores computational models, simulation descriptions and associated meta-data, including a collection of COVID-19 models from BioModels.
The ClinicalTrials.gov resource provides clinical trial data, including clinical trials concerning COVID-19
The NCBI Reference Sequence Database is a collection of reference sequences including transcripts
The Universal Protein Resource contains protein sequences and associated functional data.
Reactome is a pathway database that stores information about biological pathways
The GTEx Portal is a tissue database containing gene expression data
The Infectious Disease Ontology is an ontology for human diseases
Human Phenotype Ontology
Terms in the Human Phenotype Ontology represent phenotypes of human hereditary diseases
Phenotype and Trait Ontology
The Phenotype and Trait Ontology contains information about phenotypic qualities
Mammalian Phenotype Ontology
The Mammalian Phenotype Ontology describes mammalian phenotypes
ChEBI, short for Chemical Entities of Biological Interest, is an ontology for molecular entities such as metabolites
The biomedical data network Hetionet includes information about diseases and anatomy