Integrated Data Sources

#Data Source NameCategoryDescriptionWebsource
1AACR Project GENIECancer Genomicsglobal cancer registry sharing real-world data from top cancer centersGENIE
2BioGRIDInteractionProtein, Genetic and Chemical InteractionsBioGRID
3BiomartGenemappings of gene ids from biomart using pybiomartBioMart
4Borealis - list2Geneconservation scores from canadian research data repository dataBorealis: The Canadian Dataverse Repository
5cBioPortalCancer GenomicsOpen-source resource for interactive exploration of multidimensional cancer genomics data setscBioPortal
6Cell OntologyTerminologyan ontology of cell typesCell Ontology
7CellosaurusTerminologya knowledge resource on cell linesCellosaurus
8ChEBIDrug/CompoundIDs and accession numbers onlyChemical Entities of Biological Interest (ChEBI)
9ChEMBLDrug/Compounda manually curated database of bioactive molecules with drug-like propertiesChEMBL Database
10ClinVarGeneticsaggregated information about genomic variation and its relationship to human healthClinVar
11CompartmentsGeneprotein subcellular localization from manually curated literature, high-throughput screens, automatic text mining, and sequence-based prediction methodsCOMPARTMENTS
12Complex PortalInteractiona manually curated, encyclopaedic resource of macromolecular complexes from a number of key model organisms (only human data is included)Complex Portal
13dbSNPGeneticscontains human single nucleotide variations, microsatellites, and small-scale insertions and deletions along with publication, population frequency, molecular consequence, and genomic and RefSeq mapping information for both common variations and clinical mutationsdbSNP
14depmapExpressionCancer Dependency Map dataThe Cancer Dependency Map Project at Broad Institute
15DrugbankDrug/Compounddrug informationDrugBank
16DrugcentralDrug/Compounddrug informationDrug Central
17EBI – GWAS CatalogGenecatalog of human genome-wide association studiesGWAS Catalog
18EBI – HGNCGeneresource for approved human gene nomenclatureHGNC
19EFOTerminologysystematic description of experimental variables available in EBI databasesEFO - The Experimental Factor Ontology < EMBL-EBI
20Ensembl – GeneGenegene information from ensembleEnsembl
21Expasy - EnzymeProteinthe nomenclature of enzymesExpasy - ENZYME
22FDA - Adverse Event Reporting SystemDrug/CompoundThe FDA Adverse Event Reporting System (FAERS) is a database that contains information on adverse event and medication error reports submitted to FDA.FAERS
23FDA - UNIIClinicalFDA’s global Substance Registration System enables an efficient and accurate exchange of information on substances through their Unique Ingredient Identifiers (UNIIs) which can be generated at any time in the regulatory life cycleFDA's Global Substance Registration System
24GencodeGenegenecode for mouseGENCODE - Mouse Release M32
25GeneontologyPathwayinformation on the functions of genesGene Ontology
26GeneRIFGenefunctional annotation of genesAbout Gene RIF - Gene - NCBI
27GSEA - MSigDBPathwaycurated gene sets (C2) from msigdbHuman MSigDB Collections
28GTExPortalExpressiongene expression data from samples collected from 54 non-diseased tissue sites across nearly 1000 individualsGTEx Portal
29HomoloGeneGenean automated system for constructing putative homology groups from the complete gene sets of a wide range of eukaryotic speciesHome - HomoloGene - NCBI
30Human protein atlasProteinTranscript expression values across normal and cancer tissueProtein atlas
31IGV TracksGeneHuman tracks for integrative genomics viewer, containing NCBI reference sequences, clinVar Variants, nested Repeats and Cis-Regulatory ElementsIGV
32IntActInteractionmolecular interaction dataIntAct
33InterProProteinthe classification of protein families, predicting domains and important sitesInterPro
34JASPARGeneticsdatabase of manually curated, non-redundant transcription factor binding profilesJASPAR
35KEGGGenetics/Pathwayresource for understanding high-level functions of biological systems. Includes KEGG data for drugs, variants, diseases, pathways, ligands and genesKEGG
36MeSHTerminologymedical subject headingsHome - MeSH - NCBI
37miRWalkChemical BiologymiRNA-target interactionsmiRWalk
38MonarchInteractionGene – disease – phenotype relationships retrieved from the monarch knowledge graphMonarch
39MondoTerminologydisease ontology aimed at harmonizing disease definitions across the worldMondo
40MyGeneGenegene annotation dataMyGene.info
41NA (Paper)Chemical Biologydata from paper: 'Reimagining high-throughput profiling of reactive cysteines for cell-based screening of large electrophile libraries'Reimagining high-throughput profiling of reactive cysteines for cell-based screening of large electrophile libraries | Nature Biotechnology
42NA (Paper)Chemical Biologydata from paper: 'hUbiquitome: a database of experimentally verified ubiquitination cascades in humans'hUbiquitome: a database of experimentally verified ubiquitination cascades in humans
43NA (Paper)Chemical Biologydata from The PROTACtable genome paperThe PROTACtable genome | Nature Reviews Drug Discovery
44NA (Paper)Proteindata from paper: ‘Proteome-wide mapping of short-lived proteins in human cells’Proteome-wide mapping of short-lived proteins in human cells
45NCBI - GeneGenehuman gene id, name, symbol and synonyms from NCBI geneHome - Gene - NCBI
46NCBI - OrthologsGenegene_orthologs table from NCBI geneHow are orthologs calculated? - NCBI
47NCBI - TaxonomyGenea curated classification and nomenclature for all of the organisms in the public sequence databasesHome - Taxonomy - NCBI
48OMIMGeneticscatalog of human genes and genetic disordersOMIM
49OpenTargets – Target/Disease evidenceGeneticsevidence between targets and diseases from OpenTargetsOpenTargets
50PDB - Ligand expoProteinchemical and structural information about small molecules within the structure entries of the Protein Data BankLigand Expo
51PDBbind-CNProteina comprehensive collection of experimentally measured binding affinity data for all biomolecular complexes deposited in the Protein Data BankPDBbind
52Pig RNA atlasGeneGene expression for protein-coding genes in 98 pig tissues and their human orthologsPig-Atlas
53Protac-DBDrug/CompoundDatabase of proteolysis targeting chimerasProtac-DB
54ReactomePathwayChebi to reactome mappingReactome
55ReactomePathwaysignaling and metabolic molecules and their relations organized into biological pathways and processesReactome
56SAbDabProteinall the antibody structures available in the PDB, annotated and presented in a consistent fashionSAbDab: The Structural Antibody Database
57ScannetNANANA
58SelleckchemDrug/Compoundcompound data (smiles ...)Selleck Chemicals
59SIFTSProteinStructure Integration with Function, Taxonomy and Sequence (SIFTS) is a project in the PDBe-KB resource for residue-level mapping between UniProt and PDB entriesSIFTS < PDBe < EMBL-EBI
60STRINGGeneticsprotein interactionsSTRING
61Swiss-ModelProteinannotated 3D protein structure models generated by the SWISS-MODEL homology-modelling pipelineSWISS-MODEL Repository
62TCGA - The cancer genome atlas programExpressiondata from molecularly characterized over 20,000 primary cancer and matched normal samples spanning 33 cancer typesThe Cancer Genome Atlas Program (TCGA) - NCI
63The Human Protein AtlasExpressionexpression profiles in human tissues of genes both on the mRNA and protein levelThe Human Protein Atlas
64UberonTerminologycross-species ontologyUber-anatomy ontology
65UbiNetDrug/Compoundupdated, validated, and abundant E3-substrate interactions, detailed classification of human E3 ligasesUbiNet 2.0
66UMLSTerminologymedical and biomedical vocabularies and standardsUMLS Knowledge Sources: File Downloads
67UniprotProtein/Geneorthologous protein dataUniProt
68UniprotProteinprotein to protein family mappingUniProt
69Uniprot - OrthoDBProteindatabase of orthologous groupsOrthoDB | Cross-referenced databases | UniProt
70Uniprot - Subcellular locationProteinLocation of proteins within cellsSubcellular location | UniProt help
71Uniprot - Swiss-ProtProtein/Genecurated protein dataUniProt
72Uniprot - TremblProteinnon curated computationally analyzed recordsUniProt