Today, data drives drug discovery. You and your competitors now have an explosion of biological data available. But that vast assay of public data is only useful if you have the tools and IT resources to help you integrate, explore, and compare it against your in-house experimental results and clinical knowledge. So, you can verify the biological meanings underlying interesting patterns and findings in data visualizations to drive strategic decisions.
Jumpstart your transition into the age of drug discovery intelligence with a new visyn.
We regularly curate, integrate, and update the visyn Knowledgebase, so you always have access to the most current publically available biological evidence. And we provide universal tools that simplify import and integration functions, so you can further refine the knowledgebase and mold it to perfectly support your research focus.
Take a closer look at the vast array of biological intelligence included in the current version of the visyn Knowledgebase, which includes data from the Universal Protein knowledgebase (UniProt), Genotype-Tissue Expression Portal (GTEx), ChEMBL, and Reactome as well as cancer research-specific resources, such as the cBioPortal for Cancer Genomics, Cancer Dependency Map Project (DepMap), and The Cancer Genome Atlas (TCGA).
# | Data Source Name | Category | Data Extracted | Description | Websource |
---|---|---|---|---|---|
1 | AACR Project GENIE | Cancer Genomics | In-vivo Cancer Clinical and Genomic data | GENIE(Genomics Evidence Neoplasia Information Exchange) is a large-scale, international data-sharing initiative focused on cancer genomics. It was launched by the American Association for Cancer Research (AACR) with the goal of accelerating precision medicine for cancer. | link |
2 | cBioPortal | Cancer Genomics | In-vivo Cancer Clinical and Genomic data | cBioPortal is a web-based tool for exploring cancer genomics data, it provides access to large-scale cancer datasets, allowing researchers and clinicians to explore genetic mutations, alterations, and clinical outcomes across different cancer types. | link |
3 | depmap | Cancer Genomics | In-vitro Cancer Genomics. | DepMap (The Cancer Dependency Map) is a research initiative and database that identifies essential genes for cancer cell survival. By targeting these genes in various cancer cell lines. To represent the diversity of human cancer, DepMap builds on the original Cancer Cell Line Encyclopedia (CCLE) project and more than 2000 cell line models have been collected. | link |
4 | TCGA | Cancer Genomics | Data from molecularly characterized 20,000 primary cancer and matched normal samples spanning 33 cancer types | TCGA (The Cancer Genome Atlas) is a comprehensive project that provides detailed genomic, epigenomic, transcriptomic, and clinical data on various types of cancer. | link |
5 | NA (Paper) | Chemical Biology | NA | data from The PROTACtable genome paper | link |
6 | NA (Paper) | Chemical Biology | NA | data from paper: 'hUbiquitome: a database of experimentally verified ubiquitination cascades in humans' | link |
7 | NA (Paper) | Chemical Biology | NA | data from paper: 'Reimagining high-throughput profiling of reactive cysteines for cell-based screening of large electrophile libraries' | link |
8 | miRWalk | Chemical Biology | miRNA-target interactions | miRWalk is a comprehensive database that provides information on microRNA (miRNA) target predictions. It offers data on the potential binding sites of miRNAs on genes across the entire genome. miRWalk integrates both predicted and experimentally validated miRNA-target interactions, covering a wide range of species. | link |
9 | FDA - UNII | Clinical | Substances Unique Identifier | FDA - UNII (Unique Ingredient Identifier) is a system developed by the U.S. Food and Drug Administration (FDA) to assign unique, non-proprietary identifiers to substances used in pharmaceuticals, biologics, food, and cosmetics. | link |
10 | FDA - Adverse Event Reporting System | Drug/Compound | Adverse event and medication error reports. | The FDA Adverse Event Reporting System (FAERS) is a database that contains information on adverse event and medication error reports submitted to FDA, designed to support the FDA's post-marketing safety surveillance program for drug and therapeutic biologic products. | link |
11 | ChEBI | Drug/Compound | Compounds Ids and Accession. | ChEBI (Chemical Entities of Biological Interest) is a dictionary of small molecular entities used to standardize chemical nomenclature by offering a consistent way to describe and reference chemical entities. | link |
12 | UbiNet | Drug/Compound | E3-substrate interactions, classification of human E3 ligases | UbiNet 2.0 (Database of E3-Substrate Interactions) is a knowledge repository that provides updated, validated, and abundant E3-substrate interactions, detailed classification of human E3 ligases, and visualization tools to investigate ubiquitination network. | link |
13 | Drugbank | Drug/Compound | Drug chemical properties, Indications and Targets | DrugBank is a comprehensive database that provides detailed information on drugs and drug targets | link |
14 | Drugcentral | Drug/Compound | Drug chemical properties, Indications and Targets | DrugCentral is an online drug information resource that provides information on active ingredients chemical entities, pharmaceutical products, drug mode of action, indications, pharmacologic action. | link |
15 | Protac-DB | Drug/Compound | NA | Protac-DB is a specialized database focused on PROTACs (PROteolysis-TArgeting Chimeras). PROTACs are a novel class of therapeutic agents designed to target and degrade specific proteins within cells by harnessing the cell's own ubiquitin-proteasome system. Protac-DB provides comprehensive information on PROTACs, including their structures, target proteins, mechanisms of action, and related research. | link |
16 | ChEMBL | Drug/Compound | Compounds Bioactivities | ChEMBL is a manually curated database of bioactive molecules with drug-like properties. It brings together chemical, bioactivity and genomic data to aid the translation of genomic information into effective new drugs. | link |
17 | Selleckchem | Drug/Compound | Compound libraries for HTS | Selleckchem is a company that specializes in providing high-quality chemical compounds and research tools for drug discovery and biomedical research. They offer a wide range of products, including Small Molecule Inhibitor, Compound Libraries and Protein Assays | link |
18 | The Human Protein Atlas | Expression | Human tissues expression profiles of genes on the mRNA and protein level | The Human Protein Atlas is a comprehensive resource that provides detailed information on the expression, localization, and function of proteins in human tissues and cells. | link |
19 | GTExPortal | Expression | Genotype-Tissue Expression | The Genotype Tissue Expression (GTEx) is a comprehensive resource of WGS, RNA-Seq and QTL data from 54 non-diseased tissue sites across nearly 1000 individuals to study human gene expression and regulation, and its relationship to genetic variation across multiple diverse tissues and individuals. | link |
20 | IGV Tracks | Gene | Human Genomics tracks from NCBI reference sequences, clinVar Variants, nested Repeats and Cis-Regulatory Elements | IGV (Integrative Genomics Viewer) Tracks refer to the visual representations of different types of genomic data displayed within the IGV software. Tracks in IGV contain various data types, such as DNA sequences, gene annotations, RNA expression levels, variant calls, and epigenetic marks all aligned along the genome. | link |
21 | HomoloGene | Gene | Homologous Genes | An automated system of the NCBI for constructing putative homology groups from the complete gene sets of a wide range of eukaryotic species | link |
22 | GeneRIF | Gene | Genes Functional Annotations | GeneRIF (Gene Reference Into Function) is a database that provides concise summaries of the functions of genes, based on information from scientific publications. | link |
23 | Gencode | Gene | Mouse Gene Nomenclature | Gencode is a resource for comprehensive and accurate annotations of human and mouse genomes. | link |
24 | NCBI - Gene | Gene | Human gene id, name, symbol and synonyms. | NCBI gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide. | link |
25 | Pig RNA atlas | Gene | Pig's genes expression | The Pig RNA Atlas presents the genome-wide expression of all 22342 protein-coding genes in 98 pig tissues. | link |
26 | Ensembl – Gene | Gene | Genes Information | Ensembl is a comprehensive genome database that provides detailed information about genes and their associated genomic features across a wide range of species, including humans. | link |
27 | MyGene | Gene | Gene annotations | MyGene is a web-based resource that provides a simple-to-use interface for accessing gene annotation data. | link |
28 | EBI – HGNC | Gene | Human Gene Nomenclature | EBI – HGNC (HUGO Gene Nomenclature Committee) is an organization that provides a standardized and unique naming system for human genes. | link |
29 | EBI – GWAS Catalog | Gene | Genotype-Phenotype associations | The EBI – GWAS Catalog is a public database that compiles results from genome-wide association studies (GWAS). These studies investigate the associations between genetic variants and traits or diseases. | link |
30 | Borealis | Gene | Conservation Scores | Borealis, the Canadian Dataverse Repository is a platform that provides access to research data across various disciplines. Hosted by Scholars Portal, we use Borealis as a source for conservation scores. | link |
31 | Biomart | Gene | Gene, Protein Mapped Identifiers | BioMart is a data management system provide an easy-to-use web-based tool that allows researchers to query, filter, and retrieve biological data from multiple databases in a unified and efficient manner. we use BioMart as a source for gene and protein identifiers mapping. | link |
32 | NCBI - Orthologs | Gene | Gene Orthologs | NCBI's Eukaryotic Genome Annotation pipeline identifies ortholog gene groups for the NCBI Gene dataset using a combination of protein sequence similarity and local synteny information. | link |
33 | OMIM | Genetics | Genotype-Phenotype associations | OMIM (Online Mendelian Inheritance in Man) is a comprehensive, online database that catalogs human genes and genetic disorders. It contains information on all known mendelian disorders and over 16,000 genes. | link |
34 | OpenTargets | Genetics | Target/Disease evidence | OpenTargets is a platform that use human genetics and genomics data for systematic drug target identification and prioritisation. It combines information from multiple sources, including genomics, transcriptomics, proteomics, and clinical data, to evaluate the relationship between genes, diseases, and potential therapeutic interventions. | link |
35 | ClinVar | Genetics | Variants Clinical Significant | ClinVar is a public database that aggregates information about the clinical significance of genetic variants. | link |
36 | JASPAR | Genetics | Transcription Factors binding profiles | JASPAR is a database of curated, non-redundant transcription factor (TF) binding profiles stored as position frequency matrices (PFMs) and TF flexible models (TFFMs) for TFs across multiple species in six taxonomic groups. | link |
37 | dbSNP | Genetics | Genetic Variants | dbSNP (Database of Single Nucleotide Polymorphisms) contains human single nucleotide variations, microsatellites, and small-scale insertions and deletions along with publication, population frequency, molecular consequence, and genomic and RefSeq mapping information for both common variations and clinical mutations. | link |
38 | IntAct | Interaction | Protein-Protein Interaction | IntAct is a public database managed by the European Bioinformatics Institute (EBI) and provides information on molecular interactions, primarily focusing on protein-protein interactions and supporting a wide range of interaction types, including direct physical interactions and indirect associations. | link |
39 | Monarch | Interaction | Genotype-Phenotype associations | Monarch is a knowledge platform focused on integrating and analyzing data related to genotype-phenotype relationships across various species, including humans. | link |
40 | STRING | Interaction | Protein-Protein Interactions | STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) is a comprehensive database and web resource that provides information on known and predicted protein-protein interactions. It integrates data from various sources, including experimental studies, computational predictions, and literature, to offer a detailed view of protein interactions within biological systems. | link |
41 | Complex Portal | Interaction | Macromolecular Complexes description and evidence | Complex Portal is a manually curated, encyclopaedic resource of macromolecular complexes from a number of key model organisms. The majority of complexes are made up of proteins but may also include nucleic acids or small molecules. | link |
42 | BioGRID | Interaction | Genetic Interactions | BioGRID (Biological General Repository for Interaction Datasets) is a public database that collects and curates data on protein and genetic interactions across various organisms. | link |
43 | PDB - Ligand expo | Ligands | Small molecules chemical and structural information. | Ligand Expo (formerly Ligand Depot) provides chemical and structural information about small molecules within the structure entries of the Protein Data Bank. Tools are provided to search the PDB dictionary for chemical components, | link |
44 | Reactome | Pathway | Biological pathways | Reactome database provides information about biological pathways and processes. It offers curated data on the interactions and reactions involving proteins, genes, and small molecules within various cellular pathways. | link |
45 | Reactome-Chebi | Pathway | Chebi to reactome mapping | Chebi to reactome mapping | link |
46 | KEGG | Pathway | KEGG data for drugs, variants, diseases, pathways, ligands and genes | KEGG is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies. | link |
47 | GSEA - MSigDB | Pathway | Curated Gene Sets (C2) for GSEA | MSigDB (Molecular Signatures Database) is a curated collection of gene sets that represent various biological states or processes, such as signaling pathways, gene ontologies, and gene expression signatures. These gene sets are used as input for GSEA to explore and interpret large-scale gene expression data. | link |
48 | Geneontology | Pathway | Gene Ontology Terms and Annotations. | Gene Ontology (GO) is a framework for classifying and describing the functions of genes across different species. It provides a standardized vocabulary for annotating genes in terms of biological process, molecular function, and cellular component. | link |
49 | Expasy - ENZYME | Protein | Enzymes Nomenclature | ENZYME is a repository of information relative to the nomenclature of enzymes. It is primarily based on the recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB) and it describes each type of characterized enzyme for which an EC (Enzyme Commission) number has been provided. | link |
50 | Uniprot - Subcellular location | Protein | Protein Topology | Uniprot - Subcellular location subsection provides information on the location and the topology of the mature protein in the cell. | link |
51 | Uniprot - OrthoDB | Protein | Orthologous groups | UniProt-OrthoDB is a specialized resource that integrates protein sequence data from UniProt with orthology information from OrthoDB. UniProt-OrthoDB enables researchers to explore and analyze orthologous relationships between proteins across different organisms. | link |
52 | Compartments | Protein | Protein Subcellular Localization | Compartments is a database that provides detailed information about the subcellular localization of proteins (e.g., nucleus, cytoplasm, mitochondria). It integrates data from various sources, including experimental data, predictions, and high-throughput studies. | link |
53 | SWISS-MODEL | Protein | Annotated 3D protein structure models | The SWISS-MODEL Repository is a database of annotated 3D protein models generated by automated homology modelling for relevant model organisms and experimental structure information for all sequences in UniProtKB. | link |
54 | SIFTS | Protein | Uniprot/PDB mapping | Structure Integration with Function, Taxonomy and Sequence (SIFTS) is a project in the PDBe-KB resource for residue-level mapping between UniProt and PDB entries. SIFTS also provides annotation from the IntEnz, GO, InterPro, Pfam, CATH, SCOP, PubMed, Ensembl and Homologene resources. | link |
55 | NA (Paper) | Protein | NA | data from paper: ‘Proteome-wide mapping of short-lived proteins in human cells’ | link |
56 | Scannet | Protein | PPI-probability per residue | Scannet provides information about the likelihood or confidence that a specific amino acid residue within a protein is involved in a protein-protein interaction (PPI). | link |
57 | Uniprot - Trembl | Protein | Non curated computationally analyzed records | UniProt (Universal Protein Resource) is a comprehensive and highly curated database of protein sequence, functional annotations and protein variants information. It serves as a central repository for protein data. | link |
58 | Human protein atlas | Protein | Expression Level in normal and cancer tissue | The Human Protein Atlas is a comprehensive database that maps the expression and localization of proteins in human tissues and cells to provides information on where and how proteins are expressed in the body, including normal tissues, cancerous tissues, and cell lines. | link |
59 | InterPro | Protein | Protein families, predicting domains and important sites. | InterPro provides functional analysis of proteins by classifying them into families and predicting domains and important sites. InterPro uses predictive models, known as signatures, provided by several databases. | link |
60 | PDBbind-CN | Protein | PDB binding affinity | PDBbind is a comprehensive collection of experimentally measured binding affinity data for all biomolecular complexes deposited in the Protein Data Bank (PDB). It provides an essential linkage between the energetic and structural information of those complexes. | link |
61 | Uniprot - Swiss-Prot | Protein | Curated protein data | UniProt (Universal Protein Resource) is a comprehensive and highly curated database of protein sequence, functional annotations and protein variants information. It serves as a central repository for protein data. | link |
62 | SAbDab | Protein/Antibody | Antibody annotations | SAbDab database containing all the antibody structures available in the PDB, annotated and presented in a consistent fashion. | link |
63 | EFO | Terminology | Description of experimental variables available in EBI, GWAS. | The Experimental Factor Ontology (EFO) provides a systematic description of many experimental variables available in EBI databases, and for projects such as the GWAS catalog. | link |
64 | Cellosaurus | Terminology | Cell Lines Profiles | Cellosaurus is a knowledge resource on cell lines. It attempts to describe all cell lines used in biomedical research. It includes information such as their origin, characteristics and genetic information | link |
65 | Uberon | Terminology | Cross-species anatomical terms | Uberon is an ontology that provides a standardized vocabulary for describing anatomical structures across multiple species. | link |
66 | Cell Ontology | Terminology | Cell Types Ontology. | Cell Ontology provides a standardized vocabulary for describing cell types and their relationships across different biological contexts. | link |
67 | UMLS | Terminology | Medical and biomedical vocabularies. | UMLS (Unified Medical Language System) is a comprehensive framework and set of resources developed by the U.S. National Library of Medicine (NLM) to facilitate the integration and interoperability of biomedical information. | link |
68 | MeSH | Terminology | Terms for diseases, chemicals, drugs, procedures, and other health-related concepts. | Medical Subject Headings (MeSH) is a comprehensive vocabulary used for indexing, cataloging, and searching biomedical and health-related information including terms for diseases, chemicals, drugs, procedures, and other health-related concepts | link |
69 | Mondo | Terminology | disease ontology | The Mondo Disease Ontology (Mondo) aims to harmonize disease definitions across the world. it standardizes the classification and naming of diseases across different medical and biological databases. | link |
70 | NCBI - Taxonomy | Terminology | Species Taxonomy | NCBI Taxonomy Database is a curated classification and nomenclature for all of the organisms in the public sequence databases. This currently represents about 10% of the described species of life on the planet. | link |