SimChrom - interactive resource to explore function and localization of human chromatin proteins

Supplementary material to "Meta-analysis of human nuclear proteome and chromatome datasets: understanding chromatin functioning through protein abundance, sequence and domain composition" (by A.K. Gribkova, G.A. Armeev, A.K. Shaytan).

The resource provides interactive tools to analyze human nuclear and chromatin proteins with respect to their subnuclear localization (from UniProt, HPA, OpenCell), chromatin category according to the developed simplified chromatin protein classification SimChrom, protein abunfance (by PaxDB) and domain architecture (by PFAM). You can also download preprocessed MS-based chromatome datasets and nuclear/non-nuclear localisation reference protein sets.

Interactive Figure 1

Interactive Figure 2
Interactive Figure 3
SimChrom-SL vs Protein domains and families by Pfam
Protein domains co-occurence in SimChrom categories of chromatin regulators
Co-occurrence of protein domains that are often found in the chromatin regulators (Histone chaperones, Histone PTM erasers, Histone PTM writers, Histone PTM readers, Histone modification, Chromatin remodelers, Methylated DNA binding, DNA (de)methylation, RNA modification). From the list of PFAM domain pairs that were found in at least three chromatin regulator protein, domains involved in various histone post-translational modifications, chromatin remodeling, histone binding, DNA binding and protein dimerization/oligomerization were manually selected and classified based on the information currently available in the literature. The conditional probability of finding a corresponding domain A in a chromatin protein given that another domain B is already present was estimated and is presented (columns and rows correspond to domains A and B, respectively). The number of proteins with the corresponding domain pair is given for all SimChrom proteins.
Table 1. Constructed reference datasets of nuclear and non-nuclear proteins at different levels of confidence and uniqueness of localization in the nucleus or multi localization in the nucleus and other cellular compartments.
DownloadDefinitionNumber of proteins
NULOC_CSentries annotated as nuclear in both databases: UniProt (provided evidence code is available) AND HPA (with evidence tags: Enhanced, Supported, Approved), excludes proteins labeled only as non-nuclear in the OpenCell database (annotation grade 2 or 3)3244
NULOC_CS_ULentries annotated only as nuclear in both databases: UniProt (provided evidence code is available) AND HPA (with reliability score: Enhanced, Supported, Approved), excludes proteins labeled only as non-nuclear in the OpenCell database (annotation grade 2 or 3)1263
NULOC_CS_UL_NECFentries annotated only as nuclear in both databases: UniProt AND HPA, excludes proteins labeled only as non-nuclear in the OpenCell database1310
NULOC_JT_ULentries annotated only as nuclear in in at least one database: UniProt (provided evidence code is available) OR HPA (with reliability score: Enhanced, Supported, Approved) OR OpenCell database (annotation grade 2 or 3)4256
NULOC_JT_UL_NECFentries annotated only as nuclear in in at least one database: UniProt OR HPA, excludes proteins labeled only as non-nuclear in the OpenCell database4280
NULOC_CS_NECFentries annotated as nuclear in both databases: UniProt AND HPA, excludes proteins labeled only as non-nuclear in the OpenCell database3962
NULOC_JTentries annotated as nuclear in at least one database: UniProt (provided evidence code is available), HPA (with evidence tags: Enhanced, Supported, Approved), OpenCell (annotation grade 2 or 3)8038
NULOC_JT_NECFentries annotated as nuclear in at least one database: UniProt, HPA, OpenCell8914
NON_NULOC_CSproteins whose localization annotations exclude nuclear localization in both databases: UniProt (provided evidence code is available) and HPA (with evidence tags: Enhanced, Supported, Approved)6056
CYTLOC_CS_ULentries annotated only as cytoplasmic (see Methods) in both databases: UniProt (provided evidence code is available) AND HPA (with evidence tags: Enhanced, Supported, Approved)2026
Table 2. Representative list of nuclear and chromatome datasets from MS-based experimental studies.
YearType of cellsNumber of chromatin proteins (processed)Methods, NotesReferenceDownload
2011HeLa S31501Three different chromatin purification methods: total chromatin extraction, salt extraction and total MNase digestion. Torrente et al. 2011torrente_2011
2014HeLa S33509Nascent chromatin capture (NCC) to profile chromatin proteome dynamics during replication in human cells. NCC relies on biotin–dUTP labelling of replicating DNA, affinity purification and quantitative proteomics.Alabert et al., 2014alabert_2014
2014HepG2, HeLa, MCF-71941ML-based classification based on Chromatin Enrichment for Proteomics (CheP) experiments (crosslinking + differential extraction under denaturing condition) in different conditions Kustatscher et al. 2014kustatscher_2014
2016HeLa1001Crude nuclear extraction, identified "mostly nuclear" fraction by global intensity distributionItzhak et al., 2016itzhak_2016
2018T98G cell line (from glioblastoma multiforme)3056DEMAC method of chromatin extraction (with CsCl fractionation)Ginno et al. 2018ginno_2018
2021K562 cell line2848BL-Hi-C method of chromatin extractionShi et al. 2021shi_2021
2023ESCs H91667chromatin purification strategy: Chromatin Aggregation Capture (ChAC), with Data-Independent Acquisition (DIA) MS-based proteomicsUgur et al., 2023ugur_2023
Placeholder