Clicking the categories selects them for display in the list of proteins in the table below. When several categories are selected the boolean logic operation selected by the toggle above applies. Number of classifed proteins are displayed next to each category.
From the list of Pfam domain type pairs that were found in at least three chromatin regulator protein, domains involved in various histone post-translational modifications, chromatin remodeling, histone binding, DNA binding and protein dimerization/oligomerization were manually selected and grouped into the respective categories based on the information currently available in the literature. The conditional probability of finding a corresponding domain A in a chromatin protein given that another domain B is already present was estimated and is presented below (columns and rows correspond to domains A and B, respectively). The number of proteins with the corresponding domain pair is given for all SimChrom proteins. Clicking on individual squares selects the respective proteins for the display in the table below.
Reference | Chromatin state specificity | Chromatin purification methods (short) and additional computation filtration | Type of cells | Proteins identified (by authors) | Processed number of proteins | Download protein list |
---|---|---|---|---|---|---|
Torrente et al., 2011 | Total chromatin, euchromatin, heterochromatin | (1) Total chromatin extraction with hypotonic lysis, Triton X-100 permeabilization, low-speed centrifugation, and EDTA-mediated nuclear lysis. (2) Salt extraction using high salt buffer (420 mM KCl), sonication, centrifugation, followed by dialysis. (3) Micrococcal nuclease (MNase) digestion, including total digestion and partial digestion to separate euchromatin and heterochromatin fractions. | HeLa S3 | 1038 (total chromatin extraction), 1388 (salt), 949 (MNase); 751 (partial MNase); 1912 (all identified chromatin proteins) | 1501 | torrente_2011 |
Kustatscher et al., 2014 | Total interphase chromatin | Chromatin Enrichment for Proteomics (ChEP): in vivo crosslinking with 1% formaldehyde, followed by differential extraction under denaturing conditions (SDS, urea), RNase A treatment, centrifugation-based chromatin pelleting, and sonication. To assess chromatin association, the study applied Multiclassifier Combinatorial Proteomics (MCCP), which integrates SILAC-based quantitative proteomics from 35 biochemical and biological perturbation experiments. A random forest machine learning algorithm was trained on curated chromatin and non-chromatin reference proteins to assign each detected protein an interphase chromatin probability score (ICP). | HeLa, MCF-7, HepG2, HEK293, U2OS, DT40 | 1980 (chromatin proteins with ICP>0.5); 7635 (total chromatin proteins with ICP values) | 1956 | kustatscher_2014 |
Alabert et al., 2014 | Nascent vs. mature chromatin | Nascent Chromatin Capture (NCC): biochemical isolation of newly replicated chromatin using biotin–dUTP incorporation. Cells were pulse-labelled with biotin–dUTP during DNA replication and fixed after either 20 min (nascent chromatin) or 2 h (mature chromatin). Chromatin was crosslinked with 2% formaldehyde, nuclei were isolated, and chromatin was sheared to 2–3 kb by sonication. Biotin-labelled DNA–protein complexes were isolated using streptavidin beads. For proteomic analysis, nascent and mature chromatin were metabolically labeled by SILAC and processed together. | HeLa S3 | 426 (nascent-enriched); 3995 (all identified chromatin proteins) | 3861 | alabert_2014 |
Itzhak et al., 2016 | Nuclear proteome | Cells were metabolically labeled with SILAC and gently lysed under hypo-osmotic conditions to preserve organelle integrity. Post-nuclear supernatants were fractionated by differential centrifugation into five sub-organellar fractions plus cytosolic and nuclear pellets. Protein abundance profiles across SILAC fractions were processed by PCA and classified using a supervised SVM algorithm trained on curated organelle markers. In parallel, total intensities in the nuclear, cytosolic, and organellar fractions (from label-free MS) were used to assign each protein to global classes such as "mostly nuclear", based on relative signal distribution. | HeLa | 1133 (nuclear); 672 (nucleo-cytosolic); 8710 (total proteome) | 1092 | itzhak_2016 |
Ginno et al., 2018 | Total chromatin: time-resolved (G1, S, M) | Density-based enrichment for mass spectrometry analysis of chromatin (DEMAC): formaldehyde-fixed cells were sonicated, subjected to cesium chloride (CsCl) gradient ultracentrifugation to isolate DNA–protein complexes by buoyant density (1.39 g/cm³). Chromatin fractions were collected, dialyzed, decrosslinked, digested with DNase I. | Human T98G (glioblastoma) | 3065 (chromatome); 6242 (total proteome) | 3051 | ginno_2018 |
Shi et al., 2021 | Promoter-proximal chromatin | Hi-MS (Hi-C-based proteomics, adapted from BL-Hi-C): cells crosslinked with 1% formaldehyde; genomic DNA digested with HaeIII (GGCC sites); ligated with biotinylated bridge linkers; nuclei lysed in 0.2% SDS; chromatin sonicated; chromatin-DNA complexes captured on streptavidin beads. Quantified sensitivity to 1,6-hexanediol evaluated via AICAP index (Anti-1,6-Hexanediol Index of Chromatin-Associated Proteins). | K562 | 3228 | 2848 | shi_2021 |
Ugur et al., 2023 | Total chromatin | Chromatin Aggregation Capture (ChAC): nuclei fixed with 1% formaldehyde, lysed with SDS and urea, sonicated, and purified by protein aggregation capture (PAC) on magnetic beads. DIA-MS with DIA-NN used for quantification. | Human ESCs (H9) | 2487 | 1730 | ugur_2023 |
Alvarez et al., 2023 | Time-resolved (nascent, G2/M, early and late G1) | Nascent Chromatin Capture (NCC) method, which relies on pulse-labeling newly replicated DNA with biotin-dUTP, followed by formaldehyde crosslinking and sonication-based chromatin fragmentation. Biotinylated DNA-protein complexes were affinity-purified using streptavidin magnetic beads. HeLa S3 cells were synchronized and harvested at five post-replication time points (Nasc, Late S, G2/M, early G1, late G1) across six biological replicates. | HeLa S3 | 1454 (present at all time points in all 6 replicates; from total of 5770) | 1478 (2894 total) | alvarez_2023 |
Alvarez et al., 2023 | Time-resolved (nascent, G2/M, early and late G1) | isolation of Proteins On Nascent DNA (iPOND): formaldehyde crosslinking (1%), EdU labeling for 15 minutes, click chemistry with biotin-azide, chromatin fragmentation, streptavidin bead enrichment. | TIG-3 fibroblasts | 2351 (detected in 4 to 5 of 5 replicates) | 2397 (2894 total) | alvarez_2023 |
Dataset name (download protein list) | Definition of dataset | Number of proteins |
---|---|---|
NULOC_CS | entries annotated as nuclear in both databases: UniProt (provided evidence code is available) AND HPA (with evidence tags: Enhanced, Supported, Approved), excludes proteins labeled only as non-nuclear in the OpenCell database (annotation grade 2 or 3) | 3296 |
NULOC_CS_NECF | entries annotated as nuclear in both databases: UniProt AND HPA, excludes proteins labeled only as non-nuclear in the OpenCell database | 3988 |
NULOC_CS_UL | entries annotated only as nuclear in both databases: UniProt (provided evidence code is available) AND HPA (with reliability score: Enhanced, Supported, Approved), excludes proteins labeled only as non-nuclear in the OpenCell database (annotation grade 2 or 3) | 1322 |
NULOC_CS_UL_NECF | entries annotated only as nuclear in both databases: UniProt AND HPA, excludes proteins labeled only as non-nuclear in the OpenCell database | 1322 |
NULOC_JT | entries annotated as nuclear in at least one database: UniProt (provided evidence code is available), HPA (with evidence tags: Enhanced, Supported, Approved), OpenCell (annotation grade 2 or 3) | 8048 |
NULOC_JT_NECF | entries annotated as nuclear in at least one database: UniProt, HPA, OpenCell | 8912 |
NULOC_JT_UL | entries annotated only as nuclear in in at least one database: UniProt (provided evidence code is available) OR HPA (with reliability score: Enhanced, Supported, Approved) OR OpenCell database (annotation grade 2 or 3) | 4292 |
NULOC_JT_UL_NECF | entries annotated only as nuclear in in at least one database: UniProt OR HPA, excludes proteins labeled only as non-nuclear in the OpenCell database | 4292 |
NON_NULOC_CS | proteins whose localization annotations exclude nuclear localization in both databases: UniProt (provided evidence code is available) AND HPA (with evidence tags: Enhanced, Supported, Approved) | 3674 |
NON_NULOC_JT | proteins whose localization annotations exclude nuclear localization at least one databases: UniProt (provided evidence code is available) OR HPA (with evidence tags: Enhanced, Supported, Approved) OR OpenCell (annotation grade 2 or 3) | 11479 |
CYTLOC_CS_UL | proteins with only aggregate cytoplasm annotation in both database: UniProt (provided evidence code is available) AND HPA (with evidence tags: Enhanced, Supported, Approved) | 2026 |