How many signalling pathways




















To make the results of our biocuration efforts routinely and freely available to the research community, we next developed a web user interface UI for the SPP knowledgebase that would provide for browsing of datasets, as well as for mining of the underlying data points.

A comprehensive walkthrough file containing instructions on the use of the SPP interface is available in Supplementary Information Section 1. Scope of the major signaling pathway module and biosample classifications in the SPP knowledgebase. In addition, categorization of tissue and cell line biosamples according to their organ and physiological system of origin d facilitates an appreciation of tissue-specific patterns of transcriptional regulation.

Note that this represents the theoretical scope of SPP; not all entities depicted are represented in the current version of the SPP knowledgebase. See the Methods section for additional information. Individual dataset pages enable integration of SPP with the research literature via digital object identifier DOI -driven links from external sites, as well as for citation of datasets to enhance their FAIR status 3 , 4. Single gene queries are designed for researchers who wish to evaluate transcriptomic or ChIP-Seq evidence for regulation of a single gene of interest across all nodes, or within specific categories, classes or families see Table 2 for examples and Supplementary Information Section 1B for a user walk-through.

GO term queries see Table 3 for examples and Supplementary Information Section 1C for a user walk-through accommodate users interested not in a specific gene, but rather in regulation of multiple genes mapped to broader functional or mechanistic terms by GO annotators.

Gene list queries see Supplementary Information Section 1D for a user walk-through return transcriptional regulatory data points for custom user gene lists containing up to approved gene symbols or Entrez GeneIDs.

Key elements of the SPP query and reporting interface. The default display for single gene queries is by Category, which can be adjusted to cluster data points by biosample or species. The default display for multi-gene queries is by Target.

IP antigens are identified using case-sensitive AGSs to denote experiments in different species. Results are returned in an interface referred to as the Regulation Report, a detailed graphical summary of evidence for transcriptional regulatory relationships between signaling pathway nodes and genomic target s of interest Fig.

Reflecting the hierarchy in Table 1 , each Regulation Report category is subdivided into classes depicted as Category Class in the UI which are in turn subdivided into families containing member nodes, which are themselves mapped to bioactive small molecules BSMs that regulate their function.

The transcriptomic Regulation Report Fig. Below the node sections, the transcriptomic Regulation Report contains a Models section, in which data points from related animal and cell model experiments are consolidated to convey evidence for previously underappreciated roles of a target transcript in specific physiological contexts, such as adipogenesis.

Each data point in either Regulation Report links to a pop-up window containing the essential experimental information Fig. This in turn links to a window summarizing the pharmacology of any BSMs used in the experiment Fig. Finally, to allow users to share links to SPP Regulation Reports with colleagues, or to embed them in research manuscripts or grant applications, all Reports are accessible by a constructed URL defining all of the individual query parameters.

A particularly desirable goal is unbiased meta-analysis to define community consensus reference signatures that allow users to predict regulatory relationships between signaling pathway nodes and their downstream genomic targets. Accordingly, we next set out to design a meta-analysis pipeline that would leverage our biocurational platform to reliably rank signaling pathway node - target gene regulatory relationships in a given biosample context.

Since this analysis was designed to establish a consensus for a node or node family across distinct datasets from different laboratories, we referred to the resulting node-target rankings as consensomes. Consensome queries see Supplementary Information Subsection 1E for a walk-through are designed for users unfamiliar with a particular signaling node family who are seeking evidence for targets that have close regulatory relationships with members of that family.

Table 4 shows examples of the consensomes available in the initial version of the SPP knowledgebase. Section 2 of the Supplementary information shows the full list of consensomes available in the initial release of SPP. Subsequent menus allow for selection of specific signaling pathway node families, physiological systems or organs of interest, or species.

To accommodate researchers interested in a specific physiological system or organ rather than a specific pathway node, consensomes are also calculated across all experiments mapping to a given physiological system metabolic, skeletal, etc. To maximize their distribution, exposure and citation in third party resources, consensomes can also by accessed by direct DOI-resolved links.

Consensomes are displayed in an accessible tabular format Fig. To reflect the frequency of differential expression of a given target relative to others in a specific consensome, the percentile ranking of each target within the consensome is displayed. Targets in the 90th percentile of a given consensome — the highest confidence predicted genomic targets for a given node family - are accessible through the web interface, and the entire list of targets is available for download in spreadsheet format for import into custom analysis programs.

As previously discussed, to suppress the diversity of experimental designs as a confounding variable in consensome analysis, the direction of differential expression is omitted when calculating the ranked signatures. That said, an appreciation of the pharmacology of a specific node-target gene relationship is essential to allow researchers to place the ranking in a specific biological context and to design subsequent experiments in an informed manner.

To accommodate this, consensome targets link to transcriptomic or cistromic Regulation Reports filtered to display those data points i. Consensome user interface. The example shows genomic targets most frequently significantly differentially expressed in response to genetic or pharmacological manipulation of the human insulin receptor in a transcriptomic experiment. Targets are ranked by default by the consensome P value CPV , which equates to the probability that the observed frequency of differential expression occurs by chance.

Target symbols link to a SPP Regulation Report filtered by the consensome category and biosample parameters to show the underlying data points. We next wished to verify that consensomes were reliable reference datasets for modeling regulatory relationships between cellular signaling pathway nodes and their downstream genomic targets.

To do this we designed a validation strategy comprising four components: comparison of consensomes with existing canonical i. Two considerations recommended members of the nuclear receptor NR superfamily of physiological ligand-regulated transcription factors for selection for initial proof-of-principle validation of the consensomes.

Firstly, as the largest single class of drug targets, they are the subject of a large body of dedicated research literature, affording considerable opportunity for testing the consensomes against existing canonical knowledge of their downstream targets. Secondly, as ligand-regulated transcription factors, members of this superfamily are prominently represented in both publically archived transcriptomic and ChIP-Seq experiments, enabling meaningful cross-validation of consensomes between these two experimental categories.

Substantial literature evidence from prokaryotic 8 and eukaryotic 9 systems indicates that genes encoding metabolic enzymes are transcriptionally plastic and subject to dynamic regulation of their expression by numerous afferent metabolic and endocrine cues. If consensome analysis were biologically valid, we anticipated that targets with elevated rankings in the murine hepatic transcriptomic consensome — that is, genes that are preferentially responsive to multiple hepatic signaling pathways - would be enriched for genes encoding enzymes with prominent roles in hepatic metabolism.

Scatterplot of the mouse all nodes liver transcriptomic consensome. This plot distills data from nearly distinct experiments to convey a visual appreciation of the relative rates of differential expression of murine genes across a variety of hepatic signaling contexts. Genes in the 99 th percentile are highlighted in orange. For cross-reference with Table 5 , genes encoding selected metabolic enzymes in the 99 th percentile are called out by gene symbol and name.

For details, refer to the Transcriptomic Consensomes subsection in the Generation of Consensomes section of the Methods.

We next speculated that transcripts under fine control by hepatic signaling pathways would be enriched for enzymes whose deficiency would have a critical impact upon hepatic metabolism. Rate-limiting enzymes RLEs play critical roles in determining mammalian metabolic flux Collectively, these analysis demonstrates the ability of pan-node organ consensomes to illuminate factors that are downstream targets of multiple distinct signaling nodes in a specific organ and, by inference, have pivotal, tightly-regulated roles in the function of that organ.

We next wished to establish whether the biological significance implied by elevated rankings in consensomes for cellular signaling pathway node families was reflected in both gain- and loss-of-function validation experiments at the bench. Figure 6a shows a scatterplot depiction of the ERs-Hs-All organs consensome. The two distinct tails in the distribution demarcate between genes whose discovery rates are comparable, but based upon different total numbers of experiments in which the genes were assayable, and therefore giving rise to different CPVs.

We next identified targets that were assigned very high consensome rankings, but whose functional importance in the context of signaling by the corresponding signaling nodes has been largely uncharacterized in the research literature. The tumor protein Dlike 1 TPD52L1 gene encodes a little-studied protein that bears sequence homology to members of the TPD52 family of coiled-coil motif proteins that are overexpressed in a variety of cancers Genes selected for Q-PCR validation are colored orange and called out by approved gene symbol.

Data are representative of three independent experiments. M, membrane; N, nucleus; P, perinuclear junctions; SF, stress fibers. Cells were harvested on Day 5. MBOAT2 encodes an enzyme catalyzing cycles of glycerophospholipid deacylation and reacylation to modulate plasma membrane phospholipid asymmetry and diversity In contrast to the large volume of literature devoted to these targets however, with the exception of a mention in a couple of androgen expression profiling studies 15 , 16 , the functional role of MBOAT2 in the context of AR signaling has been entirely unstudied.

This result was unexpected to us given the prevailing perception of AR as a driver of prostate tumor growth, but can be rationalized in the context of suppression of growth and support of differentiation by AR in normal prostate luminal epithelium Such an assertion is supported by the recent characterization of the role of MBOAT2 in chondrogenic differentiation of ATDC5 cells 18 , and by the fact that the AR agonist testosterone stimulates the chondrogenic potential of chondrogenic progenitor cells The experimental validation studies in the first use case focused on distinct single node-target regulatory relationships.

We next wished to validate the use of consensome intersection analysis to highlight convergence of multiple signaling nodes on targets involved in a common downstream biological process. Although regulation of glycogen metabolism in a variety of organs is known to be under the control of signaling mediated by the glucocorticoid GR 22 , estrogen receptor-related ERR 23 and insulin IR 24 receptor families, the respective underlying mechanisms are incompletely understood. We wished to use SPP consensomes to investigate the hypothesis that regulation of glycogen metabolism by members of these distinct receptor families might involve convergent regulation of glycogen synthase activity.

Corroborating these predicted regulatory relationships, we identified conserved GR and ERR response elements in the Ppp1r3c promoter Supplementary Information Section 6. Right panel. Activity of the Prkab2. One day post-transfection MB were then cultured in 0.

C2C12 myotubes were treated as described in Methods. Mcrip2 is induced by Ppargc1 co-nodes in C2C12 myotubes in an Esrra-dependent manner. Differentiated adipocytes were infected with adenoviruses expressing Ppargc1a and Gadd45g and mRNA levels measured as described in the Methods section. The control of cellular mitochondrial content and oxidative capacity is important for cellular and organismal energy homeostasis For example, brown and beige adipocytes generate new mitochondria and increase their oxidative and thermogenic capacity in response to norepinephrine NE , which is secreted locally when the organism senses a cold environment NE adrenergic stimulation elicits an acute transcriptional response, exemplified by the induction of genes such as the uncoupling protein Ucp1 , the Pparg co-node Ppargc1a and the signaling regulator Gadd45g 31 , In vivo , chronic or repeated exposure to cold or to adrenergic agonists also leads to higher mitochondrial DNA content, increased cristae density and enhanced expression of oxidative enzymes OxPhos complexes and Ucp1 We also confirmed the interdependence of Esrra, Ppargc1a and Ppargc1b in regulation of Mcrip2 in mouse muscle cells Fig.

Here, we set out to complement these resources to allow researchers to routinely answer targeted questions such as: what cell cycle-related factors are regulated by FGF receptors in human liver? What genomic targets are most responsive to insulin receptor signaling in the liver? What targets in my gene set are regulated by E3 ubiquitin ligases? To fill this gap, we designed a knowledgebase, SPP, which allows bench researchers to routinely evaluate evidence in public transcriptomic or ChIP-Seq datasets to infer the roles of cellular signaling pathway nodes in their system of interest.

The SPP resource is characterized by a number of unique features. Previous transcriptomic meta-analysis approaches in the field of cellular signaling have been perturbation-centric, and applied to experiments involving a single unique perturbant 47 , Consensomic analysis differs from these approaches in that it is node-centric: that is, it is predicated upon the functional relatedness of any genetic or small molecule manipulation of a given pathway node, and accordingly incorporates a step that maps experiments to their relevant pathway nodes.

This mapping step affords the consensomic analysis greater statistical power, enabling it to call potential node-target relationships with a higher degree of confidence than would otherwise be possible. Incorporation of this mapping approach into the Regulation Reports also serves to place emphasis on the functional relatedness of distinct experimental perturbations with respect to a given node-target regulatory relationship.

An additional unique aspect is that many other primary analysis and meta-analysis studies describing integration of transcriptomic and ChIP-Seq datasets, although insightful, are limited in scope, and exist only as stand-alone literature studies. Ours is to our knowledge the first meta-analysis to be sustainably integrated into an actively-biocurated FAIR web resource in a manner supporting routine dataset re-use and citation by bench researchers lacking formal informatics training.

Our resource has a number of limitations. Future versions of the knowledgebase will only benefit from the incorporation of the growing number of metabolomic and proteomic profiling datasets, which will illuminate effects of signaling pathways on cellular functions not addressed by transcriptional methodologies. Secondly, bias in publically archived datasets towards specific nodes and biosamples is to some extent reflected in SPP. Other limitations of the consensomes relate to the design of available archived experiments.

For example, certain targets may be regulated by a given node only under specific circumstances e. Moreover, a low ranking for a target in a consensome does not necessarily imply the complete absence of a regulatory relationship, and may reflect the requirement for a quite specific cellular context e.

Caveats such as these notwithstanding, we believe SPP adds value to the currently available tool set enabling bench investigators to re-use archived discovery scale transcriptional datasets for hypothesis generation and data validation.

Netsearch determines networks by integrating protein-protein interaction data with microarray expression data by extracting subnetworks of the protein interaction dataset whose members have the most correlated expression profiles [ 22 ].

Another method highlights the order of signaling pathway components, assuming all the components on the pathways are known [ 21 ]. It constructs a score function based on the correlations between each gene pair to determine the final signal transduction network.

All of the above methods aim to restructure the topologies of known signaling pathways. However, to our knowledge, no open-source methods have been reported that simultaneously and comprehensively identify the set of active signaling pathways and the likely pathway structures for a gene expression profile i.

Additionally, most of the above methods were evaluated and applied to yeast PPI data, with only a few methods designed specifically to deal with the significantly greater complexity of mammalian data. Here we propose an approach to systematically identify the set of active receptor-mediated signaling pathways within any given cell, by combining PPI and gene expression data.

The list of R proteins was collected from a curated database of the Fantom5 project [ 24 ]. The list of K proteins was collected from the Uniprot curated database [ 23 ].

Please note that we have considered here all the physical and other inferred e. This thresholding yielded 16, and 19, PPIs for mouse and human respectively. After obtaining these highly scored PPIs both for the human and mouse organisms we have merged all the PPIs by assuming that the molecules have one-to-one homology mapping between the organisms. We then took the union of all PPIs and have assigned the larger score value of a PPI if it is present in both organisms.

This process included interactions from:. To make the signaling pathway paths, we first made a directed weighted graph from the PPI data using the igraph R package. As igraph considers the weight of the interaction as a cost i. We have collected all the complete paths a path is being called complete if it starts from a R protein and ends up to a TF protein that have a length from 3 to 7, allowing for at most 2 layers for RP, 5 layers for KN and 1 layer for TF.

To identify cell type-specific paths, we then filtered out the complete paths where all factors were designated as housekeeping genes see the next section for how the list of housekeeping genes was generated. As a result of these steps, the final collection of complete paths consists only of those that are not designated as housekeeping paths.

These paths are used as background pathway path data for our method. We collected the published RNA-seq gene expression data sets for different cells and tissues both for mouse and human from the ENCODE project [ 27 , 28 ], and processed them separately. Using this cut-off we identified the expressed genes for all the cells and tissues.

This approach was used to identify both the mouse and human housekeeping genes. These 2 lists of housekeeping genes were then combined to generate a unique list of housekeeping genes, assuming one-to-one homology mapping between human and mouse genes. This combined list of unique housekeeping genes was used as background data.

The background signaling pathway path data was used to identify the potential signaling pathways for a particular gene expression data set. From the gene expression data set, first we calculated the average expression value of the replicates and then identified the expressed genes by using the cut-off threshold described above. From the background path data we obtained only those paths for which all the protein factors are expressed according to the input gene expression data.

This set of paths is treated as potential signaling pathway paths for the gene expression data set. For each potential signaling pathway, we first calculated the proportion of active molecules defined as highly expressed genes based on the above high expression threshold for each path. We then summed all the proportions of all the paths for the pathway and divided the total proportion value by the total number of paths of the pathway.

This final value was termed the Activity score A s for a pathway and mathematically can be written as:. Where p i denotes the proportion of active molecules in each path and n denotes the number of downstream TFs for the pathway. Next we plotted the values of n and A s to display the results of top ranked active signaling pathways in the upper positions.

The number of highly ranked active pathways for each sample was then counted. The false positive rate for highly ranked pathways was obtained by dividing the number of highly ranked pathways obtained from the randomly assigned data by the number of highly ranked pathways obtained from the original sample.

GO analysis was also performed on the randomly assigned gene expression data used to determine the SPAGI false discovery rate. Each GO analysis was performed separately using the online version of Enrichr for biological processes [ 29 ]. Results were filtered to retain only the significant terms and for signaling GO terms using the raw p -value. The false positive rate for the GO analysis was calculated by dividing the number of highly ranked pathways obtained via the randomly assigned data by the number of highly ranked pathways obtained from the original data.

The ability of SPAGI to identify known, critical, tissue-specific signaling pathways was tested using four cell types obtained from three different gene expression data sets two are RNA-seq and one is microarray. These four cell types were chosen as there is an extensive body of literature for them that has already identified critical pathways, thus enabling biological validation of the SPAGI output.

The first data set used is from mouse dental epithelial cells at the development stage E After all filtering, we have obtained 14, specific paths i. The result of mouse embryonic dental epithelium cell at E The red color indicates the known pathways.

A large number of studies over decades have described the requirement for different signaling pathways during lens development. The BMP pathway is also involved in pre-placodal induction, invagination of the lens placode, LEC proliferation and survival, and LF cell differentiation.

Signaling through different integrins is required early in lens differentiation, and for cell adhesion, lens capsule assembly and normal development of both LECs and LFs. Cadherins are required for appropriate polarity, adhesion and survival of LECs, and for LF cell elongation.

Critically, how molecular integration of all these pathways occurs during lens development or formation of different cataract subtypes is currently unclear [ 34 ]. After all filtering this analysis gave us 25, specific paths i. Moreover, the activity score with the number of downstream TFs was able to preferentially rank these known critical lens pathways over other pathways identified within the mouse LEC data.

This analysis also gave us 23, specific paths i. The result of newborn mouse lens epithelial cell. The details of FGFR1 pathway is shown in the figure. The details of ROR1 pathway is shown in the figure. We have obtained 13, specific paths i.

Differences in the ranking of particular pathways provide indications of how these pathways are integrated in the transition from LECs to LF cells e. Overall, these results show that the SPAGI R package can accurately identify and rank known, critically-important signaling pathways from the gene expression profiles of different cell and tissue types.

As shown in Figs. Additionally, new candidate critical tissue regulators can be identified via the activity score ranking. Also PVRL3 is known to be associated for congenital ocular disease [ 37 ], so this can also be a potential active pathway for lens. The result of mouse lens fiber cell. Notch signaling involves gene regulation mechanisms that control multiple cell differentiation processes during embryonic and adult life.

STAT signaling may hold the key to some immune deficiency research and various types of cancers. By studying the autophosphorylation behavior of this pathway, and the influence of STAT proteins in other pathways, researchers gain valuable insight into understanding how this essential signaling system interacts with cellular functions.

Rockland develops a variety of antibodies against this pathway that have proven instrumental in continuing the ongoing research within the cancer and immunology fields. Rockland Immunochemicals, Inc. Limerick, PA E-mail: orders rockland-inc. Shopping Cart 0 Items. Cells typically receive signals in chemical form via various signaling molecules.

When a signaling molecule joins with an appropriate receptor on a cell surface, this binding triggers a chain of events that not only carries the signal to the cell interior, but amplifies it as well.

Cells can also send signaling molecules to other cells. Some of these chemical signals — including neurotransmitters — travel only a short distance, but others must go much farther to reach their targets. Cell Biology for Seminars, Unit 4. Topic rooms within Cell Biology Close. No topic rooms are there. Or Browse Visually. Student Voices.

Creature Cast. Simply Science. Green Screen. Green Science. Bio 2. The Success Code. Why Science Matters. The Beyond. Plant ChemCast. Postcards from the Universe. Brain Metrics. Mind Read. Eyes on Environment. Accumulating Glitches. Saltwater Science.



0コメント

  • 1000 / 1000