Tag: Proteomics

  • Why tumour geography matters — and how to map it

    Why tumour geography matters — and how to map it

    [ad_1]

    To people with cancer, tumours can seem like amorphous clumps of defective cells, relentlessly focused on unconstrained growth and invasion. But this does not mean that they’re homogeneous. Cancerous cells have a broad spectrum of mutations, and growths contain healthy host cells, blood vessels and microscale fronts at which immune cells wage war with malignant tissue.

    Until around a decade ago, researchers were ill-equipped to explore this tumour microenvironment. But the emergence of tools that can spatially map large numbers of biomolecules, such as RNA and protein, has caused something of a revolution. Indeed, researchers are increasingly weaving these layers of information together to create rich ‘multiomic’ spatial maps that can classify diverse cell types and probe their activities throughout a tumour.

    “We’re not just talking about tumour heterogeneity any more — we can see it,” says Arutha Kulasinghe, a cancer biologist at the University of Queensland in Brisbane, Australia. “We can see pockets of drug resistance, sensitivity and different biology directly on the tissue.” The spatial factors that contribute to carcinogenesis and disease progression are also increasingly visible, revealing potential vulnerabilities in the process. Such capabilities could transform cancer research and pathology, making it possible to model, interpret and perhaps predict tumour biology with unprecedented sophistication.

    But the barriers to entry are high. There are many technology platforms for spatial omics analyses, and the experiments can be costly and complicated. Even with data in hand, cancer researchers can face a computational odyssey before they can make sense of their results. “Everybody wants what I like to call the ‘blender theory’ of multiomics, which is where you throw all the data sets together and it will tell you the answer as to what’s in them” says Elana Fertig, a bioinformatician at Johns Hopkins Medicine in Baltimore, Maryland. “I’ve become less and less convinced that’s possible, because everybody has a different question that they want to ask.”

    Welcome to the neighbourhood

    For more than a decade, biologists have been studying tumour microenvironments by breaking tissue samples into individual cells and characterizing their molecular contents. These single-cell omics technologies are fairly simple to use, at least for RNA analysis. Instruments such as the Chromium from 10x Genomics in Pleasanton, California, can survey gene expression across millions of individual cells.

    Some researchers, such as cancer genomicist Dan Landau at the New York Genome Center in New York City, have even extended these tools to perform multiomic experiments — coupling transcription to other biological features, such as genomic mutations or epigenetic signals that directly govern gene expression at the single-cell level. “The vision is to try to start understanding how those layers are talking to one another,” says Landau.

    Such experiments can categorize cell types and reveal which biological processes those cells are engaged in — but they lack essential context. “It was pretty clear early on that we miss a lot of information by dissociating a tumour into single cells,” says Bernd Bodenmiller, a systems biologist at the University of Zurich in Switzerland and the Swiss Federal Institute of Technology. For example, the efficacy of immunotherapy against a given tumour depends not only on which immune cells are present, but also where they are in the tumour.

    In 2014, Bodenmiller helped to pioneer the spatial omics era when he and his colleagues combined a laser ablation technique with mass spectrometry to detect and localize proteins labelled with various metal-tagged antibodies (see Nature 567, 555–557; 2019). They called the approach imaging mass cytometry (IMC), and used it to quantify 32 proteins at subcellular resolution in a breast-tumour specimen. Bodenmiller says that these early experiments demonstrated the importance of spatially localized communities of inter-communicating cells, now known as ‘cellular neighbourhoods’, which would have been invisible using dissociated single cells. “These were the first striking examples for me of how the spatial arrangement of tumour cells — and how they form communities with other cells — is really strongly prognostic for patient outcome,” he says.

    Most spatial experiments today focus on the transcriptome, and there are numerous commercial platforms available. Some are sequencing-based, such as the Visium platform from 10x Genomics, which builds on a method developed in 2016 (see Nature 606, 1036–1038; 2022). Tissue slices are prepared on a slide coated with an array of location-barcoded DNA strands. The RNA is then released from the tissue, captured by these strands and converted to DNA for sequencing; the barcode associated with each sequence reveals where it was on the slide.

    Other methods are imaging-based. For example, the MERSCOPE platform from Vizgen in Cambridge, Massachusetts, is based on the technique MERFISH. First reported1 in 2015, the technique involves the serial labelling of tissue samples with fluorescently tagged probes that enable direct visualization, identification and quantification of transcripts in a specimen.

    The choice of platform involves trade-offs. “Generally, the imaging-based technology can capture a larger piece of tissue area, whereas with the sequencing-based [methods] you capture a lot less,” says Kai Tan, a research oncologist at the Children’s Hospital of Philadelphia in Pennsylvania. Imaging-based methods also tend to offer superior spatial resolution — down to the cellular or even sub-cellular scale — but are more labour-intensive and constrained, typically requiring users to select which genes to probe rather than broadly interrogating the tissue RNA, and profiling a smaller fraction of the transcriptome than dissociated, single-cell methods. Sequencing methods can detect even unexpected transcripts, albeit often at lower spatial resolution. But “those two worlds are converging”, Landau notes.

    For instance, the Slide-tags method offers an inventive alternative, in which the address-defining barcodes for spatial transcriptomics are delivered directly into the confines of the cell nucleus, providing subcellular resolution2. These nuclei can then be isolated and analysed more extensively with a range of single-cell methods.

    Regardless of the platform, spatial transcriptomics is unlocking exciting opportunities for cancer researchers. For example, neurosurgeon Dieter Henrik Heiland at the University of Freiburg in Germany has used these techniques to tease apart the conditions that foster the growth and invasive behaviour of brain tumours, such as glioblastoma — specifically, the impact of certain myeloid bone marrow cells on the activity of immune system T cells. “We could identify defined patterns, defined architectures that we could not do before with any other technologies,” he says.

    A multiplicity of maps

    Increasingly, however, transcriptomics represent not the entirety of the spatial analysis but one component thereof — a ‘baseline layer’, as Heiland puts it. “Then, we think what we can do on top.”

    Often, that’s spatial proteomics. Although all proteins are translated from messenger RNAs, not all mRNAs give rise to proteins. Kulasinghe says that in his experience, spatial patterns of RNA and protein can differ by up to 50% in a given sample, such that transcription levels might not reliably predict protein output. Proteins can also form complexes and undergo chemical modifications that would be impossible to determine from transcriptomic data alone. Proteomic analysis is therefore a crucial component in understanding tumour spatial biology. “People are stuck with proteins forever,” says Garry Nolan, an immunologist at Stanford University in California.

    MERSCOPE data image of human ovarian cancer tissue displaying genes represented by different colours

    Data image created using the MERSCOPE platform of the genes in ovarian cancer tissue.Credit: Vizgen

    Today’s spatial proteomics toolbox includes methods that can profile dozens or even hundreds of proteins at a time. For example, Nolan’s group developed3 the widely used CODEX method (now commercialized by Akoya Biosciences of Marlborough, Massachusetts, as the PhenoCycler system) in 2018. This approach uses DNA-tagged antibodies for up to 100 protein targets, which are sequentially detected with an enzymatic process that specifically adds dye-labelled nucleotides to a subset of those DNA tags; these dyes are then cleaved off before the next imaging round. Similarly, the GeoMx platform from NanoString in Seattle, Washington, allows researchers to image RNA at the same time as several hundred proteins in the same sample.

    Fertig and her team reported the combined power of spatial proteomics and transcriptomics in a study that explored the involvement of cells known as fibroblasts in the progression of premalignant pancreatic growths to cancer4. “With the transcriptomics data, we were able to find the fibroblasts and determine their impact on epithelial cells,” Fertig says. The method lacked the spatial resolution to discriminate between cell types fully, but layering on IMC data revealed how some cancer-associated fibroblasts help to establish a microenvironment that promotes malignant growth.

    Perhaps the most direct readout of what a cell is doing at any given moment is the metabolome — the sugars, lipids, peptides and other biomolecules that act as inputs and outputs of biological processes. Several groups are mapping the metabolome using imaging mass spectrometry, in which a laser is scanned over a specially prepared sample to generate spatially localized chemical signatures. “The beauty of the technology is you basically get a completely different picture of your tissue than what you have in your transcriptomic data,” says Heiland. In one 2022 study, Heiland and his colleagues combined this approach with spatial transcriptomics and imaging mass cytometry to map out patterns of oxygen deprivation in glioblastoma. They found that hypoxic conditions lead to more-severe genomic disruption and abnormal gene expression5.

    Room for error

    Still, spatial omics can be intimidating for newcomers. “Everybody wants to adopt spatial, but it’s overwhelming,” says Jasmine Plummer, a geneticist at St Jude Children’s Research Hospital in Memphis, Tennessee. As head of the hospital’s Center for Spatial Omics core facility, she advises users to “think about a specific question you want to answer, not just a fishing expedition”, and then select the method that provides the necessary resolution, multiplexing or other capabilities.

    Some platforms allow users to directly survey multiple molecular categories at once. For example, the NanoString GeoMx and CosMx instruments can perform both protein and gene-expression analysis, and the Landau group collaborated with 10x Genomics to achieve similar analyses using Visium6. But Bodenmiller cautions that in some cases, “you don’t get the optimum of each method” with simultaneous analyses. For example, the enzymatic digestion steps required to liberate RNA from tissue can damage proteins. But optimized workflows are emerging to serially analyse the same specimen using multiple platforms, with the sample-preparation process modified to minimize loss between steps. “I think researchers have realized that you need optimized technology stacks,” says Kulasinghe.

    Other groups perform parallel analyses on consecutive thin sections derived from the same tumour sample, then align and merge the resulting data sets. This is harder than it sounds, however. “If you go from one tissue section to the next, you only find about 50–60% of the cells in both sections,” says Bodenmiller. Furthermore, different experimental formats can produce radically different data types, confounding integration. For example, an IMC experiment yields an array of pixels denoting different proteins at subcellular resolution, whereas sequencing-based transcriptomic experiments map ‘spots’ that often encompass multiple cells. “Every data set has to be treated on its own, and you have to figure out how we can now integrate those data by some kind of similarity measurement,” says Heiland.

    There is also the fundamental challenge of segmentation: accurately defining and classifying individual cells in the spatial data. “If you cannot accurately segment the boundary of the cell, then everything downstream will be off,” says Tan. Different spatial platforms bring different challenges, and there are no universal solutions — Kulasinghe’s team has tested multiple algorithms for this purpose and observed inconsistent performance. As a solution, his team draws boundaries based on a ‘majority vote’ derived from multiple algorithms. Kulasinghe also emphasizes the importance of using conventional histology stains to fact check algorithmic analyses and establish ‘ground truth’ for a spatial study.

    Above all, careful planning is essential. Spatial omics experiments are expensive — Kulasinghe says that a single imaging-based transcriptomics assay can cost nearly US$10,000 — and can generate terabytes of data. “Getting pilot data in this realm is important,” says Plummer. “I don’t think you want to take a whole deep dive in until you’ve understood your data a little bit first.”

    The final frontier?

    Fortunately, the number of core facilities is growing, giving researchers access to expert guidance as well as the technological capabilities needed to perform spatial analyses.

    In parallel, international and cross-institutional research efforts are leveraging single-cell — and, increasingly, spatial — multiomic analysis at unprecedented scale, including the US National Institutes of Health-backed Human Tumour Atlas Network and global consortium the Human Cell Atlas. These efforts are developing and optimizing analytical pipelines and tools, and, more importantly, generating vast collections of reference data for healthy and diseased tissues that the scientific community can use to interpret future experiments. “Without those large initiatives, we would really not be at the state of technology and possibility where we are now,” says Bodenmiller.

    Meanwhile, some spatial-omics pioneers are looking to new horizons. Bodenmiller’s group reported an alternative to IMC in which antibodies are labelled with isotopic tags that can be detected and distinguished by X-ray imaging rather than mass spectrometry, allowing rapid mapping of many proteins at once throughout the specimen7. He says that the method could be an excellent fit for 3D imaging, and is fast because it avoids the slow scanning process that is typical of IMC. First, however, the team must work out logistical challenges, such as how to efficiently deliver antibodies into the interior of intact tissues.

    The imminent deluge of spatial data will also provide a treasure trove for researchers looking to apply deep-learning methods to cancer. This includes ‘digital pathology’ strategies, in which artificial-intelligence algorithms are trained to correlate features on conventional pathology slides with molecular indicators that are associated with tumour identity, prognosis and susceptibility to treatment. Companies are already entering this space with assays that guide drug selection based on spatial data, and Kulasinghe sees opportunities to assess immune activity in a tumour or predict the likelihood of metastasis without the need for spatial assays in the clinic. “This can give us deeper insights into the tumour microenvironments that ultimately associate with clinical endpoints,” he says.

    For his part, Nolan predicts a post-data world, in which the research priority shifts from generating molecular maps to using them to train AI models that reveal hidden vulnerabilities. “We’re going to be able to create a virtual tissue that looks just like colon cancer,” he says. “Then you can start to change the parameters, and say: ‘OK, how do I stop the following structure from forming?’”

    [ad_2]

    Source link

  • Natural proteome diversity links aneuploidy tolerance to protein turnover

    [ad_1]

    Reagents

    Unless otherwise noted, reagents were purchased as follows. Bacto yeast extract (212750), Bacto peptone (211677), Bacto dehydrated agar (214010), water (LC–MS grade, Optima, 10509404), acetonitrile (ACN) (LC–MS grade, Optima, 10001334), methanol (LC–MS grade, Optima, A456-212) and formic acid (LC–MS grade, 13454279) were purchased from Fisher Chemicals. Heavy (13C6/15N2) lysine was purchased from Roth (2085.1) and Silantes (211604102). Trypsin (sequence grade, V511X) was purchased from Promega. d-glucose (G7021), glycerol (G2025), DL-dithiothreitol (BioUltra, 43815), iodoacetamide (BioUltra, I1149) ammonium bicarbonate (eluent additive for LC–MS, 40867), yeast nitrogen base without amino acids (Y0626) and glass beads (acid washed, 425–600 µm, G8772) were purchased from Sigma-Aldrich. Urea (puriss. P.a., reag. Ph. Eur., 33247H) and acetic acid (eluent additive for LC–MS, 49199) were purchased from Honeywell Research Chemicals. Ninety-six-well solid-phase extraction plates (MACROSpin C18, 50–450 μl, SNS SS18VL) were purchased from the Nest Group.

    Yeast strains

    The natural isolate library counts 1,023 strains in total, of which 997 strains were previously described1 to be representative of the entire S. cerevisiae species. A further 26 strains were described in two studies7,56. The isolates were arranged in a 96-well plate format according to estimated growth rates from growth on YPD agar. Aneuploidies in strains from ref. 56 and laboratory isolates were manually detected through the coverage plots of the genomic read mapping. For all other isolates, the aneuploidy annotations as described1 were considered. All strain details including aneuploidy, phylogenetic classification, ecological origin of isolation and ploidy are provided in Supplementary Table 1.

    Mat A disomic yeast strains, constructed by A. Amon’s laboratory3, were provided by R. Li (disome WT, strain 11311; disome 1, strain 12683; disome 2, strain 12685; disome 4, strain 24367; disome 5, strain 14479; disome 8, strain 13628; disome 9, strain 13975; disome 10, strain 12689; disome 11, strain 13771; disome 12, strain 12693; disome 13, strain 12695; disome 14, strain 13979; disome 15, strain 12697; and disome 16, strain 12700).

    Transcriptomic data

    Raw read counts are described in a parallel study6 and were filtered to include only genes with a mean of more than 1 count per million across measured strains (Supplementary Table 18). These filtered read counts were then normalized using the trimmed mean of M-values (TMM) method57 as implemented in edgeR58,59. Non-zero, non-log2-transformed counts-per-million values were used for further analysis.

    Microarray gene-expression data for lab-engineered disomic strains were downloaded from the supplementary material of a previous study3. Only data for strains grown in batch culture were used. Raw expression profiling data for this dataset are available from the Gene Expression Omnibus database60 under the accession number GSE7812.

    High-throughput cultivation of yeast isolates

    Natural isolates

    The yeast samples were cultivated and digested as follows: the collection was grown on agar plates containing synthetic minimal (SM) medium (6.8 g l−1 yeast nitrogen base, 2% glucose, without amino acids). Subsequently, colonies were inoculated in SM liquid medium (200 μl) and incubated at 30 °C overnight. Then, 160 µl of the culture was transferred to 96-deep-well plates pre-filled with one borosilicate glass bead in each well and diluted 10× in SM liquid medium to a total volume of 1.6 ml per well. Plates were sealed using an oxygen-permeable membrane and grown at 30 °C to exponential phase (Supplementary Table 2), shaking at 1,000 rpm for 8 h. Then, 1.5 ml of cell suspension was transferred to a new deep-well plate and collected by centrifugation (3,220g, 5 min, 4 °C). The supernatant was discarded and plates were immediately cooled on dry ice, then stored at −80 °C until further processing.

    Lab-engineered synthetic disomic strains

    Samples were grown using SD-His+G418 agar and medium selecting for the duplicated chromosomes (6.7 g l−1 yeast nitrogen base without ammonium sulfate, Difco 233520; 20 g l−1 glucose; 1 g l−1 monosodium glutamate, VWR 27872.298; 0.56 g l−1 CSM-His-Leu-Met-Trp-Ura, MP Biomedicals 4550422; 0.02 mg ml−1 uracil; 0.06 mg ml−1 leucine; 0.02 mg ml−1 methionine; 0.04 mg ml−1 tryptophan; 200 µg ml−1 G418, Gibco 11811023). Each disomic strain and the euploid wild type were set up in triplicate. The procedure for cultivation and lysis of the disomic strains was as described above, except that the collection by centrifugation was performed at 2,700g, 10 min and 4 °C.

    Preparation of proteomics samples

    Natural isolates

    The samples for proteomics were prepared in 96-well plates as previously described41,61, with up to four plates processed in parallel. For yeast lysis, 200 µl of lysis buffer (100 mM ammonium bicarbonate and 7 M urea) and around 100 mg glass beads were added to each well, followed by 5 min bead beating at 1,500 rpm (Spex Geno/Grinder). For reduction and alkylation, 20 μl of 55 mM DL-dithiothreitol (1 h incubation at 30 °C) and 20 μl of 120 mM iodoacetamide (incubated for 30 min in the dark at ambient temperature) were used. Subsequently, 1 ml of 100 mM ammonium bicarbonate was added per well, followed by centrifugation (3,220g, 3 min) and 230 μl of this mixture was transferred to plates pre-filled with 0.9 μg trypsin per well. The samples were incubated for 17 h at 37 °C and the digestion was subsequently stopped by adding 24 μl of 10% formic acid (FA). The mixtures were cleaned up using C18 96-well plates, with 1-min centrifugations between the steps at the described speeds. The plates were conditioned with methanol (200 μl, centrifuged at 50g), washed twice with 50% ACN (200 μl, centrifuged at 50g) and equilibrated three times with 3% ACN/0.1% FA (200 μl, centrifuged at 50g, 80g and 100g, respectively). Then, 200 μl of the digested sample was loaded (centrifuged at 100g) and washed three times with 3% ACN/0.1% FA (200 μl, centrifuged at 100g). After the last washing step, the plates were centrifuged at 180g. Subsequently, peptides were eluted in three steps, twice with 120 μl and once with 130 μl of 50% ACN (180g), and collected in a plate (1.1 ml, square well, V-bottom). The collected material was completely dried on a vacuum concentrator and redissolved in 40 μl of 3% ACN/0.1% FA before transfer to a 96-well plate. The final peptide concentration was estimated by absorption measurements at 280 nm with a Lunatic photometer (Unchained Labs, 2 µl of sample). All pipetting steps were performed with a liquid handling robot (Biomek NXP) and samples were shaken on a thermomixer (Eppendorf Thermomixer C) after each step.

    Lab-engineered synthetic disomic strains

    Lysis, reduction, alkylation and digestion of the disomic strains were performed as described above. The digest was quenched using 25 µl of 10% FA per sample. The conditioning of the solid-phase-extraction plates was performed as described above, but using 0.1% FA instead of the 3% ACN/0.1% FA mixture. After loading 200 µl of the digested sample, the columns were washed four times with 200 µl of 0.1% FA followed by centrifugation (150g). Purified peptides were collected by three consecutive elution steps using 110 µl of 50% ACN (centrifugation at 200g). After vacuum drying, peptides were dissolved in 30 µl of 0.1% FA. All steps of the sample preparation were performed by hand. Peptide concentrations were determined using a fluorimetric peptide assay kit following the manufacturer’s instructions (Thermo Fisher Scientific, 23290).

    LC–MS/MS measurements

    Natural isolates

    For the collection of natural isolates, liquid chromatography was performed on a nanoAcquity UPLC system (Waters) coupled to a Sciex TripleTOF 6600. Peptides (2 μg) were separated on a Waters HSS T3 column (150 mm × 300 μm, 1.8-μm particles) ramping in 19 min from 3% B to 40% B (solvent A: 1% ACN/0.1% FA; solvent B: ACN/0.1% FA) with a non-linear gradient (Supplementary Table 19). The flow rate was set to 5 μl min−1. The SWATH acquisition method62 consisted of an MS1 scan from m/z 400 to m/z 1,250 (50 ms accumulation time) and 40 MS2 scans (35 ms accumulation time) with a variable precursor isolation width covering the mass range m/z 400 to m/z 1,250 (Supplementary Table 20). Proteomic raw data were recorded using Analyst v.1.8.1.

    Lab-engineered synthetic disomic strains

    Proteomics measurements were performed on an Agilent 1290 Infinity LC system coupled to a SCIEX TripleTOF 6600 equipped with an IonDrive source as previously described41. Buffer A consisted of 0.1% FA in water, and buffer B of 0.1% FA in ACN. All solvents were LC–MS grade. Five micrograms of peptides per sample were separated at 30 °C with a 5-min active gradient starting with 1% B and increasing to 36% B on an Agilent Infinitylab Poroshell 120 EC-C18 column (2.1 × 50 mm, 1.9-μm particles). The flow rate was set to 0.8 ml min−1 and the scanning SWATH acquisition method consisted of an m/z 10-wide sliding isolation window.

    Generation of an experimental spectral library for strain S288c

    Five micrograms of yeast digest was injected and run on a nanoAcquity UPLC (Waters) coupled to a SCIEX TripleTOF 6600 with a DuoSpray Turbo V source. Peptides were separated on a Waters HSS T3 column (150 mm × 300 µm, 1.8-µm particles) with a column temperature of 35 °C and a flow rate of 5 µl min−1. A 55-min linear gradient ramping from 3% ACN/0.1% FA to 40% ACN/0.1% FA was applied. The ion source gas 1 (nebulizer gas), ion source gas 2 (heater gas) and curtain gas were set to 15 psi, 20 psi and 25 psi, respectively. The source temperature was set to 75 °C and the ion spray voltage to 5,500 V. In total, 12 injections were run with the following m/z mass ranges: 400–450, 445–500, 495–550, 545–600, 595–650, 645–700, 695–750, 745–800, 795–850, 845–900, 895–1,000 and 995–1,200. The precursor isolation window was set to m/z 1 except for the mass ranges m/z 895–1,000 and m/z 995–1,200, for which the precursor windows were set to m/z 2 and m/z 3, respectively. The cycle time was 3 s, consisting of high- and low-energy scans, and data were acquired in ‘high-resolution’ mode. The spectral libraries were generated using library-free analysis with DIA-NN directly from these scanning SWATH acquisitions. For this DIA-NN analysis, MS2 and MS1 mass accuracies were set to 25 ppm and 20 ppm, respectively, and the scan window size was set to 6.

    Proteomics data processing

    Natural isolates

    Protein-wise fasta files were created by inferring single-nucleotide polymorphisms for each strain on the basis of the reference genome of the S288c strain. In cases of heterozygosity, one of the possible alleles was randomly inferred1,7. For non-reference genes, a single representative sequence per protein was available based on the genomes. The proteome for the reference strain S288c was obtained from UniProt (UP000002311, accessed 10 February 2020)63. Sequences of strains present in the original strain collections1,7 and subject to intellectual property restrictions were excluded from our study, leading to the inclusion of 1,023 strains in the processing. To reduce the processing time and limit the search space to relevant peptides, the protein-wise fasta files were processed to select peptides that were well shared across the strain collection. The protein sequences were thus trypsin-digested in silico and missed cleavages were disregarded. Non-proteotypic peptides were excluded and only peptides shared by 80% of the strains were selected for further analysis. This list of peptides was used to filter the experimental library. Raw mass spectrometry files were processed using the filtered spectral library with the DIA-NN software (v.1.7.12)42. Default parameters of the software were used except for the following: mass accuracy, 20; mass accuracy MS1, 12. Because the peptides selected were not necessarily present ubiquitously in all the strains, an additional step was required to remove false-positive peptide assignments (entries in which a peptide is detected in a strain in which it should be absent). This filter led to the exclusion of around 1% of the entries. Samples with insufficient MS2 signal quality (around 5.7 × 107) and entries with a q value greater than 0.01 or a protein group q value greater than 0.01 were removed. Outlier samples were detected on the basis of both the total ion chromatograms (TIC) and the number of identified precursors per sample (z-score > 2.5 s.d.) and were excluded from further analysis. Precursor normalized values as inferred by DIA-NN that were well detected across the samples (in at least 80% of the strains) and with CV < 0.3 in the quality control samples were retained. Subsequently, batch correction was performed at the precursor level by bringing median precursor quantities of each batch to the same value. Proteins were then quantified using the maxLFQ64 function implemented in the DIA-NN R package, resulting in a dataset containing 1,576 proteins for 796 strains. Missing values (less than 4% of all values) were imputed using k-nearest neighbours (KNN) imputation65.

    Lab-engineered synthetic disomic strains

    Mass spectrometry files were processed using the experimental spectral library obtained through gas phase fractionation for the S288c strain with the DIA-NN software (v.1.7.12). Default parameters of the software were used except for the following: mass accuracy, 20; mass accuracy MS1, 12. The output from the software was then processed in R. Entries with a q value greater than 0.01 or a protein group q value greater than 0.01 and non-proteotypic peptides were removed. Samples with too low an optical density (OD) (less than 0.075) were filtered out for further analysis (disome 4 and one replicate of disome 8). The precursor normalized values inferred by DIA-NN were used and precursors that were well detected across 80% of the samples were retained. Proteins were then quantified using the maxLFQ function implemented in the DIA-NN R package. The resulting dataset consists of 1,377 proteins for 38 samples. Missing values (less than 2.35% of all values) were imputed using the KNN approach. The median value of all available replicate measurements was used for each protein during all further analyses.

    Twenty-four-hour time-course proteomics

    Yeast isolates were cultivated on SM medium (as above) in batch culture. In brief, colonies from across an agar plate were incubated in 5 ml medium for 16 h at 30 °C, 750 rpm. The pre-culture was diluted to an optical density at 600 nm (OD600 nm) of 0.1 in 30 ml medium, and incubated for 24 h at 30 °C, 750 rpm. At regular intervals, the OD600 nm was recorded, and around 4 × 107 cells were collected by centrifugation (5 min, 10,000g, 4 °C) at five time points to cover early exponential, mid-exponential, late exponential and stationary phases of growth. Samples were lysed in screw cap tubes by adding around 100 mg of glass beads and 160 µl of lysis buffer (7 M urea and 100 mM ammonium bicarbonate (ABC)), followed by four cycles of bead beating (5 min, 1,500 rpm followed by 5 min on ice) using a GenoGrinder. Samples were centrifuged (5 min, 10,000g, 4 °C) and the supernatant was transferred to a 500-µl 96-well plate. Twenty microlitres of 55 mM DTT was added to each well and the samples were incubated for 1 h at 30 °C. Subsequently, the plate was cooled on ice for 5 min, and then 20 μl of 120 mM IAA was added to each well. The samples were incubated for 30 min at 25 °C in the dark. The reduced and alkylated samples were diluted by adding 500 μl of 100 mM ABC to each well. Then, 2 μg of trypsin/LysC was added to each sample and the plate was incubated for 17 h at 37 °C. The digest was stopped by the addition of 35 µl 20% FA, and peptides were purified using solid-phase extraction as described above. Purified peptides were dried using a vacuum concentrator and dissolved in 35 µl 0.1% FA, and peptide concentrations were determined using a fluorimetric peptide assay kit following the manufacturer’s instructions (Thermo Fisher Scientific, 23290).

    Peptide separation was accomplished in a 63-min water to ACN active gradient on an Ultimate 3000 RSLnanoHPLC coupled to a Q Exactive Plus mass spectrometer (both Thermo Fisher Scientific) operating in data-independent acquisition (DIA) mode. Tryptic peptides (1 µg) were concentrated on a trap column (PepMap C18, 5 mm × 300 μm × 5 μm, 100 Ǻ, Thermo Fisher Scientific, buffer containing 2:98 (v/v) ACN/water containing 0.1% (v/v) trifluoroacetic acid, flow rate of 20 μl min−1) and separated on a C18 column (Acclaim PepMap C18, 2 μm, 100 Å, 75 μm, 150 mm, Thermo Fisher Scientific) in a linear gradient from 5–28% buffer B in 63 min followed by an increasing step to 98% B in 1 min and washing for 9 min with 98% buffer B before equilibration for 15 min with initial conditions with a flow of 300 nl (buffer A, 0.1% formic acid; buffer B, 80% ACN and 0.1% formic acid). The total acquisition time was 100 min. The Orbitrap worked in centroid mode with a duty cycle consisting of one MS1 scan at 70,000 resolution power with a maximum injection time of 300 ms and an AGC target of 3 × 106 followed by 40 variable MS2 scans using a 0.5-Da overlapping window pattern. The window length started with 25 MS2 scans at 12.5 Da, followed by 7 windows with 25 Da, and the last 8 windows were set to 62.5 Da. Precursor MS spectra (m/z 378–1,370) were analysed with 17,500 resolution after 110 ms accumulation of ions to a target value of 3 × 106 in centroid mode. The following mass spectrometric settings were used: spray voltage, 2.1 kV; no sheath and auxiliary gas flow; heated capillary temperature, 275 °C; normalized HCD collision energy 27%. In addition, the background ions m/z 445.1200 acted as lock mass.

    Raw data were processed using DIA-NN v.1.8 (ref. 42) with the scan window size set to 7 and the MS2 and MS1 mass accuracies set to 20 ppm and 10 ppm, respectively. A spectral library-free approach and yeast UniProt (UP000002311, reviewed, canonical, downloaded 18 November 2021)63 were used for annotation. The output was filtered at 1% FDR on peptide level. Log2-transformed protein expression levels between the aneuploid and the euploid isolate were calculated per time point for each protein present in at least two of the three biological replicates of the euploid strain in the given time point, and normalized per strain and time point as described below for the natural isolate library.

    Ubiquitinomics

    Selected aneuploid and euploid yeast isolates were cultivated in SM medium (6.7 g l−1 yeast nitrogen base with ammonium sulfate, Difco 291920, 20 g l−1 glucose) at 30 °C. Three individual pre-cultures per strain were cultured for 16 h, and used to inoculate three flasks of 30–50 ml SM medium per strain. Cultures were collected at mid-log phase by centrifugation (2,880g, 8 min, 4 °C) and pellets were frozen at –20 °C. Cells were lysed using glass beads (volume equal to pellet volume) in 200 µl freshly prepared SDC buffer (1% sodium deoxycholate, 10 mM TCEP, 40 mM chloroacetamide and 75 mM Tris-HCl, pH 8.5) by five cycles of 1 min vortexing, 1 min on ice. Samples were centrifuged (13,800g, 15 min, 4 °C) and the supernatant was collected. Protein concentrations were determined using a Pierce BCA Protein Assay Kit (Thermo Fisher Scientific, 23225). Then, 500 µg of proteins was digested with a trypsin/LysC mix (V5071 or V5072, Promega) overnight at 37 °C with a 1:50 enzyme-to-protein ratio. K-GG peptide enrichment was performed as reported previously39. The digestion was stopped by adding two volumes of 99% ethylacetate/1% TFA, followed by sonication for 1 min using an ultrasonic probe device (energy output of around 40%). The peptides were desalted using 30 mg Strata-X-C cartridges (8B-S029-TAK, Phenomenex) as follows: (a) conditioning with 1 ml isopropanol; (b) conditioning with 1 ml of 80% ACN/5% NH4OH; (c) equilibration with 1 ml of 99% ethylacetate/1% TFA; (d) loading of the sample; (e) washing with 2× 1 ml of 99% ethylacetate/1% TFA; (f) washing with 1 ml of 0.2% TFA; and (g) elution with 2× 1 ml of 80% ACN/5% NH4OH. The eluates were snap-frozen in liquid nitrogen and lyophilized overnight. K-GG peptide enrichment was performed by resuspending lyophilized peptides in 1 ml of cold immunoprecipitation (IP) buffer (50 mM MOPS pH 7.2, 10 mM Na2HPO4 and 50 mM NaCl). Peptides were then incubated with 4 µl of K-GG antibody bead conjugate (Cell Signaling Technology, PTMScan HS Ubiquitin/SUMO Remnant Motif (K-ε-GG) Kit, 59322) for 2 h at 4 °C with end-over-end rotation. Beads were washed (with the help of a magnetic stand) four times with 1 ml IP buffer and an additional time with cold Milli-Q water. After removing all of the supernatant, the beads were incubated with 200 µl of 0.15 % TFA at room temperature while shaking at 1,400 rpm. After briefly spinning, the supernatant was recovered and desalted using in-house-prepared, 200 µl two plug StageTips66 with SDB-RPS (3M Empore, 2241). SDB-RPS StageTips were conditioned with 60 µl isopropanol, 60 µl 80% ACN/5% NH4OH and 100 µl 0.2% TFA. The K-GG enrichment eluate (0.15% TFA) was directly loaded onto the tips followed by two washing steps of 200 µl 0.2% TFA each. Peptides were eluted with 80% ACN/5% NH4OH. Peptides were Speedvac-dried and then resuspended in 10 µl of 0.1% FA, of which 4 µl were injected into the mass spectrometer.

    For LC–MS measurement, peptides were loaded on 40-cm reversed-phase columns (75 µm inner diameter, packed in-house with ReproSil-Pur C18-AQ 1.9 µm resin (ReproSil-Pur, Dr. Maisch)). The column temperature was maintained at 60 °C using a column oven. An EASY-nLC 1200 system (Thermo Fisher Scientific) was directly coupled online with the mass spectrometer (Q Exactive HF-X, Thermo Fisher Scientific) through a nano-electrospray source, and peptides were separated with a binary buffer system of buffer A (0.1% FA plus 5% DMSO) and buffer B (80% ACN plus 0.1% FA plus 5% DMSO), at a flow rate of 300 nl min−1. The mass spectrometer was operated in positive polarity mode with a capillary temperature of 275 °C. The DIA method consisted of an MS1 scan (m/z = 300–1,650) with an AGC target of 3 ×106 and a maximum injection time of 60 ms (R = 120,000). DIA scans were acquired at R = 30,000, with an AGC target of 3 × 106, ‘auto’ for injection time and a default charge state of 4. The spectra were recorded in profile mode and the stepped collision energy was 10% at 25%. The number of DIA segments was set to achieve an average of four to five data points per peak. For details on the DIA method set-up, see a previous report39.

    Raw data-independent acquisition data files were analysed using DIA-NN (v.1.8) in library-free mode searching against the S. cerevisiae reference proteome (strain ATCC 204508/S288c, UniProt ID UP000002311, excluding isoforms, accessed 18 November 2021). Trypsin/P, one missed cleavage, a maximum of two variable modifications (including cysteine carbamidomethylation and diglycine remnant modification, K-GG) and a precursor charge rate between 2 and 4 were set for precursor ion generation. The MBR and remove interferences options were enabled and Robust LC (high precision) was chosen as the quantification strategy. Peptides with a diglycine remnant (UniMod: 121) were used to quantify genes using the built-in MaxLFQ algorithm in DIA-NN, with a global and run-specific FDR of 1% being applied at both the precursor and the protein group level. The resulting data were filtered to include only genes that were measured in at least two of the three biological replicates in at least one strain for further analyses.

    Intracellular lysine measurements

    Isolates (ABH, AFR, AHR, AHS, AII, AIP, ALK, ALM, ALT, AMC, ANA, ANR, APD, APM, APT, AQD, ARL, ARV, ASV, ATA, ATC, AVL, BBD, BBV, BDA, BDI, BDK, BDL, BFK, BFV, BHE, BIP, BKP, BLF, BPA, BPG, BPH, BPP, BTS, CAH, CAN, CCH, CHH, CLN, CLT, CME, CMF, CMM, CMN, CNL, CPB, CPE, CPQ, CPR, CPT, CQQ, CQR, CRL, SACE.YAB and SACE.YCO, as for dynamic SILAC experiments, see below) were randomized in triplicate on two 96-well microtitre plates. Six replicates of a lysine-auxotroph lab strain (BY4742-HLU)67 were also added (three positions randomly per plate). Colonies were picked after 48 h growth at 30 °C on SM medium + 2% agar with a Singer Rotor HDA (Singer Instruments) and pre-cultured overnight in 200 µl SM medium supplemented with labelled l-lysine (80 mg l−1) for the continuous labelling experiment, or 200 µl unlabelled l-lysine (80 mg l−1) for the switching experiment. The OD600 nm was measured after 17 h (Tecan Infinite) and cells were diluted to a starting OD of around 0.1 in 1.6 ml Lys-8- or Lys-0-labelled medium, respectively. For the continuous labelling experiment, isolates were cultured for 8 h at 30 °C whilst shaking, collected by centrifugation (2,900g, 10 min, 4 °C), and stored overnight at −80 °C. For the switching experiment, isolates were cultured at 30 °C whilst shaking for 4 h, then centrifuged (2,900g, 10 min, 30 °C), the supernatant discarded and the pellets washed using 1 ml SM medium, cultivated for another 3 h at 30 °C whilst shaking, and collected as above. The final OD600 nm at collection of all samples was measured.

    Amino acids were extracted by adding 200 µl pre-cooled 80% ethanol containing the internal standard D4-l-lysine (Silantes, 211113913) to each of the frozen cell pellets. The samples were incubated for 2 min at 80 °C and subsequently vortexed. This step was repeated two more times. The samples were centrifuged (2,900g, 10 min, 4 °C) and the supernatants were collected. The measurements of 1 µl per sample were performed on a triple quadrupole mass spectrometer system (Agilent 6460) as previously described68. Technical controls from pooled extracted metabolite samples were included and measured by LC–MS/MS after every 15th sample, in total 27 times. The analysis was performed using MassHunter Software B.07.01 (Agilent Technologies). The internal standard response ratios were calculated for each sample and normalized to the OD600nm measured at collection.

    Dynamic SILAC

    Yeast strains (46 diploid aneuploid isolates with a single chromosome gain (trisomic strains) for which we quantified attenuation, 2 randomly chosen haploid aneuploid isolates with a single chromosome gain, as well as 10 diploid euploid and two haploid euploid isolates with a similar range of growth rates to that of the aneuploid isolates, meaning isolates ABH, AFR, AHR, AHS, AII, AIP, ALK, ALM, ALT, AMC, ANA, ANR, APD, APM, APT, AQD, ARL, ARV, ASV, ATA, ATC, AVL, BBD, BBV, BDA, BDI, BDK, BDL, BFK, BFV, BHE, BIP, BKP, BLF, BPA, BPG, BPH, BPP, BTS, CAH, CAN, CCH, CHH, CLN, CLT, CME, CMF, CMM, CMN, CNL, CPB, CPE, CPQ, CPR, CPT, CQQ, CQR, CRL, SACE.YAB and SACE.YCO; see also Supplementary Table 16) were grown on synthetic medium containing 6.7 g l−1 yeast nitrogen base, 2% glucose and 80 mg l−1 l-lysine (SM + Lys-0). For SILAC labelling, l-lysine was swapped for 80 mg l−1 heavy [13C6/15N2] lysine (SM + Lys-8). Cells were taken from cryo stocks and streaked on freshly prepared SM + Lys-0 agar plates (20 g l−1 agar) and grown for 48–72 h at 30 °C. Colonies across the whole agar plate were gathered and cultivated in 5 ml SM + Lys-0 for approximately 16 h at 30 °C and 300 rpm. The overnight pre-culture was then diluted in 25 ml in SM + Lys-0 (pre-warmed to 30 °C) to a starting OD600 nm of around 0.1. The culture was grown at 30 °C, 300 rpm until it reached an OD600 nm of between 0.25 and 0.3. At this point, the medium was switched from SM + Lys-0 to SM + Lys-8 using the following procedure. First, 20 ml of the culture was transferred into a 50-ml Falcon tube and centrifuged for 5 min at 30 °C, 3,095g. Then, the supernatant was decanted and the pellet was washed twice with 4 ml SM + Lys-8 (pre-warmed to 30 °C). Lastly, the pellet was resuspended in 20 ml warm SM + Lys-8 and transferred into clean flasks. The heavy-labelled cultures were grown at 30 °C, 300 rpm and at three time points (90 min, 135 min and 180 min), 2 ml of the culture was collected into ice-cold screw cap tubes. The samples were centrifuged at 10,000g, 4 °C, the supernatant aspirated and the pellets stored at −80 °C. At each collection time point, the OD600 nm was also recorded. Strains BPP, BDK, ATA and BFV did not grow well under the chosen conditions or were not growing exponentially when sampled, and were therefore omitted from further processing.

    Cells were lysed mechanically in screw cap tubes by adding around 100 mg glass beads and 100 µl fresh lysis buffer (7 M urea and 100 mM ABC) to each sample, followed by two cycles of bead beating (5 min, 1,500 rpm, followed by 5 min on ice) using a GenoGrinder. The samples were briefly centrifuged (4,000g, 1 min) and the supernatant was transferred to a 500-μl Eppendorf 96-well plate. From this step onwards, all samples were processed together in high throughput. Reduction, alkylation and digest were performed as described in the ‘Twenty-four-hour time-course proteomics’ section, using 10 μl of 55 mM DTT, 10 μl of 120 mM IAA, 380 μl of 100 mM ABC and 2 μg of trypsin/LysC. Samples were digested for 17 h at 37 °C. The digest was stopped by the addition of 25 μl of 20% FA and the samples were purified using solid-phase extraction as described above. Purified peptides were dried using a vacuum concentrator and subsequently dissolved in 25 µl 0.1% FA. Peptide concentrations were determined using a fluorimetric peptide assay kit following the manufacturer’s instructions (Thermo Fisher Scientific, 23290).

    For each strain, 1 µg of peptide sample was separated on a VanquishNeo System (Thermo Fisher Scientific) by reverse-phase chromatography with a 30-min efficient gradient from 3 to 30% ACN on a self-packed 20-cm column (ID 75 µm, 1.9-µm beads), and directly injected through electrospray ionization (ESI) to an Exploris480 Orbitrap (Thermo Fisher Scientific). In brief, the MS settings for Top20 acquisition scheme were the following: ESI voltage: 2.2 kV; resolution MS1 60k; IT MS1 10 ms; RF-Lens 55; resolution MS2 15k; maxIT MS2 50 ms; isolation width 1.2 Da; HCD collision energy 28; AGC target 100%.

    Raw files were analysed with MaxQuant v.1.6.7.0 using standard settings, with match between runs and requantify enabled, and the Uniprot S. cerevisiae protein database including isoforms (downloaded 9 February 2023) was selected for the database search. The complexity was set to 2, with Lys-8 set as the heavy label. Further processing of the data and calculation of half-lives was done in R. First, the evidence.txt was loaded with the fread function from the data.table package, filtered for lysine-containing peptides and cleaned from potential contaminants and remaining reverse hits. Owing to the fact that many proteins in yeast are very stable, a correction for doubling times is not applicable to most identified proteins. As in a previous study46, we therefore calculated turnover rates (kdp) and the corresponding half-lives without doubling time or dilution rate correction (Supplementary Table 16). In more detail, protein turnover rates were calculated for proteins with valid SILAC ratios in at least two time points per strain by building a linear model from the different sampling time points against the log-transformed H/L ratios, thus calculating kdp. The corresponding slopes from each fit depict the kdp value for each strain. Half-lives were calculated from the resulting kdp as log(2)/kdp (Supplementary Table 17). Isolate CLN was excluded owing to a very low number of valid SILAC ratios obtained at t = 135 min. Furthermore, three proteins (ERP1, NEO1 and MDE1) were excluded from the dataset because they were measured in only a few isolates (6, 4 and 3, respectively) and exhibited very high variability in half-lives across these strains.

    Post-processing statistical analyses

    All statistical analyses were conducted in R v.3.6 unless otherwise indicated. KEGG annotations for S. cerevisiae genes were obtained through the KEGG database (accessed January 2021)69. The org.Sc.sgd.db package70 was used to obtain chromosomal location information for genes and to map gene names to systematic open reading frame (ORF) identifiers. If no gene name was annotated in this package, the systematic ORF identifier was used instead. Standardized S. cerevisiae yeast strain names and systematic ORF identifiers for genes were used throughout all analyses. Heat maps were plotted using the ComplexHeatmap package71. In all box plot representations, the centre marks the median, box plot hinges mark the 25th and 75th percentiles and whiskers show all values that, at maximum, fall within 1.5 times the interquartile range.

    Assembly of integrated chromosome copy number, mRNA and protein expression datasets

    Gene copy numbers for natural isolates were downloaded from the 1002 Yeast Genome website (http://1002genomes.u-strasbg.fr/files/)1, and the following loci were excluded: ribosomal DNA, Ty elements, RTM loci, ORFs located on the 2-micron plasmid, mitochondrial ORFs and non-reference material. Furthermore, the table was filtered to retain only genes with non-zero and non-missing values for further analyses. Chromosome copy number status for all engineered disomic strains was confirmed by Torres et al.3, meaning that all disomic strains used in our study were haploid with indicated ‘disomic’ chromosomes duplicated. One exception was disome 13: despite published mRNA expression values being available and proteomics data having been measured in our experiments, disome 13 was excluded from all analyses because it had undergone whole-genome duplication when reaching our laboratory (personal communication, J. Zhu).

    For 761 isolates, both proteomes (this study) and transcriptomes6 were available, and for 759 isolates, gene copy number information1 was available. Data for gene copy number, mRNA expression and protein abundances were matched by strain name and systematic ORF identifier for both the natural isolate collection and the disomic strain collection. Only genes for which values for gene copy number, transcript and protein levels were available were used for analyses. We noticed that a number of strains in the natural isolate collection exhibited a mismatch between the median gene copy number per chromosome and the assigned aneuploidy as described in Supplementary Table 1, which is likely to be attributable to segmental aneuploidies, shorter gene copy number variations or algorithm-specific thresholds used for aneuploidy determination. We excluded all strains (n = 80) containing one or more of those ‘mismatched’ chromosomes from our analysis (Supplementary Table 5). From this point, chromosome copy numbers as given by the aneuploidy annotation were used throughout the analyses.

    Calculation of relative chromosome copy numbers, mRNA and protein expression values

    The assembled integrated dataset was used to compare the relative changes in chromosome copy number, mRNA transcript expression and protein abundances between aneuploid and euploid strains. Relative chromosome copy number changes were calculated as the log2 ratio between the chromosome copy number and the ploidy of the strain. Relative abundances for transcriptomic and proteomic data were calculated gene-wise as the log2 ratio between a gene’s mRNA or protein abundance in a given strain and the median mRNA or protein expression value of the respective gene across all euploid strains (‘all-euploid strain’ method). In addition, ploidy-wise calculation of relative mRNA or protein abundance was tested, comparing the abundances of a haploid strain to the median abundance of that same mRNA or protein across all euploid haploid strains, each diploid strain to all euploid diploids and so forth for all basal ploidies (‘ploidy-wise’ method). There was a high correlation between relative expression values calculated across all euploid strains, and ploidy-wise calculated relative expression values (Extended Data Fig. 9), indicating that non-linear scaling of the proteome with ploidy72 had no significant effect on the outcome of the used data normalization strategy. For the transcriptomic data of the lab-engineered disomic strains, we used the transcript levels as published; that is, as log2 fold changes relative to the wild-type strain (disome WT, 11311)3. For replicate measurements, the median value was used for further analysis.

    For the log2 mRNA and protein ratios of genes encoded on euploid chromosomes, a distribution centred around 0 would be expected, representing no overall shift of relative expression values of these genes across strains. This was true for our proteomics data, and also for most strains in the transcriptomic data. Because some natural isolates showed left tails in this distribution for the transcriptomic data, presumably because of restricting the assembled dataset to genes for which we had data across all three -omics layers, we decided to normalize the relative mRNA and protein expression values. Normalization was performed in a strain-by-strain manner for both the across-euploid strains and the ploidy-wise ratio calculation methods (see above): first, the median log2 mRNA or protein ratio of all genes encoded on euploid chromosomes of a given strain was calculated. This median value was then subtracted from all log2 mRNA or protein ratios of that strain.

    The proteome profiles of disome 12 and disome 14 showed no aneuploid signature, indicating that those strains, even though they were held under selective pressure, had lost their duplicated chromosome either before their arrival in our laboratory or during our experiments. Both strains were therefore excluded from our analysis. Similarly, when comparing relative chromosome copy number changes and relative mRNA expression levels in the natural isolate collection, we noticed discrepancies indicative of chromosomal instabilities in natural S. cerevisiae isolates. Some euploid strains had gained or lost chromosomes, evident as much higher or lower fold changes of expression values observed in the transcriptomics data than in chromosome copy numbers. Likewise, some aneuploid strains underwent changes in their karyotype, resulting in either more complex aneuploidies or in aneuploid strains reverting to euploid strains. We decided to include in our analysis only strains that showed consistent relative expression (log2 ratio) values on the chromosome copy number and the transcriptome level. Consequently, we excluded strains that had at least one chromosome for which the difference between relative chromosome copy number and the median of the normalized relative mRNA abundances differed by more than ±4 standard deviations from the mean, on the basis of all relative chromosome–mRNA comparisons (Supplementary Table 6, n = 66). After excluding these strains, the calculations to obtain gene-wise relative (strain/euploid) mRNA and protein expression values were repeated to avoid unintended biases towards these excluded strains.

    Gene-by-gene quantification of dosage compensation

    Linear regressions between log2 mRNA or protein expression ratios and relative chromosome copy number (CN) changes (log2 chromosome CN/basal ploidy) were performed for all genes that were encoded on an aneuploid chromosome in at least three different natural aneuploid isolates (so genes on chromosomes 1, 3, 4, 5, 6, 8, 9, 11, 12 and 14). Isolates that had reverted to euploidy (Supplementary Table 12) were excluded from this analysis. Relative chromosome copy numbers were restricted to be greater than or equal to 0, thus including all euploid chromosomes, and all chromosome gains of aneuploid isolates, but excluding chromosome losses. Therefore, each regression was performed using data for the expression of the gene on euploid chromosomes (log2 CN change = 0) and at least three independent data points with a relative chromosome CN change greater than 0. The slopes of these gene-by-gene linear regressions were used as a measure of across-isolate dosage compensation. For lab-engineered disomic strains, a similar analysis was performed; however, it was necessarily restricted to one ‘aneuploid’ data point per gene and forced through 0 because each aneuploid chromosome was engineered exactly once in the disomic strain collection. Therefore, in total, the regressions were performed for 827 and 680 genes at the mRNA and protein level in natural isolates and disomic strains, respectively. For the cumulative distribution (‘rolling threshold’) analysis, the number of mRNAs or proteins exhibiting attenuation slopes smaller than a given threshold were counted, and effect sizes for these attenuated mRNAs or proteins were calculated as the median of the attenuation slopes smaller than the respective threshold. For the following analyses, a threshold of 0.85 was selected to define attenuated mRNAs and proteins.

    For assessment of the protein properties on attenuation, the following sources were used: macromolecular-complex membership: Complex Portal of the EBI (accessed December 2020)73; protein–protein interactions (PPIs): STRING database (accessed November 2022)74; prediction of protein disorder and linear interacting peptides by AlphaFold, MobiDB and anchor: MobiDB (accessed October 2022)75; GC content and percentile mean gRSCU: calculated on the Saccharomyces cerevisiae S288C sequence (NCBI: GCF_000146045.2_R64) using the gc1, gc2, gc3 and gRSCU functions in BioKIT v.0.1.2 (ref. 76); ribosome occupancy: from a previous study77; amino acid synthesis costs and glucose cost: from a previous study78; absolute protein copy numbers per cell: from a previous study79; protein length, mass and modification sites: UniProt (accessed October 2022, ubiquitinated residue information inferred from experimental and automatic cross-link evidence listed as ‘Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in ubiquitin)’)63; and protein half-life: from a previous study46. The internal variability of transcripts and proteins was calculated as the standard deviation of mRNA and protein abundance across all euploid isolates of the collection, respectively. Receiver operator characteristics were calculated using the pROC package80 as described previously26.

    For assessment of mRNA- and protein-level annotation across pathways, cellular localizations, molecular functions and biological processes, KEGG annotations were obtained using the KEGG API (accessed January 2021)69, and a GO slim mapping file was obtained from SGD (accessed January 2021)81. The degree of relative attenuation was determined as 100 × (1 − slope) per mRNA or protein, and median attenuation levels per KEGG or GO category were calculated only for those KEGG or GO terms with at least six mRNAs or proteins for which an attenuation slope had been determined.

    For the comparison of proteins non-exponentially degraded in human cells versus yeast, we identified the yeast homologues of the human proteins quantified and assigned as either exponentially degraded (ED), non-exponentially degraded (NED) or undefined in a previous report25. We identified 759 yeast homologues for 3,187 human proteins covered in that report25 (around 23%), agreeing very well with the expected fraction of the human proteome that has yeast homologues. Of those 759 proteins, 146 were classified as NED, 349 classified as ED and 262 as undefined; two proteins could not be unambiguously mapped to one of these categories.

    Chromosome-wide and strain-by-strain quantification of dosage compensation

    To assess attenuation at the chromosome level, the median mRNA or protein log2 ratio of all genes encoded per chromosome or relative chromosome copy number change across isolates was calculated. For both disomic lab-engineered strains and natural isolates of S. cerevisiae, mRNA and protein expression log2 ratios between a gene’s expression in a strain and the gene’s expression over all euploid strains were examined in relation to the log2-transformed fold change of the copy number of the chromosome on which the gene is located. For gains of chromosomes, all log2 chromosome copy number changes for which fewer than 300 affected data points (genes) were quantified were excluded for distribution visualization. For chromosome losses, fewer data points were available overall, so the described cut-off was set at 50 data points (genes). The attenuation observed in the lab-engineered disomic strains measured by DIA-MS was comparable to that previously measured using SILAC10 (Extended Data Fig. 10).

    To quantify the relationship between chromosome gains (log2 chromosome copy number/ploidy > 0, all relative chromosome copy number changes included) and relative mRNA or protein expression from aneuploid chromosomes, linear models were fitted between the log2 chromosome copy number change and the median relative mRNA or protein expression value.

    For the strain-by-strain quantification of dosage compensation, non-parametric two-sided one-sample Wilcoxon tests were performed for each relative chromosome copy number change per isolate to compare the normalized log2 mRNA or protein expression distributions to the expected median (log2 chromosome copy number/basal ploidy). P values were corrected using the Benjamini–Hochberg method. This way, for each isolate, it could be assessed whether the observed attenuation of chromosomes with the same copy number change in that isolate was significant or not. Aneuploid isolates were marked as ‘reverted to euploid’ if the pseudomedian of the protein-level Wilcoxon test was between −0.1 and 0.1. The attenuation at mRNA or protein level was calculated as 100 × (1 − pseudomedian/relative chromosome copy number change) per isolate, with ‘pseudomedian’ referring to the pseudomedian obtained from the Wilcoxon test. The calculation was performed only for isolates that had a single aneuploidy, or complex aneuploidies of the same relative chromosome copy number change of aneuploid chromosomes; that is, attenuation levels could, for example, be calculated for a diploid isolate with one extra copy of chromosome 1 (for example, isolate BDI), also, for example, for a diploid isolate with one gained copy of chromosome 1 and one gained copy of chromosome 4 (for example, isolate CFV), but not, for example, for a tetraploid isolate with a complex aneuploidy that gained one copy of chromosome 1 and lost a copy of chromosome 3 (for example, isolate BRP).

    For investigating the relationship between the degree of aneuploidy and dosage compensation, we defined an additional measure of degree of aneuploidy by calculating the ploidy-adjusted absolute number of protein copies per cell of all proteins encoded on aneuploid chromosomes in a given strain (referred to as ‘aneuploid protein load’). This measure correlated very well with the number of genes located on aneuploid chromosomes, a previously used measure of aneuploidy degree (for example, ref. 26; PCC = 0.96, P << 0.05).

    Assessment of the trans transcriptome and proteome response in natural aneuploid isolates

    Trans expression at the transcriptome and proteome level was defined as the mRNA or protein expression, respectively, of genes encoded on euploid chromosomes in aneuploid isolates. Genes up- or downregulated according to the ESR were mapped as described in ref. 44, genes up- or downregulated according to the CAGE signature were mapped as described in ref. 34 and genes annotated as upregulated in the APS were mapped as described in ref. 10. To find genes differentially expressed in trans in aneuploid strains—that is, genes encoded on euploid chromosomes of aneuploid strains that show up- or downregulation at the mRNA or protein level when compared with euploid strains—we calculated the gene-by-gene median normalized relative mRNA or protein abundances (log2 ratios) of all genes encoded on euploid chromosomes in aneuploid strains (n = 95). KEGG-pathway GSEA of these median relative expression values was performed with WebGestalt 2019 using the default settings (accessed December 2021)82. In addition, one-sample t-tests were used to compare gene-by-gene mean normalized protein log2 ratios across euploid chromosomes of aneuploid strains against the theoretical gene-by-gene mean protein log2 ratio value across euploid strains (µ = 0). P values were corrected for multiple hypothesis testing using the Benjamini–Hochberg method as implemented in the rstatix package83. Annotation of structural components was obtained through KEGG (accessed January 2021)69. Detailed proteasome component annotations were obtained from a previous study84.

    For the analysis of the role of RPN4 in mediating the increase of proteasome abundance, RPN4 transcript levels were obtained from the transcriptome data of the natural isolate collection6, and TMM normalized and scaled as described above. RPN4 regulon targets (found in high-throughput screens and manually curated ones) were downloaded from SGD (accessed April 2023)81, with around 50 of these targets being measured in the proteomic dataset of the natural isolates.

    Determination of ubiquitination levels

    Relative levels of ubiquitinated proteins were determined gene-wise by calculating the log2 ratio between the measured abundance of a ubiquitinated protein in each strain and the median abundance of the ubiquitinated protein across all euploid strains. Assuming a distribution centred around 0 of relative levels of ubiquitinated proteins on euploid chromosomes, these relative abundances were then normalized strain-wise by subtracting the calculated median log2 ratio of all genes expressed on euploid chromosomes of a strain from all log2 ratios of that strain.

    Attenuation and turnover analyses

    Proteomes of the aneuploid yeast deletion collection were obtained from a previous study45. Fold changes were defined as ratios between protein abundances and the median abundances of the respective protein across all strains. Chromosomes were defined as duplicated when the median log2 expression levels were greater than 0.8 across all measured proteins on the respective chromosome. Log2 fold changes were averaged across the strains with duplications of the respective chromosome. Long and short half-lives were defined as being greater than the 75 % and less than the 25% quantile (n = 110), respectively. Half-lives were taken from a reference dataset and were obtained by metabolic labelling46.

    To compare the turnover rates of proteins when expressed from aneuploid versus euploid chromosomes, we first filtered the turnover dataset to include only proteins for which we had turnover rates determined in at least 80% (44/55) of isolates and KNN-imputed the remaining missing values. We then quantile-normalized the turnover rates to correct for differences in overall turnover rates between isolates. We calculated the median quantile-normalized turnover rate for each protein when expressed from aneuploid chromosomes, from euploid chromosomes of aneuploid isolates or from euploid chromosomes of euploid isolates. These calculations were performed only for proteins for which turnover rates were determined at least three times on aneuploid and euploid chromosomes, respectively. The median turnover rates were then compared to count the number of times a protein exhibits a difference in turnover rates depending on whether it is expressed from aneuploid or euploid chromosomes.

    For determining the relationship between protein attenuation and overall turnover rates of isolates, the Pearson correlation between the protein’s log2 expression ratio in a given isolate (see ‘Calculation of relative chromosome copy numbers, mRNA and protein expression values’ section) and isolates’ turnover rates was calculated. This analysis was performed for each protein expressed at least three times from aneuploid chromosomes, euploid chromosomes of euploid isolates or euploid chromosomes of aneuploid isolates. GSEA was conducted on Pearson correlation coefficients with WebGestalt 2019 using the default settings.

    Reporting summary

    Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

    [ad_2]

    Source link

  • Cells cope with altered chromosome numbers by enhancing protein breakdown

    Cells cope with altered chromosome numbers by enhancing protein breakdown

    [ad_1]

    Nature, Published online: 22 May 2024; doi:10.1038/d41586-024-01360-6

    When chromosomes are lost or gained, massive changes in gene expression disrupt the delicate balance of proteins in a cell. Yeasts with incorrect chromosome numbers counteract this by degrading excess proteins.

    [ad_2]

    Source link

  • the AI protein predictor gets an upgrade

    the AI protein predictor gets an upgrade

    [ad_1]

    Download the Nature Podcast 8 May 2024

    In this episode:

    00:45 A nuclear timekeeper that could transform fundamental-physics research

    Nuclear clocks — based on tiny shifts in energy in an atomic nucleus — could be even more accurate and stable than other advanced timekeeping systems, but have been difficult to make. Now, a team of researchers have made a breakthrough in the development of these clocks, identifying the correct frequency of laser light required to make this energy transition happen. Ultimately it’s hoped that physicists could use nuclear clocks to probe the fundamental forces that hold atoms together.

    News: Laser breakthrough paves the way for ultra precise ‘nuclear clock’

    10:34 Research Highlights

    Why life on other planets may come in purple, brown or orange, and a magnetic fluid that could change shape inside the body.

    Research Highlight: Never mind little green men: life on other planets might be purple

    Research Highlight: A magnetic liquid makes for an injectable sensor in living tissue

    13:48 AlphaFold gets an upgrade

    Deepmind’s AlphaFold has revolutionized research by making it simple to predict the 3D structures of proteins, but it has lacked the ability to predict situations where a protein is bound to another molecule. Now, the AI has been upgraded to AlphaFold 3 and can accurately predict protein-molecule complexes containing DNA, RNA and more. Although the new version is restricted to non-commercial use, researchers are excited by its greater range of predictive abilities and the prospect of speedier drug discovery.

    News: Major AlphaFold upgrade offers huge boost for drug discovery

    Research Article: Abramson et al.

    Subscribe to Nature Briefing, an unmissable daily round-up of science news, opinion and analysis free in your inbox every weekday.

    Never miss an episode. Subscribe to the Nature Podcast on Apple Podcasts, Spotify, YouTube Music or your favourite podcast app. An RSS feed for the Nature Podcast is available too.

    [ad_2]

    Source link

  • Powerful ‘nanopore’ DNA sequencing method tackles proteins too

    Powerful ‘nanopore’ DNA sequencing method tackles proteins too

    [ad_1]

    Two gloves hands holding a MinION portable and real time device for DNA and RNA sequencing

    A nanopore sequencing device is typically used for sequencing DNA and RNA.Credit: Anthony Kwan/Bloomberg/Getty

    With its fast analyses and ultra-long reads, nanopore sequencing has transformed genomics, transcriptomics and epigenomics. Now, thanks to advances in nanopore design and protein engineering, protein analysis using the technique might be catching up.

    “All the pieces are there to start with to do single-molecule proteomics and identify proteins and their modifications using nanopores,” says chemical biologist Giovanni Maglia at the University of Groningen, the Netherlands. That’s not precisely sequencing, but it could help to work out which proteins are present. “There are many different ways you can identify proteins which doesn’t really require the exact identification of all 20 amino acids,” he says, referring to the usual number found in proteins.

    In nanopore DNA sequencing, single-stranded DNA is driven through a protein pore by an electrical current. As a DNA residue traverses the pore, it disrupts the current to produce a characteristic signal that can be decoded into a sequence of DNA bases.

    Proteins, however, are harder to crack. They cannot be consistently unfolded and moved by a voltage gradient because, unlike DNA, proteins don’t carry a uniform charge. They might also be adorned with post-translational modifications (PTMs) that alter the amino acids’ size and chemistry — and the signals that they produce. Still, researchers are making progress.

    Water power

    One way to push proteins through a pore is to make them hitch a ride on flowing water, like logs in a flume. Maglia and his team engineered a nanopore1 with charges positioned so that the pore could create an electro-osmotic flow that was strong enough to unfold a full-length protein and carry it through the pore. The team tested its design with a polypeptide containing negatively charged amino acids, including up to 19 in a row, says Maglia. This concentrated charge created a strong pull against the electric field, but the force of the moving water kept the protein moving in the right direction. “That was really amazing,” he says. “We really did not expect it would work so well.”

    Chemists Hagan Bayley and Yujia Qing at the University of Oxford, UK, and their colleagues have also exploited electro-osmotic force, this time to distinguish between PTMs2. The team synthesized a long polypeptide with a central modification site. Addition of any of three distinct PTMs to that site changed how much the current through the pore was altered relative to the unmodified residues. The change was also characteristic of the modifying group. Initially, “we’re going for polypeptide modifications, because we think that’s where the important biology lies”, explains Qing.

    And, because nanopore sequencing leaves the peptide chain intact, researchers can use it to determine which PTMs coexist in the same molecule — a detail that can be difficult to establish using proteomics methods, such as ‘bottom up’ mass spectrometry, because proteins are cut into small fragments. Bayley and Qing have used their method to scan artificial polypeptides longer than 1,000 amino acids, identifying and localizing PTMs deep in the sequence. “I think mass spec is fantastic and provides a lot of amazing information that we didn’t have 10 or 20 years ago, but what we’d like to do is make an inventory of the modifications in individual polypeptide chains,” Bayley says — that is, identifying individual protein isoforms, or ‘proteoforms’.

    Molecular ratchets

    Another approach to nanopore protein analysis uses molecular motors to ratchet a polypeptide through the pore one residue at a time. This can be done by attaching a polypeptide to a leader strand of DNA and using a DNA helicase enzyme to pull the molecule through. But that limits how much of the protein the method can read, says synthetic biologist Jeff Nivala at the University of Washington, Seattle. “As soon as the DNA motor would hit the protein strand, it would fall off.”

    Nivala developed a different technique, using an enzyme called ClpX (see ‘Read and repeat’). In the cell, ClpX unfolds proteins for degradation; in Nivala’s method, it pulls proteins back through the pore. The protein to be sequenced is modified at either end. A negatively charged sequence at one end allows the electric field to drive the protein through the pore until it encounters a stably folded ‘blocking’ domain that is too large to pass through. ClpX then grabs that folded end and pulls the protein in the other direction, at which point the sequence is read. “Much like you would pull a rope hand over hand, the enzyme has these little hooks and it’s just dragging the protein back up through the pore,” Nivala says.

    Read and repeat. Graphic showing a nanopore protein-sequencing strategy using the push and pull of an electric field through a membrane, enzyme and slip sequence.

    Source: Ref. 3

    Nivala’s approach has another advantage: when ClpX reaches the end of the protein, a special ‘slip sequence’ causes it to let go so that the current can pull the protein through the pore for a second time. As ClpX reels it back out again and again, the system gets multiple peeks at the same sequence, improving accuracy.

    Last October3, Nivala and his colleagues showed that their method can read synthetic protein strands of hundreds of amino acids in length, as well as an 89-amino-acid piece of the protein titin. The read data not only allowed them to distinguish between sequences, but also provided unambiguous identification of amino acids in some contexts. Still, it can be difficult to deduce the amino-acid sequence of a completely unknown protein, because an amino acid’s electrical signature varies on the basis of both its surrounding sequence and its modifications. Nivala predicts that the method will have a ‘fingerprinting’ application, in which an unknown protein is matched to a database of reference nanopore signals. “We just need more data to be able to feed these machine-learning algorithms to make them robust to many different sequences,” he says.

    Stefan Howorka, a chemical biologist at University College London, says that nanopore protein sequencing could boost a range of disciplines. But the technology isn’t quite ready for prime time. “A couple of very promising proof-of-concept papers have been published. That’s wonderful, but it’s not the end.” The accuracy of reads needs to improve, he says, and better methods will be needed to handle larger PTMs, such as bulky carbohydrate groups, that can impede the peptide’s movement through the pore.

    How easy it will be to extend the technology to the proteome level is also unclear, he says, given the vast number and wide dynamic range of proteins in the cell. But he is optimistic. “Progress in the field is moving extremely fast.”

    [ad_2]

    Source link

All authors reviewed and revised the manuscript. Detailed author contributions are provided in the Supplementary Information.

[ad_2]

Source link

  • A chemical method for selective labelling of the key amino acid tryptophan

    A chemical method for selective labelling of the key amino acid tryptophan

    [ad_1]

    • RESEARCH BRIEFINGS

    A broadly applicable method allows selective, rapid and efficient chemical modification of the side chain of tryptophan amino acids in proteins. This platform enables systematic, proteome-wide identification of tryptophan residues, which can form a bond (called cation–π interaction) with positively charged molecules. Such interactions are key in many biochemical processes, including protein-mediated phase separation.

    [ad_2]

    Source link

  • Chemoproteomic discovery of a covalent allosteric inhibitor of WRN helicase

    [ad_1]

  • Behan, F. M. et al. Prioritization of cancer therapeutic targets using CRISPR-Cas9 screens. Nature 568, 511–516 (2019).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Chan, E. M. et al. WRN helicase is a synthetic lethal target in microsatellite unstable cancers. Nature 568, 551–556 (2019).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kategaya, L., Perumal, S. K., Hager, J. H. & Belmont, L. D. Werner syndrome helicase is required for the survival of cancer cells with microsatellite instability. iScience 13, 488–497 (2019).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Lieb, S. et al. Werner syndrome helicase is a selective vulnerability of microsatellite instability-high tumor cells. eLife 8, e43333 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • van Wietmarschen, N. et al. Repeat expansions confer WRN dependence in microsatellite-unstable cancers. Nature 586, 292–298 (2020).

    Article 
    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kawakami, H., Zaanan, A. & Sinicrope, F. A. Microsatellite instability testing and its role in the management of colorectal cancer. Curr. Treat. Options Oncol. 16, 30 (2015).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hause, R. J., Pritchard, C. C., Shendure, J. & Salipante, S. J. Classification and characterization of microsatellite instability across 18 cancer types. Nat. Med. 22, 1342–1350 (2016).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Bonneville, R. et al. Landscape of microsatellite instability across 39 cancer types. JCO Precis. Oncol. 2017, PO.17.00073 (2017).

    PubMed 

    Google Scholar
     

  • Andre, T. et al. Pembrolizumab in microsatellite-instability-high advanced colorectal cancer. N. Engl. J. Med. 383, 2207–2218 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Lenz, H. J. et al. First-line nivolumab plus low-dose ipilimumab for microsatellite instability-Hhigh/mismatch repair-deficient metastatic colorectal cancer: the Phase II CheckMate 142 Study. J. Clin. Oncol. 40, 161–170 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Shan, J., Han, D., Shen, C., Lei, Q. & Zhang, Y. Mechanism and strategies of immunotherapy resistance in colorectal cancer. Front. Immunol. 13, 1016646 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wang, R. et al. Intrinsic resistance and efficacy of immunotherapy in microsatellite instability-high colorectal cancer: a systematic review and meta-analysis. Biomol. Biomed. 23, 198–208 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Fuca, G. et al. Ascites and resistance to immune checkpoint inhibition in dMMR/MSI-H metastatic colorectal and gastric cancers. J. Immunother. Cancer 10, e004001 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Sui, Q. et al. Inflammation promotes resistance to immune checkpoint inhibitors in high microsatellite instability colorectal cancer. Nat. Commun. 13, 7316 (2022).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Huang, S. et al. The premature ageing syndrome protein, WRN, is a 3’->5’ exonuclease. Nat. Genet. 20, 114–116 (1998).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Gray, M. D. et al. The Werner syndrome protein is a DNA helicase. Nat. Genet. 17, 100–103 (1997).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Zong, D. et al. Comprehensive mapping of cell fates in microsatellite unstable cancer cells support dual targeting of WRN and ATR. Genes Dev. 37, 913–928 (2023).

  • Backus, K. M. et al. Proteome-wide covalent ligand discovery in native biological systems. Nature 534, 570–574 (2016).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Weerapana, E. et al. Quantitative reactivity profiling predicts functional cysteines in proteomes. Nature 468, 790–795 (2010).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Parker, M. J. et al. Identification of 2-sulfonyl/sulfonamide pyrimidines as covalent inhibitors of WRN using a multiplexed high-throughput screening assay. Biochemistry 62, 2147–2160 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Hansen, R. et al. The reactivity-driven biochemical mechanism of covalent KRASG12C inhibitors. Nat. Struct. Mol. Biol. 25, 454–462 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Rudolph, M. G. & Klostermeier, D. When core competence is not enough: functional interplay of the DEAD-box helicase core with ancillary domains and auxiliary factors in RNA binding and unwinding. Biol. Chem. 396, 849–865 (2015).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Cancer Dependency Map Portal (RRID:SCR_017655). DepMap Portal https://depmap.org/portal/ (2019).

  • Bird, J. L. et al. Recapitulation of Werner syndrome sensitivity to camptothecin by limited knockdown of the WRN helicase/exonuclease. Biogerontology 13, 49–62 (2012).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Soto-Gamez, A., Quax, W. J. & Demaria, M. Regulation of survival networks in senescent cells: from mechanisms to interventions. J. Mol. Biol. 431, 2629–2643 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Kang, K., Lee, S. B., Yoo, J. H. & Nho, C. W. Flow cytometric fluorescence pulse width analysis of etoposide-induced nuclear enlargement in HCT116 cells. Biotechnol. Lett. 32, 1045–1052 (2010).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Skog, S. & Tribukait, B. Cell size following irradiation in relation to cell cycle. Acta Radiol. Oncol. 25, 269–273 (1986).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Rogakou, E. P., Pilch, D. R., Orr, A. H., Ivanova, V. S. & Bonner, W. M. DNA double-stranded breaks induce histone H2AX phosphorylation on serine 139. J. Biol. Chem. 273, 5858–5868 (1998).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Liu, Y. et al. Patient-derived xenograft models in cancer therapy: technologies and applications. Signal Transduct. Target. Ther. 8, 160 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Overman, M. J. Overview of the management of primary colon cancer. uptodate https://www.uptodate.com/contents/overview-of-the-management-of-primary-colon-cancer (2024).

  • Picco, G. et al. Werner helicase is a synthetic-lethal vulnerability in mismatch repair-deficient colorectal cancer refractory to targeted therapies, chemotherapy, and immunotherapy. Cancer Discov. 11, 1923–1937 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Study of HRO761 Alone or in Combination in Cancer Patients With Specific DNA Alterations Called Microsatellite Instability or Mismatch Repair Deficiency (US National Library of Medicine, 2023): https://classic.clinicaltrials.gov/show/NCT05838768.

  • Bordas, V. et al. Triazolo-pyrimidine analogues for treating diseases connected to the inhibiton of Werner syndrome RECQ helicase (WRN). International Patent WO 2022/249060 (2022).

  • A Study to Evaluate the Safety, Pharmacokinetics, and Anti-tumor Activity of RO7589831 in Participants with Advanced Solid Tumors (US National Library of Medicine, 2023); https://classic.clinicaltrials.gov/show/NCT06004245.

  • Newman, J. A. et al. Crystal structure of the Bloom’s syndrome helicase indicates a role for the HRDC domain in conformational changes. Nucleic Acids Res. 43, 5221–5235 (2015).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Schwanhäusser, B. et al. Global quantification of mammalian gene expression control. Nature 473, 337–342 (2011).

  • Sommers, J. A. et al. A high-throughput screen to identify novel small molecule inhibitors of the Werner Syndrome Helicase-Nuclease (WRN). PLoS One 14, e0210525 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Perez-Riverol, Y. et al. The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences. Nucleic Acids Res. 50, D543–d552 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Diederichs, K. & Karplus, P. A. Improved R-factors for diffraction data analysis in macromolecular crystallography. Nat. Struct. Biol. 4, 269–275 (1997).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Karplus, P. A. & Diederichs, K. Linking crystallographic model and data quality. Science 336, 1030–1033 (2012).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • [ad_2]

    Source link

  • Study reveals breakthrough in non-invasive detection of endometrial cancer

    Study reveals breakthrough in non-invasive detection of endometrial cancer

    [ad_1]

    In a recent study published in eBioMedicine, researchers evaluated proteomic signatures in blood plasma and cervicovaginal fluid to detect endometrial cancer.

    Study: Detection of endometrial cancer in cervicovaginal fluid and blood plasma: leveraging proteomics and machine learning for biomarker discovery. Image Credit: crystal light / Shutterstock.comStudy: Detection of endometrial cancer in cervicovaginal fluid and blood plasma: leveraging proteomics and machine learning for biomarker discovery. Image Credit: crystal light / Shutterstock.com

    Diagnosing endometrial cancer

    The prevalence of endometrial cancer, which is the most common gynecological malignancy in high-income countries, continues to rise throughout the world. Endometrial cancer is amenable to curative hysterectomy when diagnosed early, with a five-year survival rate of over 90% following treatment. Comparatively, individuals with metastatic or advanced disease often have poor outcomes, with the five-year survival rate estimated at 15%.

    Over 90% of females with endometrial cancer present with postmenopausal bleeding, thus triggering urgent investigations through sequential transvaginal ultrasound, hysteroscopy, and endometrial biopsy, all of which could be anxiety-provoking and painful procedures. Therefore, developing simple, cost-effective, and non-invasive tests for early cancer diagnosis is crucial for both patients and clinicians.

    Cervicovaginal fluid, which is a mix of vaginal, uterine, and cervical secretions, has been investigated as a source of biomarkers for inflammatory conditions of the lower reproductive tract, pregnancy-related pathologies, and cervical neoplasia. In fact, one recent study found that cervicovaginal fluid can be used to detect endometrial cancer.

    About the study

    In the present study, researchers evaluate the performance of proteomic signatures from cervicovaginal fluid and plasma for endometrial cancer detection. Cases comprised females with histopathological evidence of endometrial cancer based on hysterectomy, whereas controls included symptomatic females without endometrial cancer or atypical hyperplasia. Individuals with a history of gynecological malignancy or hysterectomy were excluded.

    Cervicovaginal fluid and blood were collected, and mass spectrometry was performed. Digitized proteomic maps were derived using sequential window acquisition of all theoretical mass spectra.

    Spectral data were converted and searched against a human plasma library and a previously published library of 19,394 peptides and 2,425 proteins in the cervicovaginal fluid. Random forest (RF) modeling was used for feature selection. The most discriminatory proteins were ranked based on the mean decrease in accuracy.

    Nested logistic regression models were built by sequentially adding proteins based on their rank. The parsimonious model was identified, and its performance was evaluated by plotting the receiver operating characteristic curve and calculating the area under the curve (AUC). Likelihood ratio tests and Akaike information criteria (AIC) were used to compare the performance of nested models.

    Study findings

    Overall, 118 postmenopausal females with symptoms were included in the study, 53 of whom had confirmed endometrial cancer and 65 with no evidence of cancer. About 86% of the study cohort were White. Individuals with endometrial cancer were likely to be older and have a higher body mass index (BMI) than controls.

    Taken together, 597, 310, and 533 proteins were quantified in the cervicovaginal fluid supernatant, cell pellets, and plasma samples, respectively. Overall, 941 unique proteins were identified across sample types. There was evidence of separation between cancers and controls based on cervicovaginal fluid supernatant proteins.

    Classifiers were selected based on the mean decrease accuracy metric of the RF model. Principal component analyses (PCA) using the top discriminatory proteins revealed more substantial discrimination between cancers and controls.

    The model with the top five discriminatory proteins had the lowest AIC value and was selected as a parsimonious model. This model predicted endometrial cancer with AUC, sensitivity, and specificity of 0.95, 91%, and 86%, respectively.

    Feature selection analysis indicated that 38 proteins were important for discrimination between cancers and controls. Proteins in cervicovaginal fluid cell pellets were less promising as cancer biomarkers than supernatant-derived proteins.

    Fewer differentially expressed proteins were observed in plasma samples between cases and controls as compared to the cervicovaginal fluid, with little evidence of discrimination based on plasma proteins. PCA indicated a modest separation between cancers and controls. A three-plasma biomarker panel predicted endometrial cancer with AUC, sensitivity, and specificity of 0.87, 75%, and 84%, respectively.

    Feature selection analysis revealed six plasma proteins as important classifiers. Furthermore, three- and four-marker panels of cervicovaginal fluid and plasma proteins predicted early-stage endometrial cancer with AUCs of 0.92 and 0.88, respectively. Five- and six-marker panels of cervicovaginal fluid and plasma proteins predicted advanced-stage endometrial cancer with AUCs of 0.96 and 0.93, respectively.

    Conclusions

    Cervicovaginal fluid proteins were more accurate in detecting endometrial cancer than plasma proteins. The five-marker panel of cervicovaginal fluid proteins comprised the immunoglobulin heavy constant mu (IGHM), haptoglobin (HPT), fibrinogen alpha chain (FGA), lymphocyte antigen 6D (LY6D), and galectin-3-binding protein (LG3BP), whereas the three-marker panel of plasma proteins included HPT, proteasome 20S subunit alpha 7 (PSMA7), and apolipoprotein D (APOD).

    Further confirmatory studies using larger cohorts are needed to validate these findings.

    Journal reference:

    • Njoku, K., Pierce, A., Chiasserini, D., et al. (2024). Detection of endometrial cancer in cervicovaginal fluid and blood plasma: leveraging proteomics and machine learning for biomarker discovery. eBioMedicine. doi:10.1016/j.ebiom.2024.105064

    [ad_2]

    Source link

  • Researchers shed light on proteins controlling the development of ovaries in mice

    Researchers shed light on proteins controlling the development of ovaries in mice

    [ad_1]

    Researchers at the Francis Crick Institute have shed light on the proteins controlling the development of ovaries in mice before and after birth. This could lead to a better understanding of how female infertility develops.

    Following their research identifying the gene responsible for initiating the development of ovaries in the mouse embryo, the scientists aimed to understand which genes maintain the functions of the ovaries, including producing eggs, after birth.

    Previous experiments have shown that removing a gene called Foxl2 in female (XX) mice at different points in development has different effects depending on the timing. If removed from embryos, ovaries become abnormal and the adult mice are infertile. If removed from adult mice, their ovaries begin to resemble testes. 

    In research published today in Science Advances, the team found that, while FOXL2 does play a role during embryonic development, it has the most impact after birth, where the protein regulates the activity of many more genes, including some involved in functions critical for the ovary such as egg development.

    FOXL2 is a type of protein that physically sits on top of specific regions in DNA (‘enhancers’) and influences whether and how other (target) genes are read.

    The researchers used a technique called chromatin proteomics to ‘fish out’ all of the other proteins that interact with FOXL2 when it is bound to DNA. They found that the number of protein interactions drastically increased in ovaries after birth compared to during embryonic development.

    Among many others, they identified a protein called USP7, which binds to FOXL2 when it interacts with its DNA targets. Until now, researchers weren’t aware of USP7 and FOXL2 interaction or what role USP7 was playing in ovary development.

    When the researchers removed the Usp7 gene from female mice, they found that the mice couldn’t develop ovaries beyond puberty, so were infertile. The team believe USP7 might be needed to stabilise FOXL2 on top of DNA. 

    FOXL2 and USP7 share some common roles in humans. People lacking one copy of the FOXL2 gene can start making eggs but don’t develop full ovaries, so have problems with fertility. USP7 mutations can also lead to infertility in people, as well as neurodevelopmental disorders.

    Genetic testing is key to diagnose problems with sexual development, so researchers hope to find the major genetic causes of infertility and consider how gene editing techniques could help with future treatments. 

    Robin Lovell-Badge, Group Leader of the Stem Cell Biology and Developmental Genetics Laboratory at the Crick, said: “In our research, we’ve come closer to answers for two major questions regarding development – what drives ovary development, and how the function of the ovary is maintained. We’ve found that FOXL2 has very different roles throughout development, and identified another crucial protein, USP7.

    “The genetic factors underlying female development haven’t been as well studied as male development, because many female developmental pathways happen at the same time rather than in an easy-to-follow sequence. Infertility is a big problem worldwide, so shedding light on the key genes and proteins responsible at each stage is vital.”

    This is the first time we’ve been able to use these approaches to see the interactions that FOXL2, a factor critical for female fertility, establishes with other proteins whilst they are bound to DNA in mouse ovaries. Factors that actively bind to the DNA are more likely to have an impact on the regulation of genes important for the development and function of the ovary. We’ve identified USP7 through this method and the hope is that many more proteins responsible for ovary development can be found using our approach.”


    Roberta Migale, Postdoctoral Fellow at the Crick and first and co-senior author on the study

    A Crick-wide effort, Robin and Roberta worked with several specialist teams, including the Genetic Modification Service, Bioinformatics and Biostatistics, Proteomics, Flow Cytometry, Experimental Histopathology, Light Microscopy, and the Biological Research Facility.

    The researchers will continue to study the role of the USP7 protein in sexual development.

    Source:

    Journal reference:

    Migale, R., et al. (2024) FOXL2 interaction with different binding partners regulates the dynamics of ovarian development. Science Advances. doi.org/10.1126/sciadv.adl0788.

    [ad_2]

    Source link