Measure, Normalise, Analyse

Glycans are complex sugar molecules that are present on the majority of proteins in the human body. As such, glycans have shown great potential as biomarkers of both biological and chronological ages, as well as various diseases, including cancer and autoimmune diseases.

Glycans are complex sugar molecules that are present on the majority of proteins in the human body. As such, glycans have shown great potential as biomarkers of both biological and chronological ages, as well as various diseases, including cancer and autoimmune diseases. Large sample sizes need to be measured for studies of biomarker identification which can introduce differences between individuals due to the experimental variation and not necessarily biological variation researchers are interested in. Glycomics is no exception and requires reduction of these differences to make samples comparable, thereby avoiding unwanted bias and false positives.

Modern high-throughput glycomics data shows that there are large differences between subjects in total glycan abundance and that the glycans are highly correlated. An essential step in preprocessing of raw glycan data is normalisation, a process which allows the transformation of glycomics measurements and makes them comparable between subjects. The compositional nature of the data resulting from applying current methods of normalisation makes many standard multivariate statistical methods inappropriate or inapplicable.

Lack of consensus on the appropriate normalisation approach in the field of glycomics motivated the study by Uh et al. to investigate how different normalisation methods affect subsequent statistical analysis, such as variable selection for age prediction using immunoglobulin G (IgG) glycans.

The study focuses on testing six normalisation methods, variable selection and ultimately evaluation of the robustness and efficiency of the normalisation methods by performing simulations. Researchers demonstrate that the widely used row-wise total area normalisation method performs poorly compared to the column-wise normalisation methods, such that the glycans were falsely selected the prediction error was large. The column-wise normalisation methods, such as MS (median scaling) and MQN (Multivariate Quantile normalisation), not only outperformed the row-wise methods but also have an advantage of preserving the correlation structure. The recommendation is that several normalisation methods should be applied and association results that are detected by the majority should be reported.

Studies like this highlight the importance of the initial data handling and increase the awareness of possible bias and false positives due to inappropriate choice of normalisation method. The procedure described also represents a great guide for studies aiming to identify robust reproducible glycan biomarkers.

Start or continue your GlycanAge journey

Don’t be afraid to reach out to us and ask questions, provide commentary or suggest topics.

Other articles you may like:

Blog image Glycoscience

Glycosylation Profile of Sars-Cov-2 Spike Protein

Detailed glycan analysis and potential post-translational modifications of the SARS-CoV-2 spike protein is very important for the development of glycoprotein-based vaccine.

Read full article
Blog image Glycoscience

Understanding Glycans in COVID-19 Drug Design

We need to intensify studies on the role of glycosylation and understand the importance of these complicated structures if we want to have success in the diagnosis, treatment, and prevention of the COVID-19.

Read full article