The field of biomarkers discovery is one of the leading research areas in proteomics. One of the most exploited approaches to this purpose consists of the identification of potential biomarkers from spot volume datasets produced by 2D gel electrophoresis. In this case, problems may arise due to the large number of spots present in each map and the small number of maps available for each class (control/pathological). Multivariate methods are therefore usually applied together with variable selection procedures, to provide a subset of potential candidates. The variable selection procedures available usually pursue the so-called principle of parsimony: the most parsimonious set of spots is selected, providing the best classification performances. This approach is not effective in proteomics since all potential biomarkers must be identified: not only the most discriminating spots, usually related to general responses to inflammatory events, but also the smallest differences and all redundant molecules, i.e. biomarkers showing similar behaviour. The principle of exhaustiveness should be pursued rather than parsimony. To solve this problem, a new ranking and classification method, "Ranking-PCA", based on principal component analysis and variable selection in forward search, is proposed here for the exhaustive identification of all possible biomarkers. The method is successfully applied to three different proteomic datasets to prove its effectiveness.

The principle of exhaustiveness versus the principle of parsimony: a new approach for the identification of biomarkers from proteomic spot volume datasets based on principal component analysis

MARENGO, Emilio;ROBOTTI, Elisa;GOSETTI, Fabio
2010-01-01

Abstract

The field of biomarkers discovery is one of the leading research areas in proteomics. One of the most exploited approaches to this purpose consists of the identification of potential biomarkers from spot volume datasets produced by 2D gel electrophoresis. In this case, problems may arise due to the large number of spots present in each map and the small number of maps available for each class (control/pathological). Multivariate methods are therefore usually applied together with variable selection procedures, to provide a subset of potential candidates. The variable selection procedures available usually pursue the so-called principle of parsimony: the most parsimonious set of spots is selected, providing the best classification performances. This approach is not effective in proteomics since all potential biomarkers must be identified: not only the most discriminating spots, usually related to general responses to inflammatory events, but also the smallest differences and all redundant molecules, i.e. biomarkers showing similar behaviour. The principle of exhaustiveness should be pursued rather than parsimony. To solve this problem, a new ranking and classification method, "Ranking-PCA", based on principal component analysis and variable selection in forward search, is proposed here for the exhaustive identification of all possible biomarkers. The method is successfully applied to three different proteomic datasets to prove its effectiveness.
File in questo prodotto:
File Dimensione Formato  
ABC ranking pca.pdf

file disponibile solo agli amministratori

Tipologia: Altro materiale allegato
Licenza: DRM non definito
Dimensione 2.38 MB
Formato Adobe PDF
2.38 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11579/30050
Citazioni
  • ???jsp.display-item.citation.pmc??? 8
  • Scopus 24
  • ???jsp.display-item.citation.isi??? 22
social impact