Review article
On the utilization of principal component analysis in laser-induced breakdown spectroscopy data analysis, a review

https://doi.org/10.1016/j.sab.2018.05.030Get rights and content

Highlights

  • Implementation of principal component analysis to processing of laser-induced breakdown spectroscopy data

  • Thorough review of articles on pattern recognition, classification and regression

  • Advices on data preprocessing and understanding of PCA outputs

Abstract

An implementation of a fast, robust, and effective algorithm is inevitable in modern multivariate data analysis (MVDA). The principal component analysis (PCA) algorithm is becoming popular not only in the spectroscopic community because it complies with the qualities mentioned above. PCA is, therefore, often used for the processing of detected multivariate signal (characteristic spectra). Over the past decade, PCA has been adopted by the Laser-Induced Breakdown Spectroscopy (LIBS) community and the number of scientific articles referring to PCA steadily increases. The interest in PCA is not caused only by the basic need to obtain a fast data visualization on a lower dimensional scale and to inspect the most prominent variables. Most recently, PCA has also been applied to yield unconventional data analyses, i.e. processing of large scale LIBS maps. However, a rapid development of LIBS-related instrumentation and applications has led to some non-uniform methodologies in the implementation and utilization of MVDA, including PCA. Thus, in this work, we critically assess and elaborate on the approaches to utilize PCA in LIBS data processing. The aim of this article is also to derive some implications and to suggest advice in data preprocessing, visualization, dimensionality reduction, model building, classification, quantification and non-conventional multivariate mapping. This review reflects also other MVDA algorithms than PCA and consequently, presented conclusions and recommendations can be generalized.

Introduction

Sample characterization using Laser-Induced Breakdown Spectroscopy (LIBS) technique has been dynamically advancing in recent years. The parameters of conventionally utilized analytical instrumentation (lasers, spectrometers, and detectors) are being constantly improved. Moreover, the complicated or basic lab-built systems have been transformed to the sophisticated and commercially available systems, which enable an effortless and fast spectroscopic analysis. Contemporary state-of-the-art LIBS systems are capable of a high-end performance analysis (repetition rate, resolution, sensitivity). The high-end performance of LIBS is in certain cases superior to the performance of its analytical counterparts or reference techniques, such as Laser-Ablation Inductively Coupled Plasma (LA-ICP) based techniques, X-ray Fluorescence (XRF), etc.

LIBS is a well-established technique in many different applications, such as biology [[1], [2], [3], [4]], geology [5], and industry [6]. The reason is the simplicity and robustness of the LIBS instrumentation together with its capability of a fast-throughput multielemental analysis. Its potential has been repeatedly demonstrated by its high-end lab-based [7], in-situ and stand-off [8,9], and even extraterrestrial [10,11] utilization.

LIBS is one of the atomic emission spectroscopic techniques [6,12,13] based on the laser ablation sampling. Thorough articles were published with the aim to review the basic theory of the Laser-Induced Plasma (LIP) formation [[14], [15], [16]] and LIBS in general [[17], [18], [19], [20]].

The introduction covering the basic theory about LIBS technique was brief because this review article targets namely the aspects of data processing. The reader should follow referenced books and review articles for more detailed background of LIBS theory prior to any further data processing through MVDA algorithms. As it was emphasized by Hahn and Omenetto [17]: “advanced chemometric algorithms must be used with knowledge of what emission features (e.g. atomic or molecular emission peaks) are providing the associated discrimination.

A typical LIBS system is able to provide a high number of measurements (given by its repetition rate) when each measurement is described by a high number of variables (especially in the case of echelle spectrometers). Note that the high repetition rate systems are mentioned for their leading edge in the LIBS applications, however, obtaining large number of measurements is not strictly related to LIBS systems with high repetition rate. The collected LIP spectrum is rich in information and represents the sample from which it originated, i.e. the chemical/spectral fingerprint of the sample [21,22]. The processing of large scale datasets is a demanding task which can be accomplished by using the so-called multivariate data analysis (MVDA; often related to as chemometric, exploratory data analysis or pattern recognition). It is noteworthy that unique LIP spectra are strongly affected by the matrix effect [19] which requires special attention when it comes to the conventional univariate calibration and quantitative analysis. On the contrary, the relation to the sample matrix enables a classification of samples according to their spectral fingerprints using simple MVDA algorithms. When processing large datasets, there are two more requirements to be met, namely, to process the data in the least possible time and in the most efficient manner. Efficiency can be measured by the conservation of variance during the dimensionality reduction, the sensitivity to outliers and the specificity to discriminate between individual matrices of analytes.

MVDA algorithms are massively spread throughout the LIBS community and are used in a number of applications. It may be stated that the future of the LIBS data analysis lies in the implementation of MVDA algorithms. The use of multivariate algorithms for processing of spectroscopic data has already been well documented [[23], [24], [25], [26]]. Moreover, several review articles [5,17,27,28] dealt solely with the multivariate processing of LIBS data. A full chapter in the LIBS book by Cremers and Radziemski [12] was also dedicated to this topic. Based on the literature survey, the most popular MVDA algorithm in the LIBS community is the Principal Component Analysis (PCA). This simple linear algorithm provides powerful means of data visualization and pattern recognition on a lower-dimensional scale.

Based on our thorough literature research, the methodological approaches in the processing of LIBS data through MVDA algorithms significantly differ. This is given i) by the needs of a particular application, ii) by the uniqueness of the data acquisition and data size, iii) by the data topology, iv) by the variety of MVDA algorithms and also v) by the internal methodology of each research group. Consequently, there is not a unified approach and it might not exist in the future. Moreover, a wide range of MVDA algorithms together with the available software for the processing of data creates an option to perform a reachable and easy-to-use analysis. This might lead to the misguided implementation of these algorithms and software, i.e. when their use leads to aesthetic improvement of low-quality data (high fluctuation, low sensitivity, etc.) [29]. Nevertheless, it has to be stressed that a stable and optimized analytical system providing a reproducible high-performance analysis (high-quality data) should be the cornerstone of any experimental work. The same is valid for the understanding of the theory of i) LIBS (e.g. laser-ablation and plasma dynamics and its properties) and ii) MVDA algorithms and their considerate and judicious implementation in the data analysis process [17].

In this work we bring a summary of the most common approaches in the implementation of PCA in LIBS data analysis for: low-dimensional visualization, clustering, outliers filtering, variable selection, quantification, classification, and non-conventional multivariate mapping. Additionally, general suggestions for the data preprocessing and the model building, as well as a comparison with the performance of other MVDA counterparts, are given.

Section snippets

Data preprocessing

Prior to an implementation of any MVDA algorithm such as PCA and its variations, it is strongly advised to preprocess the obtained data [28,30]. Detected multivariate signal in its raw state is burdened with unwanted background signal, fluctuation in the experimental parameters, etc. It has to be kept in mind that the data structure is changing during the data handling. This leads to consecutive changes in the performance of MVDA algorithm applied to the final data [31]. In general, there is a

PCA in LIBS

Advances in instrumentation development enable measurements with higher repetition rates, broader spectral ranges and better resolutions. Nowadays, an analysis results in datasets with thousands of variables [78] and millions of spectra [7]. Thus, the state-of-the-art LIBS system routinely provides big datasets (high number of spectra and variables) and so it is crucial to manage an effective and fast-response data processing. The MVDA algorithms must be applied into the analytical data

Summary of publications (Table 1)

Conclusion and future prospects

Based on the literature survey, LIBS combined with MVDA algorithms proved the capability to classify unknown samples and quantify analytes in many applications. However, the majority of reviewed articles represented only feasibility and preliminary studies. The impact of presented alterations in data pre-processing and MVDA algorithms on the resulting figures of merit was demonstrated on a limited number of samples, with a low number of spectra per sample, etc.

Generally, LIBS is on its rise and

Acknowledgement

Authors affiliated with CEITEC would like to acknowledge the financial support obtained from the National Sustainability program - CEITEC NPU II (LQ1061) and supported also from the ERDFund-Project CEITEC Nano+ (CZ.02.1.01/0.0/0.0/16_013/0001728). PP is grateful to the Fulbright commission for supporting his research at the University of Florida (E0583833).

References (193)

  • P. Geladi et al.

    Chemometrics in spectroscopy

    Spectrochim. Acta B At. Spectrosc.

    (2004)
  • N. Kumar et al.

    Chemometrics tools used in analytical chemistry: an overview

    Talanta

    (2014)
  • F.C.J. De Lucia et al.

    Rapid analysis of energetic and geo-materials using LIBS

    Mater. Today

    (2011)
  • J. El Haddad et al.

    Good practices in LIBS analysis: review and advices

    Spectrochim. Acta B At. Spectrosc.

    (2014)
  • P. de Boves Harrington

    Statistical validation of classification and calibration models using bootstrapped Latin partitions

    Trends Anal. Chem.

    (2006)
  • E.H. van Veen et al.

    Application of mathematical procedures to background correction and multivariate analysis in inductively coupled plasma-optical emission spectrometry

    Spectrochim. Acta B At. Spectrosc.

    (1998)
  • P. Yaroshchyk et al.

    Automatic correction of continuum background in laser-induced breakdown spectroscopy using a model-free algorithm

    Spectrochim. Acta B

    (2014)
  • N.B. Zorov et al.

    A review of normalization techniques in analytical atomic spectrometry with laser sampling: from single to multivariate correction

    Spectrochim. Acta B At. Spectrosc.

    (2010)
  • T. Takahashi et al.

    Quantitative methods for compensation of matrix effects and self-absorption in laser induced breakdown spectroscopy signals of solids

    Spectrochim. Acta B

    (2017)
  • F. Colao et al.

    Quarry identification of historical building materials by means of laser induced breakdown spectroscopy, X-ray fluorescence and chemometric analysis

    Spectrochim. Acta B At. Spectrosc.

    (2010)
  • B. Bousquet et al.

    Towards quantitative laser-induced breakdown spectroscopy analysis of soil samples

    Spectrochim. Acta B At. Spectrosc.

    (2007)
  • D.L. Death et al.

    Multi-element analysis of iron ore pellets by laser-induced breakdown spectroscopy and principal components regression

    Spectrochim. Acta B At. Spectrosc.

    (2008)
  • D.L. Death et al.

    Multi-element and mineralogical analysis of mineral ores using laser induced breakdown spectroscopy and chemometric analysis

    Spectrochim. Acta B At. Spectrosc.

    (2009)
  • A.P.M. Michel et al.

    Analysis of laser-induced breakdown spectroscopy spectra: the case for extreme value statistics

    Spectrochim. Acta B At. Spectrosc.

    (2007)
  • J. Klus et al.

    Effect of experimental parameters and resulting analytical signal statistics in laser-induced breakdown spectroscopy

    Spectrochim. Acta B At. Spectrosc.

    (2016)
  • G. Amato et al.

    Progress towards an unassisted element identification from laser induced breakdown spectra with automatic ranking techniques inspired by text retrieval

    Spectrochim. Acta B At. Spectrosc.

    (2010)
  • R.B. Anderson et al.

    Clustering and training set selection methods for improving the accuracy of quantitative laser induced breakdown spectroscopy

    Spectrochim. Acta B At. Spectrosc.

    (2012)
  • P. Pořízka et al.

    Laser-induced breakdown spectroscopy coupled with chemometrics for the analysis of steel: the issue of spectral outliers filtering

    Spectrochim. Acta B At. Spectrosc.

    (2016)
  • J. Carlson et al.

    Limits of quantitation — yet another suggestion

    Spectrochim. Acta B At. Spectrosc.

    (2014)
  • W. Wu et al.

    Kernel-PCA algorithms for wide data part II: fast cross-validation and application in classification of NIR data

    Chemom. Intell. Lab. Syst.

    (1997)
  • M. Hubert et al.

    A fast method for robust principal components with applications to chemometrics

    Chemom. Intell. Lab. Syst.

    (2002)
  • R.A. Putnam et al.

    A comparison of multivariate analysis techniques and variable selection strategies in a laser-induced breakdown spectroscopy bacterial classification

    Spectrochim. Acta B At. Spectrosc.

    (2013)
  • O. Forni et al.

    Independent component analysis classification of laser induced breakdown spectroscopy spectra

    Spectrochim. Acta B At. Spectrosc.

    (2013)
  • M. Defernez et al.

    The use and misuse of chemometrics for treating classification problems

    Trends Anal. Chem.

    (1997)
  • B. Bousquet et al.

    Development of a mobile system based on laser-induced breakdown spectroscopy and dedicated to in situ analysis of polluted soils

    Spectrochim. Acta B At. Spectrosc.

    (2008)
  • P. Pořízka et al.

    Laser-induced breakdown spectroscopy for in situ qualitative and quantitative analysis of mineral ores

    Spectrochim. Acta B At. Spectrosc.

    (2014)
  • J. Klus et al.

    Multivariate approach to the chemical mapping of uranium in sandstone-hosted uranium ores analyzed using double pulse laser-induced breakdown spectroscopy

    Spectrochim. Acta B At. Spectrosc.

    (2016)
  • V. Lazic et al.

    Analysis of explosive and other organic residues by laser induced breakdown spectroscopy

    Spectrochim. Acta B At. Spectrosc.

    (2009)
  • J.L. Gottfried et al.

    Multivariate analysis of laser-induced breakdown spectroscopy chemical signatures for geomaterial classification

    Spectrochim. Acta B At. Spectrosc.

    (2009)
  • A. Erdem et al.

    Characterization of Iron age pottery from eastern Turkey by laser- induced breakdown spectroscopy (LIBS)

    J. Archaeol. Sci.

    (2008)
  • G. Vítková et al.

    Fast identification of biominerals by means of stand-off laser-induced breakdown spectroscopy using linear discriminant analysis and artificial neural networks

    Spectrochim. Acta B At. Spectrosc.

    (2012)
  • P. Pořízka et al.

    Assessment of the most effective part of echelle laser-induced plasma spectra for further classification using Czerny-Turner spectrometer

    Spectrochim. Acta B At. Spectrosc.

    (2016)
  • D. Prochazka et al.

    Combination of laser-induced breakdown spectroscopy and Raman spectroscopy for multivariate classification of bacteria

    Spectrochim. Acta B At. Spectrosc.

    (2018)
  • S.J. Rehse et al.

    Laser-induced breakdown spectroscopy (LIBS): an overview of recent progress and future potential for biomedical applications

    J. Med. Eng. Technol.

    (2012)
  • P. Pořízka et al.

    Algal biomass analysis by laser-based analytical techniques—a review

    Sensors

    (2014)
  • R. Noll

    Laser-induced Breakdown Spectroscopy Fundamentals and Applications

    (2012)
  • J.O. Cáceres et al.

    Megapixel multi-elemental imaging by laser-induced breakdown spectroscopy, a technology with considerable potential for paleoclimate studies

    Sci. Rep.

    (2017)
  • S. Maurice et al.

    ChemCam activities and discoveries during the nominal mission of the Mars Science Laboratory in Gale crater, Mars

    J. Anal. At. Spectrom.

    (2016)
  • CHEMCAM Team

    “ChemCam on Mars,” 2 February 2018. [Online]

  • D.A. Cremers et al.

    Handbook of Laser-induced Breakdown Spectroscopy

    (2013)
  • Cited by (162)

    • Wind turbine contaminant classification using machine learning techniques

      2023, Spectrochimica Acta - Part B Atomic Spectroscopy
    View all citing articles on Scopus
    View full text