Metabolomics

Metabolomics is the scientific study of chemical processes involving metabolites. Specifically, metabolomics is the "systematic study of the unique chemical fingerprints that specific cellular processes leave behind", the study of their small-molecule metabolite profiles. The metabolome represents the collection of all metabolites in a biological cell, tissue, organ or organism, which are the end products of cellular processes. Thus, while mRNA gene expression data and proteomic analyses do not tell the whole story of what might be happening in a cell, metabolic profiling can give an instantaneous snapshot of the physiology of that cell. One of the challenges of systems biology and functional genomics is to integrate proteomic, transcriptomic, and metabolomic information to give a more complete picture of living organisms.

Origins
The idea that biological fluids reflect the health of an individual has existed for a long time. Ancient Chinese doctors used ants for the evaluation of urine of patients to detect whether the urine contained high levels of glucose, and hence detect diabetes. In the Middle Ages, "urine charts" were used to link the colours, tastes and smells of urine to various medical conditions, which are metabolic in origin.

The concept that individuals might have a "metabolic profile" that could be reflected in the makeup of their biological fluids was introduced by Roger Williams in the late 1940s, who used paper chromatography to suggest characteristic metabolic patterns in urine and saliva were associated with diseases such as schizophrenia. However, it was only through technological advancements in the 1960s and 1970s that it became feasible to quantitatively (as opposed to qualitatively) measure metabolic profiles. The term "metabolic profile" was introduced by Horning, et al. in 1971 after they demonstrated that gas chromatography-mass spectrometry (GC-MS) could be used to measure compounds present in human urine and tissue extracts. The Horning group, along with that of Linus Pauling and Arthur B. Robinson led the development of GC-MS methods to monitor the metabolites present in urine through the 1970s.

Concurrently, NMR spectroscopy, which was discovered in the 1940s, was also undergoing rapid advances. In 1974, Seeley et al. demonstrated the utility of using NMR to detect metabolites in unmodified biological samples. This first study on muscle highlighted the value of NMR in that it was determined that 90% of cellular ATP is complexed with magnesium. As sensitivity has improved with the evolution of higher magnetic field strengths and magic angle spinning, NMR continues to be a leading analytical tool to investigate metabolism. Recent efforts to utilize NMR for metabolomics have been largely driven by the laboratory of Dr. Jeremy Nicholson at Birkbeck College, University of London and later at Imperial College London. In 1984, Nicholson showed 1H NMR spectroscopy could potentially be used to diagnose and treat diabetes mellitus, and later pioneered the application of pattern recognition methods to NMR spectroscopic data.

In 2005, the first metabolomics web database, METLIN, for characterizing human metabolites was developed in the Siuzdak laboratory at The Scripps Research Institute and contained over 10,000 metabolites and tandem mass spectral data. , METLIN contains over 60,000 metabolites as well as the largest repository of tandem mass spectrometry data in metabolomics.

On 23 January 2007, the Human Metabolome Project, led by Dr. David Wishart of the University of Alberta, Canada, completed the first draft of the human metabolome, consisting of a database of approximately 2500 metabolites, 1200 drugs and 3500 food components. Similar projects have been underway in several plant species, most notably Medicago truncatula and Arabidopsis thaliana for several years.

As late as mid-2010, metabolomics was still considered an "emerging field". Further, it was noted that further progress in the field depended in large part, through addressing otherwise "irresolvable technical challenges", by technical evolution of mass spectrometry instrumentation.

Metabolome
Metabolome refers to the complete set of small-molecule metabolites (such as metabolic intermediates, hormones and other signaling molecules, and secondary metabolites) to be found within a biological sample, such as a single organism. The word was coined in analogy with transcriptomics and proteomics; like the transcriptome and the proteome, the metabolome is dynamic, changing from second to second. Although the metabolome can be defined readily enough, it is not currently possible to analyse the entire range of metabolites by a single analytical method. The first metabolite database(called METLIN) for searching m/z values from mass spectrometry data was developed by scientists at The Scripps Research Institute in 2005. In January 2007, scientists at the University of Alberta and the University of Calgary completed the first draft of the human metabolome. They catalogued approximately 2500 metabolites, 1200 drugs and 3500 food components that can be found in the human body, as reported in the literature. This information, available at the Human Metabolome Database (www.hmdb.ca) and based on analysis of information available in the current scientific literature, is far from complete. In contrast, much more is known about the metabolomes of other organisms. For example, over 50,000 metabolites have been characterized from the plant kingdom, and many thousands of metabolites have been identified and/or characterized from single plants.

Each type of cell and tissue has a unique metabolic ‘fingerprint’ that can elucidate organ or tissue-specific information, while the study of biofluids can give more generalized though less specialized information. Commonly used biofluids are urine and plasma, as they can be obtained non-invasively or relatively non-invasively, respectively. The ease of collection facilitates high temporal resolution, and because they are always at dynamic equilibrium with the body, they can describe the host as a whole.

Metabolites
Metabolites are the intermediates and products of metabolism. Within the context of metabolomics, a metabolite is usually defined as any molecule less than 1 kDa in size. However, there are exceptions to this depending on the sample and detection method. For example, macromolecules such as lipoproteins and albumin are reliably detected in NMR-based metabolomics studies of blood plasma. In plant-based metabolomics, it is common to refer to "primary" and "secondary" metabolites. A primary metabolite is directly involved in the normal growth, development, and reproduction. A secondary metabolite is not directly involved in those processes, but usually has important ecological function. Examples include antibiotics and pigments. By contrast, in human-based metabolomics, it is more common to describe metabolites as being either endogenous (produced by the host organism) or exogenous. Metabolites of foreign substances such as drugs are termed xenometabolites.

The metabolome forms a large network of metabolic reactions, where outputs from one enzymatic chemical reaction are inputs to other chemical reactions. Such systems have been described as hypercycles.

Metabonomics
Metabonomics is defined as "the quantitative measurement of the dynamic multiparametric metabolic response of living systems to pathophysiological stimuli or genetic modification". The word origin is from the Greek μεταβολή meaning change and nomos meaning a rule set or set of laws. This approach was pioneered by Jeremy Nicholson at Imperial College London and has been used in toxicology, disease diagnosis and a number of other fields. Historically, the metabonomics approach was one of the first methods to apply the scope of systems biology to studies of metabolism.

There has been some disagreement over the exact differences between 'metabolomics' and 'metabonomics'. The difference between the two terms is not related to choice of analytical platform: although metabonomics is more associated with NMR spectroscopy and metabolomics with mass spectrometry-based techniques, this is simply because of usages amongst different groups that have popularized the different terms. While there is still no absolute agreement, there is a growing consensus that 'metabolomics' places a greater emphasis on metabolic profiling at a cellular or organ level and is primarily concerned with normal endogenous metabolism. 'Metabonomics' extends metabolic profiling to include information about perturbations of metabolism caused by environmental factors (including diet and toxins), disease processes, and the involvement of extragenomic  influences, such as gut microflora. This is not a trivial difference; metabolomic studies should, by definition, exclude metabolic contributions from extragenomic sources, because these are external to the system being studied. However, in practice, within the field of human disease research there is still a large degree of overlap in the way both terms are used, and they are often in effect synonymous.

Separation methods

 * Gas chromatography, especially when interfaced with mass spectrometry (GC-MS), is one of the most widely used and powerful methods. It offers very high chromatographic resolution, but requires chemical derivatization for many biomolecules: only volatile chemicals can be analysed without derivatization. (Some modern instruments allow '2D' chromatography, using a short polar column after the main analytical column, which increases the resolution still further.) Some large and polar metabolites cannot be analysed by GC.


 * High performance liquid chromatography (HPLC). Compared to GC, HPLC has lower chromatographic resolution, but it does have the advantage that a much wider range of analytes can potentially be measured.


 * Capillary electrophoresis (CE). CE has a higher theoretical separation efficiency than HPLC, and is suitable for use with a wider range of metabolite classes than is GC. As for all electrophoretic techniques, it is most appropriate for charged analytes.

Detection methods

 * Mass spectrometry (MS) is used to identify and to quantify metabolites after separation by GC, HPLC (LC-MS), or CE. GC-MS is the most 'natural' combination of the three, and was the first to be developed. In addition, mass spectral fingerprint libraries exist or can be developed that allow identification of a metabolite according to its fragmentation pattern. MS is both sensitive (although, particularly for HPLC-MS, sensitivity is more of an issue as it is affected by the charge on the metabolite, and can be subject to ion suppression artifacts) and can be very specific. There are also a number of studies which use MS as a stand-alone technology: the sample is infused directly into the mass spectrometer with no prior separation, and the MS serves to both separate and to detect metabolites.


 * Surface-based mass analysis has seen a resurgence in the past decade, with new MS technologies focused on increasing sensitivity, minimizing background, and reducing sample preparation. The ability to analyze metabolites directly from biofluids and tissues continues to challenge current MS technology, largely because of the limits imposed by the complexity of these samples, which contain thousands to tens of thousands of metabolites.  Among the technologies being developed to address this challenge is Nanostructure-Initiator MS (NIMS),  a desorption/ ionization approach that does not require the application of matrix and thereby facilitates small-molecule (i.e., metabolite) identification. MALDI is also used however, the application of a MALDI matrix can add significant background at <1000 Da that complicates analysis of the low-mass range (i.e., metabolites). In addition, the size of the resulting matrix crystals limits the spatial resolution that can be achieved in tissue imaging. Because of these limitations, several other matrix-free desorption/ionization approaches have been applied to the analysis of biofluids and tissues. Secondary ion mass spectrometry (SIMS) was one of the first matrix-free desorption/ionization approaches used to analyze metabolites from biological samples. SIMS uses a high-energy primary ion beam to desorb and generate secondary ions from a surface. The primary advantage of SIMS is its high spatial resolution (as small as 50 nm), a powerful characteristic for tissue imaging with MS. However, SIMS has yet to be readily applied to the analysis of biofluids and tissues because of its limited sensitivity at >500 Da and analyte fragmentation generated by the high-energy primary ion beam. Desorption electrospray ionization (DESI) is a matrix-free technique for analyzing biological samples that uses a charged solvent spray to desorb ions from a surface. Advantages of DESI are that no special surface is required and the analysis is performed at ambient pressure with full access to the sample during acquisition.  A limitation of DESI is spatial resolution because "focusing" the charged solvent spray is difficult. However, a recent development termed laser ablation ESI (LAESI) is a promising approach to circumvent this limitation.


 * Nuclear magnetic resonance (NMR) spectroscopy. NMR is the only detection technique which does not rely on separation of the analytes, and the sample can thus be recovered for further analyses. All kinds of small molecule metabolites can be measured simultaneously - in this sense, NMR is close to being a universal detector. The main advantages of NMR are high analytical reproducibility and simplicity of sample preparation. Practically, however, it is relatively insensitive compared to mass spectrometry-based techniques.


 * Although NMR and MS are the most widely used techniques, other methods of detection that have been used include ion-mobility spectrometry, electrochemical detection (coupled to HPLC) and radiolabel (when combined with thin-layer chromatography).

Statistical methods
The data generated in metabolomics usually consist of measurements performed on subjects under various conditions. These measurements may be digitized spectra, or a list of metabolite levels. In its simplest form this generates a matrix with rows corresponding to subjects and columns corresponding with metabolite levels. Several statistical programs are currently available for analysis of both NMR and mass spectrometry data. For mass spectrometry data, software is available that identifies molecules that vary in subject groups on the basis of mass and sometimes retention time depending on the experimental design. The first comprehensive software to analyze global mass spectrometry-based metabolomics datasets was developed by the Siuzdak laboratory at The Scripps Research Institute in 2006. This software, called XCMS, is freely available, has over 20,000 downloads since its inception in 2006, and is one of the most widely cited mass spectrometry-based metabolomics software programs in scientific literature. XCMS has now been surpassed in usage by a cloud-based version of XCMS called XCMS Online. Other popular metabolomics programs for mass spectral analysis are MZmine, MetAlign, MathDAMP, which also compensate for retention time deviation during sample analysis. LCMStats is another R package for detailed analysis of liquid chromatography mass spectrometry(LCMS)data and is helpful in identification of co-eluting ions especially isotopologues from a complicated metabolic profile. It combines xcms package functions and can be used to apply many statistical functions for correcting detector saturation using coates correction and creating heat plots. Metabolomics data may also be analyzed by statistical projection (chemometrics) methods such as principal components analysis and partial least squares regression.

Once metabolic composition is determined, data reduction techniques can be used to elucidate patterns and connections. In many studies, including those evaluating drug-toxicity and some disease models, the metabolites of interest are not known a priori. This makes unsupervised methods, those with no prior assumptions of class membership, a popular first choice. The most common of these methods includes principal component analysis (PCA) which can efficiently reduce the dimensions of a dataset to a few which explain the greatest variation When analyzed in the lower dimensional PCA space, clustering of samples with similar metabolic fingerprints can be detected. This clustering can elucidate patterns and assist in the determination of disease biomarkers - metabolites that correlate most with class membership.

Key applications

 * Toxicity assessment/toxicology. Metabolic profiling (especially of urine or blood plasma samples) detects the physiological changes caused by toxic insult of a chemical (or mixture of chemicals). In many cases, the observed changes can be related to specific syndromes, e.g. a specific lesion in liver or kidney. This is of particular relevance to pharmaceutical companies wanting to test the toxicity of potential drug candidates: if a compound can be eliminated before it reaches clinical trials on the grounds of adverse toxicity, it saves the enormous expense of the trials.


 * Functional genomics. Metabolomics can be an excellent tool for determining the phenotype caused by a genetic manipulation, such as gene deletion or insertion. Sometimes this can be a sufficient goal in itself—for instance, to detect any phenotypic changes in a genetically-modified plant intended for human or animal consumption. More exciting is the prospect of predicting the function of unknown genes by comparison with the metabolic perturbations caused by deletion/insertion of known genes. Such advances are most likely to come from model organisms such as Saccharomyces cerevisiae and Arabidopsis thaliana. The Cravatt laboratory at The Scripps Research Institute has recently applied this technology to mammalian systems, identifying the N-acyltaurines as previously uncharacterized endogenous substrates for the enzyme fatty acid amide hydrolase (FAAH) and the monoalkylglycerol ethers (MAGEs) as endogenous substrates for the uncharacterized hydrolase KIAA1363.


 * Nutrigenomics is a generalised term which links genomics, transcriptomics, proteomics and metabolomics to human nutrition. In general a metabolome in a given body fluid is influenced by endogenous factors such as age, sex, body composition and genetics as well as underlying pathologies. The large bowel microflora are also a very significant potential confounder of metabolic profiles and could be classified as either an endogenous or exogenous factor. The main exogenous factors are diet and drugs. Diet can then be broken down to nutrients and non- nutrients.  Metabolomics is one means to determine a biological endpoint, or metabolic fingerprint, which reflects the balance of all these forces on an individual's metabolism.

Environmental metabolomics

 * Environmental metabolomics is the application of metabolomics to characterise the interactions of organisms with their environment. This approach has many advantages for studying organism–environment interactions and for assessing organism function and health at the molecular level. As such, metabolomics is finding an increasing number of applications in the environmental sciences, ranging from understanding organismal responses to abiotic pressures, to investigating the responses of organisms to other biota. These interactions can be studied from individuals to populations, which can be related to the traditional fields of ecophysiology and ecology, and from instantaneous effects to those over evolutionary time scales, the latter enabling studies of genetic adaptation.