Applies to mineXpert2 8.1.1

# 1 Generalities

In this chapter, I wish to introduce some general concepts around the mineXpert2 program and the way data elements are named in this manual and in the program.

A mass spectrometry experiment generally involves monitoring the m﻿/﻿z value of analytes injected in the mass spectrometer along a certain time duration. The m﻿/﻿z value of each detected analyte is recorded along with the corresponding signal intensity i, so that a mass spectrum is nothing but a series of (m﻿/﻿z,i) pairs recorded along the acquisition duration. All along the acquisition, the precise moment at which a given analyte is detected (and its (m﻿/﻿z,i) pair is recorded), is called the retention time of that analyte (rt). This retention time is not to be misunderstood as the drift time of that analyte in an ion mobility mass spectrometry experiment.

## 1.1 Citing the mineXpert2 software. #

Please, cite the software using the following citation: mineXpert2: Full-Depth Visualization and Exploration of MSn Mass Spectrometry Data. Olivier Langella and Filippo Rusconi. Journal of the American Society for Mass Spectrometry (check for pages, as this is a pub ahead of print reference). DOI: 10.1021/jasms.0c00402.

## 1.2 General concepts and terminologies#

Most generally, the mass spectrometer acquires an important number of spectra in, say, one second. But all these spectra are combined together, and, on the surface, the massist only sees a slow acquisition of 1 spectrum per second. This apparent slow acquisition rate is configurable. At the time of writing, generally 1 spectrum per second is recorded on disk. So, say we record mass spectra for 5 minutes, we would have recorded (5*60) spectra.

## 1.3 Acquiring Mass Data Along Time: To Profile or Not To Profile?#

As a mass spectrometry user, the reader of this manual certainly has used mass spectrometers where mass spectra are acquired and stored in different ways:

• Mass spectra are acquired and summed—the next to the previous—in such a manner that one is left, at the end of the acquisition, with a single spectrum of which the various peak intensities have been increasing all along the acquisition. Indeed, in this mode, each new spectrum is actually combined to the previously acquired ones. The resulting mass spectrum that is displayed on screen and that gets ultimately stored on disk is called a combined spectrum. This is typically the way MALDI-TOF mass spectrometers are used when manually acquiring data from samples deposited onto sample plates. We refer to this kind of acquisition as an accumulation mode acquisition;

• Mass spectra are acquired and stored on disk as a single file containing all the spectra, appended one after the other. There is no combination of the spectra: each time a new spectrum is displayed on screen, that spectrum is appended to the file.[4] This is typically the case when mass spectra are acquired all along a chromatography run and is generally called a profile mode acquisition. Note that this profile mode acquisition must not be mistaken as the profile mass peak type that negates the centroid mass peak type.

## 1.4 Mass Data Visualisation: To Combine or Not To Combine?#

In the previous section, we mentioned spectrum combination a number of times. What does that mean, that spectra are combined together into a single combined spectrum? Say we have 200 spectra that need to be combined together into a single spectrum that summatively represents the data of these 200 spectra.

First, a new spectrum would be allocated (result spectrum), entirely empty at first. Then, the very first spectrum of the 200 spectra is literally copied into that result spectrum. At this point the combination occurs, according to an iterative process that has the following steps:

• Pick the next spectrum of the 200-spectra dataset;

1. Pick the first (m﻿/﻿z,i) pair of the currently iterated spectrum;

2. Look up in the result spectrum if a m﻿/﻿z value identical to the m﻿/﻿z value of the current (m﻿/﻿z,i) pair is already present;

3. If the m﻿/﻿z value is found, increment its intensity by the intensity of the (m﻿/﻿z,i) pair;

4. Else, if the m﻿/﻿z value is not found, add the current (m﻿/﻿z,i) pair to the result spectrum;

5. Iterate over all the remaining (m﻿/﻿z,i) pairs of the current spectrum and redo these steps.

• Iterate over all the 198 remaining spectra of the dataset and do the steps above for each single iterated spectrum.

At the end of the two nested loops above, the combined spectrum is still a single spectrum that represents—summatively—all the 200 spectra. This whole process is very computing-intensive, in particular if:

• The m﻿/﻿z range is large: there are lots of points in each spectrum, which means that for each new (m﻿/﻿z,i) pair we need to iterate in the long list of m﻿/﻿z values that make the result spectrum;

• The resolving power of the mass spectrometer is high: there are many points per m﻿/﻿z range unit.

When a profile mode acquisition is performed, the user gets an innumerable number of distinct spectra, all appended to a single file. These unitary spectra are virtually unusable if an initial processing is not performed. This initial processing of the spectra is called total ion current chromatogram calculation. What is it? Let's say that the user has performed a profile mode mass spectrometry acquisition on the eluate of a chromatography column. Now, imagine that the spectrometer stores the mass data at a rate of one spectrum per second and that the chromatography gradient develops over 45 min: there would be a total of (45 * 60) spectra in that file. The question is: —How can we provide the user with a data representation that might be both meaningful and useful to start exploring the data? The conventional way of doing so is to load all the mass spectra and compute the total ion current chromatogram (the TIC chromatogram). The analogy with chromatography is evident: the TIC chromatogram is the same as the UV chromatogram unless optical density is not the physical property that is measured over time; instead, the amount of ions that are detected in the mass spectrometer is measured over time. That amount is actually the sum of the intensities of all the (m﻿/﻿z,i) pairs detected in each spectrum. When mass data are acquired during a chromatography run, often, the total ion current chromatogram mirrors (mimicks) the UV chromatogram[5]. For each retention time (RT) a TIC value is computed by summing the intensities of all the (m﻿/﻿z,i) pairs detected at that specific RT.

How is this total ion current chromatogram computed? This is an iterative process: from the first spectrum (retention time value 0 s), to the second spectrum (retention time value 1 s) up to the last spectrum (retention time 45 min), the program computes the sum of the intensities of all the spectrum's (m﻿/﻿z,i) pairs. That computation ends up with a map that relates each RT value with the corresponding TIC value. The TIC chromatogram is nothing but a plot of the TIC values as a function of RT values. In that sense, it is indeed a chromatogram.

mineXpert2 works exactly in this way. When mass spectrometry data are loaded from a file, the TIC chromatogram is computed and displayed. This TIC chromatogram serves as the basis for the mass data exploration, as described in this manual. The TIC chromatogram serves as the basis for spectral combinations that can be performed in various ways, and not all formally combinations, which is why I prefer the term integrations. Some of these integrations are described below:

• Integrating data from the TIC chromatogram to a single mass spectrum;

• Integrating data from the TIC chromatogram to a single drift spectrum;

Note that the reverse actions are possible (and indeed necessary for a thorough data exploration): selecting a region of a mass spectrum and asking that the TIC chromatogram be reconstituted from there; or selecting a region of a drift spectrum and asking that the TIC chromatogram be reconstituted from there also. Finally, integrations may, of course, be performed from a mass spectrum to a drift spectrum, and reverse.

## 1.5 Examples of Various Mass Spectral Data Integrations#

In the sections below, the inner workings of mineXpert2 are described for some exemplary mass data integrations. For example, when doing ion mobility mass spectrometry data exploration, it is essential to be able to characterize most finely the drift time of each and any analyte. Since each analyte is actually defined as one or more (m﻿/﻿z,i) pairs, it is essential to be able to ask questions like the following:

• What is the drift time of the ions below this mass peak?

• What are all the drift times of all the analytes going through the mobility cell for a given retention time range?

• What are all the ions that are responsible for this shoulder in the drift spectrum?

### 1.5.1 TIC -> MZ integration#

What computation does actually mineXpert2 do when a mass spectrum is computed starting from a TIC chromatogram region, say between retention time RT minute 7 and RT minute 8.5?

1. List all the mass spectra that were acquired between RT 7 and RT 8.5. In this spectral set, there might be many hundreds of spectra that match this criterion, if we think that, in ion mobility mass spectrometry, ≈ 200 spectra are acquired and stored individually every second (I mean it, every 1 s time lapse);

2. Allocate a new empty spectrum—the combined spectrum—and copy into it without modification the first spectrum of the spectral set;

3. Go to the next spectrum of the spectral set and iterate into each (m﻿/﻿z,i) pair:

• Check if the m﻿/﻿z value of the iterated pair is already present in the combined spectrum. If so, increment the combined spectrum's (m﻿/﻿z,i) pair's intensity value by the intensity of the iterated (m﻿/﻿z,i) pair's intensity. If not, simply copy the iterated (m﻿/﻿z,i) pair in the combined spectrum;

• Iterate over all the remaining (m﻿/﻿z,i) pairs and perform the same action as above.

4. Iterate over all the remaining spectra of the spectral set and perform step number 3.

mineXpert2 then displays the combined spectrum.

### 1.5.2 TIC -> DT integration#

What computation does mineXpert2 actually do when a drift spectrum is computed starting from a given TIC chromatogram region, say between retention time RT minute 7 and RT minute 8.5?

What is a drift spectrum? A drift spectrum (mobilogram) is a plot where the cumulated ion current of the detected ions is plotted against the drift time at which they were detected. Let's see how that computation is handled in mineXpert2:

1. Create a map to store all the (drift time, intensity) pairs that are to be computed below, the (dt,i) map;

2. List all the mass spectra that were acquired between RT 7 and RT 8.5. The obtained list of mass spectra is called the spectral set;

3. Go to the first spectrum of the spectral set and compute its TIC value (sum of all the intensities of all the (m﻿/﻿z,i) pairs of that spectrum). Get the drift time value at which this mass spectrum was acquired. We thus have a value pair: (dt, i), that is, for drift time dt, the intensity of the total ion current is i;

At this point, we need to do a short digression: we saw earlier that, at the time of this writing, one of the commercial instruments on which the author of these lines does his experiments stores 200 spectra each second. These 200 spectra actually correspond to the way the drift cycle is divided into 200 bin (time bins). That means that in the retention time range [7–8.5], there are (1.5*60) complete drift cycles. And thus there are (1.5*60) spectra with drift time x, the same amount of spectra with drift time y, and so on for the reminaing 198 time bins. Of course, a large number of these spectra might be almost empty, but these spectra are there and we need to cope with them.

The paragraph above must thus lead to one interrogation about the current (dt,i) pair: —Has the current dt value be seen before, during the previous iterations in this loop?. If not, then create the (dt, i) pair and add it to the (dt,i) map; if yes, get the dt element in the map and increment its intensity value by the TIC value computed above;

4. Iterate over all the remaining spectra of the spectral set and perform step number 3.

At the end of the loop above, we get a map in which each item relates a given drift time with a TIC value. This can be understood this way: —For each drift time value, what is the accumulated ion current of all the ions having that specific drift time?.

At this point, mineXpert2 displays the drift spectrum (mobilogram).

### 1.5.3 Using multiple threads during mass data integrations#

Whenever the computer has multiple threads available for the integrations to be performed using parallel execution, mineXpert2 manages to use all these available threads. It is possible to limit the number of threads used for the integration computations as described in Preferences menu, Section 2.3, “The Main Program Window Menu”.

## 1.6 Installation of the software#

The installation material is available at http:/msxpertsuite.org.

### 1.6.1 Installation on MS Windows and macOS systems#

The installation of the software is extremely easy on the MS-Windows and macOS platforms. In both cases, the installation programs are standard and require no explanation.

### 1.6.2 Installation on Debian- and Ubuntu-based systems#

The installation on Debian- and Ubuntu-based GNU/Linux platforms is also extremely easy (even more than in the above situations). mineXpert2 is indeed packaged and released in the official distribution repositories of these distributions and the only command to run to install it is:

$ [6] sudo apt install <package_name>RETURN In the command above, the typical package_name is in the form minexpert2 for the program package and minexpert2-doc for the user manual package. Once the package has been installed the program shows up in the Science menu. It can also be launched from the shell using the following command: $ minexpert2RETURN

###### Tip

If the Debian system onto which the program is to be installed is older than testing, that is, older than Buster (Debian 10), then using the AppImage program bundle might be a solution. See below for the method to run mineXpert2 as an AppImage bundle.

### 1.6.3 Installation with an AppImage software bundle#

The AppImage software bundle format is a format that allows one to easily run a software program on any GNU/Linux-based distribution. From the http:/appimage.org/:

 The key idea of the AppImage format is one app = one file. Every AppImage contains an app and all the files the app needs to run. In other words, each AppImage has no dependencies other than what is included in the targeted base operating system(s). --Simon Peter

There are AppImage software bundles for the various mineXpert2 versions that are available for download. As of writing, the software bundle has been tested on Centos version 8.3.2011 and on Fedora version 22. These are pretty old distribution versions and thus mineXpert2 should also run on more recent versions of these computing platforms. The AppImage bundle of mineXpert2 was created on a rather current Debian version: the testing Debian 11-to-be distribution.

In order to run the mineXpert2 software AppImage bundle, download the latest version (like mineXpert2-0.7.4-x86_64.AppImage). Once the file has been downloaded to the desired directory, change to that directory and change the permissions to make it executable:

$  chmod a+x mineXpert2-0.7.4-x86_64.AppImageRETURN Finally, execute the file that has become a normal program: $  ./mineXpert2-0.7.4-x86_64.AppImageRETURN

###### Tip

If the program complains about a locale not being found, please, modify the command line to read:

$  LC_ALL="C" ./mineXpert2-0.7.4-x86_64.AppImageRETURN ## 1.7 Building the software from source# The mineXpert2 software build is under the control of the CMake build system. There are a number of dependencies to install prior to trying to build the software, as described below. ### 1.7.1 The dependencies required to build minexpert2# The dependencies to be installed are listed here with package names matching the packages that are in Debian/Ubuntu. In other RPM-based software, most often the package names are similar, albeit with some slight differences. ###### Dependencies # The build system cmake Conversion of svg files to png files graphicsmagick-imagemagick-compat For the parallel computations libgomp1 For the isotopic cluster calculations libisospec++-dev For all the raw mass calculations like the data model, the mass spectral combinations… libpappsomspp-dev, libpappsomspp-widget-dev For all the plotting libqcustomplot-dev For the C++ objects (GUI and non-GUI) qtbase5-dev, libqt5svg5-dev, qttools5-dev-tools, qtchooser For the man page docbook-to-man For the documentation (optional, with -DMAKE_USER_MANUAL=1 as a flag to the call of cmake, see below.) daps, libjeuclid-core-java, libjeuclid-fop-java, docbook-mathml, libjs-jquery, libjs-highlight.js, libjs-mathjax, fonts-mathjax, fonts-mathjax-extras, texlive-fonts-extra, fonts-ebgaramond-extra ### 1.7.2 Getting the source tarball# In the example below, the version of the software to be installed is 7.3.0. Replace that version with any latest version of interest, which can be looked for at https://gitlab.com/msxpertsuite/minexpert2/-/releases. #### 1.7.2.1 Using git# The rather convoluted command below only downloads the branch of interest. The whole git repos is very large… $  git clone https://gitlab.com/msxpertsuite/minexpert2.git --branch master/7.3.0-1 --single-branch minexpert2-7.3.0

wget https://gitlab.com/msxpertsuite/minexpert2/-/archive/7.3.0/minexpert2-7.3.0.tar.gz

Untar the tarball, which creates the minexpert2-7.3.0 directory:

tar xvzf minexpert2-7.3.0.tar.gz

### 1.7.3 Building of the software#

Change directory:

$ cd minexpert2-7.3.0 Create a build directory: $ mkdir build

Change directory:

$ cd build Configure the build: $ cmake ../ -DCMAKE_BUILD_TYPE=Release

Build the software:

\$ make

[4] Although there certainly is spectrum combination going on in the guts of the software, because the system actually acquires much more spectra than is visible on screen and each newly displayed spectrum is actually the combination of many spectra acquired under the surface.

[5] Unless eluted analytes do absorb UV light but do not either desorb/desolvate or ionize, or both.

[6] The prompt character might be % in some shells, like zsh.