Skip to content

RMassBank demonstration and record generation

Tobias Schulze edited this page Aug 30, 2023 · 3 revisions

Based on the RMassBank vignette document.

Instructions:

  • Open RStudio
  • Open a new R Script
  • Copy and paste the code chunks to the document
  • Run the code chunks

Load the libraries

# Load libraries
library(RMassBank)
library(RMassBankData)

Open and review the vignettes. These are tutorial files.

# Review the Vignettes (advanced tutorial files)
browseVignettes("RMassBank")

Working directory

RMassBank requires a working directory to store and read data.

working_directory <- "c:/massbank" # R uses slash not backslash

if (dir.exists(working_directory)) {
    setwd(working_directory)    
} else {
    dir.create(working_directory)
    setwd(working_directory)
}

getwd()

Load the compound list and the settings

Let's load the demonstration compound list and settings.

# Load the compound list and the settings file
file.copy(system.file("list/NarcoticsDataset.csv", 
                      package="RMassBankData"), "./Compoundlist.csv")

RmbSettingsTemplate("mysettings.ini")

Review and edit the settings file

Open your working directory and review the compound list and the settings.ini.

Replace the prefix by any arbitrary and reload the settings

In case, the settings.ini was changed, it can be readin with:

loadRmbSettings("mysettings.ini")

MSMS workflow

# Create a MSMSworkspace
w <- newMsmsWorkspace()

Now, we can load the demonstration data from RMassBankData.

# load example files
files <- list.files(system.file("spectra", package="RMassBankData"),
                    ".mzML", full.names = TRUE)
basename(files)

Here, we focus on few compounds. Try to use all compounds in the hands on session (create new MSMSworkspace and skip this code chunk).

# for demonstration use only two compounds
w@files <- files[1:2]

Load your custom or the demonstration compound list in the wo

# load the compound list in the workspace w
loadList("./Compoundlist.csv")

Run the first part of the workflow (re-calibration)

# We can run it in one step:
w <- msmsWorkflow(w, mode="pH", steps=c(1:4), archivename = 
                      "pH_narcotics")

# Plot the recalibration
plotRecalibration(w)

Run the second part of the workflow (spectra aggregation and failed peaks analysis)

w <- msmsWorkflow(w, mode="pH", steps=c(5:8), archivename = 
                      "pH_narcotics")

The MSMSworkflow is finished.

The MassBank record workflow

Get annotations

As it was before:

mb <- newMbWorkspace(w)
mb <- loadInfolists(mb, system.file("infolists", package="RMassBankData"))
mb <- mbWorkflow(mb, steps=c(1:3))

Create records

part 1, compiling the information

Step 4 compiles the entire metadata into the @info slot for every children item:

mb4 <- mbWorkflow(mb, 4)
# The entire record is in here:
mb4@compiled[[1]]@children[[3]]@info
# Note also: 
mb4@compiled_ok
mb4@compiled_notOk

The magic is here:

https://github.com/MassBank/RMassBank/blob/17c69e1cc69ef80d18fc0e45f02a3a2ff43549b6/R/createMassBank.R#L257-L267

More or less, for every item in mb@spectra, buildRecord is called. All "augmented" spectra are then put in the @compiled slot. Note, the additionalPeaks (anyone ever use those?) are written into the spectrum here:

res <- buildRecord(r, mbdata=mbdata, additionalPeaks=mb@additionalPeaks, filter = filterOK & best)

The entire buildRecord code is in

https://github.com/MassBank/RMassBank/blob/17c69e1cc69ef80d18fc0e45f02a3a2ff43549b6/R/buildRecord.R

part 2, flattening to a text file

Step 5 and hopefully soon 6 do nothing at all!

mb5 <- mbWorkflow(mb4, 5) # does nothing
mb6 <- mbWorkflow(mb5, 6) # should be removed, since molfiles are deprecated

The magic happens in step 7, using the function toMassbank:

https://github.com/MassBank/RMassBank/blob/17c69e1cc69ef80d18fc0e45f02a3a2ff43549b6/R/createMassBank.R#L1679

https://github.com/MassBank/RMassBank/blob/17c69e1cc69ef80d18fc0e45f02a3a2ff43549b6/R/createMassBank.R#L1481

mb7 <- mbWorkflow(mb6, 7) # flattens and exports
# does more or less: files <- toMassbank(mb6@compiled)

This can be done for individual RmbSpectrum2 objects, and you can compare the @info slot with the created text record:

info <- mb6@compiled[[2]]@children[[2]]@info
record <- toMassbank(mb6@compiled[[2]]@children[[2]])

But in principle, any RmbSpectrum2 will print into the record format! Of course, only what is in @info will go into the record, which usually is not much unless the mbWorkflow is run.

minirecord <- toMassbank(w@spectra[[2]]@children[[3]])

SessionInfo

Run this to get information on your installed packages and R (which makes communication easier)

sessionInfo()