Part 4: multi-strain analysis

Once gempipe derive produced a strain-specific GSMM for each input strain, a multi-strain analysis can be performed. The Gempipe API includes handy functions to perform simple multi-strain analysis. Specifically, it is currently possible to:

  • cluster strains according to their metabolic potential, creating a “phylometabolic tree”, i.e. a dendrogram built exclusively using metabolic features.

  • visually compare these data-driven clusters against a specified strains metadata, for example the environmental niche, or the species of origin.

  • extract the metabolic features characterizing each cluster.

The Gempipe API currently handles any number of binary metabolic features, which only represent the presence (1) or absence (0) of specific capabilities. These kind of features include for example the presence of metabolic reactions, the ability to catabolize alternative substrates, or the presence of auxotrophies for amino acids and vitamins. Their are provided by gempipe derive by means of specific tables: rpam.csv, cnps.csv and aux.csv, respectively (see Output files). These tables represent three distinct “layers” of binary metabolic features; however, the Gempipe API can handle any number of layers, if provided.

Tutorials are available to show the usage of some of the Gempipe API functions: