Deconvoluting Natural Product Libraries Using High-content Screening

Primary screening with natural product extracts is not commonly done as the extracts are essentially pools of unknown compounds and the hit rates can be very high with no clear path forward on how to tease out unique compounds of interest.  Last month Dr. Roger Linington and his group at U.C. Santa Cruz published Integration of high-content screening and untargeted metabolomics for comprehensive functional annotation of natural product libraries in PNAS where they describe the Compound Activity Mapping platform.  Using high-content screening, they create synthetic phenotypic fingerprints for every active metabolite in a natural product extract library to identify compounds with interesting activity.  High Content Review caught up with Dr. Linington to learn more about this new approach.

HCR:  In your paper you had extract fingerprints (the well data) and synthetic fingerprints (the average fingerprint for each metabolite that was in one or more extracts / wells).  How were the synthetic fingerprints used in the analysis?

Dr. Linington: The synthetic fingerprint and cluster score (how similar the extract phenotypes were to one another) are used to evaluate all metabolites in every extract, and eliminate metabolites from the network if they did not pass both the synthetic fingerprint and activity score (how different the synthetic fingerprint is from control) cutoffs. Then, the network tells you which extracts and metabolites go together, and you can use the actual fingerprints of the extracts and the synthetic fingerprints of the metabolites to narrow your selection further, and drive your lead selection based on mechanistic novelty (if this is your goal), or similarity to mechanism of actions (MOAs) of interest etc etc.

HCR:  What is the Enzo compound library training set, and how did you compare your data to it?

Dr. Linington: The Enzo library is a library of ~480 drugs of known MOA which were run using the same assay.  The compound fingerprints can be used to match fingerprints from the extracts and metabolites, therefore providing an inferred MOA for these unknown compounds.  In our paper we identified a new class of compounds that clustered with known ER stressors called the Quniocinnolinomycins.

HCR: Are you pursuing the target of the Quniocinnolinomycins or turning your attention to larger libraries?

Dr. Linington:  Yes, we have done some secondary assays that support they are ER stressors, but we are also interested in dramatically increasing the size of the library we analyze. The bigger the dataset, the less frequently compounds will be co-expressed, and the higher the resolution of the system will be.

HCR: Can you replicate the results with different cell lines or feature sets?

Dr. Linington:  Yes, but you have to run the full assay on all extracts, plus the Enzo set. In theory, this can be done with any multi-parametric bioassay system, and we are actively investigating its application for antibiotic screening.

HCR: Can you narrow down the cytological feature set to do this analysis (were all features predictive, or were some redundant?)

Dr. Linington: Probably, yes. We already reduced the feature set from 400 to 248, but it is likely that a much smaller feature set would give a similar cluster resolution.

HCR:  How would Compound Activity Mapping work if you had a very diverse natural product library?  What if the active components were mixed in with many other compounds that were also active. The synthetic signature would then become very messy and likely “damped out” by diverse activities.

Dr. Linington:  True. If a compound only occurs in mixtures of bioactives it can be hard to find. However, if it occurs both alone and with other bioactives it is easier to discriminate. We have seen some nice examples where we can divide larger clusters into subclusters that only contain A, only contain B, and contain both A and B.

HCR:  Did any unique metabolite cluster as a singleton?

Dr. Linington:  In this case no, but in theory this should be true for very rare bioactives, and these would be of the highest priority for further examination.

HCR: Did you eliminate outlier data to create the synthetic signature – maybe even identify 2 or 3 separate synthetic signatures for a metablolite using cluster analysis.

Dr. Linington: We have thought about refining this, but have not tackled this question yet.

HCR: I wonder if you might extrapolate Compound Activity Mapping to screening compound library pools with the goal of reducing the size of large libraries when cell-assays are too expensive to run in primary format?   How many replicates would you want for every compound to effectively use your analysis to ID the hit?

Dr. Linington:   Compound Activity Mapping works best if you see compounds several times, but the natural product extracts have an average of ~20 compounds per fraction, so it is quite good at deconvoluting the screening data from these mixtures. The immediate goal is to extend this to 6000 extracts. With this much data, the results should be a lot cleaner, and will really provide an unequivocal view of bioactive chemistry from bacterial natural products.

Leave a Comment

five × five =