BASIL (BioActive Semantic Integration and Linking Database) was developed to guide users through the complexities of nutritional data through the study of bioactive compounds, enhancing understanding and simplifying access to health insights.
Stay on this page to get an overview of our data by exploring interactive visualizations that illustrate its intricate relationships and patterns. If you're looking for specific information, visit our Search page where you can query bioactives, conditions, or foods to access detailed evidence summaries. For insights into our project's background, methodology, and the team behind BASIL, check out our About page.
The above heatmap visualizes the efficacy of various bioactive compounds against a range of health conditions, such as Pruritus, Asthma, Rheumatoid Arthritis, and more. Arranged horizontally are compounds like Quercetin, Curcumin, and Coenzyme Q10, among others. Each colored cell in the heatmap reflects the strength of association between a compound and a health effect, with lighter colors indicating stronger relationships. For instance, there seems comparably high evidence for the connection between Capsaicin and the concept ”Pain” or Curcumin and the concept ”Inflammation”.
This visualization demonstrates the Tanimoto similarity coefficients among various bioactive compounds, highlighting potential overlaps and unique properties in their biological activities. The Tanimoto coefficient, also known as the Jaccard index, is a measure of the similarity between two sets and is calculated using the formula:
T(A, B) = |A ∩ B| / |A ∪ B|
Where A and B are sets of characteristics (e.g., chemical properties, biological activities) for two different compounds. The coefficient ranges from 0 to 1, where 0 indicates no similarity and 1 indicates identical sets. In this context, it quantifies how closely the properties of one bioactive compound resemble those of another, providing insights into their structural and functional relationships.
The visual displayed is an MDS (Multidimensional Scaling) plot, which projects the high-dimensional data of the similarity coefficients into a two-dimensional space. This projection allows for the intuitive visualization of the relationship dynamics between compounds, with closer points indicating higher similarity. The MDS plot is particularly useful for identifying clusters of bioactives with similar effects and exploring the diversity within the dataset.
In our dataset, as can be seen above, we differentiate compounds such as Theanine and L-Theanine based on their molecular orientation to emphasize stereoisomerism, highlighting that despite having the same molecular formula, their atoms' spatial arrangement can vary. Additionally, certain synthetic compounds like Telmisartan, sourced from the original datasets, are included for their functional similarities to naturally occurring bioactives, such as their role in blood pressure regulation. This comprehensive approach aids in comparing and understanding the therapeutic potentials and biochemical behaviors of both synthetic and natural compounds. BASIL DB contains a small number of compounds that are not naturally occurring. These include compounds used as dietary supplements, those inspired by natural compounds or their mechanisms, and semi-synthetic derivatives.
Beyond the structural similarity between compounds, BASIL lets us identify the effect similarity between compounds. That is, we can see how similar the existing literature is for two compounds regarding their health effects. For this purpose, we define the weight between a compound and a health effect as follows:
For each research paper that discusses both a specific bioactive compound and a health condition, we assign weights based on the type of study, the number of participants, and an adjusted bias score. The weight of the study, \(E_p\), varies by the study type: Single Compound studies are weighted at 6, Combination Therapy or Comparative studies at 4, Derivative studies at 2, and all other study types at 1.
The number of participants, \(N_p\), is normalized to moderate the influence of studies with large sample sizes. We use the formula \(N_p' = \frac{\log(N_p + 1)}{\log(\max(N) + 1)}\), where \(\max(N)\) is the maximum number of participants in any study. Additionally, the bias score, \(B_p\), ranges from 0 (no bias) to 1 (maximum bias), and is adjusted to \(B_p' = 1 - B_p^{0.5}\) to lessen the impact of higher biases on the study’s weight.
The overall weight for each paper is calculated as \(w_p = E_p \cdot N_p' \cdot B_p'\). The total weight for the relationship between a bioactive and a health condition is the sum of weights from all pertinent studies, expressed as \(w(c,h) = \sum_i w_i\), with \(i\) indexing the included papers.
We can explore the literature similarity of up to three compounds using the following visualization. Here, red nodes represent health effect concepts (such as "Anxiety", "Inflammation", or "Blood Pressure") and green nodes represent bioactive compounds. You can adjust the threshold of minimum edge weight using the slider.
Lastly, the visualization below allows you to interactively explore how different foods impact health. By selecting various food items, you can see a breakdown of their bioactives (which were scraped for BASIL), and how these relate to specific health conditions. The tool connects each food item to potential health outcomes based on the results of our automated literature review. You can also adjust how many health conditions are displayed for each compound by changing the maximum number of most highly related conditions to simplify or deepen your analysis. We have set the minimum weight between compounds and conditions to 15, explaining why some compounds are not linked to any conditions.
It's important to note that the number of compounds listed for each food does not necessarily reflect its health value. Some very healthy foods may link to only one or a few compounds, while others might connect to many, due to ongoing research and data compilation. This aspect of the tool will continue to evolve as more information becomes available and is integrated into the database. For exact nutrient values in each food, please use the search tool to look up specific food items or visit FooDB for more detailed information.