Data visualization is a crucial part of the experimental data analysis. The forms of visualization should be adjusted to highlight the essential result and tailored to satisfy personal needs.
This article describes the visualization methods available in HaDeX2 and explains how different plot types can be used to interpret HDX-MS data. It focuses on the biological questions each visualization addresses rather than implementation details.
The analyzed protein is the eEF1B\(\alpha\) subunit of the human guanine-nucleotide exchange factor (GEF) complex (eEF1B), measured in Mass Spectrometry Lab in Institute of Biochemistry and Biophysics Polish Academy of Sciences (Bondarchuk et al. 2022). In the one-state classification, we will focus on pure alpha state - eEF1B\(\alpha\). The comparative analysis is conducted between eEF1B\(\alpha\) and eEF1B\(\alpha\) in presence of eEF1B\(\gamma\).
The comparison plot presents deuterium uptake of the peptides in a given time point, with information on the length of the peptide and their position in the protein sequence. It allows comparison of the results of different biological states.
Example In the comparison plot below, we see the fractional deuterium uptake for all three possible states: eEF1B\(\alpha\), eEF1B\(\alpha\) in presence of eEF1B\(\gamma\) and in presence of eEF1B\(\beta\), from protein eEF1B. The values are calculated for the time point 1 min. The length of the segments represents the length of the peptide and the position in the protein sequence. The error bars indicate the uncertainty of the measurement.
create_state_comparison_dataset(alpha_dat, time_t = 1) %>%
plot_state_comparison(., fractional = TRUE) +
labs(x = "Position in sequence",
y = "Fractional deuterium uptake [%]",
title = "Measurement after 1 min of exchange")Pros:
Plot variants:
Woods plot presents the deuterium uptake difference between two biological states for the peptides. The results are presented with respect to the length of the peptide and its position in the protein sequence for a given time point of the measurement. The statistical test is applied to determine the confidence limits values at the chosen level. For more description see the vignette("statistics").
Example On the Woods plot below, we see fractional deuterium uptake difference between two biological states eEF1B\(\alpha\) and eEF1B\(\alpha\) in presence of eEF1B\(\gamma\) for protein eEF1B The confidence limits indicate which differences are statistically significant at levels 98%.
calculate_diff_uptake(alpha_dat, states = c(states[3], states[1])) %>%
plot_differential(., fractional = TRUE, show_houde_interval = TRUE) +
labs(x = "Position in seqence",
y = "Fractional deuterium uptake difference [%]",
title = "Measurement after 1 min of uptake")Pros:
Plot variables:
Butterfly plot presents the deuterium uptake for all peptides in a given state at different time points at once. Each time point of measurement is indicated by a different color. Peptides are identified by their ID (peptides are numbered arranged by the start position).
Example Below, on the butterfly plot, we see how the deuterium uptake changes in time for state eEF1B\(\alpha\) for protein eEF1B. We see the different exchange speed - for some peptides, the change is stable in time, and for some peptides, there is no visible change in time.
Pros:
Cons:
Plot variants:
Butterfly differential plot shows the deuterium uptake difference between two biological states in the form of a butterfly plot. It shows the results for a peptide ID (peptides are numbered arranged by the start position). The results are shown for different time points at once (time points of measurement are indicated by the color).
Example Below, we see how the fractional deuterium uptake difference between states eEF1B\(\alpha\) and eEF1B\(\alpha\) in presence of eEF1B\(\gamma\) changes over time. We see that for some peptides, the difference is smaller with time - perhaps because of the back exchange.
The measurements for 1440 min are hidden, as they are close to 0, as expected.
create_diff_uptake_dataset(alpha_dat, state_1 = states[3], state_2 = states[1]) %>%
filter(Exposure < 1440) %>%
plot_differential_butterfly(fractional = TRUE, show_houde_interval = TRUE) Pros:
Cons:
Plot variants:
Chiclet plot shows the fractional deuterium uptake in the form of a heatmap for the peptides in a given biological state. One tile indicates the peptide (identified by its ID - number arranged by the start position) in a time point of measurement. The color of the tile indicates the fractional deuterium uptake (according to the legend below the plot).
Example In the chiclet plot below, we can see the deuterium uptake values for peptides (indicated by their ID) in state eEF1B\(\alpha\) during the time course of the experiment. The cross symbols indicate the uncertainty of the measurement (the bigger the cross sign, the bigger the uncertainty).
create_state_uptake_dataset(alpha_dat, state = states[3]) %>%
filter(Exposure < 1440) %>%
plot_chiclet(show_uncertainty = TRUE, fractional = FALSE)Pros:
Cons:
Plot variants:
Chiclet differential plot shows the deuterium uptake difference between two biological states in the form of a heatmap. One tile indicates the peptide (identified by its ID - number arranged by the start position) in a time point of measurement. The color of the tile indicates the deuterium uptake difference (according to the legend below the plot).
Example On the chiclet differential plot below, we see the fractional deuterium uptake difference between states eEF1B\(\alpha\) and eEF1B\(\alpha\) in presence of eEF1B\(\gamma\) for protein eEF1B. We see that some peptides are protected (red), and some are deprotected (blue). The cross symbols indicate the uncertainty of the measurement (the bigger the cross sign, the bigger the uncertainty).
diff_uptake_dat %>%
filter(Exposure < 1440 & Exposure > 0.001) %>%
plot_differential_chiclet(show_uncertainty = TRUE, fractional = TRUE)
Pros:
Cons:
Plot variants:
The volcano plot shows the deuterium uptake difference for two biological states for peptide and its p-value for double testing on statistical significance (Hageman and Weis 2019). On the x-axis, there is a deuterium uptake difference with its uncertainty (combined and propagated). On the y-axis, there is a P-value calculated for each peptide in a specific time point of a measurement as a un-paired t-test on given significance level (on mass measurement from the replicates to indicate if the measured mean is significantly different between two states, as the deuterium uptake difference between states can be rewritten as
\[\Delta D = D_{A} - D_{B} = m_{t, A} - m_{0} - (m_{t, B} - m_{0}) = m_{t, A} - m_{t, B} \]
for states A and B. The values of deuterium uptake difference from all time points are shown on the plot.
The dotted red lines indicate confidence limits for the values. The horizontal line indicates the confidence limit based on chosen confidence level to give a threshold on a P-value. The vertical lines indicate the confidence limit from Houde test for all time points and indicate a threshold on deuterium uptake difference. The statistically significant points are in the top left and right corners of the plot.
Example On the volcano plot below, we see the results for deuterium uptake difference between states eEF1B\(\alpha\) and eEF1B\(\alpha\) in presence of eEF1B\(\gamma\) for protein eEF1B in all time points. The points in the left and right upper corner are statistically significant using the hybrid testing.
Pros:
Cons:
Plot variations:
Uptake curves show the changes in exchange in time for a specific peptide for its state.
Example On the uptake curve below, we see how the exchange goes for peptide GFGDLKSPAGL in all three states for protein eEF1Ba.
Pros:
Plot variations:
Uncertainty plot is new visualization method, showing the uncertainty of measurement of deuterium uptake for peptides to spot the regions where the uncertainty is higher. This plot may be used as a quality control of the experiment, as discussed in the vignette("quality_control").
The presented uncertainty is in Daltons, making the threshold 1 Da as proposed limit of acceptance.
Example The plot below presents uncertainty values for multiple time points for state eEF1B\(\alpha\). We see that the uncertainty is relatively low for all the measurement, and none of the value is suspicious.
Pros:
Plot variants:
Manhattan plot is a novel plot, presenting the P-value of statistical significance between two states.
Example In this example, we present the P values of difference between two biological states: eEF1B\(\alpha\) and eEF1B\(\alpha\) in presence of eEF1B\(\gamma\). We can see the regions where the difference is statistically significant - above the significance level (detailed as option in create_p_diff_uptake_dataset, or by default 0.98).
p_diff_dat <- create_p_diff_uptake_dataset(dat = alpha_dat, diff_uptake_dat = diff_uptake_dat,
state_1 = states[3], state_2 = states[1])
plot_manhattan(p_diff_dat, show_peptide_position = TRUE)Pros:
Plot variations:
The biggest limitation of previous methods of deuterium uptake visualization is that the results are on the peptide level. However, we offer a method of deuterium uptake averaging from peptide level into high-resolution level, using the weighted method of averaging (as described in vignette("statistics")). Then, the results are presented on the heatmap.
Example
kin_dat <- create_uptake_dataset(alpha_dat, states = "Alpha_KSCN")
aggregated_dat <- create_aggregated_uptake_dataset(kin_dat)
plot_aggregated_uptake(aggregated_dat)Pros:
Cons:
Plot variations:
As most of our methods of visualization, also the high-resolution plot has its differential version. The plot presents averaged uptake difference values using weighted approach.
Example On the structure, we see the fractional deuterium uptake difference after 25 minutes between states eEF1B\(\alpha\) and eEF1B\(\alpha\) in presence of eEF1B\(\gamma\) for protein eEF1B. We see that some peptides are protected (red), and some are deprotected (blue).
diff_uptake_dat <- create_diff_uptake_dataset(alpha_dat, state_1 = states[3], state_2 = states[1])
averaged_diff_dat <- create_aggregated_diff_uptake_dataset(diff_uptake_dat)
plot_aggregated_differential_uptake(averaged_diff_dat, panels = FALSE)Pros:
Cons:
Plot variations:
High-resolution values not only can be presented linearly on high-resolution plot (as above), but also on 3D structure, if available. This way, the calculated values are connected with spatial aspect. This option is available both for single state uptake and differential uptake.
Example Structure below is mapped with deuterium uptake values after 1 minute for eEF1B\(\alpha\) state. Values are averaged using weighted approach, color signifies no exchange (white) to high exchange (red).
pdb_file_path <- system.file(package = "HaDeX2", "HaDeX/data/Model_eEF1Balpha.pdb")
plot_aggregated_uptake_structure(aggregated_dat,
differential = FALSE,
time_t = 1,
pdb_file_path = pdb_file_path)Pros:
Cons:
Plot variations:
Coverage heatmap plot is a variation of standard coverage plot - but with each peptide is colored to signal specific value. This plot is particularly useful when presenting AUC (consult ?? for the details of AUC computation) or back-exchange values, as they are specified for peptide uptake curve.
Example Plot below presents the AUC values for eEF1B\(\alpha\). We see that for the majority of regions the AUC values is close to 1, signifying fast exchange with exception for one strong region and small sub-regions.
auc_dat <- calculate_auc(create_uptake_dataset(alpha_dat))
plot_coverage_heatmap(auc_dat, value = "auc")Each of the AUC value for different biological state is calculated separately. In order to meaningfully compare AUC values, we must select the same fully deuterated control for all of them - to have the same reference point of 100 % exchange. Otherwise, when the uptake curves have different values in the plus infinity, the AUC values described only the speed of the exchange taken separately. In the end, this value is more useful when used to compare uptake curves for a specific peptide under different biological conditions.
Example The coverage heatmap plot below presents the back-exchange values for peptides form eEF1B\(\alpha\). Back-exchange is believed to be on average close to 30%, as we see on the plot. Some peptides - especially shorter ones - have grater back-exchange.
bex_dat <- calculate_back_exchange(alpha_dat, state = "Alpha_KSCN")
plot_coverage_heatmap(bex_dat, value = "back_exchange")Pros:
Cons:
Plot variations:
Below we compare the aspects of the plots.
| types | time course | length of the peptide | uncertainty | all peptides | different states | position |
|---|---|---|---|---|---|---|
| comparison | FALSE | TRUE | TRUE | TRUE | TRUE | TRUE |
| Woods (differential) | FALSE | TRUE | TRUE | TRUE | TRUE | TRUE |
| butterfly | TRUE | FALSE | TRUE | TRUE | FALSE | FALSE |
| butterfly differential | TRUE | FALSE | TRUE | TRUE | TRUE | FALSE |
| volcano | TRUE | FALSE | TRUE | TRUE | TRUE | FALSE |
| chiclet | TRUE | FALSE | TRUE | TRUE | FALSE | FALSE |
| chiclet differential | TRUE | FALSE | TRUE | TRUE | TRUE | FALSE |
| uptake curve | TRUE | FALSE | TRUE | FALSE | TRUE | FALSE |
The columns indicate: