Variability plots - ZAR1 case study
This tutorial will focus on ZAR1 – an ancient NLR shared by most flowering plants – to illustrate how NLRscape variability plots could be used to analyze a given protein.
In 2019, the cryo-EM structures of ZAR1 were reported in three stages of the activation mechanisms: resting ADP-binding, activated with absent nucleotide and activated ATP-binding state in pentameric conformation. The experimental structures were instrumental in understanding how sequence-structural traits interplay during activation (Wang et al 2019 a,b).
In this tutorial we will initially analyze solely the variability plots and next confront our observations with the cryo-EM structures.
ZAR1 clusters
The sequence view page of ZAR1 from arabidopsis, contains a series of already precomputed clusters, but also custom clusters can be generated. Custom clusters are in general recommended when investigating the homology family centered around a given sequence of interest, whereas the precomputed clusters are generated based on the overall NLR landscape density. Therefore, in precomputed clusters, the sequence of interest could have a more marginal position, with potential homologs not covered by the cluster span.
For this tutorial we will use 2 clusters:
The variability plots of both 50% and 30% clusters are shown below. Variability is expressed as relative entropy with the degree of conservation proportional to the letter height (Find out more about variability plots)
The CC domain
At a quick glance, analyzing the variability of both sets within the CC domain, we notice four regions with high helical propensities. On all four helical segments an alternate hydrophobic-hydrophilic pattern is present suggesting that each helical segment has a solvent exposed side and a solvent burried face, which is consistent with the helical bundle architecture as depicted by the ZAR1 cryo-EM.
Helix H1
Obs. In the close family, the first half of the first helix is much more conserved and hydrophobic compared to the second half of the predicted helix. This could indicate that besides solvent buried, this region might be functionally important.
Cryo-EM. In the resting state, this region is buried between the CC and NBS domain (PDB: 6j5w), while in the activated form, the hydrophobic N-ter half was shown to potrude the plasma membrane (a hydrophobic environment) essential in initiating the immune response.
Helix H2
Obs. The C-ter end of the second helix is more conserved and highly positively charged (lysine and arginine residues). This could indicate this region might be involved in protein-protein interaction with a corresponding acidic patch.
Cryo-EM. In the resting state, this region is at interface with NBS domain (ARC1 end - ARC2 beginning) via several acidic position forming saline bridges with the basic residues on the CC-H2 end (PDB: 6j5w).
Helix H3
Obs. In both close and distant families, the second half of the helix is highly conserved, this being the most conserved patch at the level of the CC domain. The acidic positions are facing the solvent exposed side generating an acidic patch. This suggests that this region might be involved in protein-protein interaction with a corresponding basic patch.
Cryo-EM. In both resting and activated states (PDB: 6j5t, 6j5w) this region surrounding the EDVID motif is in tight contact with the NBS and LRR domains. On the LRR side, the interacting counterpart is a basic patch (positively charged) forming on the lower side of the LRR domain, within the first 4 LRR repeats.
Helix H4
Obs. By contrast to the previous 3 helical segments of the CC domain, the fourth helix shows a significant higher variability, and lower helical propensities, indicating potential structural instability or flexibility within this region.
Cryo-EM. This region suffers significant conformational changes during activation. In the resting state the first half of the helical region (H4a) is part of the helical bundle, whereas the second half (H4b) has no solved coordinates. By contrast, in the activated state, the H4a place in the bundle is is taken by H4b segment, while the H4a region appears with unsolved coordinates.
The CC-NBS linker
Obs. The first part of the NBS-LRR linker displays an increased variability, whereas the second half shows a high conservation, significantly higher than expected for a disordered linker. This could indicate that this side of the linker might have a functional role.
Cryo-EM. The conserved region of the linker follows a groove formed by the three NBS subdomains: NBD, ARC1 and ARC2, with multiple contacts potentially stabilizing the NBS subdomains. By contrast, the region corresponding to the highly variable part of the linker has unsolved coordinates, indicating that this region might be characterized by increased flexibility. Moreover, the end of the linker (residues VVG) are part of the ADP/ATP binding pocket.
The LRR domain
For LRR domain analysis, provided is a 2D plot showing the LRR repeats arranged one below the other similarly to their 3D arrangement. This layout facilitates the visualisation of the relationships between residues located on consecutive LRR repeats, but in close proximity in the 3D space
The positively charged (basic) patch discussed above, is in contact with the an acidic patch on the CC domain, located on the third helical segment surrounding the EDVID motif. Both close and distant homologs set display this conserved regions on the CC and LRR domains (depicted with blue boxes in figure below), suggesting that even more remote homologs of ZAR1 might share a similar interaction profile between the CC and LRR domains.
Obs. In the close homologs set, the region upstream the LxxLxL motif shows increased conservation levels especially within LRR repeats 2-6, 9-10 and 13 (indicated above with magenta boxes). This regions correspond to the upper side of the LRR domain, suggesting this could be a potential protein-protein interaction site. Interestingly, this conservation profile is seen only in the close homolgs group, whereas in the distant family these sites are much more variable, indicating that this potential protein-protein interface might not be shared by more remote homolgs of ZAR1.
Cryo-EM. This conserved region was shown to be contact interface with RKS1 kinase (PDB: 6j5w, 6j5t), which consists in two main contact points within the upper side of repeats 2-6 and 9-13.
References
Wang J, Wang J, Hu M, Wu S, Qi J, Wang G, Han Z, Qi Y, Gao N, Wang HW, Zhou JM, Chai J. Ligand-triggered allosteric ADP release primes a plant NLR complex. Science. 2019 Apr 5;364(6435):eaav5868. doi: 10.1126/science.aav5868. PMID: 30948526.
Wang J, Hu M, Wang J, Qi J, Han Z, Wang G, Qi Y, Wang HW, Zhou JM, Chai J. Reconstitution and structure of a plant NLR resistosome conferring immunity. Science. 2019 Apr 5;364(6435):eaav5870. doi: 10.1126/science.aav5870. PMID: 30948527.