In the beginning, we need to choose the sequences for the alignment. To start, we write the name of the protein (BchD) into the
UniProt search window, and we will be taken to a
list of protein sequences from different organisms.
The site shows a large number of sequences. You may notice on the left, where it says “Status,” that there are “
Reviewed” and “
Unreviewed” sequences. This is one of my favorite features - when there is a sufficient number of reviewed sequences, I usually choose them for further analysis. These sequences have been verified to be what we expect them to be. Many automatically annotated sequences are among the Unreviewed, and sometimes we may find assignment errors. There is more helpful information (on the left field), like the availability of a 3D structure, catalytic activity, etc.
We will choose BCHD_RHOCB (entry P26175), the subunit BchD from
Rhodobacter capsulatus. In plants, the homologous subunit is called
ChlD; "B" in BchD indicates that the protein is involved in
bacteriochlorophyll syntheses, while ChlD is related to chlorophyll synthesis.
On the page that opens after clicking the entry ID
P26175, we will find detailed information on magnesium chelatase—its biological function (photosynthesis, magnesium chelatase activity), the type of ligands/substrate it binds, its catalytic function (insertion of magnesium and ATP hydrolysis), links to published works, links to entries related to this particular protein in other databases, and, of course, the amino acid sequence of BchD.
Clicking "
Family & Domains" on the menu on the left of the UniProt page will show the domain content of BchD. However, in this case, we only get the vWFA domain (
von Willebrand Factor A-like domain superfamily) at the C-terminal of the protein (residues 379-559). On the other hand, the
InterPro database link directs us to the
BCHD_RHOCB page, where we find a more detailed analysis of the domain composition of the protein. Apart from the vWFA domain, they also identify a P-loop NTPase family domain at the N-terminal part of the sequence, residues 78-235. The characteristic P-loop sequence motif, which is [AG]-x(4)-G-K-[ST] according to the Prosite database, is not conserved in
R. capsulatus BchD; we will look closer into this when we analyze the sequence alignment. There is also a link to the
AlfaFold predicted structure of the subunit, where we can explore the two domains and see their proposed arrangement in three dimensions.
A characteristic feature of the vWFA domain is the conservation of the so-called MIDAS motif. MIDAS stands for
metal ion-dependent adhesion site (see a paper by Lacy et al. 2004 for a detailed structure description). The conserved sequence consists of the DXSXS motif and additional threonine and aspartate (T and D) residues. The DXSXS motif residues in BchD are D385, S387 S389, which are close to the N-terminus of the vWFA domain, while threonine T452 and aspartate D482 are located further down in the sequence. The alignment in the image below (click to see it, it can also be found at the
Conserved Protein Domain Family server at NCBI) shows the position of these invariant residues (yellow boxes, also marked by a # on top of the alignment). These residues are involved in metal ion binding, Ca
2+ or Mg
2+. We still do not know the exact function of this domain within the magnesium chelatase complex.