Substructure Alignment

From GlycoMotif
Jump to navigation Jump to search

Introduction

Glycan substructure alignment requires motif structures align with at least one connected substructure (subtree) of a glycan. The motif-aligned substructure is not constrained to the reducing end or non-reducing end of the glycan. Aligned motif and glycan monosaccharides and glycosidic linkages must respect the matching rules outlined below. A substructure alignment may be strict, when all resolutions of missing and undetermined glycan details are consistent with the motif. Non-strict alignments require at least one resolution of missing and undetermined glycan details be consistent with the motif. All monosaccharide or glycosidic bond comparisons must be consistent with strict alignment for the alignment to be considered strict.

In addition, non-strict alignments permit additional phosphate and sulfate substituents on glycan monosaccharides, even when the motif does not require them. When ambiguity in glycan topology permits at least one motif-to-substructure alignment, the motif and glycan are considered to have a non-strict alignment. Strict motif alignments in the determined topology portion of an undetermined topology glycan are considered strict.

Substructure alignment is the default alignment strategy for motifs that do not have a reducing-end annotation of true.

Monosaccharide Comparison

Each specific characteristic (anomeric configuration, etc.) of the glycan and motif monosaccharides are considered to be sets. Missing characteristics represent a set containing all possible values. For a strict alignment, glycan monosaccharide characteristics must be contained in (or, a subset of) the corresponding motif monosaccharide characteristics. For a non-strict alignment, glycan monosaccharide characteristics must have an non-empty intersection with the corresponding motif monosaccharide characteristics.

Since monosaccharide characteristics either have a single value or are missing, we can simplify the above rules as follows:

1. have the same number of carbon atoms (HEX, PENT, etc.); and

2. have the same stem-type (Man, Glc, etc.), if the motif monosaccharide stem-type is specified; and

3. have the same orientation (D-type, L-type), if the motif monosaccharide orientation is specified.

For a strict alignment, aligned motif and glycan monosaccharides must also:

4a. have the same ring information (1:5, etc.), if the motif monosaccharide ring information is specified; and

5a. have the same anomeric configuration (α, β), if the motif monosaccharide anomeric configuration is specified.

For a non-strict alignment, aligned motif and glycan monosaccharides must also:

4b. have the same ring information (1:5, etc.), if the glycan and motif monosaccharide ring information are specified; and

5b. have the same anomeric configuration (α, β), if the glycan and motif monosaccharide anomeric configuration are specified.

Motif Monosaccharide Structure Monosaccharide Alignment?
b-dglc-HEX-1:5 b-dglc-HEX-1:5 Yes. The monosaccharides are identical.
x-dglc-HEX-1:5 a-dglc-HEX-1:5 Yes, strict alignment. All monosaccharide properties are the same, except for anomeric configuration. The motif monosaccharide anomeric configuration contains the glycan monosaccharide's value.
b-dglc-HEX-1:5 x-dglc-HEX-1:5 Yes, non-strict alignment. The motif monosaccharide anomeric configuration is specified and the glycan monosaccharide anomeric configuration includes this value.
b-dglc-HEX-1:5 a-dglc-HEX-1:5 No. The both the motif and glycan monosaccharide anomeric configuration is specified and they are not the same.
x-dglc-HEX-x:x a-dglc-HEX-1:5 Yes, strict alignment. Most monosaccharide properties are the same, with the exception of anomeric configuration and the ring information. The motif monosaccharide anomeric configuration and ring information are not specified, so the glycan monosaccharide characteristics are contained in the motif monosaccharide characteristics.

For a strict alignment, aligned motif and glycan monosaccharides must:

6a. have the same modifications (deoxygenation, carbonyl, etc.) at the same position.

For a non-strict alignment, aligned motif and glycan monosaccharides must:

6b. have the same modifications (deoxygenation, carbonyl, etc.) at the same position, except for glycan reducing-end alditol.

Motif Monosaccharide Structure Monosaccharide Alignment?
b-dglc-HEX-1:5 b-dglc-HEX-1:5 Yes. The monosaccharides are identical, neither has any modifications.

b-dglc-HEX-1:5|6:d

b-dglc-HEX-1:5|6:d

Yes, strict alignment. The base monosaccharides are the same, with the same modification at the same position.

b-dglc-HEX-1:5|3:d

b-dglc-HEX-1:5|6:d

No. The base monosaccharides are the same, but the modification is not in the same position.

b-dglc-HEX-1:5|6:d

b-dglc-HEX-1:5|6:a

No. The base monosaccharides are the same, but the modifications are different.

b-dglc-HEX-1:5|3:d

b-dglc-HEX-1:5|3:d|6:d

No. The base monosaccharides are the same, but the glycan monosaccharide has an additional modification.

b-dglc-HEX-1:5|1:aldi

b-dglc-HEX-1:5|1:aldi

Yes, strict alignment. The base monosaccharides are the same, with the same modification at the same position.

b-dglc-HEX-1:5

b-dglc-HEX-1:5|1:aldi

Yes, non-strict alignment. The base monosaccharides are the same, and all modifications except for the alditol modification are the same in the same position. The additional alditol at the reducing end of an uncyclized glycan is permitted, even if not specified on the motif monosaccharide.

b-dglc-HEX-1:5|1:aldi

b-dglc-HEX-1:5

No. The base monosaccharides are the same, but the motif monosaccharide has an additional modification missing from the glycan monosaccharide.

For a strict alignment, aligned motif and glycan monosaccharides must:

7a. have the same substituents (N-acetyl, phosphate, sulfate, etc.) with the same linkage.

For a non-strict alignment, aligned motif and glycan monosaccharides must:

7b. have the same substituents (N-acetyl, phosphate, sulfate, etc.) with the same linkage, with additional glycan phosphate and sulfate substituents permitted.

Motif Monosaccharide Structure Monosaccharide Alignment?

b-dglc-HEX-1:5

b-dglc-HEX-1:5

Yes, strict alignment. The monosaccharides are identical, neither has any substituents.

a-dgal-HEX-1:5||(2d:1)n-acetyl

a-dgal-HEX-1:5||(2d:1)n-acetyl

Yes, strict alignment. The base monosaccharides are the same, with the same substituent with the same linkage.

x-dgal-HEX-1:5||(2d:1)n-acetyl

a-dgal-HEX-1:5||(2d:1)n-acetyl

Yes, strict alignment. The base monosaccharides are the same, except for anomeric configuration, which is missing for the motif monosaccharide. Each monosaccharide has the same substituent with the same linkage.

a-dgal-HEX-1:5||(4d:1)n-acetyl

a-dgal-HEX-1:5||(2d:1)n-acetyl

No. The base monosaccharides are the same, with the same substituent, but with different linkage.

a-dgal-HEX-1:5||(2d:1)n-acetyl

a-dgal-HEX-1:5||(2d:1)n-acetyl|(6o:1)phosphate

Yes, non-strict alignment. The base monosaccharides are the same, and all substituents except for the phosphate are the same with the same linkage. The additional glycan phosphate is permitted, even if not indicated on the motif monosaccharide.

a-dgal-HEX-1:5||(2d:1)n-acetyl

a-dgal-HEX-1:5||(2d:1)n-acetyl|(6o:1)sulfate

Yes, non-strict alignment. The base monosaccharides are the same, and all substituents except for the sulfate are the same with the same linkage. The additional glycan sulfate is permitted, even if not indicated on the motif monosaccharide.

a-dgal-HEX-1:5||(2d:1)n-acetyl|(6o:1)sulfate

a-dgal-HEX-1:5||(2d:1)n-acetyl

No. The base monosaccharides are the same, but the motif monosaccharide has an additional substituent missing from the glycan monosaccharide.

a-dgal-HEX-1:5

a-dgal-HEX-1:5||(2d:1)n-acetyl

No. The base monosaccharides are the same, but the glycan monosaccharide has an additional substituent other than phosphate or sulfate.

Glycosidic Bond Comparison

Each specific characteristic of the aligned glycosidic bonds attaching a child monosaccharide to a parent monosaccharides are considered to be sets. Missing characteristics represent a set containing all possible values. For a strict alignment, glycan glycosidic bond characteristics must be contained in (or, a subset of) the corresponding glycosidic bond monosaccharide characteristics. For a non-strict alignment, glycan glycosidic bond characteristics must have an non-empty intersection with the corresponding motif glycosidic bond characteristics.

Since linkage type and child linkage position typically are either specified with a single value or are missing, we can simplify the above statements as follows:

For a strict alignment, aligned motif and glycan glycosidic bonds attaching a child monosaccharide to a parent monosaccharide must:

1a. have the same child linkage type (n,d,...), if the motif glycosidic bond child linkage type is specified; and

2a. have the same child linkage position (1,2,...), if the motif glycosidic bond child linkage position is specified; and

3a. have the same parent linkage type (d,o,...), if the motif glycosidic bond parent linkage type is specified; and

4a. have the glycan parent linkage positions (2,3,4,5,6,...) be a subset of the motif glycosidic bond parent linkage position.

For a non-strict alignment, aligned motif and glycan glycosidic bonds attaching a child monosaccharide to a parent monosaccharide must:

1b. have the same child linkage type (n,d,...), if the motif and glycan glycosidic bond child linkage type are specified; and

2b. have the same child linkage position (1,2,...), if the motif glycosidic bond child linkage position are specified; and

3b. have the same parent linkage type (d,o,...), if the motif glycosidic bond parent linkage type are specified; and

4b. have the glycan parent linkage positions (2,3,4,5,6,...) have non-empty intersection with the motif glycosidic bond parent linkage positions.

Motif Linkage Structure Linkage Alignment?

o(3+1)d

o(3+1)d

Yes, strict alignment. Identical linkages.

o(-1+1)d

o(3+1)d

Yes, strict alignment. All linkage properties specified in the motif linkage contain the properties in the glycan linkage.

o(3+1)d

o(3+-1)d

Yes, non-strict alignment. Glycan child linkage positions have non-empty intersection with motif child linkage positions.

o(2+1)d

o(2|4+1)d

Yes, non-strict alignment. Glycan parent linkage positions have non-empty intersection with motif parent linkage positions.

o(2+1)d

o(-1+1)d

Yes, non-strict alignment. Glycan parent linkage positions have non-empty intersection with motif parent linkage positions.

o(4|6+1)d

o(2|4+1)d

Yes, non-strict alignment. Glycan parent linkage positions have non-empty intersection with motif parent linkage positions.

o(-1+1)d

o(2|4+1)d

Yes, strict alignment. Glycan parent linkage positions are contained in the motif parent linkage positions.

Examples

Motif Glycan Substructure Alignment?


GM.G00047MO


G00047MO

Yes, strict alignment. A motif always aligns to itself.


GM.G00047MO


G00048MO

Yes, strict alignment. Motif substructure alignment at the reducing-end of the glycan. Motif reducing-end anomeric configuration contains the glycan reducing-end anomeric configuration.


GM.G00047MO


G68835PZ

Yes, non-strict alignment. Motif substructure alignment at the reducing-end of the glycan. All missing or undetermined details have at least one resolution consistent with the motif.


GM.G00047MO


G12614SU

Yes, non-strict alignment. Motif substructure alignment at the reducing end of either arm. The undetermined attachment sites of the Fucose includes the arm GlcNAc residues, and all other monosaccharide and linkage details of the corresponding glycan substructure match. Motif aligns outside of the determined topology core.


GM.G00047MO


G37497LY

Yes, non-strict alignment. Non-strict substructure alignment tolerates additional sulfate substituents not specified by the motif.


GM.G00046MO


G18740DM

Yes, non-strict alignment. Non-strict substructure alignment tolerates an additional alditol motification not specified by the motif at the reducing end of the glycan. Note that the motif also does not specify its reducing-end ring information (x:x), which permits it to match the uncyclized ring information (0:0) of the glycan.


GM.G00046MO


G61666PA

Yes, non-strict alignment. All missing or undetermined details have at least one resolution consistent with the motif.

Motif Glycan Substructure Alignment?


GM.G00046MO


G78694IK

No. Motif parent linkage position (3) does not match glycan parent linkage position (6).


GM.G00046MO


G04801PW

No. Motif monosaccharide anomeric configuration (α) does not match glycan monosaccharide anomeric configuration (β).

See Also

Glycan Core Alignment, Nonreducing-End Alignment, Whole-Glycan Alignment.