More (careful) effort has been made to normalize alternative names (typically shown in parentheses). Accessions correspond to the IDs from the so-called Cummings collection as shown in the Glydin spreadsheet. As we add more GlyGen motifs these IDs will diverge. There are also some accessions with repeated GlyTouCan accessions. We may collapse these at some point.
Note that motif 000048 has been removed - its description or its IUPAC sequence was wrong, and both resolutions of this row of the table (to Lewis b or Lewis y) were already represented.
Note that motif GGM.000046 should also have a Lewis y counterpart (not present in Cummings paper), and it should be type 2 Lewis y - see Blood group A.
Note that motif 000023 has been removed - its description indicates that the sialic acid Neu5Ac is cyclic, but we have no way to correctly represent it using GlycoCT. See: PMID: 9990070. The same motif without the cyclic descriptor is already represented GGM.000027.
The motif 000052 has been removed. The IUPAC sequence in the original Cumming paper indicates a repeating unit (at least two sialalic acids) that has been lost in the IUPAC sequence transliteration to GlycoCT, WURCS, and a GlyTouCan accession (See the source entry here GDC.000052). We may add this back later if we can correctly handle this type of repeating structure.
Replacement for original motif 000052 with GGM.000252. The Cummings specified IUPAC sequence, NeuAca2-8(NeuAca2-8)n NeuAca2-3Galb1-4GlcNAcb1-R, indicates three or more sialic acid residues, and given our current algorithm for motif matching of structures or motifs with repeats, the same structures will be matched.