This site will work and look better in a more modern browser, but it is still accessible to any browser or Internet device. You should upgrade your browser, if possible.
The "_" (LOW LINE) is used for compatibility with filesystems that do not differentiate between uppercase and lowercase letters. It is used in this way: all uppercase letters are followed by "_" to differentiate the filenames.
Hence "aa.png" is different from "A_A_.png" which is different from "aA_.png" and so on.
For transmission and storage of codes, the "_" is not needed, but for filenames (html pages, png files...) it is necessary.
Hence all MG encoded strings will be "escaped" with "_" and "unescaped" removing it, as needed.
When parsing MG codes, " " and "_" are eliminated.
There are hence 4160 (64*64 + 1*64) possible combinations of the "glyph subset alphabet" to encode a maximum of 4160 single glyphs. We don't expect to reach this maximum number, and instead we plan to keep the number of single glyphs
around 2000.
The first symbol (of the two that specify a glyph) indicates the category that the glyph belongs to.
Hence "ja" and "ji" are glyphs in the same category ("numerals").
The symbols from the "special symbols subset alphabet" all have a meaning affecting parsing, because they are involved in specifying composites, glyphs being shifted of category, phrases...
The trivial case: A MG string containing only symbols from the "glyph subset alphabet" would be easily parsed by splitting it in consecutive substrings of length 2, and these would be the codes specifying the glyphs and directly pointing to the image files (.png).
E.g.: "@baH@kQC@bbt"
(MG)
(equivalent to
"@b aH @k QC @b bt" and to "@baH_@kQ_C_@bbt")
would encode 6 consecutive glyphs (5 unique) whose images are located in the "l/" directory, with filenames:
"@b.png" "aH_.png" "@k.png" "Q_C_.png" "bt.png"
('l' stands for 'library', short for 'image library').
Things become slightly more complicated with the special symbols.
The square brackets specify a "phrase" inside the main sentence.
The nested "phrase" needs to be isolated and parsed in the same way as the main sentence. Phrases are commonly used arrangements of glyphs that have a translation associated (and hence an explanation page).
Form: [...]
The standard html display for phrases encloses the glyphs between square brackets ([...]) but other possibilities of displaying can be devised.
E.g.: "[@b@f]" (@b@f means "my, of me", which has its own translation and database entry in the phrases dictionary).
Note: a phrase can contain all other nested structures, i.e. other special symbols, including more "[]"
The codes in between two occurrances of the "+^" combination (PLUS SIGN followed by CIRCUMFLEX ACCENT) specify a number of glyphs forming a "multicomposite" (a composite of more than two glyphs).
Hence between two "+^" should appear ONLY an even number of symbols from the "glyph subset alphabet".
The part inside the special symbols can hence be trivially decomposed in substrings of length 2, each linking to an image.
Form: +^ABCDEF+^, where AB CD and EF are the glyph codes.
The standard html display for composites encloses the glyphs between curly brackets ({...}) but other possibilities of displaying can be devised.
The explanation pages are located in the "x/oi/" subdirectory.
E.g.: "+^xI1eq3+^" (it means "cosmogony")
A double plus sign is followed by three glyph symbols.
They indicate a shift in category, a "reclarification" of a glyph into
a new category.
Form: ++XAB, where X is the category code and AB the glyph code.
The category code (also belonging to to the "glyph subset alphabet") should be parsed so that the category image is shown. The glyph code is a normal substring of length two.
Category images are located in the "r/" directory ('r' stands for 'radicals').
The standard html display shows them as images having half the size of the other glyphs.
E.g.: "++x6j" (meaning "serene" for weather, with "x" being the "natural world" category)
A circumflex accent signals the presence of a composite of the first kind.
This composite has two glyphs coming from two different glyph categories.
It can be parsed very easily: after the "^", 4 symbols from the "glyph subset alphabet" have to be taken and these specify the two glyphs that form the composite.
Form: ^ABCD, where AB and CD are the glyph codes.
The standard html display for composites encloses the glyphs between curly brackets ({...}) but other possibilities of displaying can be devised.
E.g.: ^2kbt (meaning "speak on the phone, call")
A plus sign specifies the second kind of composite glyph.
This composite also has two glyphs, but they come from the same glyph category.
One glyph code symbol is hence redundant and not appearing in the encoded string.
To parse: take the three symbols after "+" and combine them so as to produce the two substrings formed by "the first and the second symbol" and "the first and the third symbol", as shown below.
Form: +ABC, where AB and AC are the glyph codes.
The standard html display for composites encloses the glyphs between curly brackets ({...}) but other possibilities of displaying can be devised.
E.g.: "+7nq" (meaning "mediaglyph")
The double equals sign combination encloses phonetic names. Hence everything between == is not to be understood in terms of glyph codes but in terms of ascii letters.
The UTF7 (an ascii escaping for Unicode encoding) is used for phonetic names.
UTF7 encodes Unicode special letters using: [a-z][A-Z][0-9]+-/
Some slight modifications (escaping) of UTF7 are needed for filesystem compatibility, in order to use these strings as filenames.
Form: ==LNG:string==, where LNG is the language code and what follows the ":" is the utf7 escaped form of the original name.
The standard html display is: if there is an image created for the phonetic name (usually the png holds the original form in the original language and the pronunciation, in IPA alphabet, of the name), then display the png image. Otherwise, convert to UTF8 and let the browser display it.
Location for images: the "l/uu/LNG/png" directory, if there is a language code, otherwise the "l/uu/png" directory. No "=" appear in the filenames.
E.g. "==eng:James==" (English language, name of the city of James)
What is between single equal signs is a name created with glyphs (or combinations of glyphs).
Hence the "=...=" case is equal to the "[...]" case, with the codes inside being treated as a subphrase and parsed accordingly.
Form: =...=
The standard html display for glyphnames encloses the glyphs between equal signs (=...=) but other possibilities of displaying can be devised.
E.g.: "=lp+eqp=" (meaning "The Little Prince", character name and book title: this is a glyph name containing one single glyph and one composite of the second kind)
For the whole length of the encoded string do: