-
Notifications
You must be signed in to change notification settings - Fork 93

Description
Current output
китапмы ↬ +мы ↤ qst
______________китап ↤ n ⋅ nom
______________китап ↤ n ⋅ nom
______________китап ↤ n ⋅ nom
__________+и ↤ cop ⋅ p3 ⋅ pl
______________китап ↤ n ⋅ nom
__________+и ↤ cop ⋅ p3 ⋅ sg
______________китап ↤ n ⋅ nom
Expected output
китапмы ↬ китап ↤ n ⋅ nom
______________+мы ↤ qst
__________китап ↤ n ⋅ nom
______________+и ↤ cop ⋅ p3⋅ pl
___________________+мы ↤ qst
__________китап ↤ n ⋅ nom
______________+и ↤ cop ⋅ p3 ⋅ sg
___________________+мы ↤ qst
On the "Morphological Analysis" subpage of turkic.apertium.org [1], when I analyse words which analyses contain subreadinns (start with a "+" in apertium format), order of readings gets mixed up (in particular, subreadings get displayed on top, and main readings below them and indented).
In the output of all apertium-turkic transducers, main reading is the left-most one. In the schreenshot attached, "китап" is the main reading, "и" (if there) is the first sub-reading, and "мы" is the last. They should be displayed in that order -- main reading on top, followed by subreadings each on a separate line and indented.
The way morphological analyses are displayed on the website resembles the vislcg format. If cg-conv is used to convert the apertium stream format into vislcg format, then it's simply the matter of providing the -l option to cg-conv:
apertium-tat$ echo "китапмы" | apertium -d . tat-morph | cg-conv -a -l
[1] http://turkic.apertium.org/index.eng.html?choice=tat#analyzation