Lithuanian text recognition: wrong recognition of "ų" as an "ę"
Bug #388926 reported by
Donatas Glodenis
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Cuneiform for Linux |
In Progress
|
Undecided
|
Unassigned | ||
Baltix |
New
|
Undecided
|
Unassigned |
Bug Description
Using cuneiform 0.7 on Ubuntu 9.04
When ocr-ing a lithuanian text with the switch "-l lit" a large number of letters "ų" that usually go at the end of the word get recognized as "ę".
If someone pointed me to the source file I have to check, I am pretty certain that the solution is simple, as the mistake is very simple. However, I cannot find the file: the closest match - datafiles/*lit.dat are binary and I cannot edit those...
To post a comment you must log in.
Unfortunately there is no documentation on how the code actually works. Source diving and debugger hunting are your only choices.