Searching for just a few words should be enough to get started. If you need to make more complex queries, use the tips below to guide you.
Article type: Research Article
Authors: Li, Zhenjianga; * | Wang, Weilanb | Wang, Yiqunb; c | Zhang, Qianxuea
Affiliations: [a] School of Cyberspace Security, Gansu University of Political Science and Law, Lanzhou, Gansu, China | [b] Key Laboratory of China’s Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, Gansu, China | [c] School of Artificial Intelligence, Gansu University of Political Science and Law, Lanzhou, Gansu, China
Correspondence: [*] Corresponding author: Zhenjiang Li, School of Cyberspace Security, Gansu University of Political Science and Law, Lanzhou, Gansu 730000, China. E-mail: [email protected].
Abstract: . A offline character dataset of Tibetan Historical document in Uchen font, THCU, is presented to facilitate the research of Tibetan Historical document recognition. The dataset THCU includes two subsets: THCU-M and THCU-S. The THCU-M is annotated manually in original document images, including 121214 character samples and 238 character categories. The subset THCU-S is a simulation dataset, and its samples are generated based on the idea of component combination. There are four subsets in THCU-S, in which the numbers of character category are 7238, 2908, 562 and 245 respectively, and the numbers of sample in each category are 5000, 3000, 600 and 600 respectively. We also evaluate THCU dataset using a CNN based model as a baseline performance. The experiment shows that the performance of the model on the real data is greatly improved by adding the generated samples.
Keywords: Tibetan Historical document, character recognition, dataset, sample generation
DOI: 10.3233/JCM-226167
Journal: Journal of Computational Methods in Sciences and Engineering, vol. 22, no. 5, pp. 1779-1794, 2022
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
USA
Tel: +1 703 830 6300
Fax: +1 703 830 2300
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
IOS Press
Nieuwe Hemweg 6B
1013 BG Amsterdam
The Netherlands
Tel: +31 20 688 3355
Fax: +31 20 687 0091
[email protected]
For editorial issues, permissions, book requests, submissions and proceedings, contact the Amsterdam office [email protected]
Inspirees International (China Office)
Ciyunsi Beili 207(CapitaLand), Bld 1, 7-901
100025, Beijing
China
Free service line: 400 661 8717
Fax: +86 10 8446 7947
[email protected]
For editorial issues, like the status of your submitted paper or proposals, write to [email protected]
如果您在出版方面需要帮助或有任何建, 件至: [email protected]