| Linguistic Resources for Handwriting
Recognition |
| Isolated Handwritten Tamil Character
Dataset hpl-tamil-iso-char |
Description
The dataset contains approximately 500 isolated samples each of
156 Tamil “characters” (details)
written by native Tamil writers including school children, university
graduates, and adults from the cities of Bangalore, Karnataka, India
and Salem, Tamil Nadu, India. The data was collected using HP TabletPCs
and is in standard UNIPEN format.
An offline version of the data is also available in the form of
bi-level TIFF images, generated from the online data using simple
piecewise linear interpolation with a constant thickening factor
applied.
The data is available only for research use. Subsets of this dataset
were used for the IWFHR 2006 Tamil Character Recognition Competition.

• Competition
Home Page
• Structure
of Dataset hpl-tamil-iso-char, with statistics of characters
• Details of 156 Tamil
Characters with Unicode mappings
Downloading the dataset implies that you have understood and accepted
the terms of the license
agreement.
- Online data,
UNIPEN format, tar.gz file
Version 1.0, Released June 08, 2006, 45 MB
- Offline (image)
data, Bi-level TIFF, tar.gz file
Version 1.0, Released June 08, 2006, 41 MB
Note: On downloading these files with Internet Explorer
on Windows XP, the filename extension is changed to ".tar.tar",
which is incorrect. It is recommended the file be restored to ".tar.gz"
once downloaded.
Report
an issue with this dataset
|
 |
|