| Linguistic Resources for Handwriting Recognition
|
| Test Dataset hpl-tamil-iwfhr06-test for
IWFHR 2006 Online Tamil Handwritten Character Recognition Competition. |
The dataset contains approximately 170 isolated samples each of
156 Tamil “characters” (total of 26926 samples) written
by native Tamil writers including school children, university graduates,
and adults from the cities of Bangalore, Karnataka, India and Salem,
Tamil Nadu, India. The data was collected using HP TabletPCs and
is in standard UNIPEN format.
The samples have been randomised across writers
and classes, and are serially numbered from 00000 - 26925.
An offline version of the data is also available in the form of
bi-level TIFF images, generated from the online data using simple
piecewise linear interpolation with a constant thickening factor
applied.
The data is available only for research use.
Note
- To register for the competition, please go to the competition
home page.
- Training data (collected as part of the same data collection
effort) was made
available earlier, as per the competition schedule.
- The combined dataset (training + test) will be made available
on conclusion of the competition.

Downloading the dataset implies that you have understood and accepted
the terms of the license
agreement.
- Online data,
UNIPEN format, tar.gz file
Version 1.0, Released May 04, 2006, 15.4 MB
- Offline (image)
data, Bi-level TIFF, tar.gz file
Version 1.0, Released May 04, 2006, 12.6MB
Note: On downloading these files with Internet Explorer
on Windows XP, the filename extension is changed to ".tar.tar",
which is incorrect. It is recommended the file be restored to ".tar.gz"
once downloaded.
Report
an issue with this dataset
|
 |
|