Data format
Data format
The data is stored in several directories and files:
- original - here are the acqired xml-files.
- lineStrokes - the devided lines in on-line format (example)
- lineImages - the devided lines in off-line format (example)
- writers.xml - a file containing information of the writers
- ascii - the transcriptions as plain text-files
- forms.txt - a file containing the mapping of forms to writers
Format of the XML-files
The xml-files contain the following informations:- Form
- id - The unique form id
- writerID - the id of the writer
- saveTime - the Time of saving return ((((systemTime.wYear*12+systemTime.wMonth)*31+systemTime.wDay)*24 + systemTime.wHour)*60 + systemTime.wMinute)*60 + systemTime.wSecond + (systemTime.wMilliseconds)/1000.f;
- CaptureTime - the time information in a more readable format
- Setting
- location - location of the recording
- producer - responsible transcriber
- system - the used recording software
- Text - the full ascii-transcription of the written data
- TextLine - a text-line with a unique id
- Word - a word with a unique id
- Char - a character with a unique id
- WhiteboardDescription
- SensorLocation - the corner, where the eBeam-sensor has been
- DiagonallyOppositeCoords - hte coordinates of the opposite corner (marked by the acquiring person)
- VerticallyOppositeCoords - hte coordinates of the opposite corner (marked by the acquiring person)
- HorizontallyOppositeCoords - hte coordinates of the opposite corner (marked by the acquiring person)
- StrokeSet
- Stroke - A Stroke with its coulour and time information
- Point - A Point with x, y and time value
See xml-format and xml-format.dtd for a DTD-file of the format.
See strokesz.xml for an example of an xml-file.