Repository for Excel add-ins used for corpus annotation.
ValidateAnnotation.xla - uses a schema file to check for each column header (e.g. pos, word, sentence...):
- Cells in columns with headers "word", "pos" have exact same spans (word=pos)
- "pos" may only contain set values (values:pos(NOUN,VERB,null))
- A worksheet "meta" must contain annotation names in first column (require:meta(title,author,...))
- Span of "sentence" must include all rows of overlapping "word" cell (sentence>word)
- You may also use either/or in headers like this: sent|sent_n>translation
- You can enfore cells in a column always being filled, e.g. to ensure the data is 'plated' with line annotations, via full:anno1,anno2,...
MiscAnno.xla - miscellaneous macros to facilitate annotation tasks
- Auto merge down - merge cells in selection with any empty cells under them
- Stretch into gap - stretches all cell to merge back into preceding cells if they are empty (useful if you insert a line to add a token and want to stretch all span annotations to cover the new tokens)
- Merge with content - merges selected cells and concatenates their contents into the new cell
- Delete line - programmatically delete the line that the cursor is own without marking it
- Insert line - programmatically insert a new line at the cursor position
- Fill zeros where cells in selection are empty (useful for sparse matrices in cross tabulation)
exmaralda_io_0.9.9.3.xla - Corpus linguistics format converter for Windows Excel
- Somewhat misnamed, as it supports not only EXMARaLDA XML but also PTB brackets, conll10 dependencies, PAULA XML, TreeTagger SGML and more
- See documentation in README_exmaralda_io_0.9.9.3.pdf
UnderOverUse_V1.2.xla - Legacy version of a conditional formatting script for number comparisons
- Highlights values of cells in warm/cold colors by deviation from control column values
- Thresholds are configurable (see underuse_overuse.pdf)
- Mostly reproduceable using conditional formatting in newer versions of Excel
Schemas for ValidateAnnotation.xla (Coptic SCRIPTORIUM only)
- validate_ms.txt validates files created from Coptic manuscripts according to Coptic SCRIPTORIUM standards
- validate_bible.txt validates files created from biblical editions (not manuscripts) such as Sahidica according to Coptic SCRIPTORIUM standards