Skip to content

Meeting notes

Martin Holmes edited this page Oct 30, 2024 · 188 revisions

ATOP Meeting Notes

Note that we take notes on the scratchpad, and then transfer them here.

Meeting 2024-10-30

HBS, MH, SB Questions (from SB) on assembling ODD:

  • We should probably have a generic “get the element(s) this attribute points at” function which dereferences relative paths and prefixDefs. MH made new ticket #41 for this.
  • What do we do with a <specGrp> that does not have a <specGrpRef> pointing to it? — The Guidelines say: "The declarations it contains may be included in a schemaSpec or moduleSpec element only by reference (using a specGrpRef element): it may not be nested within a moduleSpec element." So we should drop it. We consider it a schema check that should warn user when a <specGrp> is not referred to by a <specGrpRef> in the same document. It is only a warning, as another document may refer to it.
  • Should comments (and PIs) survive the transform? We decided comments should be killed -- there is no expectation that a comment will survive -- but we discussed whether to preserve PIs in case the end-user wants to act on them downstream. In the end, we didn't see a way to preserve meaningful context for them all the way to RELAXNG, and they would be thrown away anyway if the RNG were converted to RNC. So we resolved to throw away both comments and PIs, which involves no action since they're currently ignored anyway.

Meeting 2024-10-16

MH, SB, HBS

2024-10-16 MH, SB, HBS

  • We made a decision on testing remote retrieval of resources, and how to handle the lack of a net connection: let Ant discover whether there is a net connection (e.g. can ping tei-c.org), and if not, set a property which turns off specific tests, or switches them to using local resources. Action on MH to investigate this.
  • We looked at XSpec tests for the assemble phase.
  • Action on SB to write template for <specGrpRef> in assemble_odd.xslt, because it needs to be processed slightly differently from the other <*Ref> elements. The retrieved <specGrp> itself is not copied, just its contents; and the retrieved specGrp may contain many irrelevant structures (e.g. <lg>), which need to be discarded, but the <*Spec> elements inside them need to be retained.
  • We decided that we should always add the XInclude switch when calling Saxon from Ant, because, at least at the moment, we can not come up with any reason not to process any XInclude when first encountered. Action on MH to do this.
  • Action on HBS to write the ODD for the assembled ODD.

Meeting 2024-10-02

MH, SB, HBS

  • Branch issue_38_uris PR: merged.
  • Branch issue_37_document PR: orthogonal discussion of the case of a teidata.word attribute "data-notes" in the test: for an open valList, should we/can we use teidata.word with maxOccurs="1", or should we use teidata.enumerated? This deserves separate discussion, since it's a strange edge-case of ODD. We should raise a ticket for it on the TEI repo, and investigate what the current stylesheets do in various scenarios. Merged.
  • Next meetings: 9th is out because of TEI conference, but the rest of the month should be OK.
  • Over the next two weeks, MH will look at issue #30, HBS will create the ODD for post-assemble files, and SB will work on the assemble code.

Meeting 2024-09-18

SB, HBS, MH

  • Status of issue 30? Need to check the status; some work definitely done but nothing in the branch itself.
  • We concluded that <specGrpRef> should be resolved at assemble time
  • Further we note that since a <specGrp> may contain a <specGrpRef>, the assemble-time code needs to be recursive (or otherwise handle <specGrp> nesting)
  • See ticket #38
  • We have three tasks:
    • SB — code in assemble_odd.xslt for <specGrpRef>
    • HBS — 1. Fix atop:resolve-private-uri()` so that it returns an xs:anyURI (instead of xs:string) and so that input context node is optional and testing to match (issue #38); also create an ODD for post-assemble files iff time
    • MH — testing of assembly output (initially XSpec, in the assemble branch)

Meeting 2024-09-11

SB, HBS, MH

  • Merged the issue #34 PR (validation modularization).
  • Fixed some bugs in transpile.xslt, and raised an issue for ourselves to finish documenting it and search for any more cases of missing <a:documentation> elements in the output.
  • Worked on the assembly phase, and ended up raising issue #2591 on <schemaSpec>/@source.

Meeting 2024-09-04

SB, HBS, MH

  • Started looking at SB's work on assembling, and discovered some inconsistencies in file-naming of XSLT files, so remedied that in the dev branch and merged the changes into the assembling branch.
  • HBS looked at the existing function for resolving Private URI Schemes (prefixDef) and will create a PR with an enhancement to that.

Meeting 2024-08-28

SB, HBS, MH

  • Worked through the PR for removing XProc, made some fixes and enhancements, and merged the PR.
  • Started looking at the use of @source on <schemaSpec> and e.g. <elementSpec>, trying to figure out how best to resolve these imports. We determined that we must support the mixing of random TEI versions even where this is inadvisable; we should raise a ticket on TEI to add Schematron to enforce the assertion that @source on tagdocs elements should have only one URL; and we should remember that when looking up an imported specification we need to check not only the ident but also the effective @ns value to be sure we are importing the right item.
  • It is possible that these imports need to be processed recursively, because an imported (say) module may itself contain imports. This raises the issue of relative URLs; when processing a nested import, we would have to know the original URL from which the ancestor import was taken, in order to resolve the local @source reference. SB's work on assembly doesn't yet handle namespaces or recursion, but these should be straightforward to add.

Meeting 2024-08-21

SB, HBS, MH

  • MH reported on progress with issue #31 and the new issue #34, which will follow after #31 is complete.
  • Next step we're going to address is assembling, so we've created a new branch called "assembling", where that work will start immediately. Relative pointers to externals will be resolved relative to the containing file. The resulting file will be written alongside the original file, with a suffix _assembled.xml. We should have Schematron to immediately confirm that there are no remaining externals. What are the externals to be resolved?
    • <moduleRef>/@url
    • <classRef>/@source, <dataRef>/@source, <elementRef>/@source, <macroRef>/@source, <moduleRef>/@source, or <schemaSpec>/@source
    • We'll need to resolve local URI prefixes per <prefixDef>s, so we should create a function for that in the function library and add tests for it there. Action on HBS to do this in its own branch.
    • We ran up against the problem of competing @defaultExclusion attributes in the source and customization ODDs, and ticket #2357 which we should have acted on, but which is lacking a clear guide on how chaining should be handled. It's also not clear how <schemaSpec>/@defaultExclusions on the base ODD should be combined with a <schemaSpec>/@defaultExclusions defined on the customization ODD. Should a union be the result, or is the customization completely overriding the base ODD definition?

Meeting 2024-07-24

SB, MH

  • Worked out a bug in the issue #31 branch, and specced the remaining steps to make that branch complete. MH will finish that work and then do a PR.

Meeting 2024-07-17

SB, MH

  • Brief checkin and then MH continued work on issue #31.

Meeting 2024-07-10

SB, HBS, MH

  • MH reported on the work so far completed in the issue #31 branch. One bug was discovered during the meeting. What remains is a) to fix the bug, which might simply amount to running an initial clean process to remove the products of previous test runs, and b) to add the implementations of Schematron and RELAX NG validation of (respectively) the input PLODD and the output schema.
  • SB reported on his work so far in the issue #30 branch, and we discussed what remains to be done there, which is to implement the running of the Schematron and the post-processing and comparison of actual results.
  • We all confirmed we're happy with the galley of the article, except for the one dangling citation and the missing images, which we will report.

Meeting 2024-07-03

SB, HBS, MH

Meeting 2024-06-19

SB, HBS

  • We read the proofs for the Newcastle paper to be published in JTEI.

Meeting 2024-06-19

SB, HBS, MH

  • Agreed to remove XProc, since none of us is comfortable with it. Issue #31 assigned to MH.

  • Discussed the stage at which we should resolve external references; we came to the conclusion that because external references may be relative, and it may not be possible to re-orient them to a new relative path based on the output location of the base or customization ODD, we should resolve all external references at the first point when they appear.

Meeting 2024-06-12

SB, HBS, MH

  • SB reported on recent presentations at XML Prague on XSpec, XPath, Schematron and other ODD-related things.
  • Planning future work: What steps are in assembly, and what are in deriving? (E.g., are references to the base ODD sucked in during assembly? What about references to external RNG grammars?) These seem to be the two main components of the stage prior to pruning and transpiling. This will be our discussion next time.

Meeting 2024-05-29

SB, HBS, MH

  • Discussing what to focus on next, we decided that the next-easiest step to tackle is transforming a source ODD such as p5subset.xml into a derived ODD. This helps in a number of ways: first, we can test it by then transpiling the derived ODD to see if we get tei_all.rng. Second, it enables us to clearly understand what a derived ODD looks like, and write schemas for it.
  • The question is: what are the actions we can and should take at this point which can be done discretely, without any reference to a downstream customization, and in what order should these steps happen? We also need to define the things that we don't think should appear in a source ODD; one example would be the definition of a class, and then an attempt to somehow override that definition in a subsequent classSpec with the same @ident.
  • Question: if attribute classA is @memberOf="classB", does this mean that classA inherits all attributes from classB, and can then override them with @mode="change"? (If classA is (If classA is modified in a customization ODD, there's no effect -- see below.)
  • Question: If a customization ODD overrides @rend, but does it not in att.global.rendition, but in att.global itself (which is a member of att.global.rendition), what happens in the current Stylesheets, and what should actually happen? In the current stylesheets, nothing happens; @rend is unchanged. We believe this is the correct behaviour. So: attribute classes act fundamentally as convenient grouping mechanisms; they do not incorporate any notion of a class hierarchy. In this sense their name is misleading.
  • In the process of this discussion, we find we're actually gaining a much clearer understanding of ODD. For example, we now know that class membership is at least three distinct kinds of thing: Between att classes, it's merely a grouping mechanism; between elementSpecs and att classes, it's an inheritance thing (elements inherit attributes from classes); and for model classes it's something else again (still working on that one). One approach to clarifying things here would be to suggest alternative nomenclature which would work around the confusion that comes from the abuse of the word "class" in this context.

Meeting 2024-05-22

SB, HBS, MH

  • Decided to add validation of Schematron generated during transpile against the Schematron schema, only to discover that it was not valid because it contained no patterns. See issue #29 to see how we worked around this.
  • Raised issue #30 for adding tests of the Schematron against actual instance files to complete our testing scenario.

Meeting 2024-05-15

SB, HBS, MH

  • Added validation of generated RNG files in the transpile test suite, and closed issue #26.
  • Made the PLODD schema specification valid against upcoming TEI ODD requirements (specifically @context for Schematron rules, and a wrapping <sequence> element for <content> contents).
  • Did a bunch of branch cleanup, including removing the main branch, which we see no use for.
  • Down to only three open tickets now (#25, #16, #13).

Meeting 2024-05-08

SB, HBS, MH

After a month spent working on the jTEI paper, we're back to ATOP development work.

  • Reviewed all ATOP tickets (#8 --closed, #13, #16, #25, #26).
  • Did a preliminary implementation of handling for <attRef> in the transpile stage; still awaiting full tests before we can issue a PR.
  • MH agreed to take #26.
  • HBS will merge the pre-transpile.sch rules into the PLODD ODD file.
  • We raised a new ticket, #27, concerning what looks to us like weird behaviour in handling moduleRef.

Meeting 2024-04-10

SB, MH, HBS

  • We created issue #25: we concluded that attRef should survive to the PLODD stage, and thus should be handled within transpiling.
  • We added an <attRef> scenario to the test suite, commented out for the moment until the transpiling process handles it.
  • We created issue #26: the alidation of generated schemas needs to be added to the test suite.

Meeting 2024-03-27

SB, MH, HBS

Main discussion centered on on whether to use XML construct(s) or XSLT map(s) to represent intemediate stages of derivation processing. While we are not 100% sure either way, all three of us are leaning towards maps. The logic is that it is easier to debug a map (i.e., either to write an XSpec to interogate it or to write out a debugging copy of it to disk in the middle of a routine). We think map generation should happen prior to any construct manipulation. (I.e., read in customization ODD and its sources, generate map(s); process map(s) in how ever many passes needed; convert map to ODD and write it out.)

We reached some decisions:

  • Constructs can be deleted only once. We think that, as far as easily possible, the TEI schema should warn against two (or more) deletions of the same construct (identified by QName) in the same file. (We think this because of the observation that the vast majority of the time this occurs it is probably a copy-and-paste error.) If the ATOP processor comes across an instruction to delete something that does not exist (either because it never existed or because it has already been deleted), a warning should be issued, but processing should continue.
  • We notice that there are no <attRefs> in our PLODD test suite, so we must go back and add those.
  • To prevent infinite loop type problems, we need to be sure on reading a customization ODD that following the chain of @source attrs (on any tagdoc elements) leads to an ODD that does not have an @source after, say, 10 (or 40,000) steps.
  • We need to test — both what (if anything) does current chapter say about, and what happens with current stylesheets — if you have an <elementRef> to an element which is not included via a <moduleRef>. There are six cases to be considered:
    • refering from schemaSpec/elementRef/@key to an element whose module is not referred to at all
    • refering from schemaSpec/elementRef/@key to an element whose `moduleRef/@include does not mention it
    • refering from schemaSpec/elementRef/@key to an element whose `moduleRef/@except explicitly says it is not included
    • refering from content//elementRef/@key to an element whose module is not referred to at all
    • refering from content//elementRef/@key to an element whose `moduleRef/@include does not mention it
    • refering from content//elementRef/@key to an element whose `moduleRef/@except explicitly says it is not included
  • We note that <elementRef> can occur as a child of <schemaSpec>, <specGrp>, or a descendant of <content>, that last of which has a completely different meaning from the other two. Thus we should raise a ticket to say: TEI should add a constraint (probably in Schematron) that specifies that content//elementRef can not have @source, and (schemaSpec|specGrp)/elementRef can not have either @minOccurs or @maxOccurs.
  • We also reminded ourselves of the importance of TEI issue 2306.

Meeting 2024-03-20

SB, MH

  • We discussed the fact that the Council FtF did not get around to any ATOP tickets.
  • We discussed one approach to derivation which would start by building a map of all tagdocs elements with @mode, with the keys being the combination of @ident, <altIdent>, namespace, and ancestry of things that have @mode. This actually raises one interesting question: if you have an <attDef> with @mode="change" in an <elementSpec>, but the att itself is in a <classSpec>, how do you uniquely identify it in such a way that the map can give you the source tag that you need to actually process?

Meeting 2024-03-13

HBS, SB, MH

Meeting 2024-03-06

HBS, SB, MH

Suggestions buzzing around SB’s head:

  • Replacing <thingSpec ident=duck mode=replace><quack/></> with <thingSpec mode=delete/><thingSpec ident=duck mode=add><quack/></> could occur in the assemble stage.

  • Should modes be named like variables (e.g., either atop:mDeleteStuff or just mDeleteStuff) or like functions (e.g. atop:delete-stuff or just delete-stuff)? MDH likes the former.

    • For now go with "atop:mDeleteStuff" style names. (If we change our minds it is one global change.)
  • On <schemaSpec> in derived output:

    • The output of a derivation step is (by definition) a complete set of specifications — the customization ODD has been combined with the base ODD, and all the needed specifications should be present.
    • Unlike P5, the derived ODD is expected to be a complete language (and thus would have a <schemaSpec>, I guess)
    • Like P5, the derived ODD may be the base ODD for some other customization ODD (and thus you might think it would not have a <schemaSpec>).
    • I think it is better if the derived ODD has a <schemaSpec>. After all, we need a place to put @start, @docLang, @targetLang, @prefix, etc., right? ABSOLUTELY (MDH)!
    • That means the derivation step needs to be able to handle a base ODD that either does not have a <schemaSpec> (e.g., p5subset) or does have a <schemaSpec> (e.g., the derived ODD output of a previous customization). I wondering a bit how different these will be, but for the moment am too tired.
  • On getting base ODD:

    • We need a function atop:get-base-odd-uri( $context as node() ) as xs:anyURI which returns the URI of the base ODD to be used. It needs to examine at least $context/@source and $context/ancestor::schemaSpec/@source.
    • We need a template atop:getBaseOdd which takes the parameter $atop:pBaseOddUri (which is typically set to atop:get-base-odd-uri(.)) and returns a single XML element (typically a <TEI>) which contains the ODD declaration elements (like <elementSpec>) of the base ODD. (Very often the <TEI> returned will be that of p5subset.)
    • Should we maybe bend our own rules and just allow a function atop:base-odd() to return the needed element? That element is going to mostly be used in XPaths. (E.g., atop:base-odd(.)//elementSpec[ @ident eq $atop:vTheOneIWant ].)
    • MDH & SB (after the fact, HBS had already left) decided we should follow our rules. So a function gets the URL, that gets handed to a template tests that that thing exists and gets it, and the result of that template is an XML tree snippet that is assigned to a variable that is subsequently used in XPaths.
  • After some discussion, we decided:

    • NOT to try to handle the suggested future "clone" mode, at least not right away. Either let the a specification for that emerge and tell us how it should be handled, or get a working ATOP system first before adding "clone". It's not necessary anyway, just a convenience.
    • Assembly happens first; it should be run on both the base ODD (just in case), and on the customization ODD.
    • During assembly, mode="replace" is refactored into one spec for deletion and one spec for addition.
    • Deletion definitely happens next; any subsequent attempt to use deleted components is an error.
    • Addition is next. The logic is that the added elements may contain changed elements via e.g. overrides of attributes inherited from classes.
    • Change happens next.

Meeting 2024-02-28

HBS, SB, MH

We discussed in detail a specific approach to the derivation process, and it would be to specify:

  1. The order in which operations should take place (delete, add, modify, for example)
  2. The order within one operation in which tagdocs components should be processed (valItems, then valLists, then etc.)
  3. A replace operation could be pre-expanded into a delete and an add.

We think that if it is possible to settle on a precise sequence for these operations, then any paradoxes in the schema specification will be either caught (you can't modify something which is already deleted) or will disappear.

If the Guidelines could incorporate such sequencing as part of the specification, we may then have something that is implementable (we hope). What we can never do is try to guess the user's intention when they do something irrational in a schemaSpec. If it is an error to try to modify an item that doesn't exist, then clearly it matters whether deletions happen before modifications.

Such a defined sequence would also make it possible for us to implement the mode="clone" proposal; if cloning was the first stage, before deletion, it would be trivial to implement and its behaviour would be unambiguous.

  1. Factoring: change mode=replace to a "delete" and an "add"
  2. Clone
  3. Delete (including delections factored from replace operations).
  4. Override (atts from classes etc.)
  5. Modify
  6. Add

Is it an error to try to delete something which doesn't exist? We should probably warn but not fail the process. If the processing of deletions handles classSpecs first, then a subsequent attempt to delete an attribute inherited from an attribute class would generate a warning, but it isn't catastrophic because you already got what you wanted.

Meeting 2024-02-14

HBS, SB, MH

  • Completed our revisions to the paper, did a final read-through and proof, and submitted the results, along with an explanation of our changes.

Meeting 2024-02-07

HBS, SB, MH

  • Worked again on paper revisions, and completed a draft that addresses all reviewer comments we believe should be addressed; during the coming week we will write the response email to the editors and do a final proof.

Meeting 2024-01-31

  • More work on jTEI paper submission for the Newcastle conference (with some snippets from the Paderborn conference as well, now, per reviewer suggestion).

Meeting 2024-01-24

HBS, SB

Meeting 2024-01-17

HBS, MDH

  • No official meeting.
  • We reviewed SB’s first draft letter to Council discussing DM’s resignation.
  • We went over the summary that MDH did of the reviewers’ feedback to the Newcastle conference proceedings submission.

Meeting 2023-12-27

HBS, SB, MDH

  • Issue #2282
    • Action on SB and HBS to bring it to Council’s attention
    • Action on MH to send the approved email to the TEI mailing list to have an idea of how many ODD writers are using @name to point to RELAXNG attributes.
    • Sum-up of the discussion in this comment
    • Action on SB to create a test ODD and a new ticket concerning attributes with the same name that belong to different classes (e.g. @unit): what happens when a element is associated to these different classes? See issue number.

Meeting 2023-12-20

HBS, SB, MDH

  • Worked on Issue #13, and ended up deferring it.
  • Issue #21: closed
  • Issue #16:
    • To be modified so that it includes the localization restrictions because at this point it doesn’t correspond to any stage in our pipeline
    • New question posed about <dataRef>
  • PR #19: approved and merged.

Meeting 2023-12-13

HBS, SB, MDH

Meeting 2023-12-06

HBS, SB, MDH

  • Meetings:
    • URL to use? Agreed to one ending in mta
    • Wed 13 Dec (SB probably a little late), — will try to meet
    • Wed 20 Dec (SB probably very late, assuming we want to meet), — will try to meet &
    • Wed 27 Dec (yes, we are meeting)
    • Wed 03 Jan 24 (yes, we are meeting)
    • Wed 10 Jan 24 (50/50 for SB; others will be available)
    • Wed 17 Jan 24 (SB will be in Stowe, VT; has not yet decided what he will be doing.)
  • Issues https://github.com/TEIC/atop/issues/6 & https://github.com/TEIC/atop/issues/7 — are these settled?
    • Looked at #6 and determined that it is OK and can be closed; then looked at the peripherally-related issue of whether we should/can support multiple <altIdent> elements. We determined that the use-case is identical to that for multilingual <gloss>es etc., and so we simply added it to the prune/localize template for those elements. In other words, an incoming customization can have multiple <altIdents>, but a PLODD can have only one, and that one will be selected based on the localization rules.
  • Issue https://github.com/TEIC/atop/issues/8 and thus TEI https://github.com/TEIC/TEI/issues/2282. We looked at this and determined that using <attRef> to point to a collection of one or more attributes defined in RNG is not really a good idea; instead, there should be another way to do this (<rngRef>?). Unless and until that happens, ATOP need not concern itself with <attRef>s that don't point to ODD-defined entities, and ATOP will require both @name and @class, and we suggest (see https://github.com/TEIC/TEI/issues/2282) that TEI make the same requirement.

Meeting 2023-11-29

SB and MH

  • There was no meeting 11-22 as 3 of 4 of us could not make it.
  • Q: Any success stories from our release? (How many have been downloaded?)
  • SB believes that the consensus in Paderborn was that there should be an issue raised for requiring that a Schematron <constraintSpec> that has an <sch:assert> or a <sch:report> should be required to have a context specified (i.e., have a <sch:rule> with @context). MH created a new ticket #2510.
  • We discussed some of the ramifications of the new prune_and_localize.xslt routine for ultimate derived ODD → PLODD conversion.
    • It is now the focus of dev work for that stage of the build.
    • We updated XSpec test updated to match.
    • Old routine (XSLT/prune_compiled_to_PLODD.xslt) now has a new health warning at top.
  • We have 2 PRs and 6 issues to be looking at.
    • We worked on the first of those, relating to copying of namespaces; MH merged it, and did some follow-up.

Meeting 2023-11-15

MH and HBS

  • Any finalizations to derived ODD definition? We think "last derived ODD" is fine, since any derived ODD can become an input for a further derivation.
  • For pruning stage that converts the last derived ODD to a PLODD, do we want to start with prune_compiled_to_PLODD.xslt (after all, it already does a lot of the stuff we need) or start from fresh (after all, we probably could do better than something Syd threw together in a few minutes)?
    • We started by examining the simplest part of prune_compiled_to_PLODD.xslt (<xsl:template match="desc|gloss|valDesc" as="element()?">), and we think there may be a requirement for more sophisticated logic:
      • If there is an element with the target @xml:lang, then it should be used. If there is such an element, all others should be ignored.
      • If there is no element with @xml:lang matching the target, then we should presumably choose between using the "en" sibling or using a sibling which lacks an @xml:lang; which is preferred? We settled onshould be the latter, because that version must have been created by a customizer in the chain (it can't have come from p5subset).
      • If there is no element with @xml:lang matching the target, and no element without @xml:lang, then we should choose the "en" version.
      • If there is no element with the target language, or "en", or nothing, we should choose the first sibling.
    • We implemented the above by rewriting the template in the xslt file, and created a new XSpec test file for this transformation, with tests for the logic above. See commit: https://github.com/TEIC/atop/commit/de51ad8da1c3bfde06f24dd448afc3a0bd13fb81

Meeting 2023-11-01

SB, DM, HBS, MH

  • Welcome back David!
  • Consistent message routine: we should use @use-when on <xsl:message> and have a global message level that is passed as a parameter. That parameter can be compared to a set of constants defining specific message levels. Questions:
    • How many levels? What should they be named?
    • What's the name of the parameter?
  • Syd’s tweaks to transpile.xslt. Action on SB to create PR for his transpile.xslt tweaks.
  • Name for penultimate ODD (aka terminal derived ODD). Our terminology page says "The last derived ODD in the chain is then pruned (unnecessary components removed, some or all references resolved) and localized (based on selected language) to create a PLODD. We call this stage pruning." But it doesn't name this specific derived ODD.
  • Returning to what we do at the assembly and pruning stages: we believe that we only need to care about <specGrpRef>s which are inside the <schemaSpec>.
  • SB believes that instead of starting from p5subset, a user could, of course, start from the entire P5 source.
  • <elementRef>, <macroRef>, <classRef>, <dataRef> (and of course <moduleRef>) as child of <schemaSpec> get resolved at assemble time; as a descendant of <elementSpec> would be resolved by the transpile step (we think — <attRef> probably resolved in derivation step; need to sort out, e.g., what happens with an <attRef> to a deleted <attDef> or at <attDef mode='delete'> referring to an attribute included via <attRef>).

Meeting 2023-10-25

MH, SB, HBS

  • MH suggests renaming of PLODD tests to transpile tests, because that's what they are; PLODD tests will be testing the prior step that we're writing next. All agree. Action on MH to do that.
  • Things we need to do in the pruning stage (i.e., the stage that converts last derived ODD, i.e. the terminal derived ODD, to a PLODD):
    • Localization, which means: given a single input language parameter, we
      1. select the item with that language specified if there is one; or
      2. select the English version if there is no target language version; or
      3. select the version with no specified language if neither of those exist.
    • <specGrpRef>s get resolved
    • <moduleRef>s get resolved
    • <attRef>s get resolved (see below for some stats; also TEI ticket #2282). [Because the transpile step does not deal with <attRefs>.] This presumably means some potential duplication.
    • Prose gets nuked

By next week, come up with a decent term for the last derived ODD in the chain, being the one that gets pruned.

N.B.: 447 <attRef>s in in_vivo, all of which have both @name and @class (they occur only in ~10 files)

Meeting 2023-10-11

HBS, SB, MH

  • We created an alpha release of the PLODD transpiler, and posted announcements in Slack and TEI-L.

Meeting 2023-10-04

SB, MH

  • Evaluated, tested and merged HBS's PR from August which fixed a lot of in-vivo ODD invalidities and added new examples.
  • Tested the PR for the release-package builder, found several bugs and omissions, fixed them, tested, and merged. We're now in a position to build a release package, which MH will do some time soon.

Meeting 2023-09-27

HBS, SB, MH

  • Release package branch: https://github.com/TEIC/atop/tree/release-package
  • ant -f buildRelease.xml creates a package and a zip of that package.
  • Releases will be versioned. Packages will then incorporate the version number as well as the date into folder name and zip file name.
  • Trying to run a transpile using the release package as currently build generates this error:
    • <c:errors xmlns:c="[http://www.w3.org/ns/xproc-step](http://www.w3.org/ns/xproc-step)"><c:error code="err:XD0011" href="[file:///home/mholmes/tei/atop/release/2023-09-27/Util/pipeline.xpl](file:///home/mholmes/tei/atop/release/2023-09-27/Util/pipeline.xpl)" line="18" column="86" xmlns:err="[http://www.w3.org/ns/xproc-error](http://www.w3.org/ns/xproc-error)"><message>Error while loading. Resource not found: /home/mholmes/tei/atop/release/2023-09-27/Tests/resources/in\_vitro\_ODDs/transpile.plodd</message></c:error></c:errors>
    • Why should we require transpile.plodd here? What is it used for?
    • IIRC, it was used as the default input—Syd
    • We find 2 definitions of $inputTestPlodd, one of which is transpile.odd
    • We have fixed this and PLODD building is now working.
  • During the meeting we added some basic end-user documentation as well as incorporating the version file, tested, and issued a pull request to merge the branch into dev.

Meeting 2023-09-20

HBS, SB, DM, MH

  • Next meeting (09-27), SB needs to leave early
  • @context will eventually be required; action on MH to look at the current TEI source and do one or more pull requests adding all the missing contexts.
  • Progress (if any) on TEI tickets tagged ATOP
  • Progress on attribute alternation testing. Works well; still need to add another test for co-occurrence constraints. Action on MH to do that.
  • Interleave testing, which should be simple, since we don't need to worry about DTDs.
  • Action on HBS to edit the PLODD schema to disallow the use of @generate on <classSpec>.
  • Building a release package for the PLODD transpiler, with documentation: we think it would be a good idea, and we should write documentation for it in Markdown. Action on MH to make a list of what should be included and excluded, and figure out how the documentation might be organized. This release should include (and document) our code to generate a PLODD, but we would want to avoid problems being reported in that stage, which is only a temporary interim thing, rather than in the transpiler.
  • Setting up some basic timelines for getting our test suite settled and moving on to penultimate ODD → PLODD. Next step will be prune.xslt. This should be fairly simple to do; p5odds schema should be sufficient for validation, right?
  • Future meetings: DM cannot make regular weekly meetings, so we will continue weekly meetings but expect him to be available only once a month or so. Communication will be in the Slack channels.

Meeting 2023-09-13

SB, HBS, MH

  • SB reports he is almost done with new PLODD file for testing <attList> with org="choice". Still has to create expected results, valid instances, invalid instances. Hopes to have this done by next meeting.
  • Post-mortem on Paderborn presentation.
  • Arising from Paderborn: there is a general consensus among those who care that making @context mandatory is a good idea, so we will proceed on that basis. Action on MH and SB to add @context to all our in-vivo ODDs; MH will also look at the TEI source to see if there are any missing there, and do a pull request for Council with added @contexts, so that we don't end up with the TEI source out of sync with ATOP.
  • Also from Paderborn: there is general support for a new <interleave> element, so we should assume it will appear, and add examples and code for it in the transpiler. MH and SB will meet to look more closely at this before next week's meeting.

Meeting 2023-08-23

SB, HBS, DM, MH

  • Announcement: TEI now uses Mausatron (at least in a branch) [SB, 1 min]
  • Whether we have an XProc front-end or not, we still need to have a front-end that does not rely on XProc (but ant is OK), because oXygen does not run XProc 3 (or does it?) [SB, 2 mins]
  • MH to create ant pipeline for the transpile step to parallel the XProc pipeline
  • DM proposes creating a specific repo for the transpiler; this would enable people who already write ODDs with no dependence on TEI at all to use it now, and it could also be used as an alternative to the final stage of the current Stylesheets to improve them. HBS agrees; MH and SB both feel a little concerned that this would make current ATOP work more difficult to proceed with, and require that we create an external "application" to act as a front end to the codebase from more than one repository. This may be a good idea down the road, though.
  • Review & discussion of transpilation process: An interesting discussion around what should happen with moduleRefs pointing at external schemas; an example from Exemplars/tei:svg.odd is this:
        <moduleRef url="https://www.tei-c.org/release/xml/tei/custom/schema/relaxng/svg11.rng">
          <content>
             <rng:define name="tei_model.graphicLike" combine="choice">
                <rng:ref name="svg"/>
            </rng:define>
          </content>
        </moduleRef>
  • The plan we have come up with is:
    • A moduleRef like this will be converted into an rng:div child of the schemaSpec.
    • The content of its <content> model will be copied into that rng:div.
    • The content of the external schema will be copied into the rng:div. We believe that the order of this content does not matter.
    • The transpiler is expected simply to copy any such content directly to the output schema.
    • We will start by constructing a test set for this in the PLODD tests, and discover what happens, then fix any problems.
  • Currently unresolved questions:
    • It's not yet clear whether we should copy the entire <grammar> element from the external schema into the PLODD, or whether we should only pull in its children.
    • If the latter, we should presumably also transfer its namespace declarations to the rng:div container element in the PLODD, but what about attributes such as @datatypeLibrary?
    • Our initial test can avoid some of these complexities, but eventually of course we have to handle them; at this stage we only need to know what components we expect to find in the PLODD before transpilation.
  • We intended to discuss this week's test failure, but did not get time.
  • Discussion of the Paderborn paper.

Meeting 2023-08-16

SB, HBS

  • MH and DM unavoidably absent.
  • Discussion of Paderborn paper, and creation of a doc to collaborate on it.

Meeting 2023-08-09

MH, SB, DM, HBS

  • Report from Balisage: the paper was not delivered in the end.
  • Review of to-do items:
    • #2454 https://github.com/TEIC/TEI/issues/2454
    • In discussion we realize that <constraintDecl>s will need to be chained, anyway, so allowing in both <encodingDesc> and <schemaSpec> should not be (more of) a problem
  • Creation of a Schematron SIG. See https://tei-c.org/activities/sig/sig-rules-and-regulations for more information. Those at the Paderborn conference will sound out other people to see who might want to be involved.
    • Write a proposal including the SIG’s mission and and contact person of the person responsible for the SIG
    • Send proposal to Technical Council’s chair ([email protected])
    • DM to draft initial broad mission statement, to be fleshed out by folks chatting in Paderborn.
  • TEI Conference paper rendering problem: MH is in touch with PS about how best to make the final tweaks.

Meeting 2023-07-26

MH, SB, DM, HBS

  • The extended abstract for Paderborn is due soon. DM found the HTML rendering of the paper and it is badly structured; nested lists are not working. Action on MH to get this fixed via the PC. The first few paragraphs also need to be changed: "We propose a “panel session” centered on" -> "This panel is centered on", "The session will start" -> "The session starts"; "would include" -> "include". DM also points out that the keywords are not always separated from one another correctly in the program listing.
  • Reviews are in e-mail dated 2023-06-08; we decided not to change anything material based on the reviews.
  • The 2022 paper from Newcastle is going to be represented at Balisage in case of absence of scheduled paper. SB will check in the revised version of the slides, probably in < 36 hours
  • Inclusion of common/conventional namespace bindings in Schematron output: We discussed whether we should include xsl: and xd: automatically, and we should suggest adding to the Guidelines section a link to a working paper which describes in detail how Schematron should be included in ODD, which proposes that tei:, rng:, xsl:, and xs: bindings will be added automatically and people don't need to specify these bindings themselves. Action on SB to raise a ticket on the TEI repo, with the ATOP flag, asking Council whether some namespaces should be declared automatically, and if so, which they should be.
  • HBS suggests that a TEI SIG should be created for other users interested in specific aspects of Schematron in TEI, so that they could provide us with input of how (for instance) multilingual output messages could be encoded and processed. Action on HBS: prepare some notes for next meeting about the formalities related to creation of a new SIG.
  • Action on HBS: add MEI ODDs to our* in vivo* folder

Meeting 2023-07-19

DM, MH

  • DM walked MH through the process of adding the new Schematron extraction step to the XProc; we tested that, and then removed the Ant equivalent as well as a redundant Ant validation call which is already covered by the XProc pipeline.
  • We looked at the possibility of validating the Schematron file, and what to validate it with; DM's own Schematron-schematron is one component, the rnc schema for Schematron, and the schematron.sch schema, both from the Schematron project. These three validation steps can be combined in a single XProc step, and their XVRL results combined into a single file which can then be parsed in a subsequent step, which could then choose whether to fail the build or not, and what level of info reporting to provide. Action on DM to write a step that validates the Schematron.
  • Action on SB and MH to continue building out the test suite for PLODD.
  • Action on ALL for the next meeting: Discuss the organization of materials in the Lib and Schemas folder. Should all external dependencies go in Lib, or should all schemas based on standards go in the Schemas folder, and all functional code drawn from external sources go in Lib?

Meeting 2023-07-12

SB & DM

Most of our time was spent discussing Schematron enhancement proposal issue #62. SB to write up a proposed response to R. Jellife’s comments, run them by DM before posting.

DM to put pointer to his MarkupUK talk in Slack, SB will put into repo.

We notice some errors (some minor, several serious) on the workgroups page:

  1. ATOP is referred to as “The Stylesheets Task Force”, which name is too close to “The Stylesheets Group”. Either “The Stylesheets Re-write Task Force” or just ATOP.
  2. DM is not listed on the ATOP charge page.
  3. The blurb contains “It is aimed at converting ODD to the TEI Guidelines, schemas, and customization ODDs into schemas and customized output.” which is simply incorrect. We are aimed only at converting ODD (whether a single ODD, a customization ODD, or a chain of customization ODDs) to schemas.
  4. The Documentation group that is busy using TEI Publisher to generate HTML Guidelines and customized documentation from ODDs is not mentioned. (And should be.)
  5. On the ATOP charge page I see 5 spaces (U+0020) between “these should be” and “dedicated primarily to”, both in the HTML source and on the page as I see it. I do not understand this, as I thought browsers generally reduced sequences of whitespace characters to a single space.

Meeting 2023-06-28

HBS, SB, MH

  • Fix for macroSpec for PLODD schema: see atop-internal chat.
  • Once that's fixed, we could make the rest of the schematron_in_contexts.plodd file valid
  • We hereby make a decision: at each stage when a customization is applied to an existing ODD, all sch:ns declarations will be collected into the not-yet-available constraintDecl element, and if we find two namespaces declared with the same prefix, we stop and raise an error. We don't care if multiple prefixes are used for the same namespace.
  • The current iterative function for generating ns declarations should be turned into a named template which generates a collection of sch:ns elements, which we create as a global variable, and then we just write that out and look stuff up in it. We can generate all the elements in a single pass through the document, and then prune the dupes in a second pass. At that point we can error out if the same prefix is mapped to different namespaces, and we can also simply retain cases where the same ns has multiple prefixes.
  • We looked at the @start attribute, which is optional. SB thinks that @start on schemaSpec is not right, and it would have been better for elementSpecs to declare themselves as root elements, but that's not what we have. We are tending towards thinking that by the time we get to PLODD, @start should already have been specified, but that's just kicking the problem back along the chain.

Meeting 2023-06-20

HBS, SB, MH

  • Progress (if any) on testing of transpiler: We looked briefly at the transpiler and noted that expand-text="yes" is set on the root of the XSLT file. This makes MH and SB uneasy, since we typically only use it in specific contexts where we define it precisely where it's used, but this is presumably DM's normal practice. We should discuss this to see if it should be ATOP-wide, or should be avoided; using it in some places but not others seems risky.
  • Schematron enhancement proposal issue #62: https://github.com/Schematron/schematron-enhancement-proposals/issues/62 . We looked at this, and none of us actually understand RJ's response. We THINK it means that the XML namespace standard says nothing about qualified QNames used in the context of attribute values, and therefore the standard mechanism for specifying prefixes for element and attribute names has nothing to say about Schematron attributes, but if so that still doesn't explain why Schematron doesn't follow XSLT's lead in simply saying that for attributes bearing XPath-type content, xmlns[:] declarations will be in force. In choosing to have its own mechanism Schematron is making life a bit more complicated and introducing potential confusion. But we want to check in with DM about this before responding.
  • DM’s slides from MUK: DM missing today so we did not look at these.
  • Meeting of Wed 05 Jul (SB cannot make it on time, if at all; MH and HBS will meet)
  • Decision on NS prefixes for Schematron: If there is a prefix defined for a namespace, use it; if there isn't, make one up; if there's more than one, also make one up, to avoid having to decide which to use.

Meeting 2023-06-13

MH, SB, DM, HBS

  • #2306 — “tagdocs elements are available in silly contexts”: Made GREEN for HC (with SB contributing) to implement with 6 month deprecation.
  • #2173 — “Add warning when using <constraintSpec> inside <classSpec>”:
    • Council supports solution proposed by SB. See: https://github.com/TEIC/TEI/issues/2173#issuecomment-917810897
    • Council may later propose/decide to make @context mandatory for all cases v#2369 — “Need to clarify the relationship between classSpec/@generate and classRef/@expand”: The notes do not indicate a definitive answer, but SB and HBS recall that there was an agreement to ignore @generate.
  • #2426 — “Usage of <dataRef>+<dataSpec> and <macroRef>+<macroSpec>”: Assign to SB Status Go for implementing Schematron. (To implement exactly what, I am not entirely sure. But at least to disallow dataSpec//anyElement and dataSpec//elementRef, I’m pretty sure :-)
  • MH was writing XSpec test for get-schematron-test() and was using a global map, when he should have of course passed the map as a parameter. We decided that as a core principle we should keep modules discrete so that their tests do not need to depend on other resources.
  • A possible point to make at the end of the Paderborn conference panel: the process of creating ATOP should hopefully free us from the fear of extending and enhancing ODD, which we have suffered from to some degree simply because it was so difficult to contemplate adding new features to the old Stylesheets codebase.

Meeting 2023-06-07

MH, SB, DM, HBS

  • Report from Markup UK
  • review of TEI tickets important to us (in prep for TEI Council meeting, Fri 09): https://github.com/TEIC/TEI/issues?q=is%3Aopen+is%3Aissue+label%3Aatop
  • Selection of issues:
    • 2357 = low priority
    • 2381 = v. low priority
    • 2426 = med priority and easy
    • 2371 = complicated issue that needs more discussion on ticket 1st
    • 2369 = med priority and easy
    • 2330 = complicated (because of chaining), we want to discuss more internally
    • 2306 = high priority and easy
    • 2285 = done
    • 2282 = moderately complicated, we want to discuss more internally

Meeting 2023-05-31

MH, SB, DM, HBS

  • We talked about DM's Markup UK slides, which are close to finished, and which we will try to do a final proof on soon.
  • We discussed the progress on Schematron contexts; MH triggered an interesting discussion on the XML Slack which gave us a programmatic solution to parsing assert/@test and @context attributes should we need to use it, but we're coming to a consensus that we should propose to the TEI that they:
    • specify exactly which contexts allow Schematron without an explicit @context, and
    • for each of those contexts, clarify exactly how a Schematron processor should act.
  • Pending such clarification, we agreed to try to supply @context only in the case of elementSpec/child, elementSpec/attList/attDef/child, and schemaSpec/child. We will attempt to harvest and sort out ns declarations using all three mechanisms (sch:ns, tei:*/@ns, and xmlns) and supply at least one prefix for each, globally.

Meeting 2023-05-24

MH, HBS, SB

  • Non-DTD constructs
    • We decided we should not warn about structures which cannot be converted to DTDs, because that suggests we provide some kind of support for DTDs and might encourage people to use them.
  • More on Schematron contexts (i.e., namespaces)
    • Long discussion on the Schematron namespace problem; action on MH to ask the XML Slack group if it's going to be possible to parse XPath and expand the namespaces used.
    • The current map-building code does not yet account for TEI @ns attributes; nor does it handle namespaces which are associated with a prefix that is also declared for another namespace. Action on MH to keep working on this.
  • Progress on test suite
    • Decision: unnecessary to test all possible combinations, just a reasonable subset.
  • Debugging Syd's build... we ran out of time.

Meeting 2023-05-17

SB, MH, HBS, DM

  • Report from the Council meeting
    • atop issues were not discussed but a TEIC meeting will be dedicated to them either in June or July. atop members will be invited to this meeting.
  • What needs to be done to get testing of transpile step in good condition in < 3 weeks?
  • We have a problem with the ODD for PLODD files, which doesn't have a git-tracked RNG output, meaning that everyone will need to build it locally, and will build it differently.
    • Decision: to have the schemas in the repo until we have a complete build process. MH has done this for the PLODD schema.
  • Building a set of namespaces and prefixes per MH post on Slack.
    • Original post: I've been laying the groundwork for the Schematron rendering in the transpile step, avoiding all the difficult issues for now. For our meeting tomorrow, I'd like to just run through that with everyone to make sure I'm not missing anything obvious, but also I think we need to figure out a way to build a map of all active namespaces and their prefixes in the PLODD, which we can use to create a set of sch:ns elements, and also to look up a working prefix for any namespace that shows up when we're constructing the context. We can build a map using in-scope-prefixes() and namespace-uri-for-prefix(), but I'm a little hazy on the details; ideally we include every prefix which is likely to be used in an inferred context, but also every prefix that has actually been used in an explicit prefix. I'd like to come away from tomorrow's meeting with a clear plan for how to proceed on that.
    • Decision on the first task:
      • Build a global map from which we can generate all the ns elements, and which we can use to look up an appropriate prefix for any ns encountered in a Schematron fragment.
  • Schematron talk @ MarkUp UK? There is probably plenty of interesting stuff for a talk based on simply presenting how TEI embeds Schematron rules and fragments into the Pure ODD structures, and the processing problems this generates. DM created a Google Slides document so we can collaborate [link in the Framapad].

Meeting 2023-05-03

SB, DM, MH

  • We started by looking at the question of whether <dataRef>s should be resolved BEFORE the PLODD stage, or whether they should persist in the PLODD, along with their required <dataSpec>s. Our decision is that we should assume for now that <dataRef> and <dataSpec> should persist until the PLODD stage, and MH has tested that this works. We also note that an error should be raised if a <dataSpec> contains an element reference when the <dataSpec> is referenced from a <dataRef> used in an attribute definition. However, we believe we should raise a ticket to make a clear distinction between macros and <dataSpec>s, such that while both macros and <dataSpec>s can be used in the definitions of elements and attributes, only <dataSpec> should be used for attributes, and therefore <elementRef> should not be allowed in it. Action on MH to raise a ticket for this (DONE: https://github.com/TEIC/TEI/issues/2426). The idea would be to add Schematron which prevents the use of <elementRef> or <anyElement> inside <content> in a <dataSpec>.

  • This could or should be checked at the PLODD stage, using a Schematron rule which uses a recursive function to check for any eventual chain of <dataSpec>s which end up causing the use of <elementRef> or <anyElement> in an attribute definition. This could result in an infinite loop, so the recursion needs a check on that.

  • Next meeting will be cancelled because three members cannot attend.

Meeting 2023-04-26

SB, DM, HBS, MH

  • We worked on the panel proposal for TEI 2023.

Meeting 2023-04-19

SB, DM, HBS, MH

  • Action on SB to fix the remaining test to make the Jenkins build work.
  • Action on MH to add a test target to build.xml which calls our existing tests:
    • ant -f buildTest.xml test.xspec
    • ant -f buildTest.xml runPloddTests
  • Action on everyone to provide bios (here in the framapad) and on MH to insert into our repo and somehow into the submission process.
  • Paderborn proposal: We decided to split the proposal into two, one being the proposal (which needs to be concise and conform to the submission limit) and one being the actual presentation script, which can be expansive and detailed.
  • Action on everyone to read and edit the proposal to flesh it out a bit, and add links to relevant tickets.

Meeting 2023-04-12

SB, DM, and HBS

We send our warmest wishes to MH and his family.

  • Paderborn paper proposal
    • Discussion about the topics that can be addressed on a paper for Paderborn, since the list created on March 15th is too large to handle in a sole paper.
    • A panel seems the most appropriate format. Action on SB: to create the first draft of the panel.
    • The idea to present a workshop as well is disregarded. The panel could include within the introduction of our work some instructions on how to test the available code.
  • JTEI paper proposal (Newcastle conference)
    • Action on SB: to upload a TEI version of the JTEI paper by Friday.

Meeting 2023-04-05

SB, HBS, and DM met for an abbreviated time on Zoom (as 3+ people cannot Huddle on free Slack)

All send their best wishes to MH and family. He will probably also miss 04-12 meeting, will probably be at 04-19 meeting.

Both the write-up of the Newcastle 2022 presentation as a jTEI paper and the TEI conference proposal for Paderborn 2023 are due shortly after our next meeting (Wed 12 Apr). [To be precise: proposals for Paderborn are due Sun 04-16 22:59Z (an hour earlier than you might expect); jTEI paper is due Mon 04-17, no time specified.]

DM notes he has extra availability to review on Tue 11 Apr.

To-do list:

  • SB to write Paderborn proposal.
  • DM to write up a draft proposal for a 1-day “Taking ATOP for a Spin” workshop at Paederborn. People bring TEI customizations, we see what ATOP generates. Discussions of problems designing ODD processors likely to occur as side conversations.
  • SB to write jTEI paper from Newcastle presentation in some sharable medium. (Probably start in Framapad, but perhaps as a TEI document in our repo, which is what it will end up as, anyway.)

Meeting 2023-03-22

SB, HBS, and MH.

  • Minute taking. The last few minutes of each meeting should be spent reviewing the minutes together to assure that all important decisions are recorded (while they are still fresh).
  • Test suite advancements. SB presents tests for@namespace and <altIndent>. The RELAX NG files generated from running the testing PLODD both with the old Stylesheets and the atop processor are equivalent.
  • Schematron context derivation: on-demand or decorate-them-all? On-demand for now, at least. We discussed the question of how to handle situations in which a single constraintSpec wraps both fully-realized Schematron rules and Schematron fragments which need to be enclosed in a rule with a derived @context. The approach should be: 1. apply-templates to all rules; 2. Create a single rule wrapper for all non-wrapped sch content. We also discussed the issue of how to discover a) the namespace required (which is working already in the function MH has written), and b) an in-scope prefix for that namespace; we determined that the best approach might be to generate a map of all namespace URIs to the in-scope (for the target element) prefixes for that namespace, and then take the first of those if there is one; if there is no in-scope prefix, we should invent one with the pattern atopns1, atopns2 etc., using an accumulator to ensure we don't duplicate.
  • Following up from 2023-03-08, we need to clarify the decision we reached about @key and @name in PLODDs: We do (and we should) allow both @key and @name, but the Schematron for PLODD files should check that there is a referent for every @key. Action on MDH to implement that Schematron.
  • Schedule for paper production: not discussed due to lack of time

Meeting 2023-03-15

DM, SB, HBS and MH.

Things to cover in the paper for TEI2023:

  • Schematron contexts: when can we reliably derive a context, and when should we force users to supply a context?
  • @generate and @expand on content models.
  • <interleave> (or <bag> or <verschachteln>?)
  • no duplicate /attList/attDef/@ident
  • specify Schematron query language binding (see: https://github.com/TEIC/TEI/issues/2330))
  • tagdoc elements allowed most anywhere
  • Schematron to avoid contradictory use of @require and @except
  • three functions of an ODD file: documentation, customization, schema modelling. tagdocs elements may occur in various contexts for different purposes. How should ATOP transformation handle that? In an early stage, we could pull all such elements into specific contexts in the schemaSpec. e.g. classRef has different semantics when it's inside a content model from when it's not. Not to mention the processing model stuff (which ATOP can ignore).

Meeting 2023-03-08

MH, DM and SB.

  • CI build — what does it do? * MH: It runs: * ant test.xspec * ant runPloddTests * If any test fails, the build fails, and SB and MH receive an email.

  • PLODD schema: <gloss> in <attDef> should be allowed, no? Both <gloss> and <desc> are used in constructing the annotations that e.g. Oxygen uses to provide context-sensitive help. This problem turned out to be caused by a recent change to P5 which removed the model.glossLike class; the PLODD odd needs to be updated to handle that.

  • Namespaces for testing: we need more than 1. MH has two additional ones in the nascent Schematron test, but a general policy might be helpful. We decided that all test namespaces should be based on http://www.tei-c.org/ns/atop/test_suite. Action on MH and SB to fix any test namespaces they have already created.

  • Should a <dataRef> in a PLODD have a @key, or only @name? In other words, should all @keys be expanded to something else by the PLODD stage? Is that possible? (Note: this question arose because MH inadvertently created a PLODD with @key without including the required <dataSpec>.) @key should work fine as long as the target <dataSpec> is there.

Meeting 2023-03-01

All present.

  • We need distinct Schematron for checking RELAX NG files resulting from the transpile step. This should be a distinct file, separate from the PLODD file. So far MH has not come up with any cases in the existing three tests which need Schematron, but there will be cases down the road.
  • We confirmed the earlier decision that Schematron expanded and extracted from the PLODD file should go in a single div at the end of the RELAX NG file.
  • Re usage of EQName in Schematron: For query binding xslt3 this is possible in all XPath expressions (context, test, select etc.).

"The query language used shall be the extended version of XPath3 specified in XSLT3 with backwards compatibility mode as false. Consequently, the data model used shall be the data model of XDM constructed from an infoset or a PSVI. All namespaces, prefixes, functions and operators defined by XPath3 Functions shall be available. An implementation may allow user-written functions and extensions, in the appropriate namespace."

  • Action on MH to create a sample PLODD file which exemplifies every context in a PLODD file in which we may plausibly find a partially-specified Schematron context. This would be the input to some tests, which we would then use to build the function.
  • Action on SB to implement the next two PLODD tests, for <elementSpec>s with an <altIdent>, an @ns, and both, covering @ns on and absent from <schemaSpec>.
  • We will discuss submitting a paper to the TEI conference on generating Schematron contexts, particularly with regard to <classSpecs>.

Meeting 2023-02-22

Only SB and MH available.

  • Structure of test ODDs: MH would like to suggest that they be constructed such that a single instance file can be constructed which tests most of the things they're testing. SB agrees, and the current elements test is a good example.
  • We reviewed the new pair of test targets for the element tests, and considered how to handle the invalid instances. We agreed that the simplest thing for now is just to channel the error messages into an actual-results file and diff against expected-results. Action on MH to set this up.
  • We had other discussions on Ant and ways to minimize runtime and resources used.

Meeting 2023-02-15

HBS not available.

  • GitHub issue labels on the ATOP repo: we tweaked some of them and may make further changes. We added a "blocked" label.
  • Reorganization of ant build files (MH): * The ant build is now broken up into several files, but user still expected to interact with build.xml. * Action on MH (DONE) to fold build.properties into the XML build files (the logic being all of our users are happy reading XML)
  • Discussion: What p5subset should we make the default? Live release (requiring Internet)? Download it locally, and then default to local copy if we can't get it from the web? (I think I prefer that. Of course we would allow override.) Action on MH to lay out the algorithm for this.
  • Discussion on the relationship and interdependency of P5 and ATOP. Council does not want ATOP in the TEI repo; MH likes idea that ATOP is wrapped up with TEI release, so that compatible versions always ship together.
  • SB suggests each release of ATOP should explicitly list versions of P5 it works with; for old version not in list give error, for future not in list give warning. The compatibility could be checked by running all our tests with each specific version of TEI.
  • Testing of the transpiler step:
    • What do we call the test rng generated in each of the test folders. test.rng?
    • Test suite structure: build .plodd to .rng in the same folder. instances_valid and instances_invalid folders should be inside each test folder, and the test build should validate those with the built schema. These test files should be heavily commented. The generated rng should be validated against the RELAX NG schema, and should also be validated against a single global project-level Schematron file that tests generic things, and against a local Schematron file that checks things that are specific to the test context.
  • gitHub issues:
  • Action on MH to check whether the current Stylesheets support defining one TEI version on the <schemaSpec> and a completely different one on (say) an <elementSpec>. If the current Stylesheets don't support this, DM and MH would like to suggest that we raise an issue that suggests removing this from the Guidelines, because it's underspecified and possibly unimplementable. SB is not sure.

Meeting 2023-02-08

HBS not available.

  • Continuing the discussion of Schematron, we touched on: can we, and should we try, to generate missing @context attributes for constraints defined in the context of a <classSpec>? If so, we can only do this at the final stage, transpile, because only then do we know all the element members of a class.
  • We also determined that earlier in the chain we can generate @context attributes for the elementSpec/constraintSpec and elementSpec//attDef/constraintSpec contexts. It was suggested we should probably use Clark notation (Q{ns}ident) at that stage, and then only at the transpile stage convert to a convenient prefix.
    • SB addendum: except for those prefixes, if any, that are defined in the <constraintDecl> (if & when TEI creates that element), no?
  • DM suggests the “Schematron State of the Union” talk from XML Prague 2022: https://www.youtube.com/watch?v=pb5oHBQpGM0

Meeting 2023-02-01

HBS not available.

  • Pull-Request: Remove support for classRef/@expand and classSpec/@generate: https://github.com/TEIC/atop/pull/19

  • We discussed TEI issue https://github.com/TEIC/TEI/issues/2369, trying to frame a position with regard to these attributes that we would support, because it would be easily/logically processable. We are leaning towards a position where @generate is removed, and @expand, would get a default of alternate, along with the interleave-related values from RELAX NG because only interleave values make sense; there is no logical sequence which can be defined or inferred from an ODD structure. Action on SB: Look into this more carefully, especially with respect to sequence, and then post his conclusions on the ticket.

  • We then went on to discuss the issue of Schematron: at which stage should inferred contexts be expanded, and where should the resulting Schematron be stored in the output RNG? The current stylesheets expand inferred contexts, but retain the location of a Schematron rule from an elementSpec such that it appears within the define/element context in the output RNG. We may prefer a simpler approach, where the Schematron is all grouped under a div with an explanatory attribute (suggestions for name?), making for slightly easier discovery and/or export to create a Schematron schema. As this ticket indicates: https://github.com/TEIC/TEI/issues/2173 there are contexts in which it is very difficult to infer context, so ideally we can make the case that rules without @context should not be allowed in difficult or ambiguous contexts. That would make it easy to make all contexts explicit much earlier in the process, so all the transpile step needs to do is to spit out the rules.

Meeting 2023-01-18

HBS not available.

  • This was a shorter meeting than usual because we would rather spend the time actually doing things, and have more things to do than things to discuss.
  • SB introduced an XSLT script he's writing that will potentially generate test PLODDs rather than our having to manually construct them. This may or may not prove efficient.
  • Action on MH to write the XSpec tests for the atop:repeat-content() function.
  • We agreed we do not need every single combination of @minOccurs & @maxOccurs in our test PLODDs. Rather, most of the heavy lifting will be done by XSpec, and the PLODDs need only have at least one case each of unspecified, 0, 1, 2, and unbounded. (But does not need the combinatorial nightmare.)
  • Action on SB to spend a bit of time testing an upper limit that causes performance problems.

Meeting 2023-01-11

HBS not available.

  • We agreed that we need an ant target (or XProc pipeline) for the Tests/PLODD_test_suite/*/*.plodd files, i.e. a single process that will collect those files, transpile them, and test the outputs.
  • Although it is a reasonable idea to allow PLODD files to have a <revisionDesc> just so we can store changes to our own test files, we agreed not to allow them: the PLODD schema (i.e., Schemas/ploddSchemaSpecification.rnc) should continue to disallow <revisionDesc> in a PLODD, on the theory that a) they are often inaccurate, b) we want PLODDs to be as trim as possible, and c) we have no changes we want users to know about yet, anyway. (We do want each other to know, but we have git commit msgs and Slack and email and…)
  • During the meeting, we reorganized the transpile target (of build.xml) so that it expects a .plodd file as input (specified on the -DinputTestPlodd= switch) and outputs a .rng to the same folder.
  • Consider: Do we want to have the content of each of the child directories of Test/PLODD_test_suite/ have a consistent naming convention (like thing2test.plodd, validInputs/ invalidInputs/, whatever). The consensus is that the names within the folder should all be the same (i.e. we should not repeat the directory name in the filename).

Action items:

  • MH: create ant target for transpiling all of the PLODD test suite files, validating the result (against both RELAX NG and any special-purpose schemata), and validating any test instances (both those intended to be valid and those intended to be invalid) against the result.
  • SB: add Schematron to the ploddSchemaSpecification.odd to insist that for every element referenced in schemaSpec/@start, an <elementSpec> with that @ident must exist.
  • SB: change said schema so that <gloss> and <desc> are not allowed as children of <schemaSpec>, <classSpec>, nor <macroSpec>.
  • MH: add a step to inputTestPlodd target which converts the output .rng to .rnc alongside it.
  • MH: set up a document type for PLODDs in Oxygen and associated with our PLODD schema.

Meeting 2023-01-04

HBS not available.

  • Discussion of whether our distribution should be simple and not include (for example) all our tests, or whether we should expect users to deal with the entire repo. The consensus is that we should aim towards creating a much simpler distro for end-users with good instructions, without the distractions of our own code. 

  • The distribution could allow for multiple controllers; it might run with XProc or Ant, or the user might write their own driver.

  • Should we distribute Morgana and/or Ant and/or Jing or anything else as part of the package? Should we validate a user's RELAXNG or not? 

  • Should we have a single build file that does everything, or separate out (for instance) the test build file from the main build file? The consensus is to start with a single build file, so MH will add the test targets to the existing build.xml file.

  • During the meeting, SB moved targets from his PLODD-generation build file into the main build file; Action on MH (DONE as far as is possible given that they rely on *NIX bin files) to make these targets cross-platform, to conform with our general principle that we want to support Windows as a matter of course, even though these specific targets are temporary dev work and will eventually be deleted. Consensus is to create/delete a tmp folder inside the atop folder so that results of transformations can be seen and in Oxygen and by other processes more easily.

  • These changes are now made and the build basically runs, except that the requirement for the ploddSchemaSpecification.rng file is not yet met; Action on MH (DONE) to build a conditional task into the build file which checks for this file and builds it from its ODD if it's not available.

  • Global decision: when we define a property or variable which contains a path to a folder, we do NOT add a terminating slash; when files inside the folder are addressed elsewhere, the slash is added as necessary.

Meeting 2022-12-21

  • Test suite: XSpec or ant & jing? We agree on both. XSpec tests the output of a transformation; jing will test the validity (and invalidity) of the test files. Ant can run the entire process.

  • Action on MH: adapt an ant file from the Test2 set to run all three types of test below.

  • Three levels of testing:

    • XSpec tests output of transformation;
    • validating the generated RELAXNG; and
    • running validation/invalidation tests on instance files against the generated RELAXNG.
  • Action on SB: add a comment to issue #2381 to the effect that we also would like any block of RELAXNG code including in a TEI <content> element to be wrapped in an <rng:div> element, so the single-child model can be consistent. — DONE. (This may require that our pre-PLODD processing add this element to handle legacy ODD files, which should be added to the Things to Remember page.)

  • We will use the .plodd extension for all our PLODD files, so that a Document Type can be created in Oxygen to validate them against our schema.

  • Should we allow <front> and <revisionDesc> in a PLODD?— Yes. Action on SB: Add them to ploddSchemaSpecification.odd. — DONE.

  • During the meeting, we created the first test file, covering points 1 to 5 in the things-to-test list. @sydb will commit the results when they are complete. We put the introductory explanation into the <front> element, and the <schemaSpec> in the <body>, as required by the PLODD schema.

Meeting 2022-12-14

  • Progress reports:
    • SB: <content> now has only 1 child when generated by our PLODD-generation XSLT
    • SB: raised issue https://github.com/TEIC/TEI/issues/2381.
    • MH reported on getting Morgana working with Saxon 10.8, and getting XSpec working (both on all platforms).
  • Action on MH to start digging into the XSpec work and report at the next meeting on what it does now, and how we should develop it out.
  • Review and fleshing out list of PLODD test-suite cases we developed and briefly went over last week.
  • Assignments for building out PLODD test suite.
  • Ant file management. Put PLODD generation into build.xml.
  • Maybe brief discussion of #2380 — any problem as far as ATOP is concerned? My instinct is no, none of that stuff will survive to the end-stage ODD (PLODD) that is transpiled, so we just don’t care. But I am tired and may be missing something.
  • Are we meeting Wed 28 Dec 22? — NO.
  • Looking at MH’s comments on #25 in the list of PLODD features we need the transpile stage to handle, we think it makes sense to ensure that by the time of the PLODD, all Schematron constraints should have fully-realized sch:rule elements with @context attributes, so that Schematron output is less problematic for the transpiler.
  • Action on SB to turn the list of PLODD features for testing the transpiler into a separate wiki page, and point to it from the meeting notes — done by HBS.
  • We discussed the structure of PLODD test suites, and determined that we should have:
    • A folder in Test called PLODD_test_suite
    • A folder inside there for each collection of components that can be conveniently tested with a single PLODD (so, for example, basic_elementSpec)
    • A single PLODD file set up to test those things
    • A folder for Schematron which interrogates the resulting RNG to determine whether it matches expectations
    • A folder for instance documents (instances)
    • Folders called valid and invalid inside the instances folder, each containing one or more instance documents which are intended to be valid or invalid against the generated schema.
  • The test would then consist of:
    • Transpile the PLODD
    • Schematron the output RNG and fail if it fails a test.
    • Validate the valid instance files and fail if any are not valid.
    • Validate the invalid instance files and fail if any are valid.

Meeting 2022-12-07

  • Any thoughts on the Morgana ticket? https://sourceforge.net/p/morganaxproc-iiise/tickets/120/
  • It looks like this might be caused by the old included stylesheets having version 2.0; Action on MH to test this by using first a single identity transform at version 3.0; if that works, then create a version of the old stylesheets which are all 3.0 and see if that works. Apparently Saxon has stopped supporting 2.0.
  • MDH: Testing with XSpec: external files for both testing inputs and expected results, and the whitespace complications. It is straightforward to use external files or collections using doc(), collection() etc. as inputs to XSpec tests, defining them as variables, and it's also possible to define expected results in the same way. One complication is that when you do that, whitespace is deemed significant. This means that matching expected results against actual results is a bit more awkward. However, for cases where the expected result is one or more components in an existing input file (if you're testing, for example, a complex retrieval process), then it works fine of course, and if you have a canonical expected-results file or collection, then it's less of a problem. I would also like to consider the idea of writing per-file Schematron to check expected-results rather than doing diffing as we currently do. The Schematron can express exactly what we deem significant about a test result and ignore any of the complicating incidental/accidental differences that may occur (generated ids etc.).— Complete agreement on comparing XML constructs rather than diffing serializations; general agreement on using Schematron to do so, as (it turns out) XProc test suite already does it this way.
  • Arising out of discussion of list below: Action on MH (DONE) to add to the wiki page of things we will deal with later: we would like to thoroughly normalize the content of a <content> element at the early stages, so that it only has a single child, removing some of the ambiguity where (for instance) a single sequence element means the same as its children being the direct children of <content>.
  • Action on SB to make generated PLODDs have a single child element, and to raise with Council the issue (perhaps on a ticket) that this should be a general rule for Pure ODD.

List of PLODD features we need the transpile stage to handle

Meeting 2022-11-30

  • Catch-up after vacations and absences.

  • Actions from last week:

  • Action on MH to create a branch where the transpileMorgana2 step becomes the transpile step and the other two are removed, then create a pull request. We hereby basically commit to Morgana (although we could reverse this later if we needed, perhaps for end-user interface functionality). DONE.

  • Reaffirm action on MH to post the ticket to the Morgana repo (https://sourceforge.net/p/morganaxproc-iiise/tickets/) on the fact that Morgana III will not work with anything after Saxon 10.5. DONE: https://sourceforge.net/p/morganaxproc-iiise/tickets/120/

  • We are now at a point where we should be able to enumerate all the individual features of a PLODD that we need to test. This will enable us to first specify what we expect from each individual feature in terms of output. So this might take the form of a table in the wiki initially -- a list of descriptions, each with a note -- and then later be converted into an XSpec file that will do the testing.

  • This will also surface a lot of edge cases in ODD each of which may require that we either ask for changes to ODD which make it invalid; or we decide that they need to be handled prior to the PLODD stage; or we have to force clarity from Council along with a clear description of how they should be processed. This list should be created here in the Framapad, and we should all contribute in advance of next week's meeting, with the objective of that meeting being to finalize that list. Action on everyone to do this.

Meeting 2022-11-16

DM was absent.

  • Progress reports
    • SB & HBS had no opportunity to bring up ATOP tickets at TEI meeting (which was jam-packed)
    • Results of Syd’s transpilation testing, iff I get a chance to redo it this afternoon
    • MH — multiple transpilations at once? No chance to work on this yet.
    • HBS — any luck on avoiding -lib Lib/saxon​ on the ant​ commandline? Dirty solution exists, but we would prefer to be able to use the XSpec build file unchanged, so we hope that the script solutions may solve this.
    • MH — Morgana ticket? Hasn't had a chance to do it yet.
    • MH — .sh and .bat drivers? No time to work on it yet. Windows VM is ready to use for testing, though.
  • Can we start on attDef melding?
  • Testing the transpile step, we found an interesting issue: An attDef with @ident="prev " (with trailing space) is valid (correctly), but generates an error because it's never normalized. We should normalize-space() this (and other similar key-like attributes which are supposed to be QNames) at an earlier stage in the processing, to avoid having to call normalize-space() all over the place in our transpilation functions and keys. Action on MH to figure out how/where to record decisions like this, which we can't yet act on but mustn't forget. Wiki page for "Notes on future work", organized by processing step.
  • We found an instance in the vmachine.odd file where an attDef was duplicated inside an attList; this caused a transpile failure. The current Stylesheets are robust against this, possibly by accident, but we believe it should be made invalid by the addition of a Schematron rule to TEI which says that you can't have two attDefs with the same @ident in the same attList. Action on MH to raise the TEI issue for this. (Although there is a counterargument that says user may wish to modify <desc> here and <content> there.) Action on SB: Comment on ticket.
  • Action on SB: Change test PLODD generation to normalize-space( @ident ), and to validate PLODD.
  • Action on SB: Validate RNG result of transpilation, send results to group.
  • Action on HBS to validate all the in-vivo ODDs, and fix any that are not valid against tei_odds or tei_all.

Meeting 2022-11-09

DM was absent.

  • We talked about the problems of classpaths when we're running various Java tasks (subant, Morgana and so on) from ant within an already instantiated JVM. MH feels that ultimately the only real solution is to provide .sh and .bat files which set up all the classpaths we need and pass them to the main Ant build file, from which (we think) they can be successfully inherited when running e.g. XSpec or Morgana. MH will try to test this.
  • We discussed the issue of expansion of overridden content models and particularly attDef, and our collective sense is that by the time we get to the PLODD we should have a completely expanded version of any overridden attribute, meaning that at transpile time, all we need to do is either point to a pattern or have a complete fresh attDef we can process. Having thought through the case of an ODD chain in which an attribute is partially overridden at one stage, but then a later ODD changes something in the original class definition which was not part of the override, we want the overridden attDef to inherit the change to the class. Therefore we can only do this expansion ("melding"?) at either prune-and-localize time or at transpile time. MH and HBS believe that since pruning and localizing is designed to produce an ODD which is the simplest possible input to the transpile process (separation of concerns), we should do this work at the prune/localize stage. SB is not so convinced. He believes we should write this process as a separate module, which we could include at either stage so we can put it at either stage, either at the end of pruning or the beginning of transpiling. Modularizing it will make testing easier too. Terminology: we could call a melded PLODD an MPLODD, so "imploding" would be the verb for generating a PLODD.
  • We looked at two of the atop-labelled tickets on the TEI repo, in preparation for HBS and SB to discuss them at the TEI Council meeting tomorrow.

Meeting 2022-11-02

Only MH, HBS and DM were available. SB left reports on his work in the agenda.

  • SB: Loosened PLODD schema from ( gloss?, desc? ) => ( gloss? & desc? ).
  • SB: Looked at changes to <altIdent>… they should be something that requires a minor change in our schema but we (TEI Council, primarily me) failed to get the <altIdent> changes into release 4.5.0, so it is not an issue for another ~6 months.
  • SB: MH, it would be very useful to set up ant build so that we could run multiple PLODDs through transpilation at one time (preferably in parallel), no? (I realize users probably don't need that feature, but would make our work faster. :-) Now we have figured out the next issue, Action on MH to do that.
  • MH could not get the Morgana task to run via an Ant java task. We tried various solutions in the meeting, and it turned out that Morgana fails with Saxon 10.8, and works with 10.5. Action on MH to raise a ticket in SF for this. Another alternative is to switch in Ant based on the OS and run the .bat file when on Windows, which might work much better. Action on MH to try this.
  • We looked at the issue of classSpec/@generate and classRef/@expand, and we still can't figure out a use-case for @generate that makes sense. The old stylesheets output it into documentation and into the processing model, and reference it when doing oddtodtd, but other than that it does not seem to be used. One test we should run is to create instances of ODD where @generate apparently disallows the value that is used in @expand, and where it doesn't disallow it, and see if the current stylesheets do anything different. Action on DM to do that.
  • We worked on running the xspec tests through Ant, and discovered that subant does not seem to be able to pass a classpath to the subant task; however, if you run ant -lib Lib/saxon test.xspec, it works because the classpath specified on the command line is inherited by the subant. It seems to be the case that the java task and the xslt task both create new forked JVMs, which means that classpaths can be passed to them, but the subant task does not, so if we want to avoid specifying that -lib on the command line we'll have to find another approach to doing this. Action on HBS to investigate this.

Meeting 2022-10-26

Only MH and DM were available.

  • (MH) Saxon 11 required an xmlresolver jar which was not in our repo; it has to be in a lib subfolder of the folder containing the Saxon jar. I've set that up. I wonder if Saxon should be in its own subfolder like Morgana is, so we can include all the license info etc. DM agrees: Action on MH to do this. (Done 2022-11-01.)
  • (MH) The transpile target in the build.xml file errored out with this: [xslt] ERROR: Cannot expand members of the class 'class-model' as 'sequenceRepeatable' because the class only allows 'sequenceOptionalRepeatable' MH fixed it and it seems to be working now. However, I did that by changing the @expand in the classRef to match the @generate in the classSpec; prior to this, the classSpec/@generate was "sequenceOptionalRepeatable" and the classRef/@expand was "sequenceRepeatable". It's not clear to us from the tagdocs documentation whether it was an error to have a classRef/@expand that did not match the classSpec/@generate, in which case there should be a Schematron rule to prevent it, or whether it wasn't actually wrong at all, and the transpilation should not have failed. We discussed this and we believe that the former is the case. Action on MH to raise a ticket on the TEI repo to ask for a) confirmation of our interpretation that a classRef which references a classSpec which has @generate must have @expand that has a value which is one of the values appearing in the @generate, and b) if we are correct, ask for a Schematron rule to catch such cases. (Done 2022-11-01.) Note that this can't catch cases where the classRef is in a customization ODD and the classSpec is in a source ODD, but it would still be useful.
  • MH has tweaked the build.xml file so that it can now run the transpiler either directly in Saxon (ant transpile) or through Morgana (ant transpileMorgana). By default, it will process the transpile.odd file and put the result in output/transpile.rng, then validate it there, but if you provide a path to a different ODD file, it will use that instead (ant transpile -DinputTestOdd=Tests/resources/in_vivo_ODDs/dracor.odd). In the process, MH noticed a discrepancy in the naming of two folders (...ODDS vs ...ODDs), so he renamed the first to match the second. DM provided this target which shows how to run Morgana using the java lib, which should allow us to avoid trying to figure out the platform and switch between the .sh and .bat files. Action on MH to test this and see if it actually does work on Windows.
    <target name="entities" description="Erstellt die Datei 'daten/entities.xml' mit allen erkannten Entitäten">
      <java failonerror="true" fork="true" jar="bin/morgana/MorganaXProc-IIIse.jar" dir="bin/morgana">
        <jvmarg value="-javaagent:MorganaXProc-IIIse_lib/quasar-core-0.7.9.jar"/>
        <arg value="-config=config.xml"/>
        <arg value="../../../src/entities/entities.xpl"/>
        </java>
    </target>
  • DM has worked on the transpiler and we can now generate valid RNG from ODDs using TEI modules. We have a problem processing attLists; especially getting rid of duplicate definitions for inherited attributes, and attributes with the same name but different datatypes in the same attList. For example a type attribute specified in an elementSpec which [partially] overrides a definition from a classSpec of which the element is a member. We first need terminology to explain to ourselves what overriding actually is, both in the abstract (in a customization ODD or a source ODD), when it behaves hierarchically like a Object-based inheritance, and at our PLODD level, when we may find it much more convenient to make every definition of an attribute complete wherever it appears. So for example when we have attDef[@mode="change"] in an elementSpec and the attList is replaced but the gloss and desc remain unchanged at the earlier stage, by the time we get to the PLODD, we might simply have imported the gloss and desc explicitly into the attDef as though they had also been overridden. We could call this process (for example) "attribute expansion". We might also create at the element level a complete attList which is constructed of all the inherited, native, and overridden attributes. This would be easier than trying to use patterns because determining when a pattern can be used is quite complicated. It would result in a RELAX NG file that is larger; however, a subsequent pass through the RELAX NG could detect identical patterns and abstract them. This uncouples the TEI class structure from RELAX NG patterns, which is good because they're really not the same thing at all. We should discuss this next week. Action on HBS and SB to think about this before the next meeting.
  • Build problem with ant test.xspec: on MH's Ubuntu 22.04 with Java 17, this fails because the Saxon class is not found, but it works for DM on Gentoo with Java 17, because DM has Saxon in his global classpath, whereas MH doesn't. If Saxon is removed from the classpath, both the initial transform creating the xspec-runner.xml file and the subsequent xspec process fail. Action on MH to follow up on this and figure out the problem.
  • We need to discuss our policy on giving feedback to users with problematic ODDs. Everything we can delegate to TEI itself (RELAX NG or Schematron) we should delegate, but there are some things that you can only know when you do the transformation, such as misnamed patterns in RELAX NG code in a content model, or pointers to local files which should be URLs. Our transformation should fail with useful, explicit errors wherever possible. For our in-vivo ODDs, we should go back to the original authors to get them fixed if possible, and if that's not possible, we should stop using them. We don't have an obligation to successfully process a broken ODD, even if the old stylesheets might handle it.

Meeting 2022-10-19

  • SB has managed to get the Morgana-run XProc working at the command line, but had some difficulty until he used the Morgana and Saxon versions DM put into our repo. MH has been trying to get it running as a transformation scenario in Oxygen, but gets errors suggesting perhaps a class collision between in-scope Oxygen java classes and something in Morgana or Saxon 11.
  • On checking errors from the XProc output: MH added removal of <paramList> from schema during pruning process; this structure is part of the processing model stuff, and has no bearing on schema generation.
  • One of our test ODDs has an empty <content/> element, but this is not allowed in current TEI. Should ATOP have to fix this by inserting an <empty/> element, or should we assume we don't have to handle old versions of ODD?
  • The pruning process should remove any <constraintSpec> whose constraint element reduces to an empty string; this is valid TEI but not actionable at the transpile stage, so removing it makes life simpler. Action on MDH to do this.
  • Action on HBS to fix the invalidity in vmmachine.odd. It has a duplicated <valList>. Also there are other invalidities in other test ODDs, which HBS will fix.
  • The PLODD schema should not insist on the order of gloss and desc in <*Spec> and <valItem> elements, because the TEI schema does not. We do want to expect a maximum of one <gloss> and/or one <desc>, because by the PLODD stage which should be working with only one language. SB is acting on this. Action on SB to check what the fallout from <altIdent> changes in TEI 4.5 will cause.
  • Do we need a process for removing unreferenced classes in the earlier stages of the process so that we don't get unreferenced classes in the PLODD? This would have to be a recursive process.
  • From DM: Command line to run Morgana XProc transpile step:
   Lib/morgana/Morgana.sh Util/pipeline.xpl -option:teiOddSpecification=Tests/resources/in_vivo_ODDs/msdesc.odd -output:result=odd.rng

Meeting 2022-10-12

<anyElement> and its @require and @except

General agreement that DM’s suggested approach seems like the way to go. We note that this process will function correctly whether TEI decides to enforce @require and @except as mutually exclusive, or to allow both with “can’t do stupid things” caveats.

Unit Testing

MH reports no significant progress worth reporting.

PLODDs for Testing

SB reports generated PLODDs from our in vivo ODDs and ran through transpile.xslt with 22 errors. DM points out they should be run through Util/transpile.xpl, not XSLT/transpile.xslt directly. SB to re-run and report on remaining errors. (Both SB & DM suspect there will be far fewer, if any, errors.)

TODOs

  • DM: PutMorganaXProc… into repo. — DONE
  • DM: Exemplify ant target for running XProc (presumably using MorganaXProc) from an ant build file.
  • SB: Hand-simulate DM’s approach and post any problems noticed to the atop-internal channel.
  • DM: Implement approach.
  • MH: Prepare to catch us up on how testing framework is going next week.
  • SB: Re-test in vivo PLODDs using proper XProc, not directly running XSLT.
  • SB or DM: Add 1st step to transpile.xpl to validate PLODD against what is now called ploddSchemaSpecification.odd (i.e., against ploddSchemaSpecification.rng and ploddSchemaSpecification.isosch).

Meeting 2022-10-05

  • Only DM and MH available.
  • Began looking at the processing of @defaultExceptions, and discovered some errors in the O'Reilly RELAX NG book; it's quite difficult to figure out how @require and @except on <anyElement> should be handled, especially since it's possible to create irrational combinations using these two attributes (require a namespace, then exclude all elements from that namespace, for example).
  • We should suggest a Schematron rule for TEI which says that the same namespace should not appear in both @require and @except, unless it's acting as a prefix for an element name in @except. In other words, you can't simultaneously require a namespace and then exclude it. TODO: MH to create a ticket for that.
  • In order to construct a self-contained <schemaSpec>, we need to be able to put imported blocks of RELAX NG code somewhere. The only natural option really is in <xenoData>, but since <xenoData> is not allowed inside <schemaSpec>, our notion of a self-contained schema would have to be expanded to a complete TEI ODD file. Is that reasonable, or should we ask that <xenoData> be allowed in <schemaSpec>? We'll discuss this with HBS and SB next week.
  • We also discussed the case of multiple <schemaSpec>s which refer to each other in the same file; we believe that at the assembly stage, the target should be to assemble one or other of these, completing it by importing anything from the other <schemaSpec> which is required/pointed at; then, from the Assembled ODD stage onwards, we are only dealing with a single <schemaSpec>.
  • TODO: DM to rework the <anyElement> handling in transpile.xslt to see if he can work in handling of @require, and we can then write some tests for that.

Meeting 2022-09-28

  • Slides for TEI2022 will be added by SB to the Documentation folder, together with the logo. HBS will add the one from Balisage Open Mic.
  • It looks like the simplification process will get us most of the way towards being able to compare two schemas for equivalence, but there are some cases, particularly in content models, where equivalent structures do not emerge identically. It's possible we could extend the canonical simplification process described in the RELAX NG specification to deal with some or most of these cases, and this would get us closer to being able to write tests which are fundamentally semantic rather than just text diffs.
  • TODO: SB will check in presentations and logo.
  • TODO: DM and SB will collaborate on DM's implementation of the schema simplification process, which he is implementing as a multi-step process. We may extend this after the spec's step 22, to add further conversions and possibly canonicalization of XML or conversion to compact syntax, in order to get us closer to comparability between two schemas.
  • TODO: MH will continue to add documentation and tests to DM's transpile.xslt and associated files.
  • TODO: DM will make a proposal on how assembling works for schemaSpecs with a single source schema and a single customization. This assembly process will give us more options for testing and debugging before the application of the customization to create a PLODD (deriving the PLODD). New vocabulary: a derivation statement is a spec element in the customization with a @mode attribute which proposes a change to the base schema.
  • TODO: HBS update terminology and look for naming conventions and create a centralised document with it.

Meeting 2022-09-21

  • Following discussion, SB merged DM's transpiler branch into dev. The transpiler work will now continue in dev, with temporary branches created for experimental work.
  • We started figuring out for ourselves how the $x:result map works, and various ways to query it. At least MH seems to understand it. We also provided evidence that the following are equivalent:
  • We discussed various approaches to storing XSpec contexts and variables externally to the tests, for clarity and convenience.
  • We discussed canonical identity of RNG schemas: whether it's possible to canonicalize two schemas such that they are provably equivalent, even in the case where they may have different pattern names/prefixes. We will ask on the xml.com Slack to see if anyone else has approached this problem. SB will test out this simplification routine: https://github.com/maxtoroq/rng.xsl/blob/master/src/rng-simplify.xsl

Meeting 2022-08-24

Maps vs keys: to be decided in a case by case basis.

DM's initial transpiler code uses keys, and has a recursive function to discover class membership; action on MH to rewrite this in a branch as an xsl:map, so we can compare the two approaches.

DM points out that according to the TEI Guidelines for schemaSpec, the @prefix is only applied to elementSpecs; SB believes that attributes also get prefixes with the current stylesheets, which presumably shouldn't happen according to the Guidelines; and DM discovered that if prefixes were applied to other constructs such as classes, then the DHQ ODD (for example) would fail. Action on WHO? to look into this, figure out what the Guidelines mean/should mean, what the stylesheets do, and whether we think the documentation and/or the processing should change. Instinct says that any prefix defined should be used on all constructs. The terms "all TEI elements" and "pattern" seem ambiguous in the current Guidelines.

MH did some test documentation of the xsl:keys DM created, and we determined that a) xd:ref with only @name and no content will be processed identically by the Oxygen documentation generator to xd:ref with the name repeated in the content, so there's no need to provide content for xd:ref, and b) we agree that a single block of documentation for a set of constructs which xd:refs them all and provides a single explanation is an effective approach where it's suitable (as in this case).

Meeting 2022-08-17

  • DM’s transpiler: https://github.com/TEIC/atop/tree/dmaus/transpile-draft

    • XSLT/transpile.xslt is the transpiler that creates a valid RelaxNG from Tests/resources/in_vitro_ODDS/transpile.odd
    • Util/transpile.xpl is an XProc 3.0 pipeline that transpiles transpile.odd and checks the result against the RelaxNG grammar Schemas/relaxng.rnc
    • XSLT/modules/functions_module.xslt got more functions and some indexes
    • atop:repeat-content is moved from content_module.xslt to functions_module.xslt because there are minOccurs and maxOccurs outside of content, too
    • Issues due to the underspecification of ODD (e.g. ambiguities that need to be discussed).
  • Decision: @validUntil and <desc type="deprecationInfo"> should make it through to the PLODD, and the latter should be prepended to the <a:documentation> element in the RELAXNG schema.

  • Worked together to produce an outline of the TEI presentation, which SB will now turn into a Google presentation, where we can all contribute.

Meeting 2022-08-10

We would like to be generating some RELAX NG before the September conference. DM has already made a start on the <content> element. We briefly discussed the possibility of issues arising out of mixed RELAX NG and Pure ODD content, but having checked, it seems that we only have to deal with Pure ODD or RELAX NG, not both together.

Discussion about modularization. DM would like to start with a single transpile.xslt file, and then when things are a little developed, start modularizing. Division by grouped elements seems the most practical approach: attDef + attList + attRef in one file, for example, or all the *Ref elements in one file. DM will check in his transpilation work in the next week or so, and then we can all work on expanding comments and possible modularization. The master file transpile.xslt will go in the XSLT folder directly; any modules with be split out into modules/transpile/*_module.xslt.

Action on DM: push the (draft) transpiler he has already developed to atop repository

Discussion about Schematron, especially query language binding: we believe there should be a feature request for a TEI mechanism (an att on schemaSpec?) to specify this.

Action on SB: revise Tests/resources/in_vitro_ODDS/vitro_content.odd, including:

  • @minOccurs / @maxOccurs mis-match errors (“when maxOccurs is not specified minOccurs must be 0 or 1”; also maybe "unbounded" is not being treated properly)
  • only 1 thing in <sequence>

Agenda item for next meeting: TEI Conference presentation

Meeting 2022-07-27

Present: SB (briefly), MH, HBS, & DM. MH volunteers to do notes→minutes conversion this week.

  • DM presented on using XSpec to test error states and codes on XSpec. (https://gist.github.com/dmj/703daeda74340ba816f6a858c4132996#file-example-xsl). We identified one function in our current small set which is amenable to testing in this way, because it has a terminate="yes" message; ACTION ON MDH to follow DM's example code to implement an XSpec test for this function, both as a proof of concept/example in our codebase and because it may well be useful.

  • SB: "I think the derivedSchemaSpecification.odd is pretty much ready for usage. I am sure we will find things we want to improve as we go along, but I think it will do for a starting point, now. Others should chime in." The group have these questions for SB:

    • Why is all the non-schema-specific stuff (biblStruct, postCode, mentioned) in the PLODD schema? Would we not be pruning all of that out at an earlier stage?
    • In this bit: <sch:let name="elementSpecs" value="for $e in ./tei:elementSpec return concat('{', ($e/ancestor-or-self::*/@ns)[last()], '}', $e/@ident )"/> We think the @ident is used to create the pattern name and the <altIdent> to create the element name, so the namespace is not relevant here; RELAX NG doesn't allow namespace distinctions between patterns. Important conclusion after much discussion: We believe that @ident should be unique for each type of *Spec, so we should not need to include the namespace in this set of values. We don't know what we think about attributes at this point; there will be many attDefs with the same @ident (presumably), so any enumeration of attDefs will have to include further contextualizing information, but namespace would not be the important thing here. DM points to TEI issue #2282, which suggests that it is only possible to refer to an attribute when it is a member of a class, so attRefs can always be resolved based on class, and we could have a Schematron rule that requires both @class and @name on any attRef (cf https://github.com/TEIC/atop/issues/8).
    • Why not delete @rend, @rendition and @style from their global classes rather than from individual elements?
    • HBS points out that all classRef elements should probably have been replaced by their imported classes by the time we get to PLODD, so classRef does not need to be in the PLODD schema. But we also note that classRef can be used in e.g. <content>, and the same class may be used in multiple places, so presumably classRef is required to reference the same classSpec from multiple contexts within the PLODD.

Meeting 2022-07-20

Present: SB, MH, & DM. SB volunteers to do notes→minutes conversion this week.

  • We confirmed that (despite SB’s instict otherwise) the rules for parameter names apply not only to internal but also to external params users will use.
  • We decided (or confirmed, if we had already decided) that the atop: prefix should be used for global variables and global parameters. This means a global parameter looks like atop:pThing and a global variable looks like atop:vOther_Thing. We will revisit this if users object, but otherwise …
  • Syd asserted that as Chair he expects a certain level of workload (like chairing meetings, being liason to Council & the world, writing reports, requesting funding if we ever do that, etc.), but simultaneously intends to do things (and things we all should do things) in a completely transparant manner. That means that if Syd gets hit by a bus, one of the other Task Force members should be able to pick up with little or no difficulty. But that means having some shared space for notes (e.g., recent notes on how to generate psuedo-PLODDs from our collection of customization ODDs). We agreed to create a root-level Notes/ directory in the GitHub repo for such things.
  • XSpec — David plans to present an XSpec-for-atop tutorial (including testing cases that are intended to fail) next week.
  • We agreed to treat classSpec/constraintSpec/constraint/sch:assert (or sch:report) as if it had the same constraints as macroSpec/constraintSpec/constraint/sch:assert (or sch:report) — i.e., as if the TEI schema issues a warning (because the sch:rule/@context is missing). That is we will not feel compelled to do anything intelligent if it occurs, but won’t crash. We could even abort with an error, but no crashing.
  • On schema for PLODDs (Schemas/derivedSchemaSpecification.odd):
    • Any given element should never have > 1 <altIdent> child.
    • For every element type E and @ident attribute value I there should be only one occurrence of E with an @ident of I. (There might be more than 1 @ident of I, e.g. one could imagine both a <gender> and a @gender, thus an <elementSpec ident="gender"> and an <attDef ident="gender">.)
  • On routine to convert an old Stylesheet “compiled ODD” into a pseudo-PLODD:
    • [Current name is XSLT/pare_down_compiled_to_PLODD.xslt; SB plans to rename to XSLT/prune_compiled_to_PLODD.xslt]
    • Action on SB: Update names to match correct ATOP conventions.
    • We decided that <desc> elements with a @type of "deprecationInfo" should be dropped from the PLODD.
  • On <altIdent>:
    • We convinced ourselves that an @ident is used to create a pattern name, and an <altIdent> to create the element name in the output schema, so when we wish to create (for example) two versions of a <p> element for use in different contexts, we create two <elementSpec>s with different @idents but both having <altIdent>p</altIdent>.
    • Any given <*Spec> in a PLODD may have at most one <altIdent>. Multiple <altIdent>s at earlier stages are permitted for the purposes of differentiating element names between language versions, but this should have been processed out by the time we get to the PLODD, which is dealing with only one language.

Meeting 2022-07-13

Still need to discuss <moduleRef> of a foreign schema some more. If there is a <moduleRef url="http://www.example.edu/FML.rng"/> in the input, what ends up in the PLODD? (FML stands for Fun Markup Language :-) Worse, what if there is a

    <moduleRef url="http://www.example.edu/FML.rng">
      <content>
        <rng:define name="tei_model.pLike" combine="choice">
          <rng:ref name="fun"/>
        </rng:define>
      </content>
    </moduleRef>

What does that even mean?

After some investigation, we think the following:

  • <moduleRef> with @url should be resolved prior to PLODD.

  • The pointed-at content from the external schema should be retrieved and placed alongside any RNG code which is inside the moduleRef/content.

  • The @url attribute URI should be replaced with a period, to signify that the content is here and now, not external.

A parallel issue is the question of <dataRef>. This can point to an XSD datatype using @name, which is fine, but it can also in theory point to an externally-defined datatype using @ref (not @url). The current stylesheets do not handle this, and we don't intend to either (for the moment) because there is no agreed-upon way of defining an external datatype library.

From 2022-06-29 SB was supposed to check into <constraintSpec> inside <classSpec>. A quick test says 1) yes, any <constraint> inside a <classSpec> should have an <sch:rule>, else the context is "*" (in the .isosch) or "tei:" (i.e., an error, in the RNG); and 2) there is no constraint that says that (there is one for <macroSpec>). (Note that when you generate the .isosch file you get the message "INFO: constraint for XXXX class does not have a context=. Resulting rule is applied to *all* elements.".)

2022-07-06

Questions arising from SB and MH trying to create schemas for PLODDs.

  • One and only one <valDesc>? (With an @xml:lang). Decision: no <valDesc> allowed in PLODDs.
  • Always <gloss> first, then <desc>. Decision: Yes, that order, zero or one of each
  • @xml:lang values of "" and "und". Decision: going with only "und" in a PLODD (because they are essentially identical, see https://www.w3.org/International/questions/qa-no-language. So the PLODDing process will replace empty @xml:lang with "und". And the PLODD schemas should not permit xml:lang="".
  • Because it's reasonable to have multiple <elementSpec>s etc. with different <altIdent>s, our previous decision that there should be only one of each with the same @ident is presumably wrong. What we think is the case now is that there can only be one <*Spec> with the same @ident without any <altIdent>; every other <*Spec> with the same @ident must have a distinct <altIdent>. This can only be enforced with Schematron.
  • Our final discussion centred on the case of <moduleRef> having a <content> element with some RELAX NG, and also @url pointing to an external RNG schema. When you do this, are you effectively including the entirety of that target schema in the content element and then modifying/drawing on it to build the definition that's in your <content> element? We think this is essentially what this means, so it might make sense for pre-PLODD processing to import that schema and just dump it in place. [Aside: what if the @url points to something which is not a RELAX NG schema? — Then it is invalid: “@url refers to a non-TEI module of RELAX NG code by external location”].
# A tiny schema that has different definitions of <p>, one for

# metadata (in the <header>) and one for content (in the <stuff>).

start = element test { 

  header, stuff

}

header = element header { metaP+ }

stuff = element stuff { textP+ }

metaP = element p { text }

textP = element p { text | name | emph }

name = element name { text }

emph = element emph { text }

Meeting 2022-06-29

  • Back to nomenclature. We're beginning to figure out what a derived ODD is, but we don't yet have nomenclature to separate the two stages before it: combining in input subset with a new customization, then filtering out unwanted language materials and prose. We could call the last step localization: it prunes the tree to remove unwanted stuff, and that includes material unwanted because it's not in the target language.

    • customization ODD + language ODD -> derived ODD
    • derived ODD pruned & localized (i.e., lang selection) -> pruned, localized ODD
    • We read the PLODD for creating RNG (it has @xml:lang, which gets copied to <a:annotation>
    • We will probably separate the localization and pruning stages.
  • We should follow the pattern of the XInclude language fix-up specification and in the early phases of the process, we should do the same thing, and decorate every element with @xml:lang. We can also call this process language fix-up. We may actually have to do this at multiple stages in the process.

  • Action on MDH to update the wiki nomenclature page.

  • Question we have to consider: Do we need to perform xml:base fix-up when using, e.g., <elementRef>? Consensus is yes, we do.

  • Suggestion: absolutify every relative URI early in the processing, so there are no more relative URIs to worry about. (Which is the same as base-uri fixup?)

  • Suggestion: we generate a base-uri for every piece of incoming content, make use of that throughout our own processing, but then we discard the base-uri in the output RELAX NG, leaving any remaining issues to the user, since we can't control where the output RELAX NG will end up.

  • Another possible ticket to improve ODD specification: when you define Schematron, you should EITHER depend entirely on the context in the ODD to provide the context for the Schematron rule, OR you should provide a fully-fledged @context attribute. Action on SB to investigate whether a constraintSpec without @context in the context of a classSpec should trigger an error, because the inherent context is meaningless.

  • Suggestion: We should define quite clearly in which contexts a has a useful context which we put on @context (and what it is), and thus in which contexts it does not.

  • Attributes needed on schemaSpec in a derived ODD are only: start, xml:lang, xml:id, prefix, defaultExceptions.

Meeting 2022-06-22

  • Decision: we will use the schemaSpec as the root TEI element for derived ODDs to differentiate between them and input ODDs, so that we can associate different schemas with them in the Oxygen environment. Action on MH to test this and set it up using tei_all for both cases, anticipating that we will have a special schema to validate derived ODDs in the future. We should probably develop the derived ODD schema in both RELAX NG and ODD, to see if we come up with differences.

  • The chain now becomes slightly more complex:

  • Take a set of source ODDs and

    • a) assemble (include external stuff in the ODD);

    • b) apply change instructions based on @mode (which may be a lather-rinse-repeat cycle, repeated until there are no mode=change left to apply). This gives us a pre-derived ODD, and this would be the input to any documentation generation process.

    • c) discard everything which is not schemaSpec, and simplify the use of TEI elements in the prose of glosses, descs, remarks and so on.

    • d) validate the resulting schemaSpec against our special derived ODD schema.

    • e) process the derived ODD to create the RELAX NG.

  • Action on MH to start building a RELAX NG schema directly for the schemaSpec-rooted derived ODD (derivedSchemaSpec.rng), and on Action on SB to do the same using ODD (derivedSchemaSpec.odd).

  • XSpec and dynamic errors: https://github.com/xspec/xspec/wiki/Testing-Dynamic-Errors

Meeting 2022-06-15

  • One week remains for submission to TEI Members Meeting. We don't believe we have enough to say for a full-length session, as suggested by James, but we can do a 20-minute paper. But SB will check with Council whether they want a formal report, and if so whether we can get 30 minutes on the schedule for that, or whether we should submit a formal paper. We are tending towards submitting a paper because we will get useful feedback from the review process, and we may be able to engage more people.
    • ATOP -- A new TEI ODD Processor
    • Keywords: ODD, schema, RELAX NG, XSLT Stylesheets, ODD chaining
    • Abstract:
      • Who is involved?
      • Why is it needed?
      • How far have we got?
        • Decisions so far
        • Processing sequence
          • pre-flight schema
          • assembling customization ODD & base ODD (and chaining)
          • [double-check validity of derived ODD]
          • XSLT derived ODD -> RNG]
        • Plan of work (derived ODD -> RELAX NG XML, then go back to work on deriving/chaining)
        • XSpec testing framework along the way
      • demo (if we’re lucky)
  • How to handle large @minOccurs and @maxOccurs — Do we set a limit (as current Stylesheets do; it is 400 by default, search for $maxint in odds/teiodds.xsl), or just let user get an enormous RNG schema if she is silly enough to use minOccurs=100 maxOccurs=10000? (SB)
    • Decision— At least at first (until it’s a problem), no upper bound. HOWEVER, the pre-flight input schema check should issue a warning if @maxOccurs > 100 (or whatever).
    • Note— Another use of this input schema would be to check @source attr is resolvable.
    • I (DM) am not certain we need an internal representation of minOccurs/maxOccurs (cf. atop:min-max-to-int) at all. Why not use a named template that is called with RelaxNG content, its min and max occurrence and returns a corresponding RelaxNG pattern?
<xd:doc>
  <xd:desc>
    <xd:p>Given element content, an optional minimum, and an optional maximum occurrence, return a corresponding RelaxNG pattern.</xd:p>
  </xd:desc>
  <xd:param name="pContent">Element content</xd:param>
  <xd:param name="pMinOccurrence">Minimum occurrence, defaults to 1.</xd:param>
  <xd:param name="pMaxOccurrence">Maximum occurrence, defaults to 1.</xd:param>
  <xd:return>RelaxNG pattern</xd:return>
</xd:doc>

<xsl:template name="atop:repeat-content" as="element()*">
  <xsl:param name="pContent" as="element()*"/>
  <xsl:param name="pMinOccurrence" as="xs:integer?"/>
  <xsl:param name="pMaxOccurrence" as="xs:string?"/>
  <xsl:if test="exists($pContent)">
    <xsl:variable name="vMinOccurrence" as="xs:integer" select="($pMinOccurrence, 1)[1]"/>
    <xsl:variable name="vMaxOccurrence" as="xs:string" select="($pMaxOccurrence, '1')[1]"/>
    <xsl:for-each select="1 to $vMinOccurrence">
      <xsl:sequence select="$pContent"/>
    </xsl:for-each>
    <xsl:choose>
      <xsl:when test="$pMaxOccurrence eq 'unbounded'">
        <rng:zeroOrMore>
          <xsl:sequence select="$pContent"/>
        </rng:zeroOrMore>
      </xsl:when>
      <xsl:otherwise>
        <xsl:for-each select="($vMinOccurrence + 1) to xs:integer($vMaxOccurrence)">
          <rng:optional>
            <xsl:sequence select="$pContent"/>
          </rng:optional>
        </xsl:for-each>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:if>
</xsl:template>
  • Reconsider usage of modules (DM)
    • Take for example a function that uses a global variable to avoid a magic string. The function is defined in function_module.xslt, the variable in global_vars_module.xslt. If this is the case, then function_module.xslt has an implicit dependency on global_var_module.xslt. This is not good. [Note that Oxygen handles this with Main/Master Files.]
      • One consequence: XSpec tests for functions cannot run against function_module.xslt but e.g. against a stylesheet that includes both files.
    • Decision: We keep the modules and revisit this issue later
  • New functions (DM)
    • atop:get-element-QName ($elementSpec as element(elementSpec)) as xs:QName
    • atop:get-attribute-QName ($elementSpec as element(elementSpec)) as xs:QName
  • New terminology: Assembling (DM)
    • Assembling is the process by which references to external components are merged into a customization ODD.
  • Action on MDH: Raise a TEI issue to question why e.g. elementSpec is available in all sorts of ridiculous contexts. It's arguable that most tagdocs elements should only be available in other tagdocs elements, and if we can constrain this (e.g. elementSpec can only be a child of specGrp or schemaSpec) our work will be much easier. (Done 2022-06-15.)
  • Action on MDH: Create a content_module.xslt file to contain the named templates above, and more as we write it. (Done 2022-06-15.)
  • Action on HBS: Create an in-vitro ODD intended for testing processing.
  • Action on DM: Expand comments in functions_module to explain the logic around ident sequence.
  • Action on DM: Check XSpec tests for calls that should fail? (SB,DM; see functions_module.xspec)

Meeting 2022-06-08

Present: DM, HBS, SB, MDH

Agenda

  • Initial discussion relating to @SB's unique-ident() function; both MDH and DM have objections, but these may be moot since we may not need the function anyway, and if we do, can we not just use generate-id(), since we believe that a derived ODD should only have one *Spec element for any given item.

  • The arrangement of the in vivo ODDs. Action on: HBS: add complex odds to sample directory.

  • The in vitro ODD(s) for <content>. (SB)

  • Getting ant​ to run XSpec (DM): https://github.com/TEIC/atop/pull/12. Universally admired and merged.

  • XSpec tests for calls that should fail? (SB,DM; see functions_module.xspec)

  • How to handle large @minOccurs and @maxOccurs — Do we set a limit (as current Stylesheets do; it is 400 by default, search for $maxint in odds/teiodds.xsl), or just let user get an enormous RNG schema if she is silly enough to use minOccurs=100 maxOccurs=10000? (SB). Not addressed because of time.

  • One week remains for submission to TEI Members Meeting. We don't believe we have enough to say for a full-length session, as suggested by JC, but we can do a 20-minute paper. But SB will check with Council whether they want a formal report, and if so whether we can get 30 minutes on the schedule for that, or whether we should submit a formal paper. We are tending towards submitting a paper because we will get useful feedback from the review process, and we may be able to engage more people.

Actions items:

  • SB to talk to Council re Members Meeting report.
  • DM to look into XSpec tests that are supposed to test failure.
  • Everyone to think about what should go into a Members Meeting paper or report.

Meeting 2022-06-01

  • As of Oct 22 Nick will find it nearly impossible to make the Wed 15:15Z meetings, so by Sep 22 we should schedule a new meeting time. Action on SB by 2022-09-15: send out Doodle poll or whatever to schedule new meeting time.
  • Action on MH by 2022-09-01: Replace Saxon 10 with Saxon 11 (to make sure that we don’t build in any features which will break (e.g. use of document-uri() from the context of a collection).
  • We agreed to defer much of the proposed agenda, including folder naming conventions, to next week (2022-06-08) in order to thoroughly discuss atop:unique-ident(). We came up with some code for it [which SB re-wrote after the meeting]: results.

Meeting 2022-05-25

Meeting 2022-05-18

ODD searches

  • How many ODDs do we want that just include modules? None (other than those in Exemplars/ already)
  • MH to continue at least with Category:Customization, if not project ODDs, too.
  • HBS looked at repos with ".odd" files, and created a list of repos to look at later. She skipped those that were forks of TEI-C repos, or were a copy of tei_lite, or named “Old TEI”, etc.
  • HBS plans to ask each repo if we can copy their ODD (despite the file, by definition, being open source) to be polite.
  • SB wrote a draft to post to TEI-L, to which there were no objections.

ODD test suite

  • In addition to gathering ODDs per above, we also wish to generate a test suite of a whole variety of possible customization ODDs.
  • Our plan is to generate a set of tests for the thing we are about to create. E.g., make a whole suite of tests (probably in multiple ODD files) that express the various possible combinations of content of first, as we plan to work on derived ODD to RNG content models first. Then someday later , etc.
  • This is a lot of work.

Musings on testing output RNG

  • Generate RNG, ensure output is valid
  • Use XSpec for named templates and functions
  • Test that RNG validates what we want?

From 04-20 meeting

  • Concerning the things to be checked in the derived ODD:
    • <*Ref> things that cannot be found?
    • @preserveOrder?
    • EOW1: Multiple <content> or <valList> children of <macroSpec> or <dataSpec>
    • EOW1: Multiple <elementSpec>s with same @ident
  • Other questions to be discussed concerning the derived ODD:
    • Does it have warnings (in which case transpilation continues) or just errors (in which case no further processing)? In other words: “stop at first sign of problem” vs “soldier on if there is even a chance of generating usable output”
    • What should it be written in?
    • What directory does the schema file go in?

Structure for XSLT folder

  • XSLT/
    • modules/ (functions_module.xslt, global_vars_module.xslt, params_module.xslt, messages_module.xslt, etc.)
    • derivedODD_to_rng_master.xslt
    • customization_to_derived_rung.xslt (or whatver)

Meeting 2022-05-04

  • Results of MH tests concerning @docLang vs @targetLang:
    • @docLang changed <gloss> (and presumably <desc>) and the schema annotations
    • @targetLang changed the element name (i.e., used <altIdent xml:lang=MATCH>)
    • Description of the attributes is correct and the Stylesheets process them correctly
    • Currently processor does not handle more than one @docLang attribute (although it’s allowed)
    • Discussion: Does ATOP handle more than one @docLang value? In principle, yes.
    • Thus use of <xsl:result-document/> because this allows us to generate multiple schemas from one transformation process, where (for example) @docLang has multiple languages.
  • We still want a user to be able to run transform on commandline (although we may drop this later), so for now presume the ant driver sends in wacky parameter to tell the XSLT it is being run from ant. (Thus if it does not see that param, it can give the user info.)
  • Action on Syd: contact the people working in TEIGarage to ask whether ant and ant-contrib can be integrated/supported.
  • Assign ODD collection roles:
    • SB will ask on TEI-L.
    • HBS will search on GitHub for likely candidates.
    • MH will examine the ones already on the wiki.
  • Ask ODD providers how they want to be credited; anonymity being an option. A list of anonymous contributors can be kept on Slack if necessary so we can still contact any particular contributor.
  • ODD header: keep provenance (except if anonymous, in which case just use an <idno> that corresponds to that list on Slack), license, and modifications implemented by us.
  • We decided last week that step 1 of transpiling a derived ODD into RELAX NG is a validation stage. Preliminary list of things that will be checked (EOW1 means either error, or a warning that only the first will be processed, others will be ignored):
    • Error: More than 1 <schemaSpec> (which does not preclude more than 1 in a customization ODD; we will decide that later)
    • Error: contains @source on ODD element(s)
    • Error: contains <specGrpRef> or <xi:include>
    • Warn: contains <specGrp>
      • We are not sure that <specGrp> and <specGrpRef> should ever appear in the schemaSpec of a derived ODD, because they are essentially a grouping and inclusion mechanism for organizing documentation (the prose). MH to check this understanding by a) generating some derived ODDs, and b) contacting James, who is an expert on this.
    • Something we can’t handle on @docLang or @targetLang per above conversation (we do not have to worry about multiple values on @targetLang, only 1 is allowed; do we care if one of the values of @docLang is not a proper BCP 47 language tag? As long as it matches an @xml:lang, it will work, ey? * Check @xml:lang values in pertinent elements and create warning if there is no match * EOW1: Multiple <altIdent> siblings without @xml:lang or the same @xml:lang

Elements to discuss in the following meeting:

  • Concerning the things to be checked in the derived ODD:
    • <*Ref> things that cannot be found?
    • @preserveOrder?
    • EOW1: Multiple <content> or <valList> children of <macroSpec> or <dataSpec>
    • EOW1: Multiple <elementSpec>s with same @ident
  • Other questions to be discussed concerning the derived ODD:
    • Does it have warnings (in which case transpilation continues) or just errors (in which case no further processing)? In other words: “stop at first sign of problem” vs “soldier on if there is even a chance of generating usable output”
    • What should it be written in?
    • What directory does the schema file go in?

Meeting 2022-04-27

  • We have decided to try out Framapad for our notes.
  • Task: MH will determine which attribute (@docLang or @targetLang) controls the language of the annotations in the RELAX NG output by 2022-05-04.
  • Task: MH will copy the scratchpad into the wiki page after the meeting by 2022-05-04.
  • How to deal with required enhancements in the current ODD?
  • We need to amass a collection of ODDs from TEI users. We should write to TEI-L for that. If we store these in a GitHub repo, we can check them out as part of our own testing and run them all through the process. We should put these ODDs in our own ATOP repo. We could divide them into those we expect to be successfully processed, and those which contain features which we expect to raise an error or a warning for. One person should go through the ODDs that are already on the TEI wiki; someone else should seek out ODDs on GitHub and ask people if we can use them; a third person should post to TEI-L asking for contributions. We need to add metadata to those ODDs with the users' permission.
  • We're OK with not supporting something (e.g. @preserveOrder), especially if we're explicit about it, and we raise a ticket with a particular category that says that we don't support it, asking for input from Council.
  • What should happen with multiple <schemaSpec> elements in one file? The current Stylesheets don't handle them; we also don't believe we should handle them in the sense that a derived ODD should only ever have one <schemaSpec>, and we believe that in the real world, nobody actually has multiple schemaSpecs in the same file, so perhaps we should just make a feature request to TEI to disallow multiple schemaSpecs. We don't have to handle this until we're writing the first stage of the process, though.
  • The first part of step 2 (transpiling) should actually be a validation stage where we check the derived ODD against a set of rules which cover things we believe to be wrong and are justified in rejecting, and things we are not yet able to handle, which we can warn about.

Meeting 2022-04-20

Agenda

Namespace URI

We pick http://www.tei-c.org/ns/atop as the namespace URI.

Open PRs

All PRs are accepted, #2 and #4 require a rebase and will be merged by @dmj after this.

Discuss derived ODD

Issues #6 and #7 lead to a discussion about altIdent and support for different languages for documentation. These are two different issues:

  1. We need at most one altIdent to obtain a QName of a specified element or attribute. A preprocessing step might resolve any ambiguity by removing all other altIdent or annotating the one to use.

  2. We have to check how far support for differernt (documentation) languages can go with RelaxNG. The ODD chaining component should preserve all language information if possible.

Action Items:

  • Syd to create a ticket about a better documentation about usage and constraints of altIdent
  • David to rebase PR #2 and #4 and merge them to dev
Clone this wiki locally