Chapter 25. Constraint Grammar Keywords

Table of Contents

ADD
ADDCOHORT
ADDRELATION
ADDRELATIONS
AFTER-SECTIONS
ALL
AND
APPEND
BARRIER
BEFORE-SECTIONS
CBARRIER
CONSTRAINTS
COPY
CORRECTIONS
DELIMIT
DELIMITERS
END
EXTERNAL
IF
IFF
INCLUDE
LINK
LIST
MAP
MAPPINGS
MAPPING-PREFIX
MOVE
NEGATE
NONE
NOT
NULL-SECTION
OPTIONS
PREFERRED-TARGETS
REMCOHORT
REMOVE
REMRELATION
REMRELATIONS
REPLACE
SECTION
SELECT
SET
SETCHILD
SETPARENT
SETRELATION
SETRELATIONS
SETS
SOFT-DELIMITERS
STATIC-SETS
STRICT-TAGS
SUBSTITUTE
SWITCH
TARGET
TEMPLATE
TEXT-DELIMITERS
TO
UNDEF-SETS
UNMAP

You should avoid using these keywords as set names or similar.

ADD

Singles out a reading from the cohort that matches the target, and if all contextual tests are satisfied it will add the listed tags to the reading. Unlike MAP it will not block further MAP, ADD, or REPLACE rules from operating on the reading.

      ADD (tags) targetset (-1* ("someword")) ;
    

ADDCOHORT

Inserts a new cohort before or after the target.

      ADDCOHORT ("<wordform>" "baseform" tags) BEFORE (@waffles) ;
      ADDCOHORT ("<wordform>" "baseform" tags) AFTER (@waffles) ;
    

ADDRELATION

ADDRELATION creates a one-way named relation from the current cohort to the found cohort. The name must be an alphanumeric string with no whitespace.

      ADDRELATION (name) targetset (-1* ("someword"))
        TO (1* (@candidate)) (2 SomeSet) ;
    

ADDRELATIONS

ADDRELATIONS creates two one-way named relation; one from the current cohort to the found cohort, and one the other way. The names can be the same if so desired.

      ADDRELATIONS (name) (name) targetset (-1* ("someword"))
        TO (1* (@candidate)) (2 SomeSet) ;
    

AFTER-SECTIONS

Same as SECTION, except it is only run a single time per window, and only after all normal SECTIONs have run.

ALL

An inline keyword put at the start of a contextual test to mandate that all cohorts found via the dependency or relation must match the set. This is a more readable way of saying 'C'.

      SELECT targetset (ALL c (tag)) ;
    

AND

Deprecated: use "LINK 0" instead. An inline keyword put between contextual tests as shorthand for "LINK 0".

APPEND

Singles out a reading from the cohort that matches the target, and if all contextual tests are satisfied it will create and append a new reading from the listed tags. Since this creates a raw reading you must include a baseform in the tag list.

      APPEND ("baseform" tags) targetset (-1* ("someword")) ;
    

BARRIER

An inline keyword part of a contextual test that will halt a scan if the barrier is encountered. Only meaningful in scanning contexts.

      SELECT targetset (-1* ("someword") BARRIER (tag)) ;
    

BEFORE-SECTIONS

Same as SECTION, except it is only run a single time per window, and only before all normal SECTIONs have run.

CBARRIER

Careful version of BARRIER. Only meaningful in scanning contexts. See CBARRIER.

      SELECT targetset (-1* ("someword") CBARRIER (tag)) ;
    

CONSTRAINTS

Deprecated: use SECTION instead. A section of the grammar that can contain SELECT, REMOVE, and IFF entries.

COPY

Duplicates a reading and adds tags to it. If you don't want to copy previously copied readings, you will have to keep track of that yourself by adding a marker tag.

      COPY (€copy tags) TARGET (target) - (€copy) ;
    

CORRECTIONS

Deprecated: use BEFORE-SECTIONS instead. A section of the grammar that can contain APPEND and SUBSTITUTE entries.

DELIMIT

If it finds a reading which satisfies the target and the contextual tests, DELIMIT will cut the disambituation window immediately after the cohort the reading is in. After delimiting in this manner, CG-3 will bail out and disambiguate the newly formed window from the start. This should not be used instead of DELIMITERS unless you know what you are doing.

      DELIMIT targetset (-1* ("someword")) ;
    

DELIMITERS

Sets a list of hard delimiters. If one of these are found the disambuation window is cut immediately after the cohort it was found in. If no delimiters are defined or the window exceeds the hard limit (defaults to 500 cohorts), the window will be cut arbitarily. Internally, this is converted to the magic set _S_DELIMITERS_.

      DELIMITERS = "<$.>" "<$?>" "<$!>" "<$:>" "<$\;>" ;
    

END

Denotes the end of the grammar. Nothing after this keyword is read. Useful for debugging.

EXTERNAL

Opens up a persistent pipe to the program and passes it the current window.

      EXTERNAL ONCE /usr/local/bin/waffles (V) (-1 N) ;
      EXTERNAL ALWAYS program-in-path (V) (-1 N) ;
      EXTERNAL ONCE "program with spaces" (V) (-1 N) ;
    

IF

An optional inline keyword put before the first contextual test of a rule.

IFF

Singles out a reading from the cohort that matches the target, and if all contextual tests are satisfied it will behave as a SELECT rule. If the tests are not satisfied it will behave as a REMOVE rule.

      IFF targetset (-1* ("someword")) ;
    

INCLUDE

Loads and parses another grammar file as if it had been pasted in on the line of the INCLUDE statement.

        INCLUDE other-file-name ;
      

LINK

An inline keyword part of a contextual test that will chain to another contextual test if the current is satisfied. The chained contextual test will operate from the current position in the window, as opposed to the position of the original cohort that initiated the chain. The chain can be extended to any depth.

      SELECT targetset (-1* ("someword") LINK 3 (tag)) ;
    

LIST

Defines a new set based on a list of tags, or appends to an existing set.

      LIST setname = tag othertag (mtag htag) ltag ;

      LIST setname += even more tags ;
    

MAP

Singles out a reading from the cohort that matches the target, and if all contextual tests are satisfied it will add the listed tags to the reading and block further MAP, ADD, or REPLACE rules from operating on the reading.

      MAP (tags) targetset (-1* ("someword")) ;
    

MAPPINGS

Deprecated: use BEFORE-SECTIONS instead. A section of the grammar that can contain MAP, ADD, and REPLACE entries.

MAPPING-PREFIX

Defines the single prefix character that should determine whether a tag is considered a mapping tag or not. Defaults to @.

      MAPPING-PREFIX = @ ;
    

MOVE

Moves cohorts and optionally all children of the cohort to a different position in the window.

      MOVE targetset (-1* ("someword")) AFTER (1* ("buffalo")) (-1 ("water")) ;
      MOVE WITHCHILD (*) targetset (-1* ("someword")) BEFORE (1* ("buffalo")) (-1 ("water")) ;
      MOVE targetset (-1* ("someword")) AFTER WITHCHILD (*) (1* ("buffalo")) (-1 ("water")) ;
      MOVE WITHCHILD (*) targetset (-1* ("someword")) BEFORE WITHCHILD (*) (1* ("buffalo")) (-1 ("water")) ;
    

NEGATE

An inline keyword put at the start of a contextual test to invert the combined result of all following contextual tests. Similar to, but not the same as, NOT.

      SELECT targetset (NEGATE -1* ("someword") LINK NOT 1 (tag)) ;
    

NONE

An inline keyword put at the start of a contextual test to mandate that none of the cohorts found via the dependency or relation must match the set. This is a more readable way of saying 'NOT'.

      SELECT targetset (NONE c (tag)) ;
    

NOT

An inline keyword put at the start of a contextual test to invert the result of it. Similar to, but not the same as, NEGATE.

      SELECT targetset (NEGATE -1* ("someword") LINK NOT 1 (tag)) ;
    

NULL-SECTION

Same as SECTION, except it is not actually run. Used for containing ANCHOR'ed lists of rules that you don't want run in the normal course of rule application.

OPTIONS

Global options that affect the grammar parsing.

      OPTIONS += no-inline-sets ;
    

PREFERRED-TARGETS

If the preferred targets are defined, this will influence SELECT, REMOVE, and IFF rules. Normally, these rules will operate until one reading remains in the cohort. If there are preferred targets, these rules are allowed to operate until there are no readings left, after which the preferred target list is consulted to find a reading to "bring back from the dead" and pass on as the final reading to survive the round. Due to its nature of defying the rule application order, this is bad voodoo. I recommend only using this if you know what you are doing. This currently has no effect in CG-3, but will in the future.

      PREFERRED-TARGETS = tag othertag etctag ;
    

REMCOHORT

If it finds a reading which satisfies the target and the contextual tests, REMCOHORT will remove the cohort and all its readings from the current disambiguation window.

      REMCOHORT targetset (-1* ("someword")) ;
    

REMOVE

Singles out a reading from the cohort that matches the target, and if all contextual tests are satisfied it will delete the mached reading.

      REMOVE targetset (-1* ("someword")) ;
    

REMRELATION

Destroys one direction of a relation previously created with either SETRELATION or SETRELATIONS.

      REMRELATION (name) targetset (-1* ("someword"))
        TO (1* (@candidate)) (2 SomeSet) ;
    

REMRELATIONS

Destroys both directions of a relation previously created with either SETRELATION or SETRELATIONS.

      REMRELATIONS (name) (name) targetset (-1* ("someword"))
        TO (1* (@candidate)) (2 SomeSet) ;
    

REPLACE

Singles out a reading from the cohort that matches the target, and if all contextual tests are satisfied it will remove all existing tags from the reading, then add the listed tags to the reading and block further MAP, ADD, or REPLACE rules from operating on the reading.

      REPLACE (tags) targetset (-1* ("someword")) ;
    

SECTION

A section of the grammar that can contain all types of rule and set definition entries.

SELECT

Singles out a reading from the cohort that matches the target, and if all contextual tests are satisfied it will delete all other readings except the matched one.

      SELECT targetset (-1* ("someword")) ;
    

SET

Defines a new set based on operations between existing sets.

      SET setname = someset + someotherset - (tag) ;
    

SETCHILD

Attaches the matching reading to the contextually targetted cohort as the parent. The last link of the contextual test is used as target.

      SETCHILD targetset (-1* ("someword"))
        TO (1* (step) LINK 1* (candidate)) (2 SomeSet) ;
    

SETPARENT

Attaches the matching reading to the contextually targetted cohort as a child. The last link of the contextual test is used as target.

      SETPARENT targetset (-1* ("someword"))
        TO (1* (step) LINK 1* (candidate)) (2 SomeSet) ;
    

SETRELATION

Creates a one-way named relation from the current cohort to the found cohort. The name must be an alphanumeric string with no whitespace.

      SETRELATION (name) targetset (-1* ("someword"))
        TO (1* (@candidate)) (2 SomeSet) ;
    

SETRELATIONS

Creates two one-way named relation; one from the current cohort to the found cohort, and one the other way. The names can be the same if so desired.

      SETRELATIONS (name) (name) targetset (-1* ("someword"))
        TO (1* (@candidate)) (2 SomeSet) ;
    

SETS

Deprecated: has no effect in CG-3. A section of the grammar that can contain SET and LIST entries.

SOFT-DELIMITERS

Sets a list of soft delimiters. If a disambiguation window is approaching the soft-limit (defaults to 300 cohorts), CG-3 will begin to look for a soft delimiter to cut the window after. Internally, this is converted to the magic set _S_SOFT_DELIMITERS_.

      SOFT-DELIMITERS = "<$,>" ;
    

STATIC-SETS

A list of set names that need to be preserved at runtime to be used with advanced variable strings.

      STATIC-SETS = VINF ADV ;
    

STRICT-TAGS

A whitelist of allowed tags.

      STRICT-TAGS += N V ADJ ;
    

SUBSTITUTE

Singles out a reading from the cohort that matches the target, and if all contextual tests are satisfied it will remove the tags from the search list, then add the listed tags to the reading. No guarantee is currently made as to where the replacement tags are inserted, but in the future the idea is that the tags will be inserted in place of the last found tag from the search list. This is a moot point for CG-3 as the tag order does not matter internally, but external tools may expect a specific order.

      SUBSTITUTE (search tags) (new tags) targetset (-1* ("someword")) ;
    

SWITCH

Switches the position of two cohorts in the window.

      SWITCH targetset (-1* ("someword")) WITH (1* ("buffalo")) (-1 ("water")) ;
    

TARGET

An optional inline keyword put before the target of a rule.

TEMPLATE

Sets up templates of alternative contextual tests which can later be referred to by multiple rules or templates.

      TEMPLATE name = (1 (N) LINK 1 (V)) OR (-1 (N) LINK 1 (V)) ;
      TEMPLATE other = (T:name LINK 1 (P)) OR (1 (C)) ;
      SELECT (x) IF ((T:name) OR (T:other)) ;
    

TEXT-DELIMITERS

Sets a list of non-CG text delimiters. If any of the patterns match non-CG text between cohorts, the window will be delimited at that point. Internally, this is converted to the magic set _S_TEXT_DELIMITERS_. If cmdline flag -T is passed, that will override any pattern set in the grammar.

      TEXT-DELIMITERS = /(^|\n)<s/r ;
    

TO

An inline keyword put before the contextual target of a SETPARENT or SETCHILD rule.

UNDEF-SETS

A list of set names that will be undefined and made available for redefinition. See set manipulation.

      UNDEF-SETS = VINF ADV ;
    

UNMAP

Removes the mapping tag of a reading and lets ADD and MAP target the reading again. By default it will only act if the cohort has exactly one reading, but marking the rule UNSAFE lets it act on multiple readings.

      UNMAP (TAG) ;
      UNMAP UNSAFE (TAG) ;