如何获取其他行中存在于注释关键字上方/下方的数据?我可以注释关键字,但无法获取信息
示范文本:
Underwriter's Name Appraiser's Name Appraisal Company Name
Alice Wheaton Cooper Bruce Banner Stark Industries
码
TYPESYSTEM utils.PlainTextTypeSystem;
ENGINE utils.PlainTextAnnotator;
EXEC(PlainTextAnnotator, {Line});
ADDRETAINTYPE(WS);
Line{->TRIM(WS)};
REMOVERETAINTYPE(WS);
Document{->FILTERTYPE(SPECIAL)};
DECLARE UnderWriterKeyword, NameKeyword, UnderWriterNameKeyword;
DECLARE UnderWriterName(String label, String value);
CW{REGEXP("\\bUnderwriter") -> UnderWriterKeyword};
CW{REGEXP("Name")->NameKeyword};
(UnderWriterKeyword SW NameKeyword){->UnderWriterNameKeyword};
ADDRETAINTYPE(SPACE);
Line{CONTAINS(UnderWriterNameKeyword)} Line -> {
(CW SPACE)+ {-> MARK(UnderWriterName)};
};
REMOVERETAINTYPE(SPACE)
预期产量:
Underwriter's Name: Alice Wheaton Cooper
Appraiser's Name: Bruce Banner
Appraisal Company Name: Stark Industries
请建议是否可以在RUTA中使用?如果为true,如何获取数据?
TYPESYSTEM utils.PlainTextTypeSystem;
ENGINE utils.PlainTextAnnotator;
DECLARE Header;
DECLARE ColumnDelimiter;
DECLARE Cell(INT column);
DECLARE Keyword (STRING label);
DECLARE Keyword UnderWriterNameKeyword, AppraiserNameLicenseKeyword,
AppraisalCompanyNameKeyword;
"Underwriter's Name" -> UnderWriterNameKeyword ( "label" = "UnderWriter
Name");
"Appraiser's Name/License" -> AppraiserNameLicenseKeyword ( "label" =
"Appraiser Name");
"Appraisal Company Name" -> AppraisalCompanyNameKeyword ( "label" =
"Appraisal Company Name");
DECLARE Entry(Keyword keyword);
EXEC(PlainTextAnnotator, {Line,Paragraph});
ADDRETAINTYPE(WS);
Line{->TRIM(WS)};
Paragraph{->TRIM(WS)};
SPACE[3,100]{-PARTOF(ColumnDelimiter) -> ColumnDelimiter};
Line -> {ANY+{-PARTOF(Cell),-PARTOF(ColumnDelimiter) -> Cell};};
REMOVERETAINTYPE(WS);
INT index = 0;
BLOCK(structure) Line{}{
ASSIGN(index, 0);
Line{STARTSWITH(Paragraph) -> Header};
c:Cell{-> c.column = index, index = index + 1};
}
Header<-{hc:Cell{hc.column == c.column}<-{k:Keyword;};}
# c:@Cell{-PARTOF(Header) -> e:Entry, e.keyword = k};
DECLARE Entity (STRING label, STRING value);
DECLARE Entity UnderWriterName, AppraiserNameLicense, AppraisalCompanyName;
FOREACH(entry) Entry{}{
entry{ -> CREATE(UnderWriterName, "label" = k.label, "value" =
entry.ct)}<-{k:entry.keyword{PARTOF(UnderWriterNameKeyword)};};
entry{ -> CREATE(AppraiserNameLicense, "label" = k.label, "value" =
entry.ct)}<-{k:entry.keyword{PARTOF(AppraiserNameLicenseKeyword)};};
entry{ -> CREATE(AppraisalCompanyName, "label" = k.label, "value" =
entry.ct)}<-{k:entry.keyword{PARTOF(AppraisalCompanyNameKeyword)};};
}
最重要的规则如下:
Header<-{hc:Cell{hc.column == c.column}<-{k:Keyword;};}
# c:@Cell{-PARTOF(Header) -> e:Entry, e.keyword = k};
它包含三个规则元素,Header
,#
和Cell
,并以这种方式工作:
Cell
rule元素匹配,因为它被标记为锚@
。Cell
或不属于Header
注释的所有注释。它从Cell
满足此条件的第一个注释开始,并将其称为“ c”。#
匹配的,直到下一个规则元素能够匹配。Header
如果内联规则能够在Header
注释范围内匹配,则下一个规则元素与注释匹配。内联规则Cell
在此范围内与记为“ hc”的注释匹配,这些注释的特征值相同column
。如果匹配包含Keyword
记为“ k”的匹配项,则说明匹配成功。Entry
注释的跨度上创建一个称为“ e”的Cell
注释。Entry
功能keyword
。概括而言,该规则为不属于标题的Entry
每个Cell
注释创建一个注释,并分配相应列的header关键字,以定义条目的类型。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句