Skip to content

Commit 9badefd

Browse files
committed
Finish 0.6.3
2 parents 3243d77 + f94192f commit 9badefd

File tree

18 files changed

+1802
-13446
lines changed

18 files changed

+1802
-13446
lines changed

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ This is a pure-Ruby library for working with the [Shape Expressions Language][Sh
1818

1919
The ShEx gem implements a [ShEx][ShExSpec] Shape Expression engine version 2.0.
2020

21-
* `ShEx::Parser` parses ShExC and ShExJ formatted documents generating executable operators which can be serialized as [S-Expressions](https://en.wikipedia.org/wiki/S-expression).
21+
* `ShEx::Parser` parses ShExC and ShExJ formatted documents generating executable operators which can be serialized as [S-Expressions][].
2222
* `ShEx::Algebra` executes operators against Any `RDF::Graph`, including compliant [RDF.rb][].
2323
* [Implementation Report](file.earl.html)
2424

@@ -183,11 +183,9 @@ Example usage:
183183

184184

185185
## Implementation Notes
186-
The ShExC parser uses the [EBNF][] gem to generate first, follow and branch tables, and uses the `Parser` and `Lexer` modules to implement the ShExC parser.
186+
The ShExC parser uses the [EBNF][] gem to generate a [PEG][] parser.
187187

188-
The parser takes branch and follow tables generated from the [ShEx Grammar](file.shex.html) described in the [specification][ShExSpec]. Branch and Follow tables are specified in the generated {ShEx::Meta}.
189-
190-
The result of parsing either ShExC or ShExJ is the creation of a set of executable {ShEx::Algebra} Operators which are directly executed to perform shape validation.
188+
The parser uses the executable [S-Expressions][] generated from the EBNF ShExC grammar to create a set of executable {ShEx::Algebra} Operators which are directly executed to perform shape validation.
191189

192190
## Dependencies
193191

@@ -260,3 +258,5 @@ see <https://unlicense.org/> or the accompanying {file:LICENSE} file.
260258
[YARD]: https://yardoc.org/
261259
[YARD-GS]: https://rubydoc.info/docs/yard/file/docs/GettingStarted.md
262260
[PDD]: https://unlicense.org/#unlicensing-contributions
261+
[PEG]: https://en.wikipedia.org/wiki/Parsing_expression_grammar "Parsing Expression Grammar"
262+
[S-Expression]: https://en.wikipedia.org/wiki/S-expression

Rakefile

100755100644
Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -22,39 +22,41 @@ task default: :spec
2222
RSpec::Core::RakeTask.new(:spec)
2323

2424
desc 'Create versions of ebnf files in etc'
25-
task etc: %w{etc/shex.sxp etc/shex.html etc/shex.ll1.sxp}
25+
task etc: %w{etc/shex.sxp etc/shex.html etc/shex.peg.sxp}
2626

2727
desc 'Build first, follow and branch tables'
2828
task meta: "lib/shex/meta.rb"
2929

3030
file "lib/shex/meta.rb" => "etc/shex.ebnf" do |t|
3131
sh %{
32-
ebnf --ll1 shexDoc --format rb \
32+
ebnf --peg --format rb \
33+
--input-format native \
3334
--mod-name ShEx::Meta \
3435
--output lib/shex/meta.rb \
3536
etc/shex.ebnf
3637
}
3738
end
3839

39-
file "etc/shex.ll1.sxp" => "etc/shex.ebnf" do |t|
40+
file "etc/shex.peg.sxp" => "etc/shex.ebnf" do |t|
4041
sh %{
41-
ebnf --ll1 shexDoc --format sxp \
42-
--output etc/shex.ll1.sxp \
42+
ebnf --peg --format sxp \
43+
--input-format native \
44+
--output etc/shex.peg.sxp \
4345
etc/shex.ebnf
4446
}
4547
end
4648

4749
file "etc/shex.sxp" => "etc/shex.ebnf" do |t|
4850
sh %{
49-
ebnf --bnf --format sxp \
51+
ebnf --input-format native --format sxp \
5052
--output etc/shex.sxp \
5153
etc/shex.ebnf
5254
}
5355
end
5456

5557
file "etc/shex.html" => "etc/shex.ebnf" do |t|
5658
sh %{
57-
ebnf --format html \
59+
ebnf --input-format native --format html \
5860
--output etc/shex.html \
5961
etc/shex.ebnf
6062
}

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.6.2
1+
0.6.3

etc/shex.ebnf

Lines changed: 20 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -1,75 +1,3 @@
1-
# Yacker grammer for (Sh)ape (Ex)pressions (C)ompact language
2-
#
3-
# Copyright 2015, Eric Prud'hommeax, Harold Solbrig, Iovka Boneva, Jose Labra Gayo
4-
# All rights reserved; please contact copyright holders for any use outside of the ShEx semantics.
5-
# http://www.w3.org/2005/01/yacker/uploads/ShEx2/bnf
6-
#
7-
# yacker: https://www.w3.org/2005/01/yacker/uploads/ShEx3?lang=perl
8-
# repo: https://github.com/shexSpec/shexTest/
9-
# branch: TEisSE
10-
# TODO:
11-
# annotations on shapes
12-
# look hard at EXTERNAL valueClassDefinitions
13-
#
14-
# Changes from yacker ShEx2:
15-
#
16-
# + '*' in 2nd pos of REPEAT_RANGE so e.g. {2,*} === {2,}
17-
# ~ code names are iris. this changes existing semantic actions as follows:
18-
# <S> { :p1 . %GenX{ blah blah blah %} }
19-
# PREFIX GenX: <http://...GenX> <S> { :p1 . %GenX:{ blah blah blah %} }
20-
# empty code names %{ ... %} are parsed as empty relative URLs: %<>{ ... %}
21-
# + ATPNAME_NS, ATPNAME_LN terminals to parse e.g. @ex:foo which looks like a LANGTAG
22-
# ~ a is an iri (not just for predicates) enabling e.g. <reif> { rdf:predicate (a) }
23-
# ~ groupShapeConstr can only have ORs (no ANDs) so no <S1> { :p1 @<S2> AND @<S3> }
24-
# + includeSet parallels inclPropertySet (&<A><B><C>) e.g. <S3> &<S1> <S2> { ... }
25-
# + factored out (labeled) productions for startActions and semanticActions
26-
# ~ BNodes can have stringFacets
27-
# - ANON terminal (not referenced)
28-
# ~ tripleConstraint: swapped position of annotations and cardinality (EGP 20150727)
29-
# + valueClass: add xsFacet* to datatype (EGP 20150727)
30-
# + escapes (\\, \uxxxx, \Uxxxxxxxx) in CODE (EGP 20150730)
31-
# ~ moved | RDF_TYPE from iri: to predicate: (disables e.g. %a{ ... %}) (EGP 20150807)
32-
# + tried to intercept non-terminated STRING_LITERAL_LONG{1,2} (EGP 20150807)
33-
# - unaryShape: id (EGP 20150818)
34-
# + unaryShape: annotation* (EGP 20150818)
35-
# + valueClassLabel, valueClassDefinition, valueClassExpr, valueClassOrRef (EGP 20150818)
36-
# + factored out notStartAction (EGP 20150818)
37-
# - oneOf (EGP 20150820)
38-
# ~ split out encapsulatedShape (EGP 20150831)
39-
# - numericRange takes a numericLiteral, was INTEGER (EGP 20150917)
40-
# + innerShape ::= multiElementGroup | multiElementSomeOf (EGP 20151022)
41-
# ~ annotation ::= ';' predicate (iri | literal) -- was iri (EGP 20151022)
42-
# + generalized AND/OR expressions on valueClasses (EGP 20151031)
43-
# - groupShapeConstr (shapeOrRef (OR shapeOrRef)*) (EGP 20151031)
44-
# + stringFacet* on negatableValueClass ::= shapeOrRef (EGP 20151101)
45-
# ~ vcand and vcor are mutually exclusive (i.e. not recursive) (EGP 20151111)
46-
# ~ s/valueClassDefinition/valueExprDefinition/ (EGP 20151120)
47-
# ~ s/valueClassLabel/valueExprLabel/ (EGP 20151120)
48-
# ~ change valueSet delimiters from ()s to []s (EGP 20160104)
49-
# ~ codeDecls can have no code (implies code ref) (EGP 20160128)
50-
# + ()s around valueExprs (EGP 20160520)
51-
# + shapeDefinition += nonLiteralKind? stringFacet* (EGP 20160520)
52-
# - shape == "VIRTUAL"? (EGP 20160520)
53-
# + ';' separator for groups (EGP 20160524)
54-
# ~ s{;}{//} for annotations (EGP 20160524)
55-
# + ("AND" shapeDefinition)* (EGP 20160524)
56-
# + ("OR" shapeDefinition)* (EGP 20160615)
57-
# ~ generalized shapeExpression to include valueExprDefinition (EGP 20160708)
58-
# ~ reordered (EGP 20160913)
59-
# ~ s/ShapeDisjunction/shapeOr/ (EGP 20160913)
60-
# ~ s/shapeConjunction/shapeAnd/ (EGP 20160913)
61-
# ~ s/negShapeAtom/shapeNot/ (EGP 20160913)
62-
# ~ s/inclPropertySet/extraPropertySet/ (EGP 20160913)
63-
# ~ s/value/valueSetValue/ (EGP 20160913)
64-
# ~ updated production labels (EGP 20160913)
65-
# ~ reorder to align with spec (EGP 20160930)
66-
# ~ s/someOfShape/someOfTripleExpr/ (EGP 20160930)
67-
# ~ s/innerShape/innerTripleExpr/ (EGP 20160930)
68-
# ~ s/groupShape/groupTripleExpr/ (EGP 20160930)
69-
# ~ s/unaryShape/unaryTripleExpr/ (EGP 20160930)
70-
# ~ s/encapsulatedShape/bracketedTripleExpr/ (EGP 20160930)
71-
# ~ s/SomeOf/OneOf/g (EGP 20161201)
72-
731
# Notation:
742
# in-line terminals in ""s are case-insensitive
753
# production numbers ending in t or s are from Turtle or SPARQL.
@@ -87,15 +15,13 @@
8715

8816
[8] statement ::= directive | notStartAction
8917

90-
[9] shapeExprDecl ::= shapeLabel (shapeExpression|"EXTERNAL")
91-
[10] shapeExpression ::= shapeAtomNoRef shapeOr?
92-
| "NOT" (shapeAtomNoRef | shapeRef) shapeOr?
18+
[9] shapeExprDecl ::= shapeExprLabel (shapeExpression | "EXTERNAL")
19+
[10] shapeExpression ::= "NOT"? shapeAtomNoRef shapeOr?
20+
| "NOT" shapeRef shapeOr?
9321
| shapeRef shapeOr
9422
[11] inlineShapeExpression ::= inlineShapeOr
95-
# [12] shapeOr ::= shapeAnd ("OR" shapeAnd)*
96-
[12] shapeOr ::= shapeOrA | shapeOrB shapeOrA?
97-
[12a] shapeOrA ::= ("OR" shapeAnd)+
98-
[12b] shapeOrB ::= ("AND" shapeNot)+
23+
[12] shapeOr ::= ("OR" shapeAnd)+
24+
| ("AND" shapeNot)+ ("OR" shapeAnd)*
9925
[13] inlineShapeOr ::= inlineShapeAnd ("OR" inlineShapeAnd)*
10026
[14] shapeAnd ::= shapeNot ("AND" shapeNot)*
10127
[15] inlineShapeAnd ::= inlineShapeNot ("AND" inlineShapeNot)*
@@ -119,9 +45,10 @@
11945

12046
[21] shapeOrRef ::= shapeDefinition | shapeRef
12147
[22] inlineShapeOrRef ::= inlineShapeDefinition | shapeRef
122-
[23] shapeRef ::= ATPNAME_LN | ATPNAME_NS | '@' shapeLabel
48+
[23] shapeRef ::= ATPNAME_LN | ATPNAME_NS | '@' shapeExprLabel
12349

12450
[24] litNodeConstraint ::= "LITERAL" xsFacet*
51+
| nonLiteralKind stringFacet*
12552
| datatype xsFacet*
12653
| valueSet xsFacet*
12754
| numericFacet+
@@ -137,20 +64,19 @@
13764
[31] numericRange ::= "MININCLUSIVE" | "MINEXCLUSIVE" | "MAXINCLUSIVE" | "MAXEXCLUSIVE"
13865
[32] numericLength ::= "TOTALDIGITS" | "FRACTIONDIGITS"
13966

140-
[33] shapeDefinition ::= (includeSet | extraPropertySet | "CLOSED")* '{' tripleExpression? '}' annotation* semanticActions
141-
[34] inlineShapeDefinition ::= (includeSet | extraPropertySet | "CLOSED")* '{' tripleExpression? '}'
67+
[33] shapeDefinition ::= (extraPropertySet | "CLOSED")* '{' tripleExpression? '}' annotation* semanticActions
68+
[34] inlineShapeDefinition ::= (extraPropertySet | "CLOSED")* '{' tripleExpression? '}'
14269
[35] extraPropertySet ::= "EXTRA" predicate+
14370

14471
[36] tripleExpression ::= oneOfTripleExpr
14572

146-
# First/First Conflicts on "&", "$", "(", "^", :RDF_TYPE , :IRIREF, :PNAME_LN and :PNAME_NS
73+
# oneOfTripleExpr and multiElementOneOf both start with groupTripleExpr
14774
#[37] oneOfTripleExpr ::= groupTripleExpr | multiElementOneOf
14875
#[38] multiElementOneOf ::= groupTripleExpr ('|' groupTripleExpr)+
14976
#[39] innerTripleExpr ::= multiElementGroup | multiElementOneOf
15077
[37] oneOfTripleExpr ::= groupTripleExpr ('|' groupTripleExpr)*
15178

152-
# First/First Conflicts on "&", "$", "(", "^", :RDF_TYPE , :IRIREF, :PNAME_LN and :PNAME_NS
153-
# First/Follow Conflict on ";"
79+
# singleElementGroup and multiElementGroup both start with unaryTripleExpr
15480
#[40] groupTripleExpr ::= singleElementGroup | multiElementGroup
15581
#[41] singleElementGroup ::= unaryTripleExpr ';'?
15682
#[42] multiElementGroup ::= unaryTripleExpr (';' unaryTripleExpr)+ ';'?
@@ -179,7 +105,7 @@
179105
[55] languageRange ::= LANGTAG ('~' languageExclusion*)?
180106
[56] languageExclusion ::= '-' LANGTAG '~'?
181107

182-
[57] include ::= '&' shapeLabel
108+
[57] include ::= '&' tripleExprLabel
183109

184110
[58] annotation ::= '//' predicate (iri | literal)
185111
[59] semanticActions ::= codeDecl*
@@ -188,20 +114,20 @@
188114
[13t] literal ::= rdfLiteral | numericLiteral | booleanLiteral
189115
[61] predicate ::= iri | RDF_TYPE
190116
[62] datatype ::= iri
191-
[63] shapeLabel ::= iri | blankNode
192-
[16t] numericLiteral ::= INTEGER | DECIMAL | DOUBLE
193-
[129s] rdfLiteral ::= langString | string ('^^' datatype)?
117+
[63] shapeExprLabel ::= iri | blankNode
118+
[64] tripleExprLabel ::= iri | blankNode
119+
120+
[16t] numericLiteral ::= DOUBLE | DECIMAL | INTEGER
121+
[65] rdfLiteral ::= langString | string ('^^' datatype)?
194122
[134s] booleanLiteral ::= 'true' | 'false'
195-
[135s] string ::= STRING_LITERAL1 | STRING_LITERAL_LONG1
196-
| STRING_LITERAL2 | STRING_LITERAL_LONG2
197-
[66] langString ::= LANG_STRING_LITERAL1 | LANG_STRING_LITERAL_LONG1
123+
[135s] string ::= STRING_LITERAL_LONG1 | STRING_LITERAL_LONG2
124+
| STRING_LITERAL1 | STRING_LITERAL2
125+
[66] langString ::= LANG_STRING_LITERAL1 | LANG_STRING_LITERAL_LONG1
198126
| LANG_STRING_LITERAL2 | LANG_STRING_LITERAL_LONG2
199127
[136s] iri ::= IRIREF | prefixedName
200128
[137s] prefixedName ::= PNAME_LN | PNAME_NS
201129
[138s] blankNode ::= BLANK_NODE_LABEL
202130

203-
# Reserved for future use
204-
#[65] includeSet ::= "&" (shapeLabel)+
205131
@terminals
206132

207133
[67] CODE ::= '{' ([^%\\] | '\\' [%\\] | UCHAR)* '%' '}'

0 commit comments

Comments
 (0)