Mti

This is my web parser written in julia using the genie framework

Parser Utility

I'm developing a parser for the L203 classes that can be fed rules

Parser Utility

Design specs

Original Page Layout

If generate is selected, the lexical entries and syntactic rules are used to generate a number of random sentences in the bottom left box. If parse is selected, the sentence/sentences in the bottom left box are parsed using the rules specified and the resulting svg is put in the box on the right. Depth limit is used to prevent recursion from causing problems in the parse generated.

There was also a timeout so that if it took too long to parse the sentence, it was just aborted.

Rule format

The rules input could be of two types, lexical rules and syntactic rules.

Lexical rules

These rules consist of a lexical category label, a colon and then a thing that is part of that category.

for example N : dog

in addition, a set can be specified

N : {dog, cat}

this means that both dog and cat are N's.

Syntactic rules

These rules consist of a syntactic category label, an arrow and then component constituents.

NP -> Det N

Scope of rules

Should be able to handle

Parser Utility

Gameplan (in order of priority)

Utilities

Sentence parsing

Tree generation

Sentence generation

Parser Utility

Handling Optionality

The issue is that the optional pieces are occurring in rules of any length. E.g. we could have NP -> (D)(Adj)(Mod)N

The possible options for this are:

This can be seen as using a negated mask over the optional elements with the non-optional elements interpolated

if the bit at position n in the list of optionals is 1, then that position is removed.

Parser Utility

Hai's notes

  1. upper limit for num of generations:

S -> NP VP NP:{John, Mary} VP:{eat, sleep}

when I ask it to generate 20 sentences, it generates many duplicates. Of course this simple grammar can only have 4 distinctive sentences. Do we want to allow duplicates? Or maybe set a parameter whether duplicates are allowed.

  1. it doesn't seem to handle Chinese:

same small grammar but in Chinese:

S -> NP VP NP:{张三, 李四} VP:{吃饭, 睡觉}

will results in:

image.png

  1. empty lines are not allowed? Any grammar with empty lines will give an error. I think we may want to allow empty lines?

I have fixed 1 & 2 mostly. Some small changes need to be made to CFG.jl to use a font that allows for more non-ascii characters like chinese characters.

3 is not resolved at the moment.

Parser Utility

Genie tidbits

css ane other frontend assets should be modfiied in the public folder not the assets/css folder.