-
Notifications
You must be signed in to change notification settings - Fork 56
Command line interface
This page will thoroughly present the command line interface of Chatette.
Chatette is available for both Python 2.7 and 3.x (>=3.3).
In all the commands shown below, we use python as an interpreter. Depending on your operating system and particular settings, this command can run any version of Python. Some operating systems use python3 to directly refer to Python 3.x.
As explained in the readme, you can install this program from PyPI using the following command:
pip install chatetteAlternatively, you can clone this repository and install the requirements using pip:
pip install -r requirements/common.txtYou can then install the project (as an editable package) using pip, by executing the following command from the directory Chatette/chatette/:
pip install -e .You will then be able to use the other commands presented here by going to the root directory of the project.
To execute the script on a template file, you can run the following command:
python -m chatette <path-to-template-file>Several flags can be used with this command:
-
-hor--helpprints the help for the Chatette package. -
-vor--versionprints the version number of the Chatette package. -
-oor--outputfollowed by the directory in which the output files should be saved. -
-sor--seedfollowed by any string of characters (without whitespaces). This string will be used as a seed for the random generator: using the same seed on the same input file(s) will generate the same output(s). -
-lor--localchanges the output path to be specified with respect to the directory in which the template file is, rather than the current working directory. -
-aor--adapterfollowed by the name of an adapter, allows to change the format of the generated examples. 2 adapters are currently available:jsonlandrasa. They are described in the next section.
The jsonl adapter writes the output to two files: one of them contains the generated examples and has the extension .jsonl, the other contains the lists of synonyms and is named synonyms.json. The examples are written as their internal representation using a JSON formatting with few recurrence in data, in a .jsonl file. If no synonyms exist, the file synonyms.json isn't created.
The rasa adapter makes a .json file which can be directly fed to Rasa NLU: it contains the generated examples and lists of example in the JSON formatting that Rasa NLU uses.
If no output directory was provided, the outputs will by default be in the directory output/ (within the directory Chatette was called from). Within this directory (or the provided directory), 2 new directories will be created: train/ that will contain the training datasets, and test/ which will contain the testing datasets (if any). The datasets contain the generated examples (and possibly a list of entity synonyms) and will be in a file called output.json or output.jsonl. If more than 10'000 examples are generated, several files will contain them and will be named output.X.json or output.X.jsonl where X is a number.
Starting with version 1.4.0, Chatette implements an interactive mode, available from the command line.
To enter this mode, execute Chatette with the program option -i or --interactive, as such:
python -m chatette -i <path-to-template-file>or
python -m chatette --interactive <path-to-template-file>In this mode, no output will be generated (unless you ask it to).
The program parses the template file(s) and we get to a command prompt:
Chatette v1.4.0 running in *interactive mode*.
[DBG] Parsing master file: <path-to-template-file>
[DBG] Parsing finished!
>>> _
It is possible that arrow keys don't work in this prompt depending on which terminal emulator you use.
There, several commands can be ran.
All commands start with a word that correspond to the specific operation to be executed.
Expect for specific commands, most of them are formatted in the following way:
<operation> <unit-type> "<unit-name>" or <operation> <unit-type> /<unit-name>/<flags>.
<unit-type> can be one of the following:
-
aliasand~refer to aliases -
slotand@refer to slots -
intentand%refer to intents
You can append > <redirection-file-path> or >> <redirection-file-path> to any command in order for its output (that would normally be printed onto stdout) to be redirected to a file. If the file doesn't exist, it will be created. > will truncate this file, while >> appends to it. If you just append > or >> to a command without redirection file provided, the command will run in silent mode, not printing any output anywhere (this is equivalent to using > /dev/null in Unix systems).
Here is an exhaustive list of the commands that are currently available:
-
exitorCtrl+D(EOF) stops the interactive mode (and the script).Ctrl+Cwould work as well, but stops Chatette abruptly without any exit message. -
statswill display statistics about the parsed file(s), namely the number of parsed files, defined units, aliases, slots and intents. -
parse <file-path>will parse the file at<file-path>(relatively to the directory the program is executed from). -
exist alias "<alias-name>"will ask the program if an alias named<alias-name>was parsed. The program will give information about the alias (namely its name, modifiers and number of rules) if it does and say it doesn't exist otherwise. The same kind of commands exist for slots and intents:exist slot "<slot-name>"andexist intent "<intent-name>". -
show alias "<alias-name>"will ask to show the rules that define alias<alias-name>. At most 12 rules are printed out. If it doesn't exist, an error is printed. The same thing exists for slots and intents:show slot "<slot-name>"andshow intent "<intent-name>". -
rename alias "<alias-name>" "<new-alias-name>"will change the name of alias<alias-name>to<new-alias-name>. Similar commands exist for slots and intents. -
delete alias "<alias-name>"will completely remove the alias<alias-name>from the parser's memory (as if it hadn't been in the parsed template file(s)). The same thing can be done for slots and intents. -
hide alias "<alias-name>"will temorarily remove the alias<alias-name>from the parser's memory.unhide alias "<alias-name>"restores this alias definition (and can be executed after any other unit have been hidden). The same thing can be done for slots and intents. -
examples alias "<alias-name>"will ask for all the possible strings generated when referring to alias<alias-name>.examples slot "<slot-name>"andexamples intent "<intent-name>"exist as well. An error is printed if the alias/slot/intent doesn't exist.If we add a number at the end of one of those commands (separated from the command by a whitespace), we ask to limit the answer to X strings (selected randomly from the possible strings). If X is larger than the number of possible strings, the command simply returns all the possible strings.
-
generate <adapter> alias "<alias-name>"generates all possible examples and formats them using adapter<adapter>. Similar commands for slots and intents exist. Again, appending a number at the end of the command limits the number of examples generated. As for the commandexamples, you can add a number at the end of the command (separated from it by a whitespace) to limit the number of examples to be generated.Two adapters currently exist:
rasaandjsonl.generatealone will execute the generation that would have run in non-interactive mode. Adding the adapter's name will make this generation use this adapter. The generation is made in the output file given as a program option (or the default path if none was provided). -
declare alias "<alias-name>"creates a new alias named<alias-name>in the parser. It will have no rules and no modifiers. Similar commands exist for slots and intents. -
add-rule alias "<alias-name>" "<rule>"adds a rule to the definition of an alias named<alias-name>(if it exists). Similar commands exist for slots and intents. -
set-modifier alias "<alias-name>" <modifier-name> "<value>"changes the value of the modifier<modifier-name>of the alias<alias-name>. Similar commands exist for slots and intents. Since those are unit declarations, the only modifiers they accept are "case generation" (with values"True"or"False"(case-insensitive)) and "argument identifier" (with any string as a value). Modifier names are thuscasegenandarg(or alternatively&and$, if you want to be more concise).Beware that you could get exceptions later on if you set modifiers to invalid or incoherent values. The checks on those values are rather permissive.
-
rule "<rule>"will generate the rule using all the units that have been defined in the template file(s). We can redirect its outputs to a file as before. If you need to use double quotes in the rule, escape it with a backslash\. -
save <file-path>will create (or truncate) a file at<file-path>and save the configuration of the parser as templates. You will then be able to use this file as a template file later on with all the units you declared and modified during the interactive session. -
execute "<path-to-file>"will read the file<path-to-file>and execute all the commands that are inside it sequentially. The commands and results will be printed on the command line as those executions are made, unless this command or each command in the file are redirected. The commands in the file should be separated by newlines (thus, one command per line). Lines starting with//are ignored.The execution of a file stops as soon as an
exitcommand is executed from the file or when the file has been entirely read.Chatette can also directly read the commands from this file by calling the script with:
python -m chatette <path-to-template-file> -I <path-to-command-file>
or
python -m chatette <path-to-template-file> --interactive-commands-file <path-to-command-file>
Adding the option
-ior--interactivehas no effect in this case.The interactive mode is entered after all the commands in the file have been executed, unless an
exitcommand is executed in the file (the script is stopped immediately).
For all those commands (except exit), appending > <filename> or >> <filename> will respectively write the results into a file named <filename> (creating the file if it doesn't exist, overwriting it if it does) or append the results into a file named <filename> (creating it if it doesn't exist).
Moreover, in all those commands, you can replace the keyword alias, slot and intent respectively by ~, @ and %. The whitespaces around those characters should still be present.
When selecting a unit in most of the commands presented above, you need to provide the name of a unit. If you are talking about a specific unit, simply use the syntax "<unit-name>" where the name of the unit is surrounded with double quotes.
Alternatively, you can use regexes in all those commands, in the following way /regex/flags (no double quotes) where regex is the regex (defined as it should be to work with the package re in Python) and flags can be any (and as many as you want) of the following:
-
i: the regex is case-insensitive -
g: the search looks for all matches and not just a match at the beginning of each unit name
As examples, /s.me/ would match units named some and same; /s.me/i could match some, same and SOME; /s.me/g would match some, same and awesome; /s.me/ig would match some, same, SOME, awesome and Awesome.
The command executed will then be run against all the units whose name contain at least one match.