-
Notifications
You must be signed in to change notification settings - Fork 56
Command line interface
This page will thoroughly present the command line interface of Chatette.
Chatette is available for both Python 2.7 and 3.x (>=3.4).
In all the commands shown below, we use python
as an interpreter. Depending on your operating system and particular settings, this command can run any version of Python. Some operating systems use python3
to directly refer to Python 3.x.
As explained in the readme, you can install this program from PyPI using the following command:
pip install chatette
Alternatively, you can clone this repository and install the requirements using pip
:
pip install -r requirements/common.txt
You can then install the project (as an editable package) using pip, by executing the following command from the directory Chatette/chatette/
:
pip install -e .
You will then be able to use the other commands presented here by going to the root directory of the project.
To execute the script on a template file, you can run the following command:
python -m chatette <path-to-template-file>
Several flags can be used with this command:
-
-h
or--help
prints the help for the Chatette package. -
-v
or--version
prints the version number of the Chatette package. -
-o
or--output
followed by the directory in which the output files should be saved. -
-s
or--seed
followed by any string of characters (without whitespaces). This string will be used as a seed for the random generator: using the same seed on the same input file(s) will generate the same output(s). -
-l
or--local
changes the output path to be specified with respect to the directory in which the template file is, rather than the current working directory. -
-a
or--adapter
followed by the name of an adapter, allows to change the format of the generated examples. 3 adapters are currently available:jsonl
,rasa
andrasa-md
(orrasamd
). They are described in the next section. -
--base-file
followed by the path to a base file, which will be extended. Currently only usable with the Rasa adapter. Base files are described here. -
-f
or--force
prevents Chatette from asking for confirmation before overwriting output files and folders.
The files that will be created for both the training and testing data depend on which adapter is used. Currently, all adapters organize the training data and the test data using the same file types and formats. See the following section for a description of how the whole output is organized.
The jsonl
adapter writes the output to two files: one of them contains the generated examples and has the extension .jsonl
, the other contains the lists of synonyms and is named synonyms.json
. The examples are written as their internal representation using a JSON formatting with few recurrence in data, in a .jsonl
file. If no synonyms exist, the file synonyms.json
isn't created.
The rasa
adapter makes a .json
file which can be directly fed to Rasa NLU: it contains the generated examples and lists of example in the JSON formatting that Rasa NLU uses.
The rasa-md
or rasamd
adapter makes a .md
file which can be used as an input to Rasa NLU: it uses the Markdown format described in Rasa official documentation.
If no output directory was provided, the outputs will by default be in the directory output/
(within the directory Chatette was called from). Within this directory (or the provided directory), 2 new directories will be created: train/
that will contain the training datasets, and test/
which will contain the testing datasets (if any). The datasets contain the generated examples (and possibly a list of entity synonyms) and will be in a file called output.json
or output.jsonl
. If more than 10'000 examples are generated, several files will contain them and will be named output.X.json
or output.X.jsonl
where X
is a number.
If you need the program to have a more specific behavior, you will need to use Chatette in your own code, as described on this page.
Starting with version 1.4.0, Chatette implements an interactive mode, available from the command line.
To enter this mode, execute Chatette with the program option -i
or --interactive
, as such:
python -m chatette -i <path-to-template-file>
or
python -m chatette --interactive <path-to-template-file>
In this mode, no output will be generated (unless you ask it to).
The program parses the template file(s) and we get to a command prompt:
Chatette v1.4.0 running in *interactive mode*.
[DBG] Parsing master file: <path-to-template-file>
[DBG] Parsing finished!
>>> _
It is possible that arrow keys don't work in this prompt depending on which terminal emulator you use.
Note that those functionalities are subject to change. In particular, the layout of command outputs may very well change between versions of the program.
There, several commands can be ran.
All commands start with a word that correspond to the specific operation to be executed.
Expect for specific commands, most of them are formatted in the following way:
<operation> <unit-type> "<unit-name>"
or <operation> <unit-type> /<unit-name>/<flags>
.
<unit-type>
can be one of the following:
-
alias
and~
refer to aliases -
slot
and@
refer to slots -
intent
and%
refer to intents
You can append > <redirection-file-path>
or >> <redirection-file-path>
to any command in order for its output (that would normally be printed onto stdout
) to be redirected to a file. If the file doesn't exist, it will be created. >
will truncate this file, while >>
appends to it. If you just append >
or >>
to a command without redirection file provided, the command will run in silent mode, not printing any output anywhere (this is equivalent to using > /dev/null
in Unix systems).
Here is an exhaustive list of the commands that are currently available:
-
exit
orCtrl+D
(EOF
) stops the interactive mode (and the script).Ctrl+C
would work as well, but stops Chatette abruptly without any exit message. -
stats
will display statistics about the parsed file(s), namely the number of parsed files, defined units, aliases, slots and intents. -
parse <file-path>
will parse the file at<file-path>
(relatively to the directory the program is executed from). -
exist alias "<alias-name>"
will ask the program if an alias named<alias-name>
was parsed. The program will give information about the alias (namely its name, modifiers and number of rules) if it does and say it doesn't exist otherwise. The same kind of commands exist for slots and intents:exist slot "<slot-name>"
andexist intent "<intent-name>"
. -
show alias "<alias-name>"
will ask to show the rules that define alias<alias-name>
. At most 12 rules are printed out. If it doesn't exist, an error is printed. The same thing exists for slots and intents:show slot "<slot-name>"
andshow intent "<intent-name>"
. -
rename alias "<alias-name>" "<new-alias-name>"
will change the name of alias<alias-name>
to<new-alias-name>
. Similar commands exist for slots and intents. -
delete alias "<alias-name>"
will completely remove the alias<alias-name>
from the parser's memory (as if it hadn't been in the parsed template file(s)). The same thing can be done for slots and intents. -
hide alias "<alias-name>"
will temorarily remove the alias<alias-name>
from the parser's memory.unhide alias "<alias-name>"
restores this alias definition (and can be executed after any other unit have been hidden). The same thing can be done for slots and intents. -
examples alias "<alias-name>"
will ask for all the possible strings generated when referring to alias<alias-name>
.examples slot "<slot-name>"
andexamples intent "<intent-name>"
exist as well. An error is printed if the alias/slot/intent doesn't exist.If we add a number at the end of one of those commands (separated from the command by a whitespace), we ask to limit the answer to X strings (selected randomly from the possible strings). If X is larger than the number of possible strings, the command simply returns all the possible strings.
-
generate <adapter> alias "<alias-name>"
generates all possible examples and formats them using adapter<adapter>
. Similar commands for slots and intents exist. Again, appending a number at the end of the command limits the number of examples generated. As for the commandexamples
, you can add a number at the end of the command (separated from it by a whitespace) to limit the number of examples to be generated.Two adapters currently exist:
rasa
andjsonl
.generate
alone will execute the generation that would have run in non-interactive mode. Adding the adapter's name will make this generation use this adapter. The generation is made in the output file given as a program option (or the default path if none was provided). -
declare alias "<alias-name>"
creates a new alias named<alias-name>
in the parser. It will have no rules and no modifiers. Similar commands exist for slots and intents. -
add-rule alias "<alias-name>" "<rule>"
adds a rule to the definition of an alias named<alias-name>
(if it exists). Similar commands exist for slots and intents. -
set-modifier alias "<alias-name>" <modifier-name> "<value>"
changes the value of the modifier<modifier-name>
of the alias<alias-name>
. Similar commands exist for slots and intents. Since those are unit declarations, the only modifiers they accept are "case generation" (with values"True"
or"False"
(case-insensitive)) and "argument identifier" (with any string as a value). Modifier names are thuscasegen
andarg
(or alternatively&
and$
, if you want to be more concise).Beware that you could get exceptions later on if you set modifiers to invalid or incoherent values. The checks on those values are rather permissive.
-
rule "<rule>"
will generate the rule using all the units that have been defined in the template file(s). We can redirect its outputs to a file as before. If you need to use double quotes in the rule, escape it with a backslash\
. -
save <file-path>
will create (or truncate) a file at<file-path>
and save the configuration of the parser as templates. You will then be able to use this file as a template file later on with all the units you declared and modified during the interactive session. -
execute "<path-to-file>"
will read the file<path-to-file>
and execute all the commands that are inside it sequentially. The commands and results will be printed on the command line as those executions are made, unless this command or each command in the file are redirected. The commands in the file should be separated by newlines (thus, one command per line). Lines starting with//
are ignored.The execution of a file stops as soon as an
exit
command is executed from the file or when the file has been entirely read.Chatette can also directly read the commands from this file by calling the script with:
python -m chatette <path-to-template-file> -I <path-to-command-file>
or
python -m chatette <path-to-template-file> --interactive-commands-file <path-to-command-file>
Adding the option
-i
or--interactive
has no effect in this case.The interactive mode is entered after all the commands in the file have been executed, unless an
exit
command is executed in the file (the script is stopped immediately).
For all those commands (except exit
), appending > <filename>
or >> <filename>
will respectively write the results into a file named <filename>
(creating the file if it doesn't exist, overwriting it if it does) or append the results into a file named <filename>
(creating it if it doesn't exist).
Moreover, in all those commands, you can replace the keyword alias
, slot
and intent
respectively by ~
, @
and %
. The whitespaces around those characters should still be present.
When selecting a unit in most of the commands presented above, you need to provide the name of a unit. If you are talking about a specific unit, simply use the syntax "<unit-name>"
where the name of the unit is surrounded with double quotes.
For some of the commands presented earlier, you can precise the variation you are talking about using the same syntax as in template files, that is "<unit-name>#<variation-name>"
. This syntax is usable with the following commands: exist
, show
, delete
, hide
, unhide
, examples
, generate
and add-rule
.
Alternatively, you can use regexes in all those commands, in the following way /regex/flags
(no double quotes) where regex
is the regex (defined as it should be to work with the package re
in Python) and flags
can be any (and as many as you want) of the following:
-
i
: the regex is case-insensitive -
g
: the search looks for all matches and not just a match at the beginning of each unit name
For instance, /s.me/
would match units named some
and same
; /s.me/i
could match some
, same
and SOME
; /s.me/g
would match some
, same
and awesome
; /s.me/ig
would match some
, same
, SOME
, awesome
and Awesome
.
The command executed will then be run against all the units whose name contain at least one match.