Skip to content

Commit f378a6a

Browse files
authored
Merge pull request #3 from starling-lab/packaging
This pull request converts this project into a Python package: `rnlp`. ### Major Changes * Renamed `parseInputCorpus.py` -> `rnlp/parse.py` * Removed the call to `main()` at the last line of `parse.py`. * Created `docs` directory with `Makefile` and source directory for building pages with Sphinx. * Added `setup.py` and `setup.cfg` files for publishing to PyPi. ### Minor, But Worth Mentioning * New file: `rnlp/corpus.py`, currently empty but will contain example documents for testing and helping to get up-and-running. * Removed `train` directory since it contained learned models for the toy dataset rather than anything that would generally be useful here.
2 parents b77cdf6 + 575db3a commit f378a6a

36 files changed

+575
-4684
lines changed

.gitignore

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
*~
2+
train
3+
test
4+
5+
# === The following are based on the github/gitignore/Python.gitignore === #
6+
# === Made available under a Creative Commons Zero v1.0 Universal ======== #
7+
8+
# Byte-compiled / optimized / DLL files
9+
__pycache__/
10+
*.py[cod]
11+
*$py.class
12+
13+
# C extensions
14+
*.so
15+
16+
# Distribution / packaging
17+
.Python
18+
build/
19+
develop-eggs/
20+
dist/
21+
downloads/
22+
eggs/
23+
.eggs/
24+
lib/
25+
lib64/
26+
parts/
27+
sdist/
28+
var/
29+
wheels/
30+
*.egg-info/
31+
.installed.cfg
32+
*.egg
33+
MANIFEST
34+
35+
# PyInstaller
36+
# Usually these files are written by a python script from a template
37+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
38+
*.manifest
39+
*.spec
40+
41+
# Installer logs
42+
pip-log.txt
43+
pip-delete-this-directory.txt
44+
45+
# Unit test / coverage reports
46+
htmlcov/
47+
.tox/
48+
.coverage
49+
.coverage.*
50+
.cache
51+
nosetests.xml
52+
coverage.xml
53+
*.cover
54+
.hypothesis/
55+
.pytest_cache/
56+
57+
# Translations
58+
*.mo
59+
*.pot
60+
61+
# Django stuff:
62+
*.log
63+
local_settings.py
64+
db.sqlite3
65+
66+
# Flask stuff:
67+
instance/
68+
.webassets-cache
69+
70+
# Scrapy stuff:
71+
.scrapy
72+
73+
# Sphinx documentation
74+
docs/_build/
75+
76+
# PyBuilder
77+
target/
78+
79+
# Jupyter Notebook
80+
.ipynb_checkpoints
81+
82+
# pyenv
83+
.python-version
84+
85+
# celery beat schedule file
86+
celerybeat-schedule
87+
88+
# SageMath parsed files
89+
*.sage.py
90+
91+
# Environments
92+
.env
93+
.venv
94+
env/
95+
venv/
96+
ENV/
97+
env.bak/
98+
venv.bak/
99+
100+
# Spyder project settings
101+
.spyderproject
102+
.spyproject
103+
104+
# Rope project settings
105+
.ropeproject
106+
107+
# mkdocs documentation
108+
/site
109+
110+
# mypy
111+
.mypy_cache/

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Processes text from a file (or set of files) into relational facts.
44

5-
Pre-requisites:
5+
Pre-requisites:
66

77
- Python 2.7
88

@@ -14,7 +14,7 @@ Usage:
1414

1515
The code will prompt for a file or folder input of text files to convert to relational facts.
1616

17-
The Relationals encoded are:
17+
The Relations encoded are:
1818

1919
- between block's of size n (i.e. 'n' sentences) and sentences in the blocks.
2020

@@ -48,7 +48,7 @@ The relationships currently encoded are:
4848

4949
Files contain a toy corpus (`files/`) and an image of a BoostSRL tree for predicting if a word in a sentence is the word "you"
5050

51-
![BoostSRL tree for predicting if a word in a sentence is the word "you."](https://raw.githubusercontent.com/boost-starai/Natural-Language-Processing/master/output.png)
51+
![BoostSRL tree for predicting if a word in a sentence is the word "you."](https://raw.githubusercontent.com/boost-starai/Natural-Language-Processing/master/docs/img/output.png)
5252

5353
The tree says that if the word string contained in word 'b' is "you" then 'b' is the word "you". (This is of course true).
5454
A more interesting inference is the False branch that says that if word 'b' is an early word in sentence 'a' and word 'anon12035' is also an early word in sentence 'a' and if the word string contained in word 'anon12035' is "Thank", then the word 'b' has decent change of being the word "you". (The model was able to learn that the word "you" often occurs with the word "Thank" in the same sentence when "Thank" appears early in that sentence).

docs/Makefile

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Minimal makefile for Sphinx documentation
2+
#
3+
4+
# You can set these variables from the command line.
5+
SPHINXOPTS =
6+
SPHINXBUILD = python -msphinx
7+
SPHINXPROJ = rnlp
8+
SOURCEDIR = source
9+
BUILDDIR = build
10+
11+
# Put it first so that "make" without argument is like "make help".
12+
help:
13+
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
14+
15+
.PHONY: help Makefile
16+
17+
# Catch-all target: route all unknown targets to Sphinx using the new
18+
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
19+
%: Makefile
20+
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
File renamed without changes.

docs/source/conf.py

Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
#!/usr/bin/env python3
2+
# -*- coding: utf-8 -*-
3+
#
4+
# rnlp documentation build configuration file, created by
5+
# sphinx-quickstart on Thu May 17 10:12:50 2018.
6+
#
7+
# This file is execfile()d with the current directory set to its
8+
# containing dir.
9+
#
10+
# Note that not all possible configuration values are present in this
11+
# autogenerated file.
12+
#
13+
# All configuration values have a default; values that are commented out
14+
# serve to show the default.
15+
16+
# If extensions (or modules to document with autodoc) are in another directory,
17+
# add these directories to sys.path here. If the directory is relative to the
18+
# documentation root, use os.path.abspath to make it absolute, like shown here.
19+
#
20+
import os
21+
import sys
22+
sys.path.insert(0, os.path.abspath('../..'))
23+
24+
25+
# -- General configuration ------------------------------------------------
26+
27+
# If your documentation needs a minimal Sphinx version, state it here.
28+
#
29+
# needs_sphinx = '1.0'
30+
31+
# Add any Sphinx extension module names here, as strings. They can be
32+
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
33+
# ones.
34+
extensions = ['sphinx.ext.autodoc',
35+
'sphinx.ext.githubpages']
36+
37+
# Add any paths that contain templates here, relative to this directory.
38+
templates_path = ['_templates']
39+
40+
# The suffix(es) of source filenames.
41+
# You can specify multiple suffix as a list of string:
42+
#
43+
# source_suffix = ['.rst', '.md']
44+
source_suffix = '.rst'
45+
46+
# The master toctree document.
47+
master_doc = 'index'
48+
49+
# General information about the project.
50+
project = 'rnlp'
51+
copyright = '2017-2018, StARLinG Lab'
52+
author = 'Alexander L. Hayes (@batflyer), Kaushik Roy (@kkroy36)'
53+
54+
# The version info for the project you're documenting, acts as replacement for
55+
# |version| and |release|, also used in various other places throughout the
56+
# built documents.
57+
#
58+
# The short X.Y version.
59+
version = '0.1.0'
60+
# The full version, including alpha/beta/rc tags.
61+
release = '0.1.0'
62+
63+
# The language for content autogenerated by Sphinx. Refer to documentation
64+
# for a list of supported languages.
65+
#
66+
# This is also used if you do content translation via gettext catalogs.
67+
# Usually you set "language" from the command line for these cases.
68+
language = None
69+
70+
# List of patterns, relative to source directory, that match files and
71+
# directories to ignore when looking for source files.
72+
# This patterns also effect to html_static_path and html_extra_path
73+
exclude_patterns = []
74+
75+
# The name of the Pygments (syntax highlighting) style to use.
76+
pygments_style = 'sphinx'
77+
78+
# If true, `todo` and `todoList` produce output, else they produce nothing.
79+
todo_include_todos = False
80+
81+
82+
# -- Options for HTML output ----------------------------------------------
83+
84+
# The theme to use for HTML and HTML Help pages. See the documentation for
85+
# a list of builtin themes.
86+
#
87+
html_theme = 'alabaster'
88+
89+
# Theme options are theme-specific and customize the look and feel of a theme
90+
# further. For a list of options available for each theme, see the
91+
# documentation.
92+
#
93+
# html_theme_options = {}
94+
95+
# Add any paths that contain custom static files (such as style sheets) here,
96+
# relative to this directory. They are copied after the builtin static files,
97+
# so a file named "default.css" will overwrite the builtin "default.css".
98+
html_static_path = ['_static']
99+
100+
# Custom sidebar templates, must be a dictionary that maps document names
101+
# to template names.
102+
#
103+
# This is required for the alabaster theme
104+
# refs: http://alabaster.readthedocs.io/en/latest/installation.html#sidebars
105+
html_sidebars = {
106+
'**': [
107+
'about.html',
108+
'navigation.html',
109+
'relations.html', # needs 'show_related': True theme option to display
110+
'searchbox.html',
111+
'donate.html',
112+
]
113+
}
114+
115+
116+
# -- Options for HTMLHelp output ------------------------------------------
117+
118+
# Output file base name for HTML help builder.
119+
htmlhelp_basename = 'rnlpdoc'
120+
121+
122+
# -- Options for LaTeX output ---------------------------------------------
123+
124+
latex_elements = {
125+
# The paper size ('letterpaper' or 'a4paper').
126+
#
127+
# 'papersize': 'letterpaper',
128+
129+
# The font size ('10pt', '11pt' or '12pt').
130+
#
131+
# 'pointsize': '10pt',
132+
133+
# Additional stuff for the LaTeX preamble.
134+
#
135+
# 'preamble': '',
136+
137+
# Latex figure (float) alignment
138+
#
139+
# 'figure_align': 'htbp',
140+
}
141+
142+
# Grouping the document tree into LaTeX files. List of tuples
143+
# (source start file, target name, title,
144+
# author, documentclass [howto, manual, or own class]).
145+
latex_documents = [
146+
(master_doc, 'rnlp.tex', 'rnlp Documentation',
147+
'Alexander L. Hayes (@batflyer), Kaushik Roy (@kkroy36)', 'manual'),
148+
]
149+
150+
151+
# -- Options for manual page output ---------------------------------------
152+
153+
# One entry per manual page. List of tuples
154+
# (source start file, name, description, authors, manual section).
155+
man_pages = [
156+
(master_doc, 'rnlp', 'rnlp Documentation',
157+
[author], 1)
158+
]
159+
160+
161+
# -- Options for Texinfo output -------------------------------------------
162+
163+
# Grouping the document tree into Texinfo files. List of tuples
164+
# (source start file, target name, title, author,
165+
# dir menu entry, description, category)
166+
texinfo_documents = [
167+
(master_doc, 'rnlp', 'rnlp Documentation',
168+
author, 'rnlp', 'One line description of project.',
169+
'Miscellaneous'),
170+
]
171+
172+
173+

docs/source/index.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
.. rnlp documentation master file, created by
2+
sphinx-quickstart on Thu May 17 10:12:50 2018.
3+
You can adapt this file completely to your liking, but it should at least
4+
contain the root `toctree` directive.
5+
6+
Welcome to rnlp's documentation!
7+
================================
8+
9+
.. toctree::
10+
:maxdepth: 2
11+
:caption: Contents:
12+
13+
14+
15+
Indices and tables
16+
==================
17+
18+
* :ref:`genindex`
19+
* :ref:`modindex`
20+
* :ref:`search`

docs/source/modules.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
rnlp
2+
====
3+
4+
.. toctree::
5+
:maxdepth: 4
6+
7+
rnlp

docs/source/rnlp.rst

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
rnlp package
2+
============
3+
4+
Submodules
5+
----------
6+
7+
rnlp\.corpus module
8+
-------------------
9+
10+
.. automodule:: rnlp.corpus
11+
:members:
12+
:undoc-members:
13+
:show-inheritance:
14+
15+
rnlp\.parseInputCorpus module
16+
-----------------------------
17+
18+
.. automodule:: rnlp.parse
19+
:members:
20+
:undoc-members:
21+
:show-inheritance:
22+
23+
24+
Module contents
25+
---------------
26+
27+
.. automodule:: rnlp
28+
:members:
29+
:undoc-members:
30+
:show-inheritance:

0 commit comments

Comments
 (0)