-
Notifications
You must be signed in to change notification settings - Fork 95
refactor NN graph building (included in #43) #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
naspert
wants to merge
33
commits into
epfl-lts2:master
Choose a base branch
from
naspert:nn_refactor
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 3 commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
1ee85b5
attempt to refactor nn graph building
naspert 4bacd5c
update tests
naspert 00bbcdd
fix typo
naspert b822333
fix tests (avoiding not implemented combinations)
naspert 38aebd0
- fix missing space after colon in dictionary
naspert 524c60f
fix (matlab) GSP url
naspert ae83814
throw exception when using FLANN + max_dist (produces incorrect results)
naspert 62fc0ce
update test case to fit FLANN & max_dist exception
naspert 6f473fa
implement nn graph using pdist using radius
naspert 25ec6d2
implement radius nn graph with flann
naspert 96b628e
flann returns the squared distance when called with 'euclidean' dista…
naspert 09bbff4
compute sqrt of list properly
naspert 27b9a03
use cyflann instead of pyflann (radius search not working)
naspert 8a1f9b9
check nn graphs building against pdist reference
naspert 6e9e2ac
cyflann needs the flann library to be installed on the system
naspert 811de06
check nn graphs building against pdist reference
naspert 813fe39
backport stuff from cyflann branch
naspert 4a4d597
flann should (mostly) work for knn graphs
naspert 53dffc1
fix pdist warnings
naspert 1309e92
implement and use scipy-ckdtree as default (faster than kdtree)
naspert 90ae9a8
Merge remote-tracking branch 'origin-nas/nn_cyflann' into nn_refactor
naspert 648fa91
backport README changes from master
naspert 96fa5f6
Merge branch 'master' of https://github.com/epfl-lts2/pygsp into nn_r…
naspert c26e449
Merge branch 'master' into nn_refactor
naspert 8e7c553
add nmslib
naspert b83e467
test flann when not on windows
naspert 28b7858
use the same code to build sparse matrix for knn and radius
naspert 188c4a6
building the graph with rescale/center=False should also work
naspert 59c131a
Merge pull request #1 from naspert/nmslib
naspert 8e98b77
update doc for nmslib
naspert 08ae29f
enable multithreading with ckdtree/nmslib
naspert 57e9661
Merge branch 'master' into nn_refactor
naspert a562896
fix _get_extra_repr
naspert File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,6 +10,21 @@ | |
|
||
_logger = utils.build_logger(__name__) | ||
|
||
# conversion between the FLANN conventions and the various backend functions | ||
_dist_translation = { | ||
'scipy-kdtree': { | ||
'euclidean': 2, | ||
'manhattan': 1, | ||
'max_dist': np.inf | ||
}, | ||
'scipy-pdist' : { | ||
'euclidean': 'euclidean', | ||
'manhattan': 'cityblock', | ||
'max_dist': 'chebyshev', | ||
'minkowski': 'minkowski' | ||
}, | ||
|
||
} | ||
|
||
def _import_pfl(): | ||
try: | ||
|
@@ -20,6 +35,46 @@ def _import_pfl(): | |
'pip (or conda) install pyflann (or pyflann3).') | ||
return pfl | ||
|
||
|
||
|
||
def _knn_sp_kdtree(_X, _num_neighbors, _dist_type, _order=0): | ||
kdt = spatial.KDTree(_X) | ||
D, NN = kdt.query(_X, k=(_num_neighbors + 1), | ||
p=_dist_translation['scipy-kdtree'][_dist_type]) | ||
return NN, D | ||
|
||
def _knn_flann(_X, _num_neighbors, _dist_type, _order): | ||
pfl = _import_pfl() | ||
pfl.set_distance_type(_dist_type, order=_order) | ||
flann = pfl.FLANN() | ||
|
||
# Default FLANN parameters (I tried changing the algorithm and | ||
# testing performance on huge matrices, but the default one | ||
# seems to work best). | ||
NN, D = flann.nn(_X, _X, num_neighbors=(_num_neighbors + 1), | ||
algorithm='kdtree') | ||
return NN, D | ||
|
||
def _radius_sp_kdtree(_X, _epsilon, _dist_type, order=0): | ||
kdt = spatial.KDTree(_X) | ||
D, NN = kdt.query(_X, k=None, distance_upper_bound=_epsilon, | ||
p=_dist_translation['scipy-kdtree'][_dist_type]) | ||
return NN, D | ||
|
||
def _knn_sp_pdist(_X, _num_neighbors, _dist_type, _order): | ||
pd = spatial.distance.squareform( | ||
spatial.distance.pdist(_X, | ||
_dist_translation['scipy-pdist'][_dist_type], | ||
p=_order)) | ||
pds = np.sort(pd)[:, 0:_num_neighbors+1] | ||
pdi = pd.argsort()[:, 0:_num_neighbors+1] | ||
return pdi, pds | ||
|
||
def _radius_sp_pdist(): | ||
raise NotImplementedError() | ||
|
||
def _radius_flann(): | ||
raise NotImplementedError() | ||
|
||
class NNGraph(Graph): | ||
r"""Nearest-neighbor graph from given point cloud. | ||
|
@@ -33,9 +88,11 @@ class NNGraph(Graph): | |
Type of nearest neighbor graph to create. The options are 'knn' for | ||
k-Nearest Neighbors or 'radius' for epsilon-Nearest Neighbors (default | ||
is 'knn'). | ||
use_flann : bool, optional | ||
Use Fast Library for Approximate Nearest Neighbors (FLANN) or not. | ||
(default is False) | ||
backend : {'scipy-kdtree', 'scipy-pdist', 'flann'} | ||
Type of the backend for graph construction. | ||
- 'scipy-kdtree'(default) will use scipy.spatial.KDTree | ||
- 'scipy-pdist' will use scipy.spatial.distance.pdist (slowest but exact) | ||
- 'flann' use Fast Library for Approximate Nearest Neighbors (FLANN) | ||
center : bool, optional | ||
Center the data so that it has zero mean (default is True) | ||
rescale : bool, optional | ||
|
@@ -74,20 +131,34 @@ class NNGraph(Graph): | |
|
||
""" | ||
|
||
def __init__(self, Xin, NNtype='knn', use_flann=False, center=True, | ||
def __init__(self, Xin, NNtype='knn', backend='scipy-kdtree', center=True, | ||
rescale=True, k=10, sigma=0.1, epsilon=0.01, gtype=None, | ||
plotting={}, symmetrize_type='average', dist_type='euclidean', | ||
order=0, **kwargs): | ||
|
||
self.Xin = Xin | ||
self.NNtype = NNtype | ||
self.use_flann = use_flann | ||
self.backend = backend | ||
self.center = center | ||
self.rescale = rescale | ||
self.k = k | ||
self.sigma = sigma | ||
self.epsilon = epsilon | ||
|
||
_dist_translation['scipy-kdtree']['minkowski'] = order | ||
|
||
self._nn_functions = { | ||
'knn': { | ||
'scipy-kdtree':_knn_sp_kdtree, | ||
|
||
'scipy-pdist': _knn_sp_pdist, | ||
'flann': _knn_flann | ||
}, | ||
'radius': { | ||
'scipy-kdtree':_radius_sp_kdtree, | ||
'scipy-pdist': _radius_sp_pdist, | ||
'flann': _radius_flann | ||
}, | ||
} | ||
|
||
if gtype is None: | ||
gtype = 'nearest neighbors' | ||
else: | ||
|
@@ -108,33 +179,15 @@ def __init__(self, Xin, NNtype='knn', use_flann=False, center=True, | |
scale = np.power(N, 1. / float(min(d, 3))) / 10. | ||
Xout *= scale / bounding_radius | ||
|
||
# Translate distance type string to corresponding Minkowski order. | ||
dist_translation = {"euclidean": 2, | ||
"manhattan": 1, | ||
"max_dist": np.inf, | ||
"minkowski": order | ||
} | ||
|
||
|
||
if self.NNtype == 'knn': | ||
spi = np.zeros((N * k)) | ||
spj = np.zeros((N * k)) | ||
spv = np.zeros((N * k)) | ||
|
||
if self.use_flann: | ||
pfl = _import_pfl() | ||
pfl.set_distance_type(dist_type, order=order) | ||
flann = pfl.FLANN() | ||
|
||
# Default FLANN parameters (I tried changing the algorithm and | ||
# testing performance on huge matrices, but the default one | ||
# seems to work best). | ||
NN, D = flann.nn(Xout, Xout, num_neighbors=(k + 1), | ||
algorithm='kdtree') | ||
|
||
else: | ||
kdt = spatial.KDTree(Xout) | ||
D, NN = kdt.query(Xout, k=(k + 1), | ||
p=dist_translation[dist_type]) | ||
NN, D = self._nn_functions[NNtype][backend](Xout, k, | ||
dist_type, order) | ||
|
||
for i in range(N): | ||
spi[i * k:(i + 1) * k] = np.kron(np.ones((k)), i) | ||
|
@@ -144,13 +197,10 @@ def __init__(self, Xin, NNtype='knn', use_flann=False, center=True, | |
|
||
elif self.NNtype == 'radius': | ||
|
||
kdt = spatial.KDTree(Xout) | ||
D, NN = kdt.query(Xout, k=None, distance_upper_bound=epsilon, | ||
p=dist_translation[dist_type]) | ||
count = 0 | ||
for i in range(N): | ||
count = count + len(NN[i]) | ||
|
||
NN, D = self._nn_functions[NNtype][backend](Xout, epsilon, | ||
dist_type, order) | ||
count = sum(map(len, NN)) | ||
|
||
spi = np.zeros((count)) | ||
spj = np.zeros((count)) | ||
spv = np.zeros((count)) | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.