Skip to content

Conversation

sumny
Copy link
Member

@sumny sumny commented Jan 31, 2021

closes #538
todo:

  • varargs
  • better docs
  • more tests
  • complementary function that gives us the IDs of all pipeops that are "between" two (or more) pipeops.

@sumny sumny added the Status: Needs Discussion We still need to think about what the solution should look like label Feb 4, 2021
@sumny
Copy link
Member Author

sumny commented Feb 4, 2021

Something that bothers me with vararg inputs:
Suppose $replace_subgraph() should work more or less like %>>%, i..e, the ids to replace constitute an actual valid subgraph of the current graph and are in topological order.
In this case replacing a single PipeOp that used to be in a gunion with another one and connected to a vararg input will fail:

gr = po("branch", c("pca", "nop")) %>>% gunion(list(po("pca"), po("nop"))) %>>% po("unbranch")
gr$replace_subgraph("pca", substitute = po("ica"))

after having removed pca, the edges, input and output look like this:

gr$edges
   src_id src_channel   dst_id dst_channel
1: branch         nop      nop       input
2:    nop      output unbranch         ...

gr$input
           name train predict  op.id channel.name
1: branch.input     *       * branch        input

gr$output
              name train predict    op.id channel.name
1:      branch.pca     *       *   branch          pca
2: unbranch.output     *       * unbranch       output

i.e., the graph does not remember that there was a second edge connecting to "unbranch" (to be fair, how could he?)

image

Now suppose we would do it like %>>%, i.e., gr %>>% po("ica"); this would fail because the graphs have mismatching numbers of inputs / outputs; if we ignore this and connect nevertheless, the output graph would look like this:

image

:(

A workaround would be to do:

gr$replace_subgraph(c("pca", "nop"), substitute = gunion(list(po("ica"), po("nop"))))

This works because the vararg input of unbranch is now an actual free input again that can be connected to.
In my opinion this is not satisfying because replacing a single branching option is something that should easily be doable.
(With the current $replace_subgraph() it is doable if we would have constructed unbranch without a vararg input.)
I feel like for $replace_subgraph() to be this flexible, the implemenation has to go beyond what %>>% does, i.e, one could restrict the substitute to inherit the input and output connections of the subgraph to be replaced and match the pipeops via their ordered ids.
But maybe @mb706 @pfistfl have some better ideas?

@mb706
Copy link
Collaborator

mb706 commented Oct 6, 2021

image

image

image

Status:

  • varargs are not treated in a special way: the number of connections of varargs of the prior graph stays the same. otherwise we could end up "removing" a connection entirely.
  • allow one-to-many-matching: if prior graph has one output or inserted graph has one output, match to number of corresponding inputs
  • we still don't agree on a name for subgraph completion method, working title completify_graph
  • export(?) but at least test completify_graph on its own.
  • definition of completify_graph would be: "list all POs that need to be removed so that all POs not connected to prior input / output are gone" or "... so that all POs in a resulting disconnected component not connected to I / O are removed"
  • calculating "dangling" edges: see Replace Sugraph  #538
  • see if we write a helper for type checks that is also used in %>>%.

@sebffischer
Copy link
Member

This could be used in torch for transfer learning @mb706

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Needs Discussion We still need to think about what the solution should look like
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace Sugraph
3 participants