I've wondered for a while how Deepmind manages their (Python & C++) codebase. This is a small window into that I think, and you can see that they're likely using internal 'Blaze' Python rules and rely on the maintenance of compatibility with the open-source Bazel rules. `py_library` is used in `tree/BUILD`, but it is not loaded explicitly as is expected with Bazel these days; the rule definition is loaded by Bazel implicitly.
Granted, you also need CI that has most Python versions/platforms and the Python CMake extension by scikit team too. Ex. MacOS Travis with brew [2]
[1] https://github.com/filipArena/elkai/blob/ef8ee98d8114240b204...
[2] https://github.com/filipArena/elkai/blob/master/.travis.yml
> tree has originally been part of TensorFlow and is available as tf.nest.
The tf.nest docs can be found here and may be more useful for now: https://www.tensorflow.org/api_docs/python/tf/nest
I'm glad this is being moved out of TensorFlow. It's a really useful library on its own and I've found myself reaching for it on projects without wanting to import all of TensorFlow.
Isn't this a pretty common error to occur in Python programs? It would be good to at least have a checked mode available, otherwise the "dragons may fly out of your nose" semantics feels like a pretty aggressive compromise in favour of performance vs correctness.
While I agree that a checked mode would be useful, and would likely be the best compromise for a "one-off" use, if you're preserving the structure like this does, then I feel it's quite likely you'd want to do multiple passes. If so, then pre-processing to ensure that it's acyclic would be useful: rather than a checked mode, perhaps an is_acyclic() method, similar to their is_nested(), would be the better compromise? That would allow it to be a nicely optimised function for just that, rather than adding overhead for "nested structures" that are known to be acyclic.
Adding is_acyclic() also means you can handle "sanitising" it yourself. I say that because it may be non-trivial depending on what you're doing and so not best-handled by an passed-through lambda etc. that may need to modify the structure mid "iteration", which would be awkward. Also, you're then doing it up-front, rather than handling an exception, which may have interrupted side-effects, and then makes it awkward because how do you modify and resume without repeating work / side-effects?
As to whether it's pretty common error or not, we'd probably need a wider study to see whether this has caused errors -- after all, sometimes cyclic / self-referential structures are correct. So I don't feel that this is an aggressive compromise, though I do agree that doing something to help with this case (in case of cyclic structures being erroneous) would be the "correct" thing to do.
While is_acyclic() seems better to me, a checked mode would be good for one-shot uses of this library, but as this would be pointless overhead on known tree / tree-like structures (what I would expect to use personally) then as a kwarg I'd prefer check_acyclic=False by default... and we'd likely need multiple cycle detection options, since space/time tradeoffs can be important and what's optimal will likely vary depending on the actual structure used. Similar options for is_acyclic() would be needed by the same argument, of course.
My biggest issue with check_acyclic= would be that it's an additional kwarg to almost every single function in the library, which is a bigger change (I feel) than an independent function. Occam's razor / separation-of-concerns and the like means that I feel the compromise of a checked mode outweighs its benefits, whereas adding is_acyclic() only has the compromise of a new, independent function (and the costs associated with implementing it), and otherwise has the same benefits.