Luigi: How To Pass Different Arguments To Leaf Tasks?
Solution 1:
So what you're trying to do is create tasks with params without passing these params to the parent class. That is completely understandable, and I have been annoyed at times in trying to handle this.
Firstly, you are using the config
class incorrectly. When using a Config class, as noted in https://luigi.readthedocs.io/en/stable/configuration.html#configuration-classes, you need to instantiate the object. So, instead of:
class Task0(Task):
path = Parameter(default=config.path)
...
you would use:
class Task0(Task):
path = Parameter(default=config().path)
...
While this now ensures you are using a value and not a Parameter
object, it still does not solve your problem. When creating the class Task0
, config().path
would be evaluated, therefore it's not assigning the reference of config().path
to path
, but instead the value when called (which will always be defaultpath.txt
). When using the class in the correct manner, luigi will construct a Task
object with only luigi.Parameter
attributes as the attribute names on the new instance as seen here: https://github.com/spotify/luigi/blob/master/luigi/task.py#L436
So, I see two possible paths forward.
1.) The first is to set the config path at runtime like you had, except set it to be a Parameter
object like this:
config.path = luigi.Parameter(f"newpath_{i}")
However, this would take a lot of work to get your tasks using config.path
working as now they need to take in their parameters differently (can't be evaluated for defaults when the class is created).
2.) The much easier way is to simply specify the arguments for your classes in the config file. If you look at https://github.com/spotify/luigi/blob/master/luigi/task.py#L825, you'll see that the Config
class in Luigi, is actually just a Task
class, so you can anything with it you could do with a class and vice-versa. Therefore, you could just have this in your config file:
[Task0]
path = newpath_1
...
3.) But, since you seem to be wanting to run multiple tasks with the different arguments for each, I would just recommend passing in args through the parents as Luigi encourages you to do. Then you could run everything with:
luigi.build([TaskC(arg=i) for i in range(3)])
4.) Finally, if you really need to get rid of passing dependencies, you can create a ParamaterizedTaskParameter
that extends luigi.ObjectParameter
and uses the pickle of a task instance as the object.
Of the above solutions, I highly suggest either 2 or 3. 1 would be difficult to program around, and 4 would create some very ugly parameters and is a bit more advanced.
Edit: Solutions 1 and 2 are more of hacks than anything, and it is just recommended that you bundle parameters in DictParameter
.
Post a Comment for "Luigi: How To Pass Different Arguments To Leaf Tasks?"