A workflow management system for reproducible data analyses
You’ve probably organized your project using a similar folder structure where each folder contains scripts carrying out specific tasks.
But, how do you execute all tasks or keep your project in sync?
Tasks are functions starting with task_
. Use decorators to specify the dependencies
and products of a task. Using depends_on
and produces
as function args, you access
the paths to the files in the function body.
Type pytask
in your terminal, and it will automatically collect and execute all tasks.
👉 Automation reduces errors and increases reproducibility.
👉 The build process is documented in code.
👉 You can iterate faster and be more productive.
Is pytask used for actual research? Yes!
Here is a Covid-19 forecast project with an agent-based model, 10+ datasets, many different policy scenarios and 1,000+ simulations.
pytask is also part of a graduate course, teaching economists programming and best practices for research projects.
👉 https://github.com/OpenSourceEconomics/econ-project-templates
Scale your project by repeating tasks! 🚀
For example, create ten different datasets with randomly generated data.
Start a new project from a template!
A minimal template: https://github.com/pytask-dev/cookiecutter-pytask-project
A template for reproducible economics projects: https://github.com/OpenSourceEconomics/econ-project-templates
Enter the debugger if one of your tasks fails, and you want to find out why! 🏗️
You can find out more about pytask in the documentation: https://pytask-dev.readthedocs.io/.
Follow the tutorials for a step-by-step introduction: https://pytask-dev.readthedocs.io/en/stable/tutorials/index.html
pytask is also part of a more extensive ecosystem of research tools developed at @open_econ.
We will soon write about tools like estimagic, a package for complex numerical optimization, and estimation/calibration of scientific models.
Thanks for staying with me until the end! At last, some shout-outs to amazing people and projects.
Thanks to @kroehrl, @JanosGabler, and @econ_hmg, who helped me build this tool in endless and fruitful discussions! 🙇
pytask stands on the shoulders of these projects. Thank you!🙏