Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YODA principles #113

Merged
merged 31 commits into from
Sep 1, 2019
Merged

YODA principles #113

merged 31 commits into from
Sep 1, 2019

Conversation

adswa
Copy link
Contributor

@adswa adswa commented Aug 19, 2019

I'm adding a place-holder for a section on the YODA principles and organizational best-practices for data analyses in DataLad datasets. This could be woven into the narrative by using one of the datasets in here for a "midterm" data analysis project. Afterwards, in conjunction with #111, we could share these results somewhere that is not the same file system.

@adswa adswa changed the title [WIP] YODA principles YODA principles Aug 26, 2019
@adswa
Copy link
Contributor Author

adswa commented Aug 26, 2019

First draft is done.

@mih mih self-requested a review August 26, 2019 10:02
The lecturer hands out the requirements: The projects needs to

- be prepared in the form of a DataLad dataset
- needs to contain a data analysis performed with Python tools
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't read in full yet, but this wants to tell me that something in here is Python-specific, and I can ignore it for other projects

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, I get your point. I will try to reduce that feeling, if we keep this.
Can I get your take on the general idea? It was to put the YODA principles into the context of the narrative, and this I thought was easiest possible in the context of a data analysis. My initial idea was:

However, thinking about this now, it also feels like a lot in a single chapter (Yoda, Python API, datalad publish). The alternative would be to have individual parts in their own chapters or as parts of other chapters, and then combine/apply them in a single section.

I'm undecided yet, so if anyone has preferences...

docs/basics/101-123-yoda.rst Outdated Show resolved Hide resolved
docs/basics/101-123-yoda.rst Outdated Show resolved Hide resolved
docs/basics/101-123-yoda.rst Show resolved Hide resolved
docs/basics/101-123-yoda.rst Show resolved Hide resolved
The section :ref:`run` already outlined the problem of associating
a result with an input and a script. It can be difficult to link a
figure from your data analysis project with an input data file or a
script, even if you created this figure yourself.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that sense it is yet another linkage aspect (in addition to P2): provenance capture. Only executing through datalad run links the code and the inputs to the outputs. Uncertain if we want to attempt merging P2 and P3, or how to delineate them better. This is not directly relevant to your writing, but a possible conceptual weakness of the principles.

In a sense the P2 summary is also a good summary for P3....

docs/basics/101-123-yoda.rst Outdated Show resolved Hide resolved
docs/basics/101-123-yoda.rst Outdated Show resolved Hide resolved
about containerized computational environments for reproducible data science,
check out `this section <https://the-turing-way.netlify.com/reproducible_environments/06/containers#Containers_section>`_
in the wonderful book `The Turing Way <https://the-turing-way.netlify.com/introduction/introduction>`_,
a comprehensive guide to reproducible data science.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventually move into the future section

docs/basics/101-123-yoda.rst Show resolved Hide resolved
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This handbook will teach you simple and yet advanced principles of data
management for reproducible, comprehensible, transparent, and FAIR data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link FAIR to their page?

Copy link
Collaborator

@mih mih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like! Works nicely with the principles being outlined before being explained later.

@mih mih merged commit e25a68d into master Sep 1, 2019
@mih mih deleted the yoda branch September 1, 2019 04:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants