Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code import CommandError #17

Merged
merged 21 commits into from
Jun 11, 2024
Merged

Code import CommandError #17

merged 21 commits into from
Jun 11, 2024

Conversation

mih
Copy link
Member

@mih mih commented Jun 6, 2024

Originally taken from datalad-core. The PR documents how it was modified/gutted.

TODO

@mih mih force-pushed the cmderro branch 3 times, most recently from bf4041e to f960fae Compare June 6, 2024 14:56
@codecov-commenter
Copy link

codecov-commenter commented Jun 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (b500c1d) to head (79e499d).
Report is 8 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##             main       #17       +/-   ##
============================================
+ Coverage   50.00%   100.00%   +50.00%     
============================================
  Files           2         3        +1     
  Lines           4        78       +74     
  Branches        0        13       +13     
============================================
+ Hits            2        78       +76     
+ Misses          2         0        -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mih mih force-pushed the cmderro branch 6 times, most recently from ff5694d to e5a3c15 Compare June 7, 2024 08:00
mih added 20 commits June 11, 2024 12:53
Verbatim code import from DataLad
84b2bd574e9edf6a8edf37e22021a0ebbc744e66 (v1.0.1)

Refs: #16
Use a plain `assert ... == ...` instead.
The rationale is in the diff. In short, the previous implementation
was based on assumptions that do not hold in the general case.
It is coming in as a `str`, and it goes out as a `str`.
In its previous implementation, `CommandError.__str__()` could output
large amounts of test, due to the inclusion of provided `stdout` or
`stderr` as `str` or `bytes`. This is not compliant with the idea of
that function:

    compute the “informal” or nicely printable string representation
    of an object

This change removes the rendering of captured outputs. A future change
may re-add support for it with additional safe-guard on maximum length,
or rendering a summary (e.g., in the case of byte-strings) rather than
risking the rendering of large blobs of (human-)unreadable output.
While it was not clear from the documentation, this implementation only
made sense for JSON-lines encoded output produced by git-annex. Such
special casing it better implemented in a dedicated class.
Both forms have been used in the original code.
This is a sever limitation for future API changes for such a core class.
Client code can reasonable start claiming the entirely namespace,
leaving little room for well-described future additions.

It is unclear what use cases a `**kwargs` supports that would not be
(better) supported by a dedicated subclass for proper(ly typed)
arguments.
This is a more standard name that better communicates what code is
referred to. This is also aligned with `subprocess.CalledProcessError`
and maps its argument names 1:1 now.
Previously it said "under", which felt weird.

Now it clearly states that the reported value is the CWD of the command
execution.
This brings code coverage to 100% and reveal suboptimal `str()`
rendering: `CommandError:`
It makes little sense to report an exception on a command execution
failure without including information on the command that failed.

It is safe to assume that we know under all circumstances which command
was executed that lead to the failure. If we do not know this, the use
of `CommandError` is likely inappropriate
Without this implementation ``CommandError` would use ``RuntimeError``'s
variant and ignore all additional structured information except for
``.msg`` -- which is frequently empty and confuses with a
`CommandError('')` display.

This implementation limits the rendering output by truncating captured
``stdout`` and ``stderr``. For ``str``-type outputs the start and end
of the string is shown, with an indication how much text was cut in the
middle. For ``bytes`` an indication of the number of bytes is given,
only.

The handling of byte-strings is different from the one implemented in
datalad-core for this purpose. There, it tries to force-decode bytes to
unicode, while guessing the encoding, and raising an exception if that
is not possible. I consider it inappropriate to raise an exception
while handling a ``CommandError`` exception, because any client code
will need to deal with this complication. Moreover, it is not clear
that such decoding is possible or even meaningful in the general case
(the output could well be an actual data byte-stream, with no meaningful
human-readable rendering).

The existing test for exception rendering has been extended to cover
both ``str()`` and ``repr()``.

This change obsoletes the ``commanderror`` patch in datalad-next.

Refs: https://github.com/datalad/datalad-next/blob/cee9c8e2565555966e2b32c3ffb0eb39a4a7b86a/datalad_next/patches/commanderror.py
…decoding

This change aligns the string rendering of `CommandError` with that of
`subprocess.CalledProcessError`.

`CommandError` reports more information, but for the matching subset
style and wording are aligned. This should give users a more familiar
behavior.

Moreover, this change also adopts the decoding of signal names done
in subprocess. For example, translating -6 to SIGABRT. In the case where
a signal report carries useful information, this could save someone a
search query
In a previous implementation this method used to have parameters, but
presently it does not. Nevertheless it represented a public API
components without a real purpose. it was removed and the same
functionality continues to be available via the standard `__str__()`
method.
Typically, the machinery that executes a command and raises a
`CommandError` is not aware of why a (failed) command was executed,
or whether there are probably causes for a command failure.

This change tests and documents a pattern that allows for adding
a context message to a `CommandError` instance.

Such messages can be used to help users sort out errors.
Copy link

codeclimate bot commented Jun 11, 2024

Code Climate has analyzed commit 9c72427 and detected 3 issues on this pull request.

Here's the issue category breakdown:

Category Count
Complexity 3

View more on Code Climate.

@mih mih merged commit 14736a9 into main Jun 11, 2024
8 checks passed
@mih mih deleted the cmderro branch June 11, 2024 10:59
mih added a commit to mih/datalad that referenced this pull request Jun 17, 2024
This change bases the essential `CommandError` class on the analog
class in the new `datasalad` library. This primarily improves
the user-facing messaging, and debugging. For example, a
`CommandError.__repr__()` no longer returns an empty string, which
has hindered investigations in the past.

However, most reporting improvements (e.g., alignment with Python's
`subprocess` behavior) have been overwritten again, to maintain
the original behavior.

The primary motivation for this change is that in countless places
type comparisions with `CommandError` are made, and until all used
`CommandError` implementations are using a common basis migration
of other code is blocked.

Refs: datalad/datasalad#17, datalad/datalad-next#728
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Import candidate: CommandError
2 participants