Code import `CommandError` #17

mih · 2024-06-06T14:36:42Z

Originally taken from datalad-core. The PR documents how it was modified/gutted.

Closes Import candidate: CommandError #16

TODO

Make https://github.com/datalad/datalad-next/blob/main/datalad_next/patches/commanderror.py obsolete
Align messaging with subprocess

codecov-commenter · 2024-06-06T14:59:26Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (b500c1d) to head (79e499d).
Report is 8 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##             main       #17       +/-   ##
============================================
+ Coverage   50.00%   100.00%   +50.00%     
============================================
  Files           2         3        +1     
  Lines           4        78       +74     
  Branches        0        13       +13     
============================================
+ Hits            2        78       +76     
+ Misses          2         0        -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Verbatim code import from DataLad 84b2bd574e9edf6a8edf37e22021a0ebbc744e66 (v1.0.1) Refs: #16

Use a plain `assert ... == ...` instead.

The rationale is in the diff. In short, the previous implementation was based on assumptions that do not hold in the general case.

It is coming in as a `str`, and it goes out as a `str`.

In its previous implementation, `CommandError.__str__()` could output large amounts of test, due to the inclusion of provided `stdout` or `stderr` as `str` or `bytes`. This is not compliant with the idea of that function: compute the “informal” or nicely printable string representation of an object This change removes the rendering of captured outputs. A future change may re-add support for it with additional safe-guard on maximum length, or rendering a summary (e.g., in the case of byte-strings) rather than risking the rendering of large blobs of (human-)unreadable output.

While it was not clear from the documentation, this implementation only made sense for JSON-lines encoded output produced by git-annex. Such special casing it better implemented in a dedicated class.

Both forms have been used in the original code.

This is a sever limitation for future API changes for such a core class. Client code can reasonable start claiming the entirely namespace, leaving little room for well-described future additions. It is unclear what use cases a `**kwargs` supports that would not be (better) supported by a dedicated subclass for proper(ly typed) arguments.

This is a more standard name that better communicates what code is referred to. This is also aligned with `subprocess.CalledProcessError` and maps its argument names 1:1 now.

Previously it said "under", which felt weird. Now it clearly states that the reported value is the CWD of the command execution.

This brings code coverage to 100% and reveal suboptimal `str()` rendering: `CommandError:`

It makes little sense to report an exception on a command execution failure without including information on the command that failed. It is safe to assume that we know under all circumstances which command was executed that lead to the failure. If we do not know this, the use of `CommandError` is likely inappropriate

Without this implementation ``CommandError` would use ``RuntimeError``'s variant and ignore all additional structured information except for ``.msg`` -- which is frequently empty and confuses with a `CommandError('')` display. This implementation limits the rendering output by truncating captured ``stdout`` and ``stderr``. For ``str``-type outputs the start and end of the string is shown, with an indication how much text was cut in the middle. For ``bytes`` an indication of the number of bytes is given, only. The handling of byte-strings is different from the one implemented in datalad-core for this purpose. There, it tries to force-decode bytes to unicode, while guessing the encoding, and raising an exception if that is not possible. I consider it inappropriate to raise an exception while handling a ``CommandError`` exception, because any client code will need to deal with this complication. Moreover, it is not clear that such decoding is possible or even meaningful in the general case (the output could well be an actual data byte-stream, with no meaningful human-readable rendering). The existing test for exception rendering has been extended to cover both ``str()`` and ``repr()``. This change obsoletes the ``commanderror`` patch in datalad-next. Refs: https://github.com/datalad/datalad-next/blob/cee9c8e2565555966e2b32c3ffb0eb39a4a7b86a/datalad_next/patches/commanderror.py

…decoding This change aligns the string rendering of `CommandError` with that of `subprocess.CalledProcessError`. `CommandError` reports more information, but for the matching subset style and wording are aligned. This should give users a more familiar behavior. Moreover, this change also adopts the decoding of signal names done in subprocess. For example, translating -6 to SIGABRT. In the case where a signal report carries useful information, this could save someone a search query

In a previous implementation this method used to have parameters, but presently it does not. Nevertheless it represented a public API components without a real purpose. it was removed and the same functionality continues to be available via the standard `__str__()` method.

Typically, the machinery that executes a command and raises a `CommandError` is not aware of why a (failed) command was executed, or whether there are probably causes for a command failure. This change tests and documents a pattern that allows for adding a context message to a `CommandError` instance. Such messages can be used to help users sort out errors.

codeclimate · 2024-06-11T10:55:52Z

Code Climate has analyzed commit 9c72427 and detected 3 issues on this pull request.

Here's the issue category breakdown:

Category	Count
Complexity	3

View more on Code Climate.

This change bases the essential `CommandError` class on the analog class in the new `datasalad` library. This primarily improves the user-facing messaging, and debugging. For example, a `CommandError.__repr__()` no longer returns an empty string, which has hindered investigations in the past. However, most reporting improvements (e.g., alignment with Python's `subprocess` behavior) have been overwritten again, to maintain the original behavior. The primary motivation for this change is that in countless places type comparisions with `CommandError` are made, and until all used `CommandError` implementations are using a common basis migration of other code is blocked. Refs: datalad/datasalad#17, datalad/datalad-next#728

mih force-pushed the cmderro branch 3 times, most recently from bf4041e to f960fae Compare June 6, 2024 14:56

mih force-pushed the cmderro branch 6 times, most recently from ff5694d to e5a3c15 Compare June 7, 2024 08:00

mih added 20 commits June 11, 2024 12:53

chore(codeimport): CommandError implementation from DataLad

f8938fe

Verbatim code import from DataLad 84b2bd574e9edf6a8edf37e22021a0ebbc744e66 (v1.0.1) Refs: #16

test(helpers): discontinue use of nose-age assert_equal

ffb8ea4

Use a plain `assert ... == ...` instead.

rf(CommandError): discontinue command beautification

295a6d4

The rationale is in the diff. In short, the previous implementation was based on assumptions that do not hold in the general case.

rf(CommandError): discontinue encoding change of message

62c1a8d

It is coming in as a `str`, and it goes out as a `str`.

rf(CommandError): remove rendering of JSON-encoded error output

0f41ace

While it was not clear from the documentation, this implementation only made sense for JSON-lines encoded output produced by git-annex. Such special casing it better implemented in a dedicated class.

style(typeannotation): standardize on | None for optional arguments

e8d819d

Both forms have been used in the original code.

rf(CommandError): rename argument code -> returncode

c4dbf82

This is a more standard name that better communicates what code is referred to. This is also aligned with `subprocess.CalledProcessError` and maps its argument names 1:1 now.

rf(CommandError): switch from .format() to f-strings

df8ed6a

rf(CommandError): clarify CWD reporting

ee36263

Previously it said "under", which felt weird. Now it clearly states that the reported value is the CWD of the command execution.

test(CommandError): check equivalence of str() and to_str()

468244a

rf(CommandError): drop unnecessary logging import and logger creation

886b2f8

test(CommandError): test least-information CommandError

2cb8c48

This brings code coverage to 100% and reveal suboptimal `str()` rendering: `CommandError:`

chore: remove dummy code used for initial package setup

2dcbc26

doc(CommandError): an initial write-up of the class documentation

b187ad2

mih force-pushed the cmderro branch from 79e499d to 9c72427 Compare June 11, 2024 10:55

mih merged commit 14736a9 into main Jun 11, 2024
8 checks passed

mih deleted the cmderro branch June 11, 2024 10:59

mih mentioned this pull request Jun 17, 2024

rf(CommandError): now based on datasalad's CommandError datalad/datalad#7618

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Code import `CommandError` #17

Code import `CommandError` #17

mih commented Jun 6, 2024 •

edited

Loading

codecov-commenter commented Jun 6, 2024 •

edited

Loading

codeclimate bot commented Jun 11, 2024

Code import CommandError #17

Code import CommandError #17

Conversation

mih commented Jun 6, 2024 • edited Loading

codecov-commenter commented Jun 6, 2024 • edited Loading

Codecov Report

codeclimate bot commented Jun 11, 2024

Code import `CommandError` #17

Code import `CommandError` #17

mih commented Jun 6, 2024 •

edited

Loading

codecov-commenter commented Jun 6, 2024 •

edited

Loading