-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ct_image_reader #79
ct_image_reader #79
Conversation
Codecov Report
@@ Coverage Diff @@
## master #79 +/- ##
==========================================
+ Coverage 87.89% 88.66% +0.77%
==========================================
Files 14 16 +2
Lines 735 803 +68
==========================================
+ Hits 646 712 +66
- Misses 89 91 +2
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Hi @Piyush13y and @hunterhector I am facing this new error with already merged processors. E AttributeError: module 'numpy' has no attribute 'object' is there something we can do ? |
Could you try to narrow down the error? Most likely there are some new environment upgrades that cause this. I cannot easily figure it out just looking at the error. |
Run coverage run -m pytest tests/fortex/health/processors/scispacy_processor_test.py ==================================== ERRORS ==================================== /opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/tensorflow/python/pywrap_tensorflow_internal.py:15 /opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/tensorflow/python/framework/dtypes.py:223 /opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/tensorflow/python/framework/dtypes.py:513 -- Docs: https://docs.pytest.org/en/latest/warnings.html |
The same error with xray processor test case |
After a quick glance it looks like it is the spacy and numpy have some version conflicts. I would recommend recreating this issue locally and find out the right version combination. To be honest, we haven't upgraded some library versions for a long time. We should spend a sprint on that alone. |
also got it with x-ray processor test too. it was added last month. Run coverage run -m pytest tests/fortex/health/processors/xray_processor_test.py ==================================== ERRORS ==================================== The above exception was the direct cause of the following exception: /opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/tensorflow/python/framework/dtypes.py:223 /opt/hostedtoolcache/Python/3.8.15/x64/lib/python3.8/site-packages/tensorflow/python/framework/dtypes.py:513 -- Docs: https://docs.pytest.org/en/latest/warnings.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comments inline.
An additional note on the binary files. These binary example files are very useful but they will make the master branch larger and larger. Could you add them in another PR using the following trick:
https://gist.github.com/mcroach/811c8308f4bd78570918844258841942
Basically, you can create an asset branch (you can use another name, say data
branch) that holds the content, and download them when needed. This way we can keep the master branch lean.
.github/workflows/main.yml
Outdated
- { testfile: tests/fortex/health/readers/mimic3_note_reader_test.py } | ||
- { testfile: tests/fortex/health/readers/xray_image_reader_test.py } | ||
- { testfile: tests/fortex/health/processors/xray_processor_test.py } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's locate these errors locally and try to figure them out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved with numpy version downgrade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hunterhector I don't have access to create data branch in Forte repo. Either you can create a branch or give me the necessary access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have created a complete empty data branch: https://github.com/asyml/forte/tree/data
|
||
For more information for the dataset, visit: | ||
https://data.mendeley.com/datasets/jctsfj2sfn/1 | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
additional spaces
|
||
|
||
class CTimageReader(PackReader): | ||
r""":class:`CTimageReader` is designed to read CT image files from a given folder.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should describe this more clearly, what types of data, what's the folder structure?
|
||
def __init__(self): | ||
super().__init__() | ||
self.pydicom = pydicom |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need a self.pydicom
? Can we simply us pydicom
later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed the self.pydicom and used pydicom directly.
resolved
super().__init__() | ||
self.pydicom = pydicom | ||
|
||
def _collect(self, image_directory) -> Iterator[Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iterator type is too broad.
img = self.pydicom.dcmread( | ||
file_path, **(self.configs.read_kwargs or {}) | ||
) | ||
img.PhotometricInterpretation = "YBR_FULL" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should document these choices (why YBR_FULL)
img.PhotometricInterpretation = "YBR_FULL" | ||
pixel_data = img.pixel_array | ||
pack.add_image(image=pixel_data) | ||
pack.pack_name = file_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pack_name
is better to be a non-path string (not containing /
)
setup.py
Outdated
@@ -31,6 +31,7 @@ | |||
"transformers==4.18.0", | |||
"protobuf==3.19.4", | |||
"Pillow==8.4.0", | |||
"pydicom==2.3.1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be an entry in test
, it should have its own keyword in extras_require
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
def test_reader(self): | ||
for pack in self.pl.process_dataset(self.orig_image_pth): | ||
self.assertTrue( | ||
pack.pack_name.split("/")[-1] in self.expected_image_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pack_name
isn't the place for storing paths. If we want to store path, we have discussed to use Payload
cf68e4d
to
3963a7b
Compare
This PR fixes #78 .
Description of changes
This a new CT image reader implementation using pydicom lib.
Possible influences of this PR.
The sample data is a bit huge, even for one instance.
Test Conducted
the test case verifies the existence of the image path from the reader in the source folder.