You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
copy the load command in the Loading section seqrepo --root-directory $SEQREPO_ROOT/master load -n NCBI mirror/ftp.ncbi.nih.gov/refseq/H_sapiens/mRNA_Prot/human.*.gz
paste to shell terminal.
results in error: FileNotFoundError: [Errno 2] No such file or directory: 'mirror/ftp.ncbi.nih.gov/refseq/H_sapiens/mRNA_Prot/human.*.gz'
I have tried trim off the leading mirror/ or replace it with ftp:// but both didn't work.
Expected behavior
The CLI document is a few years old, and could use some update. There are additional CLI commands that are not covered. The short description with --help is hard to start with.
I am particularly interested in loading individual sequences in an existing instance. For example transcript NM_001387679.1 doesn't seem to be in the latest data pull 2023-09-28. It would be nice to know how to add it. Looks like the fetch-load is the possible command, but the cli.rst didn't mention this and a few other commands. Tried a few times but none is working: seqrepo fetch-load -i 2023-09-28 -n NCBI NM_001387679.1
Additional context
I am using seqrepo conjunction with UTA or Cdot for validating variants. Found that in some occasions the transcript ID is annotated in UTA or Cdot but its sequence cannot be retrieved from seqrepo. I am hoping load those few missing transcripts with the command line tools, and looking for similar use cases.
The text was updated successfully, but these errors were encountered:
I think that step is just referencing a set of files (presumably from ftp.ncbi.nih.gov/refseq/H_sapiens/mRNA_Prot/) that are expected to be available under that subdirectory/glob. Could mention this or include a basic curl/wget command to acquire an example.
That said, the next command does raise an error for me -- note the path given in the exception, looks like it forcibly checks under a latest/ subdirectory within the --root-directory option
[ main ⚙ venv] ~/code/seqrepo % seqrepo --root-directory $SEQREPO_ROOT/master show-status
Traceback (most recent call last):
File "/Users/jamesstevenson/code/seqrepo/venv/bin/seqrepo", line 8, in <module>
sys.exit(main())
^^^^^^
File "/Users/jamesstevenson/code/seqrepo/src/biocommons/seqrepo/cli.py", line 733, in main
opts.func(opts)
File "/Users/jamesstevenson/code/seqrepo/src/biocommons/seqrepo/cli.py", line 580, in show_status
sr = SeqRepo(seqrepo_dir)
^^^^^^^^^^^^^^^^^^^^
File "/Users/jamesstevenson/code/seqrepo/src/biocommons/seqrepo/seqrepo.py", line 120, in __init__
raise OSError("Unable to open SeqRepo directory {}".format(self._root_dir))
OSError: Unable to open SeqRepo directory /usr/local/share/seqrepo/master/latest
Describe the bug
docs/cli.rst
has theload
command examples. However, theload
command example doesn't run correctly.To Reproduce
Steps to reproduce the behavior:
load
command in the Loading sectionseqrepo --root-directory $SEQREPO_ROOT/master load -n NCBI mirror/ftp.ncbi.nih.gov/refseq/H_sapiens/mRNA_Prot/human.*.gz
FileNotFoundError: [Errno 2] No such file or directory: 'mirror/ftp.ncbi.nih.gov/refseq/H_sapiens/mRNA_Prot/human.*.gz'
I have tried trim off the leading
mirror/
or replace it withftp://
but both didn't work.Expected behavior
The CLI document is a few years old, and could use some update. There are additional CLI commands that are not covered. The short description with
--help
is hard to start with.I am particularly interested in loading individual sequences in an existing instance. For example transcript
NM_001387679.1
doesn't seem to be in the latest data pull2023-09-28
. It would be nice to know how to add it. Looks like thefetch-load
is the possible command, but the cli.rst didn't mention this and a few other commands. Tried a few times but none is working:seqrepo fetch-load -i 2023-09-28 -n NCBI NM_001387679.1
Additional context
I am using seqrepo conjunction with UTA or Cdot for validating variants. Found that in some occasions the transcript ID is annotated in UTA or Cdot but its sequence cannot be retrieved from seqrepo. I am hoping load those few missing transcripts with the command line tools, and looking for similar use cases.
The text was updated successfully, but these errors were encountered: