Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New source: BiGG Models Metabolite Database #124

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
63 changes: 63 additions & 0 deletions src/pyobo/sources/bigg.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# -*- coding: utf-8 -*-

"""Converter for bigg."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix stylization of BiGG


from typing import Iterable, Optional

import bioversions

from pyobo.struct import Obo, Reference, SynonymTypeDef, Term, TypeDef

from ..utils.path import ensure_df

HEADER = ["bigg_id", "universal_bigg_id", "name", "model_list", "database_links", "old_bigg_ids"]
PREFIX = "bigg.metabolite"

URL = "http://bigg.ucsd.edu/static/namespace/bigg_models_metabolites.txt"

alias_type = SynonymTypeDef(id="alias", name="alias")
has_role = TypeDef(reference=Reference(prefix="bigg", identifier="has_role"))


def get_obo(force: bool = False) -> Obo:
"""Get bigg as OBO."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stylization

version = bioversions.get_version("bigg")
# version = '1.2'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove dead code

return Obo(
ontology=PREFIX,
name="bigg models metabolites database",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

title case + stylization

iter_terms=get_terms,
iter_terms_kwargs=dict(force=force, version=version),
typedefs=[has_role],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this included?

synonym_typedefs=[alias_type],
auto_generated_by=f"bio2obo:{PREFIX}",
data_version=version,
)


def get_terms(force: bool = False, version: Optional[str] = None) -> Iterable[Term]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check CI - this needs a docstring

bigg_df = ensure_df(
prefix=PREFIX,
url=URL,
sep="\t",
skiprows=18,
header=None,
names=HEADER,
usecols=['bigg_id', 'name'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about the other columns? database_links has lots of information worth parsing out, for example. old_bigg_ids can be put in the alternative identifier field in the pyobo.Term

force=force,
version=version,
)

for v in bigg_df.values:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v is a bad variable name. Better is to use tuple unpacking

Suggested change
for v in bigg_df.values:
for bigg_id, name in bigg_df.values:

bigg_id = v[0]
name = v[1]
synonyms = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why include this blank list?

term = Term(
reference=Reference(prefix=PREFIX, identifier=bigg_id, name=name),
synonyms=synonyms,
)
yield term


if __name__ == "__main__":
get_obo(force=True).cli()