Manager

Manager for Bio2BEL Entrez.

class bio2bel_entrez.manager.Manager(*args, **kwargs)[source]

Genes and orthologies.

namespace_model

alias of bio2bel_entrez.models.Gene

is_populated() → bool[source]

Check if the database is already populated.

get_or_create_species(taxonomy_id: str, **kwargs) → bio2bel_entrez.models.Species[source]

Get or create a Species model.

Parameters

taxonomy_id – NCBI taxonomy identifier

get_gene_by_entrez_id(entrez_id: str) → Optional[bio2bel_entrez.models.Gene][source]

Get a gene with the given Entrez Gene identifier, if it exists.

Parameters

entrez_id – Entrez Gene identifier

get_genes_by_name(name: str) → List[bio2bel_entrez.models.Gene][source]

Get a list of genes with the given name (case insensitive).

Parameters

name – A gene name

get_gene_by_rgd_name(name: str) → Optional[bio2bel_entrez.models.Gene][source]

Get a gene by its RGD name.

Parameters

name – RGD gene symbol

get_gene_by_mgi_name(name: str) → Optional[bio2bel_entrez.models.Gene][source]

Get a gene by its MGI name.

Parameters

name – MGI gene symbol

get_gene_by_hgnc_name(name: str) → Optional[bio2bel_entrez.models.Gene][source]

Get a gene by its HGNC gene symbol.

get_or_create_gene(entrez_id: str, **kwargs) → bio2bel_entrez.models.Gene[source]

Get or create a Gene model.

Parameters

entrez_id – Entrez Gene identifier

get_or_create_homologene(homologene_id: str, **kwargs) → bio2bel_entrez.models.Homologene[source]

Get or create a HomoloGene model.

Parameters

homologene_id – HomoloGene Gene identifier

populate_homologene(url=None, cache=True, force_download=False, tax_id_filter=None) → None[source]

Populate the database.

Parameters
  • url (Optional[str]) – Homologene data url

  • cache (bool) – If true, the data is downloaded to the file system, else it is loaded from the internet

  • force_download (bool) – If true, overwrites a previously cached file

  • tax_id_filter (Optional[iter[str]]) – Species to keep

populate_gene_info(url: Optional[str] = None, cache: bool = True, force_download: bool = False, interval: Optional[int] = None, tax_id_filter: Iterable[str] = None)[source]

Populate the database.

Parameters
  • url – A custom url to download

  • interval – The number of records to commit at a time

  • cache – If true, the data is downloaded to the file system, else it is loaded from the internet

  • force_download – If true, overwrites a previously cached file

  • tax_id_filter – Species to keep

populate(gene_info_url: Optional[str] = None, interval: Optional[int] = None, tax_id_filter: Iterable[str] = ('9606', '10090', '10116', '7227', '4932', '6239', '7955', '9913', '9615'), homologene_url: Optional[str] = None)[source]

Populate the database.

Parameters
  • gene_info_url – A custom url to download

  • interval – The number of records to commit at a time

  • tax_id_filter – Species to keep. Defaults to 9606 (human), 10090 (mouse), 10116 (rat), 7227 (fly), and 4932 (yeast). Explicitly set to None to get all taxonomies.

  • homologene_url – A custom url to download

lookup_node(node: pybel.dsl.node_classes.BaseEntity) → Optional[bio2bel_entrez.models.Gene][source]

Look up a gene from a PyBEL data dictionary.

iter_genes(graph: pybel.struct.graph.BELGraph, use_tqdm: bool = False) → Iterable[Tuple[pybel.dsl.node_classes.BaseEntity, bio2bel_entrez.models.Gene]][source]

Iterate over genes in the graph that can be mapped to an Entrez gene.

normalize_genes(graph: pybel.struct.graph.BELGraph, use_tqdm: bool = False) → None[source]

Add identifiers to all Entrez genes.

enrich_genes_with_homologenes(graph: pybel.struct.graph.BELGraph) → None[source]

Enrich the nodes in a graph with their HomoloGene parents.

enrich_equivalences(graph: pybel.struct.graph.BELGraph) → None[source]

Add equivalent node information.

enrich_orthologies(graph: pybel.struct.graph.BELGraph) → None[source]

Add ortholog relationships to graph.

add_homologene_namespace_to_graph(graph: pybel.struct.graph.BELGraph) → pybel.manager.models.Namespace[source]

Add the homologene namespace to the graph.

count_genes() → int[source]

Count the genes in the database.

count_homologenes() → int[source]

Count the HomoloGenes in the database.

count_species() → int[source]

Count the species in the database.

list_species() → List[bio2bel_entrez.models.Species][source]

List all species in the database.

list_homologenes() → List[bio2bel_entrez.models.Homologene][source]

List all HomoloGenes in the database.

summarize() → Dict[str, int][source]

Return a summary dictionary over the content of the database.

list_genes(limit: Optional[int] = None, offset: Optional[int] = None) → List[bio2bel_entrez.models.Gene][source]

List genes in the database.