Usage

API for assembling biomedial lexica.

class Configuration(*, inputs: List[Input], excludes: List[str] | None = None)[source]

A configuration for construction of a lexicon.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'excludes': FieldInfo(annotation=Union[List[str], NoneType], required=False, default=None, description='A list of CURIEs to exclude after processing is complete'), 'inputs': FieldInfo(annotation=List[biolexica.api.Input], required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

class Input(*, processor: Literal['pyobo', 'bioontologies', 'biosynonyms', 'gilda'], source: str, ancestors: None | str | List[str] = None, kwargs: Dict[str, Any] | None = None)[source]

An input towards lexicon assembly.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'ancestors': FieldInfo(annotation=Union[NoneType, str, List[str]], required=False, default=None), 'kwargs': FieldInfo(annotation=Union[Dict[str, Any], NoneType], required=False, default=None), 'processor': FieldInfo(annotation=Literal['pyobo', 'bioontologies', 'biosynonyms', 'gilda'], required=True), 'source': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

assemble_terms(configuration: Configuration, mappings: List[semra.Mapping] | None = None, *, extra_terms: List[gilda.Term] | None = None, include_biosynonyms: bool = True, raw_path: Path | None = None, processed_path: Path | None = None) → List[Term][source]: Assemble terms from multiple resources.

iter_terms_by_prefix(prefix: str, *, ancestors: None | str | List[str] = None, processor: Literal['pyobo', 'bioontologies', 'biosynonyms', 'gilda'], **kwargs) → Iterable[Term][source]: Iterate over all terms from a given prefix.

load_grounder(grounder: Grounder | str | Path) → Grounder[source]: Load a gilda grounder, potentially from a remote location.

get_mesh_category_curies(letter, skip=None) → List[str][source]: Get the MeSH LUIDs for a category, by letter (e.g., “A”).

class Annotation(*, text: str, start: int, end: int, match: Match)[source]

Data about an annotation.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

property reference: Reference: Get the match’s reference.

property name: str: Get the match’s entry name.

property curie: str: Get the match’s CURIE.

property score: float: Get the match’s score.

property substr: str: Get the substring that was matched.

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'end': FieldInfo(annotation=int, required=True), 'match': FieldInfo(annotation=Match, required=True), 'start': FieldInfo(annotation=int, required=True), 'text': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

class Match(*, reference: Reference, name: str, score: float)[source]

Model a scored match from Gilda.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

property curie: str: Get the reference’s curie.

classmethod from_gilda(scored_match: ScoredMatch)[source]: Construct a match from a Gilda object.

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}: A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'name': FieldInfo(annotation=str, required=True), 'reference': FieldInfo(annotation=Reference, required=True), 'score': FieldInfo(annotation=float, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

Wrap a Gilda grounder with additional functionality.

get_matches(s: str, context: str | None = None, organisms: List[str] | None = None, namespaces: List[str] | None = None) → List[Match][source]: Get matches in Biolexica’s format.

get_best_match(s: str, context: str | None = None, organisms: List[str] | None = None, namespaces: List[str] | None = None) → Match | None[source]: Get the best match in Biolexica’s format.

annotate(text: str, **kwargs: Any) → List[Annotation][source]: Annotate the text.