forayer.knowledge_graph.KG¶
- class forayer.knowledge_graph.KG(entities: Dict[Any, Dict[Any, Any]], rel: Optional[Dict[Any, Dict[Any, Any]]] = None, name: Optional[str] = None)[source]¶
KG class holding entities and their attributes and relations between entities.
- __init__(entities: Dict[Any, Dict[Any, Any]], rel: Optional[Dict[Any, Dict[Any, Any]]] = None, name: Optional[str] = None)[source]¶
Initialize a KG object.
- entitiesDict[Any, Dict[Any, Any]]
entity information with entity ids as keys and a attribute dictionaries as values attribute dictionaries have attribute id as key and attribute value as dict value
- relDict[Any, Dict[Any, Any]]
relation triples with one entity as key, value is dict with other entity as key and relation id as value
- namestr, optional
name of the kg, default is None
>>> entities = { "e1": {"a1": "first entity", "a2": 123}, "e2": {"a1": "second ent"}, "e3": {"a2": 124}, } >>> relations = {"e1": {"e3": "somerelation"}} >>> kg = KG(entities, relations, "mykg")
Methods
__init__
(entities[, rel, name])Initialize a KG object.
add_entity
(e_id, e_attr[, overwrite])Add an entity to the knowledge graph.
add_rel
(source, target, value[, overwrite])Add relationhip with value.
cleaned_entities
([key, prefix_mapping, ...])Return cleaned entity information of specified entities.
clone
()Create a clone of this object.
info
()Print general information about this object.
neighbors
(entity_id[, only_id])Get neighbors of an entity.
remove_entity
(e_id)Remove the entity with the id.
remove_rel
(source, target[, value])Remove relationship or relationship value.
sample
(n[, seed])Return a sample of the knowledge graph with n entities.
search
(query[, attr, exact])Search for entities with specific attribute value.
subgraph
(wanted)Return a subgraph with only wanted entities.
to_rdflib
([prefix, attr_mapping])Transform to rdflib graph.
with_attr
(attr)Search for entities with specific attribute.
Attributes
Return all attribute names.
Return all attribute values.
Return ids of all entities.
rel_triples
Return all relation names.
- add_entity(e_id: str, e_attr: Dict, overwrite: bool = False)[source]¶
Add an entity to the knowledge graph.
- e_idstr
Id of the entity you want to add.
- e_attrDict
Attributes of the entity you want to add.
- overwritebool
If true, overwrite existing
- ValueError
If entity id is already present.
- add_rel(source: str, target: str, value, overwrite: bool = False) bool [source]¶
Add relationhip with value.
- sourcestr
Entity id of source.
- targetstr
Entity id of target.
- value
Value of relation, e.g. relation name.
- overwritebool
If true, overwrites existing values for already present relationship, else appends the value to existing.
- bool
True if new information was added, else false.
- property attribute_names: Set[str]¶
Return all attribute names.
- Set[str]
Attribute names as set.
- property attribute_values: Set[Any]¶
Return all attribute values.
- Set[Any]
Attribute values as set
- cleaned_entities(key: Optional[Union[str, List[str]]] = None, prefix_mapping: Optional[Dict[str, str]] = None, clean_fun: Optional[Callable] = None) Dict[Any, Any] [source]¶
Return cleaned entity information of specified entities.
By default remove datatype and language tags, shorten uris via prefixes
- Parameters
key – Wanted entity ids or None to get all
prefix_mapping – Mappings from IRI namespaces, or commonly used prefixes from prefix.cc will be used
clean_fun – Function to clean attributes, if None, will remove datatype and language tags
- Returns
Cleaned entity info
- clone() forayer.knowledge_graph.kg.KG [source]¶
Create a clone of this object.
- clone: KG
cloned KG
- property entity_ids: Set[Any]¶
Return ids of all entities.
- Set[Any]
Ids of all entities.
- info() str [source]¶
Print general information about this object.
- str
information about number of entities, attributes and values
- neighbors(entity_id: Any, only_id: bool = False) Union[Set[Any], Dict[Any, Dict[Any, Any]]] [source]¶
Get neighbors of an entity.
- entity_id: Any
The id of entity of which we want the neighbors.
- only_id: bool
If true only ids are returned
- neighbors: Union[Set[Any], Dict[Any, Dict[Any,Any]]]
entity dict of neighbors, if only_id is true returns neighbor ids as set
- property relation_names: Set[Any]¶
Return all relation names.
- Set[str]
Relation names as set.
- remove_entity(e_id: str)[source]¶
Remove the entity with the id.
- e_idstr
Id of entity you want to remove.
- KeyError
If no entity with this id exists
- remove_rel(source: str, target: str, value=None)[source]¶
Remove relationship or relationship value.
- sourcestr
Entity id of source.
- targetstr
Entity id of target.
- value
If provided: remove only this specific value.
- KeyError
If relationship does not exist
- ValueError
If value does not exist in relationship
- sample(n: int, seed: Optional[Union[int, random.Random]] = None) forayer.knowledge_graph.kg.KG [source]¶
Return a sample of the knowledge graph with n entities.
- nint
Number of entities to return.
- seedUnion[int, random.Random]
Seed for randomness or seeded random.Random object. Default is None.
- KG
Knowledge graph with n entities.
>>> from forayer.knowledge_graph import KG >>> entities = { "e1": {"a1": "first entity", "a2": 123}, "e2": {"a1": "second ent"}, "e3": {"a2": 124}, } >>> kg = KG(entities) >>> kg.sample(2) KG(entities={'e1': {'a1': 'first entity', 'a2': 123}, 'e2': {'a1': 'second ent'}},rel=None,name=None)
- search(query, attr=None, exact=False)[source]¶
Search for entities with specific attribute value.
- query
attribute value that is searched for
- attrUnion[str,List]
only look in specific attribute(s)
- exactbool
if True only consider exact matches
- result: Dict[str, Dict[str, Any]]
Entites that have attribute values that match the query.
>>> from forayer.knowledge_graph import KG >>> entities = { "e1": {"a1": "first entity", "a2": 123}, "e2": {"a1": "second ent"}, "e3": {"a2": 124}, } >>> kg = KG(entities) >>> kg.search("first") {'e1': {'a1': 'first entity', 'a2': 123}} >>> kg.search("first", exact=True) {} >>> kg.search("first", attr="a2") {}
- subgraph(wanted: Iterable[str])[source]¶
Return a subgraph with only wanted entities.
Creates a subgraph with the wanted entities. Contains only relationships between wanted entities. Entities without attributes (possibly not contained in self.entities) and relationships that point outside the subgraph are added as entities without attributes to the result KG’s entities.
- wanted: Iterable[str]
Ids of wanted entities.
- KG
subgraph with only wanted entities
>>> from forayer.knowledge_graph import KG >>> entities = {"e1": {"a": 1}, {"e2": {"a": 3}} >>> rel = {"e1": {"e2": "rel", "e3": "rel"}} >>> kg = KG(entities,rel) >>> kg.subgraph(["e1","e3"]) KG(entities={'e1': {'a': 1}, 'e3': {}}, rel={'e1': {'e3': 'rel'}}, name=None)
- to_rdflib(prefix: str = '', attr_mapping: Optional[dict] = None)[source]¶
Transform to rdflib graph.
- prefixstr
Prefix to prepend to each entity id
- attr_mappingdict
Mapping of attribute names to URIs. Mapping values can be str or
rdflib.term.URIRef
. This is also used to map relation predicates.
- rdf_g
rdflib Graph
>>> entities = { "e1": {"a1": "first entity", "a2": 123}, "e2": {"a1": "second ent"}, "e3": {"a2": {124, "1223"}}, } >>> kg = KG(entities, {"e1": {"e3": "somerelation"}}) >>> rdf_g = kg.to_rdflib() >>> from rdflib import URIRef >>> rdf_g.value(URIRef("e1"), URIRef("a1")) rdflib.term.Literal('first entity')
You can use custom prefixes and rdflib namespaces or strings for mappings
>>> from rdflib.namespace import FOAF >>> my_prefix = "http://example.org/" >>> my_mapping = {"a1":FOAF.name, "a2":"http://example.org/attr"} >>> rdf_g = kg.to_rdflib(prefix=my_prefix,attr_mapping=my_mapping) >>> rdf_g.value(URIRef(my_prefix + "e1"), FOAF.name) rdflib.term.Literal('first entity')
- with_attr(attr: str)[source]¶
Search for entities with specific attribute.
- attr: str
Attribute name.
- result: Dict[str, Dict[str, Any]]
Entites that have the attribute.
>>> from forayer.knowledge_graph import KG >>> entities = { "e1": {"a1": "first entity", "a2": 123}, "e2": {"a1": "second ent"}, "e3": {"a2": 124}, } >>> kg = KG(entities) >>> kg.with_attr("a1") {'e1': {'a1': 'first entity', 'a2': 123}, "e2": {"a1": "second ent"}}