"\u001b[0;32m<ipython-input-2-94d6ac817c95>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# export\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0;32mclass\u001b[0m \u001b[0mPodClient\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 3\u001b[0m \u001b[0;31m# Mapping from python type to schema type\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;31m# TODO move to data.schema once schema is refactored\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m TYPE_TO_SCHEMA = {\n",
4 # TODO move to data.schema once schema is refactored
5 TYPE_TO_SCHEMA = {
<ipython-input-2-94d6ac817c95> in PodClient()
8 int: "Integer",
9 float: "Real",
---> 10 datetime: "DateTime",
11 }
12
NameError: name 'datetime' is not defined
%% Cell type:markdown id: tags:
Pymemri communicates with the pod via the `PodClient`. The PodClient requires you to provide a [database key](https://gitlab.memri.io/memri/pod/-/blob/dev/docs/HTTP_API.md#user-content-api-authentication-credentials) and an [owner key](https://gitlab.memri.io/memri/pod/-/blob/dev/docs/HTTP_API.md#user-content-api-authentication-credentials). During development, you don't have to worry about these keys, you can just omit the keys when initializing the `PodClient`, which creates a new user by defining random keys.
If you want to use the same keys for different `PodClient` instances, you can store a random key pair locally with the `store_keys` CLI, and create a new client with `PodClient.from_local_keys()`. When you are using the app, setting the keys in the pod, and passing them when calling a plugin is handled for you by the app itself.
%% Cell type:code id: tags:
``` python
client=PodClient()
client.registered_classes["Photo"]
```
%% Output
<class 'pymemri.data.photo.Photo'>
pymemri.data.photo.Photo
%% Cell type:code id: tags:
``` python
# hide
success=client.api.test_connection()
assertsuccess
```
%% Cell type:markdown id: tags:
## Creating Items and Edges
%% Cell type:markdown id: tags:
Now that we have access to the pod, we can create items here and upload them to the pod. All items are defined in the schema of the pod. To create an item in the pod, you have to add the schema first. Schemas can be added as follows
The types of items in the pod are not limited to definitions to the pymemri schema. We can easily define our own types, or overwrite existing item definitions with the same `add_to_schema` method.
Note that all keyword arguments need to be added to the `properties` class variable to let the pod know what the properties of our item are. Additionally, properties in the Pod are statically typed, and have to be inferred from type the annotations of our `__init__` method.
[{'item': Person (#c95c1c89663c06971de7560462e7ce3d), 'name': 'sender'}]
[{'item': Person (#33b03cddf56811350e3eee3e8910afd2), 'name': 'sender'}]
%% Cell type:code id: tags:
``` python
# hide
assertitem_succes
assertedge_succes
```
%% Cell type:markdown id: tags:
If we use the normal `client.get` (without `expanded=False`), we also get items directly connected to the Item.
%% Cell type:code id: tags:
``` python
email_from_db=client.get(email_item.id)
print(email_from_db.sender)
```
%% Output
[Person (#c95c1c89663c06971de7560462e7ce3d)]
[Person (#33b03cddf56811350e3eee3e8910afd2)]
%% Cell type:code id: tags:
``` python
# hide
assertisinstance(email_from_db.sender[0],Person)
```
%% Cell type:markdown id: tags:
# Fetching and updating Items
%% Cell type:markdown id: tags:
## Normal Items
%% Cell type:markdown id: tags:
We can use the client to fetch data from the database. This is in particular useful for indexers, which often use data in the database as input for their models. The simplest form of querying the database is by querying items in the pod by their id (unique identifier).
When we don't know the ids of the items we want to fetch, we can also search by property. We can use this for instance when we want to query all items from a particular type to perform some indexing on. We can get all `Person` Items from the db by:
%% Cell type:markdown id: tags:
## Search
%% Cell type:markdown id: tags:
the `PodClient` can search through the pod with the `search` or `search_paginate` methods, which return the results of a search as a list or generator respectively. Search uses the same arguments as the Pod search API, which can be found [here](https://gitlab.memri.io/memri/pod/-/blob/dev/docs/HTTP_API.md#post-v4owner_keysearch).
To display how search works, we first add a few new items
To hande large volumes of Items, the `PodClient.search_paginate` method can search through the pod and return a generator which yields batches of items. This method uses the same search arguments as the `search` method:
In the near future, Pod will support searching by user defined properties as well. This will allow for the following. **warning, this is currently not supported**
To work with files like Photos or Videos, the `PodClient` has a separate file api. This api works by posting a blob to the `upload_file` endpoint, and creating an Item with a property with the same sha256 as the sha used in the endpoint.
For example, we can upload a photo with the file API as follows:
%% Cell type:code id: tags:
``` python
from pymemri.data.photo import Photo
x = np.random.randint(0, 255+1, size=(640, 640), dtype=np.uint8)
photo = Photo.from_np(x)
file = photo.file[0]
succes = client.create(file)
succes2 = client._upload_image(x)
succes2 = client._upload_image(photo.data)
```
%% Cell type:code id: tags:
``` python
# hide
assert succes
assert succes2
data = client.get_file(file.sha256)
arr = np.frombuffer(data, dtype=np.uint8)
assert (arr.reshape(640,640) == x).all()
photo.data = data
arr = photo.to_np()
assert (arr == x).all()
```
%% Cell type:markdown id: tags:
### Photo API
%% Cell type:markdown id: tags:
The PodClient implements an easier API for photos separately, which uses the same file API under the hood
%% Cell type:code id: tags:
``` python
print(client.registered_classes["Photo"])
# client.add_to_schema(Photo)
x = np.random.randint(0, 255+1, size=(640, 640), dtype=np.uint8)
photo = Photo.from_np(x)
client.create_photo(photo);
photo.file
```
%% Output
<class 'pymemri.data.photo.Photo'>
BULK: Writing 3/3 items/edges
Completed Bulk action, written 3 items/edges
[File (#76bbf5ecd3a140a999d793f71187c340)]
%% Cell type:code id: tags:
``` python
# hide
res = client.get_photo(photo.id, size=640)
assert (res.data == x).all()
res = client.get_photo(photo.id)
print(res.id)
res.file[0].sha256
assert (res.to_np() == x).all()
```
%% Output
d7af0a4a291d42f9a25a38443d07bc91
%% Cell type:markdown id: tags:
Some photos come as bytes, for example when downloading them from a third party service. We can use `photo.from_bytes` to initialize these photos:
Adding each item separately to the pod with the `create` method can take a lot of time. For this reason, using the bulk API is faster and more convenient in most cases. Here we show creating items and edges, updating and deleting is also possible.
%% Cell type:code id: tags:
``` python
# Create 100 Dogs to add to the pod, and two edges to a new person
dogs = [Dog(name=f"dog number {i}") for i in range(100)]
person = Person(firstName="Alice")
edge1 = Edge(dogs[0], person, "label")
edge2 = Edge(dogs[1], person, "label")
# Simultaneously add the dogs, person, and edges with the bulk API