|
|
> 🚧 Under construction...
|
|
|
|
|
|
## Key Resources
|
|
|
- [Memri GitLab](https://gitlab.memri.io/) - Houses the code and documentation for all Memri plug-ins. **UPDATE DESCRIPTION**
|
|
|
- [PyMemri](https://pypi.org/project/pymemri/) - Pymemri is a python client for the Memri Personal online datastore (pod). This client can be used to build plugins in python. Plugins connect and add the information to your Pod. Plugins that import your data from external services are called Importers (Gmail, WhatsApp, etc.). Plugins that connect new data to the existing data are called indexers (face recognition, spam detection, object detection, etc.). Lastly there are plugins that execute actions (sending messages, uploading files).
|
|
|
- [Memri Community Wiki](https://gitlab.memri.io/memri/memri/-/wikis/Home#roles) - Resource for how to get started, details about key roles in the plug-in development procress
|
|
|
- [Memri Docs](http://memri.docs.memri.io/docs.memri.io/)
|
|
|
|
|
|
# **Installing Memri Pod**
|
|
|
|
|
|
The first step is to install and run a local copy of the Memri Pod. The Pod or **P**ersonal **O**nline **D**atastore is a server written in Rust with a SQLite database that exposes a Graph API to store data safely and securely. It encrypts the database with encryption keys that need to be send with every API request. Furthermore it sports a micro-services architecture using docker that enables plugins to be dynamically started based on triggers. Adding certain items to the Pod will trigger a plugin to be run. More on this later. For now we'll follow the instructions [in the Pod repo here](https://gitlab.memri.io/memri/pod) to get started.
|
|
|
|
|
|
**1. Install Rust and sqlcipher**
|
|
|
|
|
|
- On MacOS: `brew install rust sqlcipher`
|
|
|
- On ArchLinux: `pacman -S --needed rust sqlcipher base-devel`
|
|
|
- On Ubuntu and Debian:
|
|
|
|
|
|
```
|
|
|
apt-get install sqlcipher build-essential libsqlcipher-dev python3-dev
|
|
|
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
|
|
|
```
|
|
|
|
|
|
**2. Clone the Pod to your local disk**
|
|
|
|
|
|
```
|
|
|
cd path/to/development/folder
|
|
|
git clone https://gitlab.memri.io/memri/pod.git
|
|
|
cd pod
|
|
|
git checkout dev
|
|
|
```
|
|
|
|
|
|
**3. Run the Pod in development mode**
|
|
|
|
|
|
Run the pod with this command: `./examples/run_development.sh`
|
|
|
|
|
|
It will first install all the dependencies using cargo (Rust's package manager). When it's done the output should look something like this:
|
|
|
|
|
|
![https://blog.memri.io/content/images/2021/05/Screen-Shot-2021-05-11-at-7.28.52-AM.png](https://blog.memri.io/content/images/2021/05/Screen-Shot-2021-05-11-at-7.28.52-AM.png)
|
|
|
|
|
|
Once the pod is up and running, you can move on to create our first plugin.
|
|
|
|
|
|
# **Creating a Plugin**
|
|
|
|
|
|
In order to create a plugin we will start by creating a new directory (or "folder" what is the modern nomenclature?) and initializing a git repo. We'll then install the pyMemri dependency and create the equivalent of a Hello World plugin.
|
|
|
|
|
|
**1. Install the plugin template**
|
|
|
|
|
|
By using the plugin template you get a standard directory structure as well as a few of the other acceptance criteria for a memri plugin. You can read more about the plugin template here: [https://gitlab.memri.io/plugins/plugin-template](https://gitlab.memri.io/plugins/plugin-template)
|
|
|
|
|
|
```
|
|
|
cd path/to/development/folder
|
|
|
git clone https://gitlab.memri.io/plugins/plugin-template.git
|
|
|
mv plugin-template demo-plugin
|
|
|
cd demo-plugin
|
|
|
rm -Rf .git
|
|
|
git init .
|
|
|
```
|
|
|
|
|
|
This should set you up with a new git repo in the `demo-plugin` directory based on the plugin template. Please note that as practice we at Memri no longer use `master` as the default branch name as this may be associate with a culture of oppression. We therefor recommend setting the default branch to `dev` before calling `git init .` for the first time using the following command `git config --global init.defaultBranch dev`.
|
|
|
|
|
|
**2. Installing pyMemri and other dependencies**
|
|
|
|
|
|
The pyMemri library provides an API to easily interface with the Memri pod. It [is available on PyPi](https://pypi.org/project/pymemri/) (the python package repository) and we use it as a client to provide us with an easy to use interface to the pod. Installation is very simple using pip. The plugin template also makes sure you have the other dependencies installed.
|
|
|
|
|
|
```
|
|
|
pip install .
|
|
|
```
|
|
|
|
|
|
> N.B. Make sure to update pyproject.toml and setup.py with your dependencies and update the author fields!
|
|
|
|
|
|
**3. A hello world plugin**
|
|
|
|
|
|
When the pod runs a plugin it will run this in a separate docker container restricted from other plugins and the pod server itself. During development this is not needed and we will simply run the plugin via python in our default environment.
|
|
|
|
|
|
When in production, the pod will pass the plugin a secret that's encoded in a binary blob and describes what access the plugin has to the Pod API. This is an important security measure following the [Principle of Least Privilege](https://en.wikipedia.org/wiki/Principle_of_least_privilege) that ensures that if, in a worst-case scenario, a plugin is malicious or becomes compromised that the amount of damage it can do is extremely limited. For instance an importer plugin may only need access to the last item it imported and only have the ability to create new items.
|
|
|
|
|
|
All this logic is handled by the PodClient implementation in the pyMemri package. We will simply use this client to access the pod api. If the plugin does not have the rights to use one of the APIs the call with return an error. In development all APIs are available to us plugin creators.
|
|
|
|
|
|
In order to include the pyMemri library and expose it to our plugin let's create a new file called `plugin.py` in your favorite editor and add the following lines.
|
|
|
|
|
|
```
|
|
|
import pymemri
|
|
|
from pymemri.data.schema import *
|
|
|
from pymemri.pod.client import *
|
|
|
from pymemri.data.photo import *
|
|
|
```
|
|
|
|
|
|
From here we can now create the client. For testing purpose we'll include some variables that tell the client how to connect to the pod. The `database_key` and `owner_key` will be passed through environment variable from the Pod. For testing purposes we'll use `"key"` as a temporary key to use.
|
|
|
|
|
|
```
|
|
|
DEFAULT_POD_ADDRESS = "http://localhost:3030"
|
|
|
POD_VERSION = "v3"
|
|
|
|
|
|
client = PodClient(database_key="key", owner_key="key")
|
|
|
```
|
|
|
|
|
|
Alright, now we're ready to start interacting with the pod. You can use the following code to test if everything is working. This code defines a new type called `Dog` and adds that to the schema, then creates a 'dog' and lastly fetches that 'dog' from the pod to test if the create call did its job.
|
|
|
|
|
|
```
|
|
|
class Dog(Item):
|
|
|
def __init__(self, name, age, id=None, deleted=None):
|
|
|
super().__init__(id=id, deleted=deleted)
|
|
|
self.name = name
|
|
|
self.age = age
|
|
|
|
|
|
@classmethod
|
|
|
def from_json(cls, json):
|
|
|
id = json.get("id", None)
|
|
|
name = json.get("name", None)
|
|
|
age = json.get("age", None)
|
|
|
return cls(id=id,name=name,age=age)
|
|
|
|
|
|
dog = Dog("max", 2)
|
|
|
client.add_to_schema(dog);
|
|
|
dog2 = Dog("bob", 3)
|
|
|
client.create(dog2);
|
|
|
|
|
|
dog_from_db = client.get(dog2.id, expanded=False)
|
|
|
|
|
|
print(dog_from_db)
|
|
|
```
|
|
|
|
|
|
You can now run this via your editor or on the command line. this should look something like this:
|
|
|
|
|
|
```
|
|
|
% python plugin.py
|
|
|
Dog (#9c97feace45561c1cc9c0eec302bfd07)
|
|
|
```
|
|
|
|
|
|
If you are seeing this then you successfully created your first plugin. Yay!
|
|
|
|
|
|
### **Debugging**
|
|
|
|
|
|
If the connection to the pod is not working you can check the following things to make sure your configuration is set up correctly:
|
|
|
|
|
|
- Make sure the pod is on the `dev` branch
|
|
|
- Check the pod output to see if a connection is being made
|
|
|
- Try curl-ing the pod in order to see if you can connect to it `curl [http://localhost:3030/](http://localhost:3030/)`
|
|
|
- Is there any other process running on port 3030?
|
|
|
- Do a `git pull` to check that you have the latest version of the `dev` branch
|
|
|
|
|
|
# **More advanced use cases**
|
|
|
|
|
|
To make building plugins easy and straightforward the `PodClient` API supports various use cases. In this section we'll go through some of those use cases. For a complete overview of all the API calls that are supported, please check out the [pyMemri documentation](http://memri.docs.memri.io/pymemri/). You can also go one level deeper and check out the [REST API for the Pod here](https://gitlab.memri.io/memri/pod/-/blob/dev/docs/HTTP_API.md).
|
|
|
|
|
|
### **Installation**
|
|
|
|
|
|
During installation of your plugin you add items to the pod that are necessary for your plugin to function. Please see the [install_plugin function](https://gitlab.memri.io/plugins/plugin-template/-/blob/master/plugin_template/plugin_flow/plugin_flow.py#L87) in the template. The bare minimum to install is the `Plugin` item that contains the information about your plugin. Furthermore you may want to add CVU screens that are needed for authentication flows (more on this in a later blog post).
|
|
|
|
|
|
### **Plugin Run Metadata**
|
|
|
|
|
|
Each time a plugin is run a new item is added to the pod. In fact, the pod uses triggers and the trigger to run a plugin is the creation of a `PluginRun` item. The ID of this item is passed to the plugin and is used by the Plugin base class `PluginFlow` to get access to metadata for the plugin, such as authentication credentials.
|
|
|
|
|
|
### **Authentication**
|
|
|
|
|
|
In order to access an external service your plugin usually needs to authenticate. The credentials for authentication can be read from the Pod. These credentials should be stored in an `Account` item that has an edge to the `Plugin` item.
|
|
|
|
|
|
```
|
|
|
PluginRun -> edge(plugin) -> Plugin -> edge(account) -> Acccount
|
|
|
```
|
|
|
|
|
|
You can store credentials such as the username and password on the Account item as well as temporary credentials such as a session token or an oAuth token.
|
|
|
|
|
|
The plugin template contains two example flows for authentication. Choose the one that fits your use case and adjust it to your needs. The [first example](https://gitlab.memri.io/plugins/plugin-template/-/blob/master/plugin_template/plugin_template.py#L26) flow asks first for a username and password from the user and then asks the user for the two-factor-authentication (2FA) code before completing the authentication flow. The [second example](https://gitlab.memri.io/plugins/plugin-template/-/blob/master/plugin_template/plugin_template.py#L51) completes a more simple oAuth flow.
|
|
|
|
|
|
When a log in fails you can store the error information on the `errorMessage` property of the `Account` item. The CVU for your plugin can then display this message.
|
|
|
|
|
|
### **Schema**
|
|
|
|
|
|
When you are creating a new plugin start by checking if there are any existing schema definitions that you can reuse. By reusing the existing schema definitions you make sure that you have compatibility with other plugins that import similar data and user interfaces that depend on that schema definition. You can find the existing schema using the schema browser in [the schema repository](https://gitlab.memri.io/memri/schema).
|
|
|
|
|
|
If you find that you need to add a new property, item or edge for your plugin you can do that as specified above. Please [join our discord](https://discord.gg/NUQSRFkKwv) to discuss any schema that you think may be more generic in order to discuss with the rest of the community and to find consensus on the best way to implement a certain data type.
|
|
|
|
|
|
**Person items**It's important to note that when creating an importer that we distinguish between `Account` items and `Person` items. For instance for the Twitter plugin each follower would be an `Account` with a `follows` edge to the twitter `Account` of the user of the pod. Each account will be attached to a `Person` item. Deduplication of `Person` items is quite complex and is handled separately by the [person deduplication plugin](https://gitlab.memri.io/plugins/person-deduplication). Each importer plugin should simply create a `Person` item for each `Account` item it adds to the pod with an `owner` edge in between. The deduplication plugin will take care of the rest!
|
|
|
|
|
|
### **Incremental runs**
|
|
|
|
|
|
All importer plugins should support importing data incrementally in order to prevent duplicating data on each run. To implement this the plugin can adopt various techniques to determine what to import.
|
|
|
|
|
|
**Continue from the last imported item**
|
|
|
|
|
|
The following method returns the last item of a certain type. You can use this to find the last imported item and start importing from there.
|
|
|
|
|
|
```
|
|
|
client.search_last_added(type="Person")
|
|
|
```
|
|
|
|
|
|
*In the near future you will be able to filter based on the account that imported the data as well. We will update this article to reflect that when that lands.*
|
|
|
|
|
|
**Check if an item already exists in the pod**
|
|
|
|
|
|
The following code returns all `Person` items that you can then search using a for loop.
|
|
|
|
|
|
```
|
|
|
all_people = client.search({"type": "Person"})
|
|
|
```
|
|
|
|
|
|
*In the future the search API will support searching on fields and edges to speed up this process. We recommend creating a helper function in your plugin that can be replaced later when a more complete search API.*
|
|
|
|
|
|
**Alternative methods**
|
|
|
|
|
|
There are several alternative methods to implement incremental runs. You could store additional fields in the `Account` item (e.g. timestamp, id-field) and use that to continue where you left of.
|
|
|
|
|
|
### **Edges**
|
|
|
|
|
|
The Pod API exposes a graph database that allows you to connected items to each other with edges. Each edge has a type which is a string that describes the relationship between the two edges. An edge is directional which means it goes from one item to the other, but not back again. Use edges to connect items you create in your plugin.
|
|
|
|
|
|
In the following example we create an edge from an `Email` to a `Person` via an edge of the type `sender`. That way we store who the sender is of the email.
|
|
|
|
|
|
```
|
|
|
person_item = Person.from_data(firstName="Alice", lastName="X")
|
|
|
item_succes = client.create(person_item)
|
|
|
edge = Edge(email_item, person_item, "sender")
|
|
|
edge_succes = client.create_edge(edge)
|
|
|
```
|
|
|
|
|
|
> (!) Be sure to create the items first before creating the edge. Creating an edge will fail if the items it points to do not exist on the pod.
|
|
|
|
|
|
### **Uploading files**
|
|
|
|
|
|
Photos and files can be uploaded to the Pod and attached to other items via edges.
|
|
|
|
|
|
**Photos**The following example creates a random photo and adds it to the pod
|
|
|
|
|
|
```
|
|
|
x = np.random.randint(0, 255+1, size=(640, 640), dtype=np.uint8)
|
|
|
photo = IPhoto.from_np(x)
|
|
|
succes = client.add_to_schema(IPhoto.from_np(x))
|
|
|
|
|
|
# Store the photo
|
|
|
client.create(photo)
|
|
|
|
|
|
# Retrieve the photo
|
|
|
res = client.get_photo(photo.id, size=640)
|
|
|
```
|
|
|
|
|
|
**Files**The following example reads a file from disk and uploads it to the pod. It then creates a File item and sets the sha256 hash of the file as its identifier. Attach the File item to another item via an edge (for instance to an `EmailMessage` via an `attachment` edge).
|
|
|
|
|
|
```
|
|
|
file=open("video.mp4","rb")
|
|
|
bytes = list(file.read(3))
|
|
|
file.close()
|
|
|
|
|
|
# Upload to the pod
|
|
|
pod_client.upload_file(bytes)
|
|
|
|
|
|
# Create SHA
|
|
|
from hashlib import sha256
|
|
|
h1 = sha256()
|
|
|
h1.update(bytes)
|
|
|
digest = h1.hexdigest()
|
|
|
|
|
|
# Create File
|
|
|
file = new File(sha256=digest)
|
|
|
pod_client.create(file)
|
|
|
```
|
|
|
|
|
|
# **Join our Discord and share your plugins**
|
|
|
|
|
|
When you are starting building plugins you'll likely have many questions. Please [join our discord](https://discord.gg/NUQSRFkKwv) and ask those questions to our community and the Memri engineers. We look forward to help you out and work together towards giving people back control over their data!
|
|
|
|