-
Aziz Berkay Yesilyurt authored984dc0cf
---
title: Email importer
keywords: fastai
sidebar: home_sidebar
nb_path: "nbs/importers.GmailImporter.ipynb"
---
<!--
#################################################
### THIS FILE WAS AUTOGENERATED! DO NOT EDIT! ###
#################################################
# file to edit: nbs/importers.GmailImporter.ipynb
# command to build the docs after a change: nbdev_build_docs
-->
<div class="container" id="notebook-container">
{% raw %}
<div class="cell border-box-sizing code_cell rendered">
</div>
{% endraw %}
{% raw %}
<div class="cell border-box-sizing code_cell rendered">
</div>
{% endraw %}
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>This importers fetches your emails and accounts over IMAP, it uses the python built-in imap client and some convenience functions for easier usage, batching and importing to the pod. This importer requires you to login with your email address and an app password. It is tested on gmail, but should work for other IMAP-servers.</p>
</div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>{% include note.html content='<strong>The recommended usage for Gmail is to enable two-factor authentication. In this case, make sure you allow <a href="https://www.gmass.co/blog/gmail-smtp/">SMTP-connections</a> and set an application password (explained in the same link)</strong>' %}</p>
</div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<h2 id="ImapClient">ImapClient<a class="anchor-link" href="#ImapClient"> </a></h2><p>The <a href="/gmail_imap/importers.GmailImporter.html#EmailImporter"><code>EmailImporter</code></a> communicates with email providers over imap. We created a convenience class around pythons imaplib , called the <code>ImapClient</code> that lets you list your mailboxes, retriev your mails and get their content.</p>
</div>
</div>
</div>
{% raw %}
<div class="cell border-box-sizing code_cell rendered">
<div class="output_wrapper">
<div class="output">
<div class="output_area">
<div class="output_markdown rendered_html output_subarea ">
7172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140
<h2 id="IMAPClient" class="doc_header"><code>class</code> <code>IMAPClient</code><a href="https://gitlab.memri.io/memri/gmail_imap/tree/prod/gmail_imap/gmail.py#L22" class="source_link" style="float:right">[source]</a></h2><blockquote><p><code>IMAPClient</code>(<strong><code>username</code></strong>, <strong><code>app_pw</code></strong>, <strong><code>host</code></strong>=<em><code>'imap.gmail.com'</code></em>, <strong><code>port</code></strong>=<em><code>993</code></em>, <strong><code>inbox</code></strong>=<em><code>'"[Gmail]/All Mail"'</code></em>)</p>
</blockquote>
</div>
</div>
</div>
</div>
</div>
{% endraw %}
{% raw %}
<div class="cell border-box-sizing code_cell rendered">
<div class="output_wrapper">
<div class="output">
<div class="output_area">
<div class="output_markdown rendered_html output_subarea ">
<h4 id="IMAPClient.list_mailboxes" class="doc_header"><code>IMAPClient.list_mailboxes</code><a href="https://gitlab.memri.io/memri/gmail_imap/tree/prod/gmail_imap/gmail.py#L30" class="source_link" style="float:right">[source]</a></h4><blockquote><p><code>IMAPClient.list_mailboxes</code>()</p>
</blockquote>
<p>Lists all available mailboxes</p>
</div>
</div>
</div>
</div>
</div>
{% endraw %}
{% raw %}
<div class="cell border-box-sizing code_cell rendered">
<div class="output_wrapper">
<div class="output">
<div class="output_area">
<div class="output_markdown rendered_html output_subarea ">
<h4 id="IMAPClient.get_all_mail_ids" class="doc_header"><code>IMAPClient.get_all_mail_ids</code><a href="https://gitlab.memri.io/memri/gmail_imap/tree/prod/gmail_imap/gmail.py#L34" class="source_link" style="float:right">[source]</a></h4><blockquote><p><code>IMAPClient.get_all_mail_ids</code>()</p>
</blockquote>
<p>retrieves all mail ids from the selected mailbox</p>
</div>
</div>
</div>
</div>
</div>
{% endraw %}
{% raw %}
<div class="cell border-box-sizing code_cell rendered">
<div class="output_wrapper">
<div class="output">
141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210
<div class="output_area">
<div class="output_markdown rendered_html output_subarea ">
<h4 id="IMAPClient.get_mail" class="doc_header"><code>IMAPClient.get_mail</code><a href="https://gitlab.memri.io/memri/gmail_imap/tree/prod/gmail_imap/gmail.py#L42" class="source_link" style="float:right">[source]</a></h4><blockquote><p><code>IMAPClient.get_mail</code>(<strong><code>id</code></strong>)</p>
</blockquote>
<p>Fetches a mail given a id, returns (raw_mail, thread_id)</p>
</div>
</div>
</div>
</div>
</div>
{% endraw %}
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<h2 id="EmailImporter">EmailImporter<a class="anchor-link" href="#EmailImporter"> </a></h2>
</div>
</div>
</div>
{% raw %}
<div class="cell border-box-sizing code_cell rendered">
<div class="input">
<div class="inner_cell">
<div class="input_area">
<div class=" highlight hl-ipython3"><pre><span></span><span class="k">class</span> <span class="nc">Account</span><span class="p">(</span><span class="n">Item</span><span class="p">):</span>
<span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">dateAccessed</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">dateCreated</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">dateModified</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">deleted</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
<span class="n">externalId</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">itemDescription</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">starred</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">version</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="nb">id</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">importJson</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
<span class="n">handle</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">displayName</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">service</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">itemType</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">avatarUrl</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">changelog</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
<span class="n">label</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">genericAttribute</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">measure</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">sharedWith</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">belongsTo</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">price</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
<span class="n">location</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">organization</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">owner</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span>
<span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">(</span><span class="n">dateAccessed</span><span class="o">=</span><span class="n">dateAccessed</span><span class="p">,</span> <span class="n">dateCreated</span><span class="o">=</span><span class="n">dateCreated</span><span class="p">,</span> <span class="n">dateModified</span><span class="o">=</span><span class="n">dateModified</span><span class="p">,</span>
<span class="n">deleted</span><span class="o">=</span><span class="n">deleted</span><span class="p">,</span> <span class="n">externalId</span><span class="o">=</span><span class="n">externalId</span><span class="p">,</span> <span class="n">itemDescription</span><span class="o">=</span><span class="n">itemDescription</span><span class="p">,</span> <span class="n">starred</span><span class="o">=</span><span class="n">starred</span><span class="p">,</span>
<span class="n">version</span><span class="o">=</span><span class="n">version</span><span class="p">,</span> <span class="nb">id</span><span class="o">=</span><span class="nb">id</span><span class="p">,</span> <span class="n">importJson</span><span class="o">=</span><span class="n">importJson</span><span class="p">,</span> <span class="n">changelog</span><span class="o">=</span><span class="n">changelog</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">label</span><span class="p">,</span>
<span class="n">genericAttribute</span><span class="o">=</span><span class="n">genericAttribute</span><span class="p">,</span> <span class="n">measure</span><span class="o">=</span><span class="n">measure</span><span class="p">,</span> <span class="n">sharedWith</span><span class="o">=</span><span class="n">sharedWith</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">handle</span> <span class="o">=</span> <span class="n">handle</span>
<span class="bp">self</span><span class="o">.</span><span class="n">displayName</span> <span class="o">=</span> <span class="n">displayName</span>
<span class="bp">self</span><span class="o">.</span><span class="n">service</span> <span class="o">=</span> <span class="n">service</span>
<span class="bp">self</span><span class="o">.</span><span class="n">itemType</span> <span class="o">=</span> <span class="n">itemType</span>
<span class="bp">self</span><span class="o">.</span><span class="n">avatarUrl</span> <span class="o">=</span> <span class="n">avatarUrl</span>
<span class="bp">self</span><span class="o">.</span><span class="n">belongsTo</span> <span class="o">=</span> <span class="n">belongsTo</span> <span class="k">if</span> <span class="n">belongsTo</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> <span class="k">else</span> <span class="p">[]</span>
<span class="bp">self</span><span class="o">.</span><span class="n">price</span> <span class="o">=</span> <span class="n">price</span> <span class="k">if</span> <span class="n">price</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> <span class="k">else</span> <span class="p">[]</span>
<span class="bp">self</span><span class="o">.</span><span class="n">location</span> <span class="o">=</span> <span class="n">location</span> <span class="k">if</span> <span class="n">location</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> <span class="k">else</span> <span class="p">[]</span>
<span class="bp">self</span><span class="o">.</span><span class="n">organization</span> <span class="o">=</span> <span class="n">organization</span> <span class="k">if</span> <span class="n">organization</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> <span class="k">else</span> <span class="p">[]</span>
<span class="bp">self</span><span class="o">.</span><span class="n">owner</span> <span class="o">=</span> <span class="n">owner</span> <span class="k">if</span> <span class="n">owner</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span> <span class="k">else</span> <span class="p">[]</span>
</pre></div>
</div>
</div>
</div>
</div>
{% endraw %}
{% raw %}
<div class="cell border-box-sizing code_cell rendered">
<div class="output_wrapper">
<div class="output">
<div class="output_area">
211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280
<div class="output_markdown rendered_html output_subarea ">
<h2 id="EmailImporter" class="doc_header"><code>class</code> <code>EmailImporter</code><a href="https://gitlab.memri.io/memri/gmail_imap/tree/prod/gmail_imap/gmail.py#L105" class="source_link" style="float:right">[source]</a></h2><blockquote><p><code>EmailImporter</code>(<strong>*<code>args</code></strong>, <strong>**<code>kwargs</code></strong>) :: <code>ImporterBase</code></p>
</blockquote>
<p>Imports emails over imap.</p>
</div>
</div>
</div>
</div>
</div>
{% endraw %}
{% raw %}
<div class="cell border-box-sizing code_cell rendered">
</div>
{% endraw %}
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<p>The email importer has the following parameters</p>
<ul>
<li><strong>username</strong> Your email address</li>
<li><strong>password</strong> Your email password. In case you're using gmail, use your application password</li>
<li><em>generic attributes</em></li>
<li><strong>host</strong> The URL of the host (defaults to imap.gmail.com)</li>
<li><strong>port</strong> The port of the server (defaults to 993 for gmail)</li>
<li><strong>max_number</strong> Max number of emails to download. Leave unset for unlimited</li>
</ul>
</div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<h2 id="Methods">Methods<a class="anchor-link" href="#Methods"> </a></h2>
</div>
</div>
</div>
{% raw %}
<div class="cell border-box-sizing code_cell rendered">
<div class="output_wrapper">
<div class="output">
<div class="output_area">
<div class="output_markdown rendered_html output_subarea ">
<h4 id="EmailImporter.get_content" class="doc_header"><code>EmailImporter.get_content</code><a href="https://gitlab.memri.io/memri/gmail_imap/tree/prod/gmail_imap/gmail.py#L134" class="source_link" style="float:right">[source]</a></h4><blockquote><p><code>EmailImporter.get_content</code>(<strong><code>message</code></strong>)</p>
</blockquote>
<p>Extracts content from a python email message</p>
</div>
</div>
</div>
</div>
</div>
{% endraw %}
{% raw %}
281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350
<div class="cell border-box-sizing code_cell rendered">
<div class="output_wrapper">
<div class="output">
<div class="output_area">
<div class="output_markdown rendered_html output_subarea ">
<h4 id="EmailImporter.create_item_from_mail" class="doc_header"><code>EmailImporter.create_item_from_mail</code><a href="https://gitlab.memri.io/memri/gmail_imap/tree/prod/gmail_imap/gmail.py#L157" class="source_link" style="float:right">[source]</a></h4><blockquote><p><code>EmailImporter.create_item_from_mail</code>(<strong><code>mail</code></strong>, <strong><code>thread_id</code></strong>=<em><code>None</code></em>)</p>
</blockquote>
<p>Creates a schema-item from an existing mail</p>
</div>
</div>
</div>
</div>
</div>
{% endraw %}
{% raw %}
<div class="cell border-box-sizing code_cell rendered">
<div class="output_wrapper">
<div class="output">
<div class="output_area">
<div class="output_markdown rendered_html output_subarea ">
<h4 id="EmailImporter.run" class="doc_header"><code>EmailImporter.run</code><a href="https://gitlab.memri.io/memri/gmail_imap/tree/prod/gmail_imap/gmail.py#L211" class="source_link" style="float:right">[source]</a></h4><blockquote><p><code>EmailImporter.run</code>(<strong><code>importer_run</code></strong>=<em><code>None</code></em>, <strong><code>pod_client</code></strong>=<em><code>None</code></em>, <strong><code>verbose</code></strong>=<em><code>True</code></em>)</p>
</blockquote>
<p>This is the main function of the Email importer. It runs the importer given information
provided in the importer run. if you pass a pod client it will add the new items to the graph.</p>
</div>
</div>
</div>
</div>
</div>
{% endraw %}
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<h2 id="Usage">Usage<a class="anchor-link" href="#Usage"> </a></h2>
</div>
</div>
</div>
<div class="cell border-box-sizing text_cell rendered"><div class="inner_cell">
<div class="text_cell_render border-box-sizing rendered_html">
<h3 id="Download-all-mails-from-your-account">Download all mails from your account<a class="anchor-link" href="#Download-all-mails-from-your-account"> </a></h3>
</div>
</div>
</div>
{% raw %}
<div class="cell border-box-sizing code_cell rendered">
<div class="input">
<div class="inner_cell">
<div class="input_area">
<div class=" highlight hl-ipython3"><pre><span></span><span class="n">pod_client</span> <span class="o">=</span> <span class="n">PodClient</span><span class="p">()</span>
<span class="k">assert</span> <span class="n">pod_client</span><span class="o">.</span><span class="n">add_to_schema</span><span class="p">(</span><span class="n">EmailMessage</span><span class="p">(</span><span class="n">externalId</span><span class="o">=</span><span class="s2">"x"</span><span class="p">,</span> <span class="n">subject</span><span class="o">=</span><span class="s2">"x"</span><span class="p">,</span> <span class="n">dateSent</span><span class="o">=</span><span class="mi">2</span><span class="p">,</span> <span class="n">content</span><span class="o">=</span><span class="s2">"x"</span><span class="p">))</span>