Using Google APIs in Python: Part 2 – Gmail API

Introduction

In this part of this tutorial series we will learn how to authenticate and access gmail API to retrieve user emails and perform other operations using google’s official python client library “google-api-python-client“.

In this post I will provide the scaffolding and setup code to access gmail, but to understand code completely or to learn how to acquire OAuth credentials see this Using Google API in Python: Introduction and Setup.

Setup Code

Imports

from __future__ import print_function
import httplib2
import os

from apiclient import discovery
from oauth2client import client
from oauth2client import tools
from oauth2client.file import Storage

try:
    import argparse
    flags = argparse.ArgumentParser(parents=[tools.argparser]).parse_args()
except ImportError:
    flags = None

First acquire google OAuth Client ID using google developer console and then code below to authenticate and connect to API. For full tutorial see this Using Google API in Python: Part 1 – Introduction and Setup.

Code to get credentials

We are using SCOPES list to define scopes our application will be working with API. There are various other scopes for Gmail API. To use multiple scopes we have to use python list as SCOPES variable instead of a string.

Different scopes that can be defined for Gmail are:

# If modifying these scopes, delete your previously saved credentials
# at ~/.credentials/gmail-python-quickstart.json
SCOPES = 'https://www.googleapis.com/auth/gmail.readonly'
CLIENT_SECRET_FILE = 'client_secret.json'
APPLICATION_NAME = 'Gmail API Python Quickstart'


def get_credentials():
    """Gets valid user credentials from storage.

    If nothing has been stored, or if the stored credentials are invalid,
    the OAuth2 flow is completed to obtain the new credentials.

    Returns:
        Credentials, the obtained credential.
    """
    home_dir = os.path.expanduser('~')
    credential_dir = os.path.join(home_dir, '.credentials')
    if not os.path.exists(credential_dir):
        os.makedirs(credential_dir)
    credential_path = os.path.join(credential_dir,
                                   'gmail-python-quickstart.json')

    store = Storage(credential_path)
    credentials = store.get()
    if not credentials or credentials.invalid:
        flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES)
        flow.user_agent = APPLICATION_NAME
        if flags:
            credentials = tools.run_flow(flow, store, flags)
        else: # Needed only for compatibility with Python 2.6
            credentials = tools.run(flow, store)
        print('Storing credentials to ' + credential_path)
    return credentials

Connecting and Getting credentials

#getting credentials
credentials = get_credentials()

#authorization of credentials
http = credentials.authorize(httplib2.Http())

#service variable is the access point to complete gmail API
service = discovery.build('gmail', 'v1', http=http)

Basics

Retrieving Emails

Getting email IDs

#getting email ids
ids = service.users().messages().list(userId='me').execute()['messages']

Parsing email body as HTML

  1. First we will extract mail’s body content in RAW form
  2. Then we will do base64 decoding to convert mail body to HTML
  3. Then will print prettified HTML using BeautifulSoup
from bs4 import BeautifulSoup
import base64

for i,id in enumerate(ids):
    #getting messsage body in raw format
    body = service.users().messages().get(userId='me',id=id['id'], format='raw').execute()

    #getting html from bodoyo
    html = base64.urlsafe_b64decode(body['raw'].encode('ASCII')

    #if you have lxml installed you can use that too instead of html5lib
    soup = BeautifulSoup(html, 'html5lib')

Extracting text data from email body

We cannot just convert BeautifulSoup instance to text we first have to do some code cleaning like:

  1. Remove style, script, meta, document, head and title.
  2. Remove \n newlines
  3. Break html into lines and remove leading/trailing spaces
def htmlToText(html):
    soup = BeautifulSoup(html, 'html5lib')
    
    #removing scripts, styles and other useless tags
    [element.extract() for element in soup(['style','script','meta','[document]','head','title']

    #getting text from html
    text = soup.getText()

    #removing leading/trailing spaces
    lines = [line.strip() for line in text.splitlines()]

    #breaking multi-headlines into line each
    chunks = [phrase.strip() for line in lines for phrase in line.split(' ')]

    #removing newlines
    text = '\n'.join([chunk for chunk in chunks])

    return text

Sending Emails

There are two ways to send email using the Gmail API:

  • You can send it directly using the messages.send method.
  • You can send it from a draft, using the drafts.send method.

Emails are sent as base64url encoded strings within the raw property of a message resource. The high-level workflow to send an email is to:

  1. Create the email content in some convenient way and encode it as a base64url string.
  2. Create a new message resource and set its raw property to the base64url string you just created.
  3. Call messages.send or, if sending a draft, drafts.send to send the message.

Creating Email

Gmail API requires mail messages to be base64 encoded

import base64

def createMessage(sender, to, subject, message):
    message = MIMEText(message)
    message['to'] = to
    message['from'] = sender
    message['subject'] = subject
    return {'raw': base64.urlsafe_b64encode(message.as_string())}

Sending Email

def sendMail(service, userId, message):
    try:
        message = (service.users().messages().send(userId=userId, body=message).execute())
        return message
    except errors.HttpError, error:
        print(error)

Creating Drafts

Creating drafts is also easy and very similar to sending email, instead of using messages we will be using drafts() and create() method to create a draft.

def createDraft(service, userId, message):
    try:
        message = (service.users().drafts().create(userId=userId, body=message).execute())
        return message
    except errors.HttpError, error:
        print(error)

Filtering Messages

messages.list takes an optional parameter called q which takes string as an argument and through which we can filter our messages.

messages = service.users().messages().list(userId='me', q='from:someuse@example.com').execute()

Working with labels

You can use labels to tag, organize, and categorize messages and threads in Gmail. A label has a many-to-many relationship with messages and threads: a single message or thread may have multiple labels applied to it and a single label may be applied to multiple messages or threads.

Types of Labes

There are many types of labels few of them are:

  • SPAM
  • INBOX
  • TRASH
  • UNREAD
  • IMPORTANT
  • STARRED

and many more…

There are few labels that cannot be manually applied for privacy reasons. Those labels are:

  • SENT
  • DRAFT
def getSpamMails(service, userId, labelId):
    spams = service.users().labels().get(userId, labelId)
    return spams

 

One thought on “Using Google APIs in Python: Part 2 – Gmail API

  1. Hi, I found your tutorial very helpful. There are a few typos in the code. One part I could not figure out. In the HTMLtoText function the following line seems to be missing something. Please excuse me if the answer is obvious, I am a new programmer:

    #removing scripts, styles and other useless tags
    [element.extract() for element in soup([‘style’,’script’,’meta’,'[document]’,’head’,’title’]

    Should that line include ‘soup = ‘ at the beginning?

    Like

Leave a comment