Using MongoDB in Python using pymongo

Introduction

PyMongo is a Python distribution containing tools for working with MongoDB, and is the recommended way to work with MongoDB from Python. If you want to follow this guide locally you need to have MongoDB installed on your machine. We will using a cloud service called mLab.com which provides enough free space to learn, test, experiment or deploy small scale MongoDB applications.

Installation

pip install pymongo

mLab.com Setup

First sign in mLab or create a new account if you don’t have already one.

1 – Select a service provider, all of these have free space available. Select any free plan.Screen Shot 2017-10-12 at 12.05.15 PM

 

2 – Select a regionScreen Shot 2017-10-12 at 12.08.00 PM

 

3 – Choose a database nameScreen Shot 2017-10-12 at 12.09.00 PM

4 – In next confirm details and click Submit Order. Don’t worry its free if you selected a free plan in first step.

5 – Now go to Home and select your database. After selecting your database, It will show your database information. Host name for this particular instance is the second URLScreen Shot 2017-10-12 at 12.18.06 PM.png

Basics

We will first have to initialize and connect to our MongoDB database. We create an instance of MongoClient class and pass it HOST and PORT at which MongoDB instance is running. If you are running MongoDB locally on your machine then HOST would be localhost and default port for MongoDB is 27017. Since I am using mLab host name and port for me would be available from mLab.com

from pymongo import MongoClient
client = MongoClient('HOST', 'PORT')

#or you can pass the mongodb connection string
client = MongoClient('mongodb://<dbuser>:<dbpassword>@ds117615.mlab.com:17615/mongo-tutorial')

Accessing Databases

We can get database instances using database name as an attribute to our client variable

db = client.myproject

We can also get database using database name as key. Since sometimes the valid database names can be invalid python variable names. In those situation this is our option.

db = client['myproject']

Accessing Collections

Just like databases we can get collection by using collection name as an property to our database variable or using it as a key.

usersCollection = db['users']
#or
usersCollection = db.users

Dropping Collections

usersCollection.drop()

PyMongo Operations

Insertion Operations

We have 2 main operation insert_one() and insert_many() for single and multiple document insertion. To retrieve id(s) of inserted elements we can use inserted_ids property.

newUser = { rollno: 20, 'name': 'Sahil Sehwag', 'hobbies': 'programming' }
id = usersCollection.insert_one(newUser).inserted_id

#bulk insertion
newUsers = [{ rollno: 10, 'name':'Sagar Sehwag', 'hobbies': 'web development'}, { 'rollno': 15, 'name':'Karan', 'hobbies': 'bakchodi'}, {'rollno': 24, 'name': 'ram', 'hobbies': 'programming'}]
ids = usersCollection.insert_many(newUsers).inserted_ids

Querying Operations

We have 2 basic function for querying our collection and those are find() and find_one(). They do as their name says. find finds all the documents matching the query and find_one returns the first matched document.

#querying for a single document
document = usersCollection.find_one({'name': 'Sahil Sehwag'})

#getting multiple documents
documents = usersCollection.find('rollno': {'$gt': 12})

Just like MongoDB we can operations like sort, count etc on matched documents

#count of queried documents
totalUsers = usersCollection.find().count()

#getting the sorted results
sortedUsers = usersCollection.find().sort('name')

Query Operators

There are different operators to perform various operations like logical AND or logical OR. Other operations include $lt, $gt, $lte etc.

#OR operation
users.find({'$or':[{'rollno': {'$lt': 15}}, {'hobbies':'programming'}]})

Deletion Operations

Just like above mentioned operations pymongo has 2 operations for document removal, 1 for single document removal and other for performing bulk delete operation.

deletedCount = usersCollection.delete_many({'name': 'Sahil Sehwag'}).deleted_counts

Similarly delete_one operation is used for single document deletion.

Updation Operations

The first argument to updation methods is the query object, and the second object is the new object that will replace or update the old document.

usersCollection.update_one({'rollno: '20'},{'$set': {'name':'Sahil'}})

Similarly there is a update_many operations whose second argument is a list of new documents or various updates

totalDocumentsModified = usersCollection.update_many({'hobbies': 'programming'}, { '$set': {'hobbies':'development'}}).modified_count

To retrieve ids of the modified documents we can use modified_count property. If you want to replace the object with completely new object just pass an entirely new object without using any operators like $set.

Indexing Operations

Indexing are used to perform high speed access operations. We set indexes on columns that are frequently accessed.

usersCollection.create_index([('rollno', pymongo.ASCENDING)], unique=True)

Conclusion

PyMongo has a very familiar API structure if you already know MongoDB. The API consistency between PyMongo and MongoDB makes it easy to learn and use. For more in-depth understanding or for reference read PyMongo documentation here

Leave a comment