The Ultimate Guide to Python-MongoDB ‘Pro-Tips’! Make Database Operations a Breeze!

amy 02/12/2025

MongoDB is a non-relational database written in C++. It is an open-source database system based on distributed file storage. Its content is stored in a JSON-like object format. Its field values can contain other documents, arrays, and arrays of documents, making it highly flexible. In this article, CloudDude (Yunduojun) will learn MongoDB storage operations with you in Python 3.

1. Preparations

Before starting, please ensure MongoDB is installed, its service is started, and the PyMongo library for Python is installed.

2. Connecting to MongoDB

To connect to MongoDB, we need to use MongoClient from the PyMongo library. Generally, you need to pass the IP address and port of MongoDB. The first parameter is the host address, and the second is the port (if not specified, it defaults to 27017):

python

import pymongo
client = pymongo.MongoClient(host='localhost', port=27017)

This creates a MongoDB connection object.

Alternatively, the first parameter host of MongoClient can also accept a MongoDB connection string starting with mongodb://, for example:

python

client = pymongo.MongoClient('mongodb://localhost:27017/')

This achieves the same connection result.

3. Specifying a Database

MongoDB can have multiple databases. Next, we need to specify which database to operate on. Here, we’ll use the test database as an example. Specify the database to use in your program:

python

db = client.test

Here, calling the test attribute of client returns the test database. Alternatively, you can specify it like this:

python

db = client['test']

Both methods are equivalent.

4. Specifying a Collection

Each MongoDB database contains many collections, which are similar to tables in relational databases.

Next, specify the collection to operate on. Here, we’ll use a collection named students. Similar to specifying a database, there are two ways to specify a collection:

python

collection = db.students
# or
collection = db['students']

This declares a Collection object.

5. Inserting Data

Now, you can insert data. For the students collection, let’s create a new student record represented as a dictionary:

python

student = {
    'id': '20170101',
    'name': 'Jordan',
    'age': 20,
    'gender': 'male'
}

This specifies the student’s ID, name, age, and gender. Next, simply call the insert() method of the collection to insert the data:

python

result = collection.insert(student)
print(result)

In MongoDB, each piece of data has an _id property for unique identification. If this property is not explicitly specified, MongoDB automatically generates an _id of type ObjectId. The insert() method returns the _id value after execution.

Output:

text

5932a68615c2606814c91f3d

You can also insert multiple documents at once by passing a list:

python

student1 = {
    'id': '20170101',
    'name': 'Jordan',
    'age': 20,
    'gender': 'male'
}

student2 = {
    'id': '20170202',
    'name': 'Mike',
    'age': 21,
    'gender': 'male'
}

result = collection.insert([student1, student2])
print(result)

The returned result is a collection of corresponding _ids:

text

[ObjectId('5932a80115c2606a59e8a048'), ObjectId('5932a80115c2606a59e8a049')]

Actually, in PyMongo 3.x, the insert() method is no longer officially recommended (though it still works). The recommended methods are insert_one() and insert_many() for inserting single and multiple records, respectively. Example:

python

student = {
    'id': '20170101',
    'name': 'Jordan',
    'age': 20,
    'gender': 'male'
}

result = collection.insert_one(student)
print(result)
print(result.inserted_id)

Output:

text

<pymongo.results.InsertOneResult object at 0x10d68b558>
5932ab0f15c2606f0c1cf6c5

Unlike insert(), this returns an InsertOneResult object. You can access its inserted_id attribute to get the _id.

For insert_many(), pass the data as a list:

python

student1 = {...}  # As above
student2 = {...}  # As above

result = collection.insert_many([student1, student2])
print(result)
print(result.inserted_ids)

Output:

text

<pymongo.results.InsertManyResult object at 0x101dea558>
[ObjectId('5932abf415c2607083d3b2ac'), ObjectId('5932abf415c2607083d3b2ad')]

This method returns an InsertManyResult type. Calling its inserted_ids attribute returns a list of _ids for the inserted data.

6. Querying

After inserting data, you can query using find_one() or find()find_one() returns a single result, while find() returns a generator object. Example:

python

result = collection.find_one({'name': 'Mike'})
print(type(result))
print(result)

Here we query data where the name is ‘Mike’. The return type is a dictionary.

Output:

text

<class 'dict'>
{'_id': ObjectId('5932a80115c2606a59e8a049'), 'id': '20170202', 'name': 'Mike', 'age': 21, 'gender': 'male'}

Note the added _id property, which MongoDB automatically adds during insertion.

You can also query by ObjectId, which requires importing ObjectId from bson.objectid:

python

from bson.objectid import ObjectId

result = collection.find_one({'_id': ObjectId('593278c115c2602667ec6bae')})
print(result)

The query result is still a dictionary. If no result is found, None is returned.

For querying multiple documents, use the find() method. For example, to find data where age is 20:

python

results = collection.find({'age': 20})
print(results)
for result in results:
    print(result)

Output:

text

<pymongo.cursor.Cursor object at 0x1032d5128>
{'_id': ObjectId('593278c115c2602667ec6bae'), 'id': '20170101', 'name': 'Jordan', 'age': 20, 'gender': 'male'}
{'_id': ObjectId('593278c815c2602678bb2b8d'), 'id': '20170102', 'name': 'Kevin', 'age': 20, 'gender': 'male'}
{'_id': ObjectId('593278d815c260269d7645a8'), 'id': '20170103', 'name': 'Harden', 'age': 20, 'gender': 'male'}

The return type is Cursor, which acts like a generator. You need to iterate through it to get all results, each being a dictionary.

To query for data where age is greater than 20:

python

results = collection.find({'age': {'$gt': 20}})

Here the query condition key’s value is not a simple number but a dictionary with the key $gt (greater than) and value 20.

Here’s a summary table of comparison operators:

SymbolMeaningExample
$ltLess than{'age': {'$lt': 20}}
$gtGreater than{'age': {'$gt': 20}}
$lteLess than or equal{'age': {'$lte': 20}}
$gteGreater than or equal{'age': {'$gte': 20}}
$neNot equal{'age': {'$ne': 20}}
$inIn array{'age': {'$in': [20, 23]}}
$ninNot in array{'age': {'$nin': [20, 23]}}

You can also perform regular expression queries. For example, to find students whose names start with ‘M’:

python

results = collection.find({'name': {'$regex': '^M.*'}})

Here, $regex specifies the regular expression match. ^M.* is a regex meaning “starts with M”.

Here’s a table summarizing some functional operators:

SymbolMeaningExampleExplanation
$regexMatch regex{'name': {'$regex': '^M.*'}}Name starts with M
$existsProperty exists{'name': {'$exists': True}}Name property exists
$typeType check{'age': {'$type': 'int'}}Age is of type int
$modModulo operation{'age': {'$mod': [5, 0]}}Age modulo 5 equals 0
$textText search{'$text': {'$search': 'Mike'}}Text-type property contains string ‘Mike’
$whereAdvanced conditional query{'$where': 'obj.fans_count == obj.follows_count'}Own follower count equals following count

For more detailed usage of these operators, refer to the official MongoDB documentation:
https://docs.mongodb.com/manual/reference/operator/query/

7. Counting

To count the number of documents in a query result, use the count() method. For example, to count all documents:

python

count = collection.find().count()
print(count)

Or to count documents matching a condition:

python

count = collection.find({'age': 20}).count()
print(count)

The result is a numerical value representing the count.

8. Sorting

To sort, call the sort() method, passing the field to sort by and the sort order flag. Example:

python

results = collection.find().sort('name', pymongo.ASCENDING)
print([result['name'] for result in results])

Output:

text

['Harden', 'Jordan', 'Kevin', 'Mark', 'Mike']

Here, pymongo.ASCENDING specifies ascending order. For descending order, use pymongo.DESCENDING.

9. Offset and Limit

In some cases, you might want to skip a number of results. Use skip() to offset. For example, to skip the first two results:

python

results = collection.find().sort('name', pymongo.ASCENDING).skip(2)
print([result['name'] for result in results])

Output:

text

['Kevin', 'Mark', 'Mike']

You can also use limit() to specify the number of results to return:

python

results = collection.find().sort('name', pymongo.ASCENDING).skip(2).limit(2)
print([result['name'] for result in results])

Output:

text

['Kevin', 'Mark']

Without limit(), three results would be returned. With the limit, only two are returned.

Note: When dealing with very large datasets (tens of millions or billions), avoid using large offsets in queries as they can cause memory issues. Instead, consider queries like this:

python

from bson.objectid import ObjectId
collection.find({'_id': {'$gt': ObjectId('593278c815c2602678bb2b8d')}})

This requires keeping track of the last _id from the previous query.

10. Updating

To update data, use the update() method, specifying the condition and the new data. Example:

python

condition = {'name': 'Kevin'}
student = collection.find_one(condition)
student['age'] = 25
result = collection.update(condition, student)
print(result)

Here we update the age for the student named ‘Kevin’: first query the data, modify the age, then call update() with the original condition and the modified data.

Output:

text

{'ok': 1, 'nModified': 1, 'n': 1, 'updatedExisting': True}

The result is a dictionary. ok indicates success, nModified indicates the number of documents affected.

Alternatively, you can use the $set operator to update data:

python

result = collection.update(condition, {'$set': student})

This updates only the fields present in the student dictionary. Other existing fields remain unchanged and are not deleted. Without $set, the entire document would be replaced by the student dictionary, potentially deleting other fields.

The update() method is also not officially recommended in newer versions. The recommended methods are update_one() and update_many(). Their second parameter must use an operator like $set as a dictionary key. Example:

python

condition = {'name': 'Kevin'}
student = collection.find_one(condition)
student['age'] = 26
result = collection.update_one(condition, {'$set': student})
print(result)
print(result.matched_count, result.modified_count)

This calls update_one(). The second parameter is {'$set': student}. The return type is UpdateResult. The matched_count and modified_count attributes give the number of matched and modified documents.

Output:

text

<pymongo.results.UpdateResult object at 0x10d17b678>
1 0

Another example:

python

condition = {'age': {'$gt': 20}}
result = collection.update_one(condition, {'$inc': {'age': 1}})
print(result)
print(result.matched_count, result.modified_count)

Here, the query condition is age > 20, and the update operation is {'$inc': {'age': 1}} (increment age by 1). This increments the age of the first matching document by 1.

Output:

text

<pymongo.results.UpdateResult object at 0x10b8874c8>
1 1

If update_many() is called, all matching documents are updated:

python

condition = {'age': {'$gt': 20}}
result = collection.update_many(condition, {'$inc': {'age': 1}})
print(result)
print(result.matched_count, result.modified_count)

Output:

text

<pymongo.results.UpdateResult object at 0x10c6384c8>
3 3

11. Deleting

Deletion is straightforward. Call remove() with the deletion condition. All matching documents will be deleted. Example:

python

result = collection.remove({'name': 'Kevin'})
print(result)

Output:

text

{'ok': 1, 'n': 1}

There are also two recommended methods: delete_one() and delete_many(). Example:

python

result = collection.delete_one({'name': 'Kevin'})
print(result)
print(result.deleted_count)
result = collection.delete_many({'age': {'$lt': 25}})
print(result.deleted_count)

Output:

text

<pymongo.results.DeleteResult object at 0x10e6ba4c8>
1
4

delete_one() deletes the first matching document. delete_many() deletes all matching documents. Both return a DeleteResult object, and the deleted_count attribute gives the number of deleted documents.

12. Other Operations

PyMongo also provides combined methods like find_one_and_delete()find_one_and_replace(), and find_one_and_update() for find-and-delete, find-and-replace, and find-and-update operations. Their usage is similar to the methods above.

You can also perform index operations using methods like create_index()create_indexes(), and drop_index().

For detailed usage of PyMongo, refer to the official documentation: http://api.mongodb.com/python/current/api/pymongo/collection.html.

Operations on databases and collections themselves are also supported. Refer to the official documentation for more: http://api.mongodb.com/python/current/api/pymongo/.