Converting a mongo cursor into JSON object in Python
Welcome to our tutorial on serializing PyMongo cursors into JSON. In this article, we will cover how to properly handle ObjectId
and datetime
objects, as well as any other object, using a custom JSONEncoder
.
One common task when working with PyMongo is the need to serialize data for storage or for transfer over a network. In this tutorial, we will look at how to serialize a PyMongo cursor, which is a common data structure used to store results from a MongoDB query, into JSON format.
A commonly reported error when doing so, is the following TyperError
:
TypeError: ObjectId('') is not JSON serializable
We will also delve into how to properly handle complex data types such as ObjectId
and datetime objects, which cannot be serialized directly into JSON. We will show you how to use a custom JSONEncoder
to properly handle these objects and any other custom object types you may have in your PyMongo cursor.
So if you want to learn how to serialize PyMongo cursors into JSON and handle complex data types, keep reading!
Creating a custom JSONEncoder
The JSONEncoder
— which is a member of the json
module of the standard library — is an extensible JSON encoder for Python data structures. By default, is supports the following serialisations:
+-----------------------------------------+--------+
| Python | JSON |
+-----------------------------------------+--------+
| dict | object |
| list, tuple | array |
| str | string |
| int, float, int & float-derived Enums | number |
| True | true |
| False | false |
| None | null |
+-----------------------------------------+--------+
This means that whenever an object of a different data type is observed (which is not listed on the above table), a TypeError
will be raised.
When working with documents in Mongo, by default every document will have an assigned _id
that corresponds to a unique identifier for every document within a collection. Now whenever you query a Mongo collection, a cursor will be returned containing (a pointer to) the retrieved documents where each document will also have the _id
field of type ObjectId
.
Therefore, if you attempt to serialise such documents using the default JSONEncoder
, you will end up getting the error mentioned in the introduction of this tutorial:
TypeError: ObjectId('') is not JSON serializable
Therefore, in order to manage to serialise such objects contained in the PyMongo cursor, we need to extend the default JSONEncoder
such that it properly handles such data types the way we would want it to. To achieve this we will also need to implement the default
method to return the mapping we wish, as specified in the documentation.
To extend this to recognize other objects, subclass and implement a
default()
method with another method that returns a serializable object foro
if possible, otherwise it should call the superclass implementation (to raiseTypeError
).
In our custom JSONEncoder
, I am about to serialise any instance of bson.ObjectId
and datetime.datetime
to str
. Depending on the documents contained in your own Mongo cursor, you may have to specify and handle additional (or perhaps less) data types.
import json
from datetime import datetime
from typing import Anyfrom bson import ObjectId
class MongoJSONEncoder(json.JSONEncoder):
def default(self, o: Any) -> Any:
if isinstance(o, ObjectId):
return str(o)
if isinstance(o, datetime):
return str(o)
return json.JSONEncoder.default(self, o)
Creating a Python object out of the JSON object
Finally, in case you wish to convert the newly created JSON object into a Python object (that is a list of dictionaries containing the key-value pairs that correspond to the document values within the Mongo cursor), all you need to call is the json.loads()
function:
data_obj = json.loads(data_json)
Final Thoughts
In this tutorial, we learned how to serialize PyMongo cursors into JSON and properly handle complex data types such as ObjectId
and datetime
objects. We accomplished this by creating a custom JSONEncoder
that extended the default JSONEncoder
and implemented a default()
method.
We then used this custom encoder to encode the PyMongo cursor, and finally, we converted the resulting JSON object into a Python object using the json.loads()
function. This tutorial demonstrated how to handle ObjectId
and datetime
objects, but the custom JSONEncoder
can also be extended to handle any other custom object types that may be present in the PyMongo cursor.