What is the preferred way to add many fields to all documents in a MongoDB collection?

debugcn 投稿 Dev

ChemLynx

I have have a Python application that is iteratively going through every document in a MongoDB (3.0.2) collection (typically between 10K and 1M documents), and adding new fields (probably doubling/tripling the number of fields in the document).

My initial thought was that I would use upsert the entire of the revised documents (using pyMongo) - now I'm questioning that:

Given that the revised documents are significantly bigger should I be inserting only the new fields, or just replacing the document?
Also, is it better to perform a write to the collection on a document by document basis or in bulk?

GHETTO.CHiLD

this is actually a great question that can be solved a few different ways depending on how you are managing your data.

if you are upserting additional fields does this mean your data is appending additional fields at a later point in time with the only changes being the addition of the additional fields? if so you could set the ttl on your documents so that the old ones drop off over time. keep in mind that if you do this you will want to set an index that sorts your results by descending _id so that the most recent additions are selected before the older ones.

the benefit of this of doing it this way is that your are continually writing data as opposed to seeking and updating data so it is faster.

in regards to upserts vs bulk inserts. bulk inserts are always faster than upserts since bulk upserting requires you to find the original document first.

Given that the revised documents are significantly bigger should I be inserting only the new fields, or just replacing the document?
- you really need to understand your data fully to determine what is best but if only change to the data is additional fields or changes that only need to be considered from that point forward then bulk inserting and setting a ttl on your older data is the better method from the stand point of write operations as opposed to seek, find and update. when using this method you will want to db.document.find_one() as opposed to db.document.find() so that only your current record is returned.
Also, is it better to perform a write to the collection on a document by document basis or in bulk?
- bulk inserts will be faster than inserting each one sequentially.

この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。

侵害の場合は、連絡してください[email protected]

編集2021-07-2

コメントを追加

サインイン

分類Dev

Related 関連記事

記事

What is the preferred way to add many fields to all documents in a MongoDB collection?

What is the preferred way to add many fields to all documents in a MongoDB collection?

Get All 'documents' from MongoDB 'collection'

What is the preferred way to a add a citation suggestion to python packages?

Upsert many documents in MongoDB and php

Get All documents in a Collection CosmosDb

MongoDB prevent documents whose fields are equal

How to stop insertion of Duplicate documents in a mongodb collection

Get the count of the number of documents in a Collection Mongodb

Fast query and deletion of documents of a large collection in MongoDB

Find all documents in a collection with mongo go driver

How to get all documents from a collection in FaunaDB?

What is the preferred way of stating (irrelevant) parameters when overriding methods?

What's the easiest way to validate a string is however many digits it needs to be and is all integers?

MongoDB : querying documents with two equal fields, $match and $eq

Closing mongodb connection in node.js after inserting many documents

Add extra fields to odoo many2many field

MongoDB: How to read all documents from all collections use expressjs?

Updating all documents in a collection (million+) with Date object

Best way to sum array sizes fields in mongoDB

how to group in mongoDB and return all fields in result

Get all in MongoDB collection using SocialCMS with Breeze

Bootstrap way to add vertical space between the fields

How can I rename a field for all documents in MongoDB?

mongodb/mongoose findMany - find all documents with IDs listed in array

Why does the mongodb aggregate $avg code return all of the documents ungrouped?

MongoDB, finding all documents where property id equals to record id

MongoDB: How To Delete All Records Of A Collection in MongoDB Shell?

How to add many fields with ng-switch (angularjs)

Preferred way of adding spinner to toolbar

Bulk add new field to ALL documents in an elasticsearch index