1

I have two model instances, rec1 and rec2, but each has different fields updated:

rec1 = MyModel(id=1, name="John")  # Only 'name' is changed  
rec2 = MyModel(id=2, age=30)       # Only 'age' is changed  

Now, I want to update both records efficiently using bulk_update(). However, bulk_update() requires a fixed list of fields, meaning:

updated_fields = ['name', 'age']
model.objects.bulk_update([rec1, rec2], fields=updated_fields)

This updates both fields in both records, even though:

  • rec1 only needs to update name
  • rec2 only needs to update age

Goal: Update only the changed fields for each record in a single query. Avoid unnecessary updates to unchanged fields.

Issue: bulk_update() applies the same field list to all records, leading to redundant updates.

Question: Is there a way to update only the modified fields per record efficiently in a single query? Or is there an alternative approach to handle this in Django ORM?

model.objects.bulk_update(instances_to_update, fields=updated_fields)

where updated_fields is a list.

i was expecting updated_fields as a iteration type function.

1
  • cant do, you can update instance where same field are getting used in the function call Commented Mar 13 at 20:56

3 Answers 3

0

You can do this, in fact the Django ORM will, for .bulk_update(..) essentially make a query that looks like:

from django.db.models import Case, Value, When

model.objects.filter(pk__in=[rec1.pk, rec2.pk]).update(
    name=Case(
        When(pk=rec1.pk, then=Value(rec1.name)),
        When(pk=rec2.pk, then=Value(rec2.name)),
        default=F('name'),
    ),
    age=Case(
        When(pk=rec1.pk, then=Value(rec1.age)),
        When(pk=rec2.pk, then=Value(rec2.age)),
        default=F('age'),
    ),
)

This is not very efficient: most databases will fall back to a "linear" search through the CASE-WHENs, and thus updating the records might be very slow.

so we can limit the cases to only the ones we want to update with:

model.objects.filter(pk__in=[rec1.pk, rec2.pk]).update(
    name=Case(When(pk=rec1.pk, then=Value(rec1.name)), default=F('name')),
    age=Case(When(pk=rec2.pk, then=Value(rec2.age)), default=F('age')),
)
Sign up to request clarification or add additional context in comments.

4 Comments

This answer is technically incorrect and misunderstands how Django's bulk_update() method works at the SQL level. When you use bulk_update(), Django generates SQL using CASE-WHEN statements per field and per record.
@KdPisda: If you took a look at the discussion ticket you took a look to, you could see, I've started that discussion :p. It indeed does, per field and per row, that is the first piece of code, and if you see ho|
@KdPisda: you can take a look how Django does this here: github.com/django/django/blob/… it: the core part is the for field in fields and Case(*[When(pk=obj.pk, then=attr)]) where it wraps the attr in a value if it is not an SQL expression. So sure, the mapping is not one-on-one, I decided to show the key part, and then show how to only update a certain column for a certain row.
@willem-van-onsem just realized it is you! :P I will look into the code again.
0

TLDR;

We want to update different fields for different records in a single bulk_update() operation. This is indeed a limitation with Django's current implementation.

The key issue is that Django's bulk_update() method requires you to specify a fixed list of fields that will be applied to ALL objects in your update batch. There's no built-in way to say "only update the 'name' field for record 1, but only update the 'age' field for record 2" in a single operation.

When we use something like

rec1 = MyModel(id=1, name="John")  # Only 'name' is changed
rec2 = MyModel(id=2, age=30)       # Only 'age' is changed
model.objects.bulk_update([rec1, rec2], fields=['name', 'age'])

Django will generate SQL that updates both fields for both records, which isn't what you want.

It looks something like

UPDATE my_table SET 
  name = CASE WHEN id=1 THEN 'John' WHEN id=2 THEN name END,
  age = CASE WHEN id=1 THEN age WHEN id=2 THEN 30 END
WHERE id IN (1,2)

It's updating both fields for both records, even though you only wanted to update one field per record. This is the inherent limitation.

As of now, there are several approaches to handle this scenario:

Group objects by the fields that need updating and perform separate bulk_update() calls:

name_updates = [obj for obj in all_objects if obj.name_changed]
age_updates = [obj for obj in all_objects if obj.age_changed]

MyModel.objects.bulk_update(name_updates, ['name'])
MyModel.objects.bulk_update(age_updates, ['age'])

Unfortunately, Django just doesn't have built-in support for this use case. According to multiple discussions, including those on Reddit, "Django and most ORMs don't bother with this. They either use all fields in the query or just the ones you specify"

There have been one discussion Reduce the "Case-When" sequence for a bulk_update when the values for a certain field are the same. an year ago about the same. You are free to use this approach with a custom manager/queryset.

1 Comment

Your approach makes two roundtrips to the database, and makes the updates, also for a small number of rows, non-atomic.
0

I think the most efficient way to update the objects is to use bulk_update twice. Since the sets in your example are disjoint, each object will be updated only once.

Running a single bulk_update would require an if condition to handle different cases, which introduces additional processing overhead

.

1 Comment

That's already the case, even for a single field, since Django, for a .bulk_update(..) generates a long CASE-WHEN sequence for every object: github.com/django/django/blob/… Often .bulk_update(..)s when the number of records is quite large, take seconds.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.