A Few Django ORM Mistakes

See if you can figure out what's wrong with the code snippets below! Ask yourself what the problem is, what effect will it have, and how can you fix it?

These examples are for Django, but probably apply to many other ORMs.

Bug 1

def create(): with transaction.atomic(): thing = Thing.objects.create(foo=1, bar=1) set_foo(thing.id) thing.bar = 2 thing.save() def set_foo(id): thing = Thing.objects.get(id=id) thing.foo = 2 thing.save()

Hint

The save method saves all attributes.

Solution

The problem with this code is that two python instances of the same database row exist. Here's the annotated source:

def create(): with transaction.atomic(): # Create database row foo=1 bar=1 thing = Thing.objects.create(foo=1, bar=1) set_foo(thing.id) # The database row has been updated with foo=2 bar=1, but this # instance still has foo=1 bar=1 as it hasn't been reloaded thing.bar = 2 thing.save() # Writes foo=1 bar=2 # The foo=2 write has been lost def set_foo(id): # Look up the same Thing, but create a new instance thing = Thing.objects.get(id=id) thing.foo = 2 thing.save() # Writes foo=2, bar=1

The result is a single Thing with a foo of 1 and a bar of 2 . A write has been lost!

Here's one possible fix:

def create(): with transaction.atomic(): thing = Thing.objects.create(foo=1, bar=2) set_foo(thing) thing.bar = 3 thing.save() def set_foo(thing): thing.bar = 4 thing.save()

Bug 2

class Thing(Model): foo = ... bar = ... def thing_set_foo(id, value): thing = Thing.objects.get(id=id) thing.foo = value thing.save() def thing_set_bar(id, value): thing = Thing.objects.get(id=id) thing.bar = value thing.save()

Hint

Assume thing_set_foo and thing_set_bar can happen simultaneously.

Solution

It's possible for a thread to read from the database just before a write happens in another thread, resulting in the following situation:

Here's one possible solution:

def thing_set_foo(id, value): thing = Thing.objects.get(id=id) thing.foo = value thing.save(update_fields=["foo"]) def thing_set_bar(id, value): thing = Thing.objects.get(id=id) thing.bar = value thing.save(update_fields=["bar"]) Bug 3

def increment(id) counter = Counter.objects.get(id=id) counter.count = counter.count + 1 counter.save()

Solution

This is very much like bug 2, but the twist is that the increment function can conflict with itself. If called in two different threads, even though increment is called twice the total may still only be 1.

One way to fix this is to make the increment operation atomic .

The way to do this in the Django ORM is to use F objects:

def increment(id) counter = Counter.objects.get(id=id) counter.count = F('count') + 1 counter.save()

Isolation Levels READ COMMITTED Isolation Level

This is the default for PostgreSQL. Transactions can read updates from other transactions after they have been committed.

REPEATABLE READ Isolation Level

This is the default for mysql. A snapshot is established on the first read in the transaction, and all subsequent reads are from the snapshot.

Going forward, assume we are using MySQL in its default configuration.

Bug 4

def state_transition(id): with transaction.atomic(): stateful = Stateful.objects.get(id=id) if stateful.state == DONE: raise AlreadyDone do_state_transition() stateful.state = DONE stateful.save()

Solution

It is possible for do_state_transition to be executed twice if the state transition is executed concurrently. This could be a problem if your state transition includes side effects!

One simple solution to this problem is to lock the object:

def state_transition(id): with transaction.atomic(): stateful = Stateful.objects.select_for_update().get(id=id) if stateful.state == DONE: raise AlreadyDone do_state_transition() stateful.state = DONE stateful.save()

But, generally, you should try to avoid doing side effects in transactions!

Bug 5 def create_payment(collection_id, amount): with transaction.atomic(): payment_collection = PaymentCollection.objects.get(id=collection_id) Payment.objects.create( amount=amount, payment_collection=payment_collection) payment_collection.total = ( payment_collection.payment_set.all() .aggregate(total=Sum('amount'))['total']) payment_collection.save() Solution

If executed concurrently, one transaction will not see the newly created Payment in the other transaction. The last write will win and the total will be inconsistent.

In addition to that, in MySQL this can potentially deadlock causing your transaction to roll back entirely! Creating the Payment causes a lock that blocks the aggregation read.

Both issues can be fixed by locking the model being updated ( not the payment!) at the start of the transaction:

def create_payment(collection_id, amount): with transaction.atomic(): payment_collection = (PaymentCollection.objects .select_for_update().get(id=collection_id)) Payment.objects.create( amount=amount, payment_collection=payment_collection) payment_collection.total = ( payment_collection.payment_set.all() .aggregate(total=Sum('amount'))['total']) payment_collection.save()

Or, alternatively, making the update atomic:

def create_payment(collection_id, amount): payment_collection = PaymentCollection.objects.get(id=collection_id) Payment.objects.create(amount=amount, payment_collection=payment_collection) with connection.cursor() as cursor: cursor.execute(""" UPDATE payment_collection, (SELECT payment_collection_id, sum(amount) AS total FROM payment GROUP BY payment_collection__id) totals SET payment_collection.total = totals.total WHERE totals.pc_id = pc.id AND pc.id = %s """, [payment_collection.id])

Note that this cannot be in a transaction, or the deadlock issues will remain!

In my opinion, the safest way to do this is by using a SQL view instead of storing the total. Views can be awkward to use with Django unfortunately.

CREATE VIEW payment_collection_totals SELECT payment_collection_id, SUM(amount) AS total FROM payment GROUP BY payment_collection_id CREATE VIEW payment_collection_with_total SELECT payment_collection.*, COALESCE(totals.total, 0) AS total FROM payment_collection LEFT JOIN totals ON (totals.payment_collection_id = payment_collection.id)

Bug 6

def foo(id): with transaction.atomic(): foo = Foo.objects.get(id=id) bar = Bar.objects.create(...) with lock(): foo.refresh_from_db() # If foo.bar has already been set in another thread, # raise an exception and rollback the transaction if foo.bar: raise Exception foo.bar = bar foo.save()

Hint

This is a bug in MySQL, but not PostgreSQL.

Solution

This bug is a result of the REPEATABLE READ isolation level. The read after the transaction starts establishes a snapshot, so when refresh_from_db is performed after waiting for a lock, the snapshot is read, not the most recent value.

This means when the foo.bar check is performed, we are checking potentially stale data. This can cause multiple Bar to be created, but only one of them linked to the correct Foo .

Confusingly, replacing with lock() with a select_for_update() will work for MySQL, because MySQL has a weird quirk where locked reads do not read from the snapshot. When using REPEATABLE READ with PostgreSQL, this will throw an error instead.

The preferred way is to either move the lock to the top, outside of the transaction, or using select_for_update() as follows:

def foo(id): with transaction.atomic(): foo = Foo.objects.select_for_update().get(id=id) bar = Bar.objects.create(...) # If foo.bar has already been set in another thread, # raise an exception and rollback the transaction if foo.bar: raise Exception foo.bar = bar foo.save()

Tips Remember the ORM is an in-memory cache. ORMs can obscure bugs. Look at the SQL! Avoid read-modify-write If you don't you'll probably need a lock. If it's not locked, it's not up to date. Lock before reads to avoid weird MySQL behaviour. Locking reads don't use the snapshot in MySQL. Prefer immutable database design if practical. See:Immutable Data. Consider using a serializable isolation level. You don't have to worry about locking if you use serializable transactions. Has other drawbacks. PostgreSQL implementation is nicer than the MySQL one IMO.

A Few Django ORM Mistakes

Trending Articles

LMD VCL Complete v2024.4

Artweaver 7.0.17 免安裝中文版 (8.0.4 安裝版) - 小型繪圖軟體

出售:美國JBL,Paul Audio 出品15吋低音喇叭

日活罗曼晴色粉红电影系列目录

有人買民雄嘉大博識嗎?(或美銓建設以前的建案)

宝可梦无限融合6.4.6最新汉化版+福利版，PC端+安卓端

bundle.load 不回调

uni.requestPayment,支付报错，"errMsg":"requestPayment:fail:[payment微信:-1]General...

[心得] 從來沒碰過魔獸世界的新手照過來,一篇文章就讓你快速上手!

【查】土星在第三宫的表现 (豆瓣草菇@占星社区小组)

《踏血寻梅》拍援交妹命案春夏露点争新人奖

出售: 100% New 抗鼻敏感噴霧 Budesonide PH&T 50 x 3

博讯｜张磊帮助下，李源潮的儿子被耶鲁录取

RAV4 E-Mirror電子式後視鏡無法連線

动画「Visual Prison」BD第三卷封面公开

原中国500强建企工程款断崖式下降生存艰难

creator的editbox怎么隐藏键盘

[一般] 至尊不動劍成長進化史給大家參考

請問Rogue這個故障燈號是什麼意思？

[搬運][ANi] 愛有點沉重的暗黑精靈從異世界緊追不放 - 07 [1080P][Baha][WEB-DL][AAC AVC][CHT][MP4]