Django comes with "batteries included" to make CRUD (create, read, update, delete) operations easy. It's nice that the CR part (create and read) of CRUD is so easy, but have you ever paused to think about the UD part (update and delete)?
Let's look at delete. All you need to do is this:
ReallyImportantModel.objects.get(id=32).delete() # gone from the database foreverJust one line, and your data is gone forever. It can be done accidentally. Or you can be do it deliberately, only to later realise that your old data is valuable too.
Now what about updating?
Updating is deleting in disguise .
When you update, you're deleting the old data and replacing it with something new. It's still deletion .
important = ReallyImportantModel.object.get(id=32) important.update(data={'new_data': 'This is new data'}) # OLD DATA GONE FOREVEROkay, but why do we care?
Let's say we want to know the state of ReallyImportantModel 6 months ago. Oh that's right, you've deleted it, so you can't get it back.
Well, that's not exactly true -- you can recreate your data from backups (if you don't backup your database, stop reading right now and fix that immediately ). But that's clumsy.
So by only storing the current state of the object, you lose all the contextual information on how the object arrived at this current state. Not only that, you make it difficult to make projections about the future.
Event sourcingcan help with that.
Event sourcingThe basic concept of event sourcing is this:
Instead of just storing the current state, we also store the events that lead up to the current state Events are replayable. We can travel back in time to any point by replaying every event up to that point in time That also means we can recover the current state just by replaying every event, even if the current state was accidentally deleted Events are append-only .To gain an intuition, let's look at an event sourcing system you're familiar with: your bank account.
Your "state" is your account balance, while your "events" are your transactions (deposit, withdrawal, etc.).
Can you imagine a bank account that only shows you the current balance?
That is clearly unacceptable ("Why do I only have $50? Where did my money go? If only I could see the the history."). So we always store the history of transfers as the source of truth.
Implementing event sourcing in DjangoLet's look at a few ways to do this in Django.
Ad-hoc modelsIf you have a one or two important models, you probably don't need a generalizable event sourcing solution that applies to all models.
You could do it on an ad-hoc basis like this, if you can have a relationship that makes sense:
# in an app called 'account' from django.db import models from django.conf import settings class Account(models.Model): """Bank account""" balance = models.DecimalField(max_digits=19, decimal_places=6) owner = models.ForeignKey(settings.AUTH_USER_MODEL, related_name='account') class Transfer(models.Model): """ Represents a transfer in or out of an account. A positive amount indicates that it is a transfer into the account, whereas a negative amount indicates that it is a transfer out of the account. """ account = models.ForeignKey('account.Account', on_delete=models.PROTECT, related_name='transfers') amount = models.DecimalField(max_digits=19, decimal_places=6) date = models.DateTimeField()In this case your "state" is in your Account model, whereas your Transfer model contains the "events".
Having Transfer objects makes it trivial to recreate any account.
Using an Event StoreYou could also use a single Event model to store every possible event in any model. A nice way to do this is to encode the changes in a JSON field.
This example uses Postgres:
from django.contrib.contenttypes.fields import GenericForeignKey from django.contrib.contenttypes.models import ContentType from django.contrib.postgres.fields import JSONField from django.db import models class Event(models.Model): """Event table that stores all model changes""" content_type = models.ForeignKey(ContentType, on_delete=models.PROTECT) object_id = models.PositiveIntegerField() time_created = models.DateTimeField() content_object = GenericForeignKey('content_type', 'object_id') body = JSONField()You can then add methods to any model that mutates the state:
class Account(models.Model): balance = models.DecimalField(max_digits=19, decimal_places=6, default=0 owner = models.ForeignKey(settings.AUTH_USER_MODEL, related_name='account') def make_deposit(self, amount): """Deposit money into account""" Event.objects.create( content_object=self, time_created=timezone.now(), body=json.dumps({ 'type': 'made_deposit', 'amount': amount, }) ) self.balance += amount self.save() def make_withdrawal(self, amount): """Withdraw money from account""" Event.objects.create( content_object=self, time_created=timezone.now(), body=json.dumps({ 'type': 'made_withdrawal', 'amount': -amount, # withdraw = negative amount }) ) self.balance -= amount self.save() @classmethod def create_account(cls, owner): """Create an account""" account = cls.objects.create(owner=owner, balance=0) Event.objects.create( content_object=account, time_created=timezone.now(), body=json.dumps({ 'type': 'created_account', 'id': account.id, 'owner_id': owner.id }) ) return accountSo now you can do this:
account = Account.create_account(owner=User.objects.first()) account.make_deposit(decimal.Decimal(50.0)) account.make_deposit(decimal.Decimal(125.0)) account.make_withdrawal(decimal.Decimal(75.0)) events = Event.objects.filter( content_type=ContentType.objects.get_for_model(account), object_id=account.id ) for event in events: print(event.body)Which should give you this:
{"type": "created_account", "id": 2, "owner_id": 1} {"type": "made_deposit", "amount": 50.0} {"type": "made_deposit", "amount": 50} {"type": "made_deposit", "amount": 150} {"type": "made_deposit", "amount": 200} {"type": "made_withdrawal", "amount": -75}Again, this makes it trivial to write any utility methods to recreate any instance of Account , even if you accidentally dropped the whole accounts table.
SnapshottingThere will come a time when you have too many events to efficiently replay the entire history. In this case, a good optimisation step would be snapshots taken at various points in history. For example, in our accounting example one could save snapshots of the account in an AccountBalance model, which is a snapshot of the account's state at a point in time.