Comment 0 for bug 430359

Revision history for this message
Robert Collins (lifeless) wrote :

So this primarily affects large data imports, because find/get/execute can all cause a flush(), but until a flush() is caused, objects which the user has no references to are held in memory.

store._dirty acts like a write-cache: it holds objects in the cache and then pushes them out to the backend.

I think it would make storm's behaviour more clear if this write cache was made explicitly into a cache, much as the read cache is an explicit cache which can have its policy controlled.

Creating a WriteCache object which had:
 - add
 - flush
 - clear
 - block_implicit_flush
 - allow_implicit_flush

methods would move all the related logic for the _dirty dict into a clear self contained object.

This could then be parameterised:
 - set a maximum size (with -1 meaning no cap)
 - set a maximum time

(and so on, if/when a user chooses to).

*even if* noone chooses to use it, it would make the behaviour more clear (that storm accumulates objects until explicitly told to flush, or find/get/execute are run).

I hope this makes sense.

James Henstridge says that a default policy that had a size cap could cause hard to debug issues for users, particularly when making a new object that doesn't meet database constraints, if an implicit flush happened at the wrong time. I think this is a good reason to have the default be the current behaviour. That said, find/get/execute can all cause such bugs already; being able to run with a write cache size of 0 would let developers find out about code that adds invalid objects to the store immediately :)