store._dirty can hide large temporary leaks
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Storm |
New
|
Undecided
|
Unassigned |
Bug Description
I'm writing this after debugging why a storm using script was using 3GB of memory, when the entire script logic was built around iterators. This primarily affects large data imports, because find/get/execute can all cause a flush(), but until a flush() is caused, objects which the user has no references to are held in memory.
store._dirty acts like a write-cache: it holds objects in the cache and then pushes them out to the backend.
I think it would make storm's behaviour more clear if this write cache was made explicitly into a cache, much as the read cache is an explicit cache which can have its policy controlled.
Creating a WriteCache object which had:
- add
- flush
- clear
- block_implicit_
- allow_implicit_
methods would move all the related logic for the _dirty dict into a clear self contained object.
This could then be parameterised:
- set a maximum size (with -1 meaning no cap)
- set a maximum time
(and so on, if/when a user chooses to).
*even if* noone chooses to use it, it would make the behaviour more clear (that storm accumulates objects until explicitly told to flush, or find/get/execute are run).
I hope this makes sense.
James Henstridge says that a default policy that had a size cap could cause hard to debug issues for users, particularly when making a new object that doesn't meet database constraints, if an implicit flush happened at the wrong time. I think this is a good reason to have the default be the current behaviour. That said, find/get/execute can all cause such bugs already; being able to run with a write cache size of 0 would let developers find out about code that adds invalid objects to the store immediately :)
Not being clear and upfront about the cache also leads to hard to debug issues for users - thats what I experienced.
I think this is a serious flaw that affects every user.
In my understanding it is related only to store.dirty but to the store._cache.
It seems to me that the cache has some circular reference to the store to which is connected. cache.clear( ) each allocated store remains allocated undefinitely.
This causes one keep the other alive and without a store.invalidate() that trigger the internal code store._
(bug spotted while working on the GlobaLeaks project)
This bug mayve affects security of the softwares using Storm() (launchpad?) were a dos attack easily result in a memory exhaustion.