[Zope3-Users] MemoryError Evolving a ZODB

Discussion:

Jeroen Michiel

2012-12-11 16:33:15 UTC

Hi,

I'm having serious trouble getting my DB evolved to a new version. I'm
runnong a Grok 1.4 site using ZODB 3.10.2
The problem happens when I add a new index to a new catalog.
As soon as the index is added, a subscriber from zop.catalog (I believe)
will automatically loop over all objects in the DB trying to index them. For
some reason, it apparently tries to keep all these objects into memory,
while only a very small part of them are effectively need indexing, and even
then, indexing shouldn't touch them.
After some time of running I see the process taking 1.9GB of mem on windows,
(or 3G on linux), and then I first get these errors:

2012-12-11 16:52:56,617 ERROR [ZODB.Connection] Couldn't load state for
0x0a45a2
Traceback (most recent call last):
File
"c:\users\jm.traficon-int\.buildout\eggs\zodb3-3.10.2-py2.6-win32.egg\ZODB\Connection.py",
line 856, in setstate
self._setstate(obj)
File
"c:\users\jm.traficon-int\.buildout\eggs\zodb3-3.10.2-py2.6-win32.egg\ZODB\Connection.py",
line 910, in _setstate
self._reader.setGhostState(obj, p)
File
"c:\users\jm.traficon-int\.buildout\eggs\zodb3-3.10.2-py2.6-win32.egg\ZODB\serialize.py",
line 612, in setGhostState
state = self.getState(pickle)
File
"c:\users\jm.traficon-int\.buildout\eggs\zodb3-3.10.2-py2.6-win32.egg\ZODB\serialize.py",
line 605, in getState
return unpickler.load()
MemoryError

I have not a single clue why it would need that much memory.
I tried using savepoints, but that doesn't help.
How can I see what exactly is eating all that memory, where do I start
debugging this?

ANY help appreciated!

--
View this message in context: http://old.nabble.com/MemoryError-Evolving-a-ZODB-tp34784598p34784598.html
Sent from the Zope3 - users mailing list archive at Nabble.com.

Adam GROSZER

2012-12-11 16:46:27 UTC

Permalink

Post by Jeroen Michiel
Hi,
I'm having serious trouble getting my DB evolved to a new version. I'm
runnong a Grok 1.4 site using ZODB 3.10.2
The problem happens when I add a new index to a new catalog.
As soon as the index is added, a subscriber from zop.catalog (I believe)
will automatically loop over all objects in the DB trying to index them. For
some reason, it apparently tries to keep all these objects into memory,
while only a very small part of them are effectively need indexing, and even
then, indexing shouldn't touch them.
After some time of running I see the process taking 1.9GB of mem on windows,
2012-12-11 16:52:56,617 ERROR [ZODB.Connection] Couldn't load state for
0x0a45a2
File
"c:\users\jm.traficon-int\.buildout\eggs\zodb3-3.10.2-py2.6-win32.egg\ZODB\Connection.py",
line 856, in setstate
self._setstate(obj)
File
"c:\users\jm.traficon-int\.buildout\eggs\zodb3-3.10.2-py2.6-win32.egg\ZODB\Connection.py",
line 910, in _setstate
self._reader.setGhostState(obj, p)
File
"c:\users\jm.traficon-int\.buildout\eggs\zodb3-3.10.2-py2.6-win32.egg\ZODB\serialize.py",
line 612, in setGhostState
state = self.getState(pickle)
File
"c:\users\jm.traficon-int\.buildout\eggs\zodb3-3.10.2-py2.6-win32.egg\ZODB\serialize.py",
line 605, in getState
return unpickler.load()
MemoryError
I have not a single clue why it would need that much memory.
I tried using savepoints, but that doesn't help.
How can I see what exactly is eating all that memory, where do I start
debugging this?
ANY help appreciated!

Well it loads too many objects in a single transaction.
Doing this after some iterations (10k?, depends on your object sizes)
helps usually:

def forceSavepoint(anyPersistentObject=None):
transaction.savepoint(optimistic=True)

if anyPersistentObject is not None:
#and clear picklecache
conn = anyPersistentObject._p_jar
conn.cacheGC()

--
Best regards,
Adam GROSZER
--
Quote of the day:
A liberal is someone too poor to be a capitalist and too rich to be a
communist.

Jeroen Michiel

2012-12-12 08:39:18 UTC

Permalink

Thanks for the reply!

I already tried
transaction.savepoint()
every minute, but that didn't help: I only saw the memory usage dropping the
first time, but never after.

I changed the code to what you suggested, but it still doesn't seem to help.
Something must be wrong somewhere along the line, but I don't have a clue
where to begin looking.
Would using something like guppy (or heapy, or what it's called) reveal
something?

Could it be something about objects with circular references not being able
to be garbage-collected?
The objects in my DB are quite complex, so something like that might
actually be happening.

Post by Adam GROSZER
Well it loads too many objects in a single transaction.
Doing this after some iterations (10k?, depends on your object sizes)
transaction.savepoint(optimistic=True)
#and clear picklecache
conn = anyPersistentObject._p_jar
conn.cacheGC()
--
Best regards,
Adam GROSZER
--
A liberal is someone too poor to be a capitalist and too rich to be a
communist.
_______________________________________________
Zope3-users mailing list
https://mail.zope.org/mailman/listinfo/zope3-users

--
View this message in context: http://old.nabble.com/MemoryError-Evolving-a-ZODB-tp34784598p34787382.html
Sent from the Zope3 - users mailing list archive at Nabble.com.

Adam GROSZER

2012-12-12 09:08:45 UTC

Permalink

Hello,

That approach works for us, on DBs over 100GB.

Let's CC zodb-dev, which seems to be the better place to discuss this.

Post by Jeroen Michiel
Thanks for the reply!
I already tried
transaction.savepoint()
every minute, but that didn't help: I only saw the memory usage dropping the
first time, but never after.
I changed the code to what you suggested, but it still doesn't seem to help.
Something must be wrong somewhere along the line, but I don't have a clue
where to begin looking.
Would using something like guppy (or heapy, or what it's called) reveal
something?
Could it be something about objects with circular references not being able
to be garbage-collected?
The objects in my DB are quite complex, so something like that might
actually be happening.

--
Best regards,
Adam GROSZER
--
Quote of the day:
The Atomic Age is here to stay - but are we? - Bennett Cerf

Marius Gedminas

2012-12-12 12:49:04 UTC

Permalink

(I tried to figure out guppy/heapy once, gave up.)

Post by Jeroen Michiel
Could it be something about objects with circular references not being able
to be garbage-collected?
The objects in my DB are quite complex, so something like that might
actually be happening.

Usually when a Python process eats too much memory, your server ends up
in swappy death land. My gut feeling is that MemoryError means a
corrupted pickle that tries to allocate a large amount of memory (e.g.
multiple gigabytes) all in one go.

Can you add a try:/except MemoryError: import pdb; pdb.set_trace() in
there? See if the process memory usage is really big. Write down the
object OID (ZODB.utils.u64(obj._p_oid)), try to load it from a separate
Python script process (with the same sys.path, so custom classes are
unplickleable):

db = ZODB.DB.DB(ZODB.FileStorage.FileStorage('Data.fs', read_only=True))
conn = db.open()
obj = conn.get(ZODB.utils.p64(0xXXXXX)) # creates a ghost
try:
obj._p_activate() # tries to load it
except MemoryError:
import pdb; pdb.set_trace()

See if you get a memory error there. If so, try do disassemble the
pickle (pickletools.dis) maybe -- if you've got pdb, you can find the
pickle itself one stack frame up.

Marius Gedminas

--
I want patience, and I WANT IT NOW!

Alexandre Garel

2012-12-13 14:53:19 UTC

Permalink

Hello

my suggestions might be silly, but in case… :

1- is that that you modify a lot of objects (and big objects) in which
case savepoints may not save you (as my wild guess is that savepoints
will only drop objects participating in computation but not modified).
If it's just re-indexing it's strange as the only thing changing would
normally be the index.

2- if it's a kind of migration for your database, do you really need to
have it done in one transaction. Could you save your database, run your
migration in multiple commit (transaction.commit() instead of
transaction.savepoint()) then if it goes wrong, restore old file and if
it's ok, well it's ok :-)

Hope it helps,

Alex

--
Alexandre Garel
06 78 33 15 37

Jeroen Michiel

2012-12-14 11:09:25 UTC

Permalink

I was thinking along the same lines.
I have found a spot in the code with circular references, and indeed (using
heapy) it seems those are the objects (which happen to be quite big also)
taking most of the memory. The main problem is now to get rid of them while
staying within memory boundaries. It's a part of the code I implemented
first in this site, and it was also the time I started working with
Zope/Grok, so I would approach things quite differently now, knowing what I
know about the ZODB that I didn't know then...
I guess multiple transactions will be needed, indeed.

It originally was just a matter of indexing objects, but wanting to get rid
of the circular references, it has become a db migration...

Thanks for the advice, I'll keep you posted on how I fixed it (if I ever
do), might be interesting for other people, too.

Regards,
Jeroen

Post by Alexandre Garel

Hello
1- is that that you modify a lot of objects (and big objects) in which
case savepoints may not save you (as my wild guess is that savepoints
will only drop objects participating in computation but not modified).
If it's just re-indexing it's strange as the only thing changing would
normally be the index.
2- if it's a kind of migration for your database, do you really need to
have it done in one transaction. Could you save your database, run your
migration in multiple commit (transaction.commit() instead of
transaction.savepoint()) then if it goes wrong, restore old file and if
it's ok, well it's ok :-)
Hope it helps,
Alex
--
Alexandre Garel
06 78 33 15 37
_______________________________________________
Zope3-users mailing list
https://mail.zope.org/mailman/listinfo/zope3-users

--
View this message in context: http://old.nabble.com/MemoryError-Evolving-a-ZODB-tp34784598p34797018.html
Sent from the Zope3 - users mailing list archive at Nabble.com.

Jeroen Michiel

2012-12-18 09:18:17 UTC

Permalink

I found the actual issue!
I am running a client/server setup: the client (C++ tool) runs tests on
devices and pushes the results to the server (Grok) with REST requests.
There was a mechanism that the server validated the results, and could
reject them if they didn't validate, so the client would rerun the test and
try to push it again. Also, if pushing results for some other reason failed
the client would keep trying. However, there was a bug in the server that if
validation failed, the wrong error was returned, but the results were
committed to the DB. So the client kept trying, and the server kept storing
but rejecting with the wrong error...

I had found and solved this issue a long time ago on my test setup while
developing, but I had no idea it had actually happened on my live setup,
too.

With the circular references, I ended up with one massive set of data, that
didn't fit into memory while doing findObjectsProviding. I managed to remove
the redundant data in chunks and committing in between. Even committing
didn't drop the memory usage, so I had to restart the script multiple times.

Now my DB migration is running OK!

thx!

Post by Jeroen Michiel
I was thinking along the same lines.
I have found a spot in the code with circular references, and indeed
(using heapy) it seems those are the objects (which happen to be quite big
also) taking most of the memory. The main problem is now to get rid of
them while staying within memory boundaries. It's a part of the code I
implemented first in this site, and it was also the time I started working
with Zope/Grok, so I would approach things quite differently now, knowing
what I know about the ZODB that I didn't know then...
I guess multiple transactions will be needed, indeed.
It originally was just a matter of indexing objects, but wanting to get
rid of the circular references, it has become a db migration...
Thanks for the advice, I'll keep you posted on how I fixed it (if I ever
do), might be interesting for other people, too.
Regards,
Jeroen

Post by Alexandre Garel

--
View this message in context: http://old.nabble.com/MemoryError-Evolving-a-ZODB-tp34784598p34809613.html
Sent from the Zope3 - users mailing list archive at Nabble.com.