Note: This is a beta release of Red Hat Bugzilla 5.0. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Also email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback here.
Bug 1062601 - Evolution hangs again
Summary: Evolution hangs again
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: evolution
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Matthew Barnes
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1077184 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-02-07 12:36 UTC by Mikhail
Modified: 2014-04-22 11:33 UTC (History)
4 users (show)

Fixed In Version: evolution-ews-3.12.1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-03-26 16:10:52 UTC


Attachments (Terms of Use)
screenshot (deleted)
2014-02-07 12:36 UTC, Mikhail
no flags Details
backtraces of evolution (deleted)
2014-02-07 12:37 UTC, Mikhail
no flags Details
valgrind log (deleted)
2014-03-21 09:23 UTC, Mikhail
no flags Details
valgrind log with --leak-check=full (deleted)
2014-03-24 07:05 UTC, Mikhail
no flags Details
valgrind another log with --leak-check=full (deleted)
2014-03-26 05:48 UTC, Mikhail
no flags Details
valgrind log with --leak-check=full (30 min) (deleted)
2014-03-26 08:30 UTC, Mikhail
no flags Details
photo of unbottable system (deleted)
2014-03-31 21:41 UTC, Mikhail
no flags Details
valgrind log of evolution 3.12.1-1 (deleted)
2014-04-18 06:31 UTC, Mikhail
no flags Details

Description Mikhail 2014-02-07 12:36:23 UTC
Created attachment 860502 [details]
screenshot

Description of problem:
Evolution hangs again

Comment 1 Mikhail 2014-02-07 12:37:36 UTC
Created attachment 860503 [details]
backtraces of evolution

Comment 2 Mikhail 2014-02-07 12:38:19 UTC
Console out:


[mikhail@ws166 ~]$ evolution

(evolution:2576): GLib-GIO-CRITICAL **: g_simple_async_result_set_error: assertion 'G_IS_SIMPLE_ASYNC_RESULT (simple)' failed

(evolution:2576): GLib-GObject-WARNING **: invalid unclassed pointer in cast to 'SoupMessage'

(evolution:2576): libeews-CRITICAL **: ews_connection_schedule_cancel_message: assertion 'SOUP_IS_MESSAGE (message)' failed

(evolution:2576): e-data-server-WARNING **: (e-data-server-util.c:532):e_util_utf8_data_make_valid: runtime check failed: (g_utf8_validate (string->str, -1, NULL))

(evolution:2576): evolution-util-CRITICAL **: e_contact_store_get_contact: assertion 'ITER_IS_VALID (contact_store, iter)' failed

(evolution:2576): evolution-util-CRITICAL **: e_contact_store_get_contact: assertion 'ITER_IS_VALID (contact_store, iter)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): evolution-util-CRITICAL **: e_contact_store_get_contact: assertion 'ITER_IS_VALID (contact_store, iter)' failed

(evolution:2576): evolution-util-CRITICAL **: e_contact_store_get_contact: assertion 'ITER_IS_VALID (contact_store, iter)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed
Created new window in existing browser session.

(evolution:2576): dconf-WARNING **: failed to commit changes to dconf: Timeout was reached

(evolution:2576): dconf-WARNING **: failed to commit changes to dconf: Timeout was reached

(evolution:2576): dconf-WARNING **: failed to commit changes to dconf: Timeout was reached

(evolution:2576): dconf-WARNING **: failed to commit changes to dconf: Timeout was reached

(evolution:2576): dconf-WARNING **: failed to commit changes to dconf: Timeout was reached

(evolution:2576): dconf-WARNING **: failed to commit changes to dconf: Timeout was reached

(evolution:2576): evolution-util-CRITICAL **: e_contact_store_get_contact: assertion 'ITER_IS_VALID (contact_store, iter)' failed

(evolution:2576): evolution-util-CRITICAL **: e_contact_store_get_contact: assertion 'ITER_IS_VALID (contact_store, iter)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): evolution-util-CRITICAL **: e_contact_store_get_contact: assertion 'ITER_IS_VALID (contact_store, iter)' failed

(evolution:2576): evolution-util-CRITICAL **: e_contact_store_get_contact: assertion 'ITER_IS_VALID (contact_store, iter)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): evolution-util-CRITICAL **: e_contact_store_get_contact: assertion 'ITER_IS_VALID (contact_store, iter)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): evolution-util-CRITICAL **: e_contact_store_get_contact: assertion 'ITER_IS_VALID (contact_store, iter)' failed

(evolution:2576): dconf-WARNING **: failed to commit changes to dconf: Timeout was reached

(evolution:2576): dconf-WARNING **: failed to commit changes to dconf: Timeout was reached

(evolution:2576): dconf-WARNING **: failed to commit changes to dconf: Timeout was reached

(evolution:2576): dconf-WARNING **: failed to commit changes to dconf: Timeout was reached

(evolution:2576): dconf-WARNING **: failed to commit changes to dconf: Timeout was reached

(evolution:2576): dconf-WARNING **: failed to commit changes to dconf: Timeout was reached


(evolution:2576): evolution-util-CRITICAL **: e_contact_store_get_contact: assertion 'ITER_IS_VALID (contact_store, iter)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): libebook-contacts-CRITICAL **: e_contact_get: assertion 'contact && E_IS_CONTACT (contact)' failed

(evolution:2576): evolution-util-CRITICAL **: e_contact_store_get_contact: assertion 'ITER_IS_VALID (contact_store, iter)' failed
Killed

Comment 3 Milan Crha 2014-02-10 11:57:09 UTC
Thanks for a bug report. The backtrace of evolution (others are basically idle) shows it busy in a GSource dispatcher/code:
 
> Thread 1 (Thread 0xb0e93900 (LWP 2576)):
> #0 0xb3a29786 in g_source_iter_next () from /lib/libglib-2.0.so.0
> #1 0xb3a2c1a9 in g_main_context_check () from /lib/libglib-2.0.so.0
> #2 0xb3a2c822 in g_main_context_iterate.isra.23 () from /lib/libglib-2.0.so.0
> #3 0xb3a2cdc3 in g_main_loop_run () from /lib/libglib-2.0.so.0
> #4 0xb481115d in gtk_main () from /lib/libgtk-3.so.0
> #5 0x0804ab44 in main (argc=1, argv=0xbfcd2944) at main.c:683

I cannot tell why that happened, but from your console output I'd guess the first critical warnings are the most suspicious, at least to me:
(In reply to Mikhail from comment #2)
> (evolution:2576): GLib-GIO-CRITICAL **: g_simple_async_result_set_error:
> assertion 'G_IS_SIMPLE_ASYNC_RESULT (simple)' failed
> 
> (evolution:2576): GLib-GObject-WARNING **: invalid unclassed pointer in cast
> to 'SoupMessage'
> 
> (evolution:2576): libeews-CRITICAL **:
> ews_connection_schedule_cancel_message: assertion 'SOUP_IS_MESSAGE
> (message)' failed

It might be good to figure out what is causing them. Do you see that every start?

The other console warnings are most like from a time when you compose a new message, using autocompletion of email addresses.

Finally those dconf timeouts are most likely caused by the main thread bing busy, avoid processing of the D-Bus signals.

I'm also wondering whether your issue isn't, at least partially, caused by an issue reported in bug #1063216.

Comment 4 Milan Crha 2014-03-17 16:35:25 UTC
*** Bug 1077184 has been marked as a duplicate of this bug. ***

Comment 5 Milan Crha 2014-03-19 08:07:31 UTC
I asked a GLib developer for an opinion on this and he thinks that there happened a memory corruption (like if the process writes to an already freed memory, or any such software/programmer error), which caused the issue with the g_main_context_iterate(). I agree with him, especially with the following runtime warnings mentioned at comment #3.

I would like to ask you for a help on the investigation, because your use case is more user-environmental than mine.

Could you install valgrind, debuginfo packages for libsoup, evolution-data-server, evolution and evolution-ews (make sure their versions will match versions of the binary packages (rpm -qa | grep evolution | sort) and then run evolution under valgrind like this, please?
   $ G_SLICE=always-malloc valgrind --num-callers=50 evolution &>log.txt
It will be very slow, due to all memory checking, and with high CPU usage, but it may eventually show where the actual memory corruption happened. Please try to use evolution for some time when running under valgrind, or keep it running during a lunch, over night or such. The longer you'll keep evolution running the better. Even if it'll not crash, the log can contain the information about incorrect memory usage, thus please upload it here anyway. The log may not contain any private information, but better to check for it. I usually search for "pass" (quoted for clarity only).

Comment 6 Mikhail 2014-03-21 09:23:16 UTC
Created attachment 877179 [details]
valgrind log

Comment 7 Milan Crha 2014-03-21 13:13:13 UTC
Thanks for the update. I never saw valgrind getting out of memory. Till now. Apart of this the log shows only usual complaints, which are supposed to be harmless. Maybe that out-of-memory crash is the reason, something is leaking in the related evolution(-*) code.

If you add also --leak-check=full into the valgrind arguments and run evolution for a shorted period (I do not know how long you've it running under valgrind), then a correct close (it is important to see evolution closed correctly) should show leaking memory. Such log will be huge, thus better to attach it compressed.

Comment 8 Mikhail 2014-03-24 07:05:23 UTC
Created attachment 877944 [details]
valgrind log with --leak-check=full

Comment 9 Milan Crha 2014-03-25 08:09:41 UTC
Thanks for the update. If you look at the very end of the log you'll see that valgrind "crashed" again, when it failed to allocate a memory block. The information at the end of the log also contains some hints what could be wrong. Anyway, I'm wondering, did you have evolution running under valgrind for a long time (several hours), or it closed itself after a short period of time (several minutes)? If I recall correctly, then you have basically only one account, an evolution-ews account, which may not cause such significant memory usage to for valgrind to get out of memory.

I tried to run evolution under valgrind on my machine too, and it doesn't abort, but I also didn't find any significant memory leaks. I'm afraid that the memory leaks also depend on the actual message type being viewed, or certain actions. I will try something along these lines and let you know my findings.

Comment 10 Mikhail 2014-03-25 08:19:09 UTC
(In reply to Milan Crha from comment #9)
> Thanks for the update. If you look at the very end of the log you'll see
> that valgrind "crashed" again, when it failed to allocate a memory block.
> The information at the end of the log also contains some hints what could be
> wrong. Anyway, I'm wondering, did you have evolution running under valgrind
> for a long time (several hours), or it closed itself after a short period of
> time (several minutes)?

Evolution under valgrind worked all weekend (2 days) + half a working day of what we can conclude that falls due to lack of memory when I'm working. Although this machine I have 3Gb RAM + 16Gb swap and PAE kernel.

> If I recall correctly, then you have basically only
> one account, an evolution-ews account, which may not cause such significant
> memory usage to for valgrind to get out of memory.
>


yes

> I tried to run evolution under valgrind on my machine too, and it doesn't
> abort, but I also didn't find any significant memory leaks. I'm afraid that
> the memory leaks also depend on the actual message type being viewed, or
> certain actions. I will try something along these lines and let you know my
> findings.

Comment 11 Milan Crha 2014-03-25 13:07:02 UTC
I made this a go on my machine, and even I found another minor leaks (minor in a sense that there leaks something on a object which is global and used only one for the whole life-time of evolution), then I did not find anything significant. I left evolution running for about 30 minutes, while it checked for new messages in an EWS account each 4-5 minutes, though my test account has basically no activity, with compare of your work account, where I suppose you've many mails received each day. There is also one more difference, I run git master of evolution (and other evolution core packages), which are currently quite close to the just released 3.12.0 version. Thus maybe the most significant leaks had been fixed meanwhile. Though it's only a wild guess.

Maybe if you can give the valgrind the last try, have it running for 30 minutes or an hour only, and the close it properly (no crash, no abort, nothing like that), then it may, hopefully, verify whether you face any already fixed leaks. The correct finish of valgrind logging may end with a summary of memory usage at the end of evolution, basically something like this:

>  LEAK SUMMARY:
>     definitely lost: 70,989 bytes in 189 blocks
>     indirectly lost: 81,385 bytes in 3,024 blocks
>       possibly lost: 505,665 bytes in 7,034 blocks
>     still reachable: 56,441,906 bytes in 897,221 blocks
>          suppressed: 0 bytes in 0 blocks

Comment 12 Mikhail 2014-03-26 05:48:09 UTC
Created attachment 878863 [details]
valgrind another log with --leak-check=full

Comment 13 Mikhail 2014-03-26 08:30:51 UTC
Created attachment 878905 [details]
valgrind log with --leak-check=full (30 min)

Comment 14 Milan Crha 2014-03-26 16:10:52 UTC
Thanks for the update. I found and fixed couple of the leaks from your logs, I even didn't know there are such, as some of them were not shown to me. This all resulted in a set of commits into evolution-data-server, evolution and evolution-ews, everything will be available in time of 3.12.1+.

Thank you for your help with this. I cannot tell for sure whether the memory leak fixes will address the initial issue of this bug report, but I hope that this is the right beginning. Some leaking parts were quite often, especially on folder updates, message movements and certain message views. I'm closing this for now.

If you want to test 3.12.0 (not the version with leak fixes, but still the more recent code that the 3.10.4 is) in Fedora 20, then you can use instructions from here:
https://copr.fedoraproject.org/coprs/rhughes/f20-gnome-3-12/

Comment 15 Mikhail 2014-03-28 09:43:48 UTC
This fixes would be ported on Evolution 3.10.4?

Comment 16 Milan Crha 2014-03-28 13:59:18 UTC
(In reply to Mikhail from comment #15)
> This fixes would be ported on Evolution 3.10.4?

Unfortunately no, that's why I pointed to the copr repo, where can be downloaded 3.12.0, before Fedora 21 is out (which will take several months).

Comment 17 Mikhail 2014-03-28 14:28:32 UTC
I'm afraid of losing support here if set repository copr.

> Active Releases
>
> The following unofficial repositories are provided as-is by owner of this
> project. Contact the owner directly for bugs or issues (IE: not bugzilla).

Comment 18 Milan Crha 2014-03-31 09:30:19 UTC
(In reply to Mikhail from comment #17)
> I'm afraid of losing support here if set repository copr.

It depends. If you mean by support that you would not be eligible to ask about issues related to evolution here, then I do not have any problem if you use the COPR repository, just the opposite, because you get more recent code and you still have an option to downgrade in case it works worse than before for you. I also do not believe that any other GNOME-related project maintainer will not be willing to help, the provided packages are "only" rebuilt against Fedora 20 packages, taken from rawhide, which includes current stable release of GNOME projects.

In any case, I'm fine if you do not want to install it, it was only an idea how to get to more recent code earlier than when Fedora 21 is out, which means several months.

Comment 19 Mikhail 2014-03-31 21:38:19 UTC
After upgrade gnome-shell with CORP repository my system become unbootable :( And I don't know easy way how rollback system to state which was before this upgrade :(

Comment 20 Mikhail 2014-03-31 21:41:22 UTC
Created attachment 881108 [details]
photo of unbottable system

Comment 21 Milan Crha 2014-04-02 07:18:45 UTC
Oh, that's bad :( Weird there is no error message shown, it's probably waiting for something. Could you boot a previous kernel version, or the Rescue option from the boot options screen? That should work. Could you Ctrl+Alt+F2 in this stage, and get a login prompt? Alternatively, edit the boot command at the login screen and add number 3 to the init parameters. It is supposed to run the system in text-only mode, which used to work this way some time ago. I'm not sure how exactly it works with all the systemd changes. Yet another option is to boot a live CD and choose the Rescue option from there. Once you get to a login prompt, log in as a root, remove the copr repo file and 'yum downgrade' the changed packages to their standard fedora 20 versions. You can see list of updated packages at /var/log/yum.log file.

Weird it failed for you, I have one Fedora 20 installation with the COPR repo activated and the system fully updated and it can boot properly. Do you use any 3rd-party video drivers? Maybe that's it, the video drivers may require rebuild for the new kernel (in which case you should be able to Ctrl+Alt+F2 to the login prompt in the stage from your photo).

Comment 22 Mikhail 2014-04-13 19:51:07 UTC
My second attempt was more successful and I moved to GNOME 3.12
And I see regression: 'Of select folder dialog removed network connections from places'. Example GNOME 3.10 how is was usefull: https://drive.google.com/file/d/0B0nwzlfiB4aQRTVDU2V5VWVOODg/edit?usp=sharing
I don't to know what package relate with it for writing a bug report.

Comment 23 Milan Crha 2014-04-14 10:34:53 UTC
(In reply to Mikhail from comment #22)
> My second attempt was more successful and I moved to GNOME 3.12

Good. I was told that the COPR repository had some dependency issues, which probably caused your trouble. Good you have the machine (and system) back.

> And I see regression: 'Of select folder dialog removed network connections
> from places'.

The dialog is from gtk3, so it should go there, probably, or to the actual application which uses it, because I tried to do a similar thing with gtk3-3.12.1 (while I expect 3.12.0 to be the same) in gedit, and gedit shows network places, while evolution does not. gedit uses "Open file", though in evolution I do not see network places in both "open file" and "open folder" dialogs, neither if I add a network place to bookmarks.

Comment 24 Mikhail 2014-04-18 06:31:03 UTC
Today update evolution
$ rpm -qa | grep evolution
evolution-data-server-3.12.1-1.fc20.i686
evolution-debuginfo-3.12.1-1.fc20.i686
evolution-data-server-debuginfo-3.12.1-1.fc20.i686
evolution-3.12.1-1.fc20.i686
evolution-ews-debuginfo-3.12.1-1.fc20.i686
evolution-ews-3.12.1-1.fc20.i686

I attach new valgrind log so you can make sure everything is fixed

Comment 25 Mikhail 2014-04-18 06:31:49 UTC
Created attachment 887428 [details]
valgrind log of evolution 3.12.1-1

Comment 26 Milan Crha 2014-04-22 11:33:33 UTC
Thanks for the log. Despite valgrind being killed by "Out of memory" error, the actual result doesn't show anything significant:
 LEAK SUMMARY:
    definitely lost: 43,214 bytes in 568 blocks
    indirectly lost: 441,016 bytes in 14,724 blocks
      possibly lost: 4,553,053 bytes in 21,669 blocks
    still reachable: 45,974,099 bytes in 1,018,805 blocks
         suppressed: 0 bytes in 0 blocks

Having allocated about 51MB is nothing large by any means these days. I see in the log one weird report, which I didn't see before, thus I'll investigate on it. I do not think there is more we can do about this right now, thus feel free to stop using the valgrind for now, let your machine relax.


Note You need to log in before you can comment on or make changes to this bug.