This forum has been archived. All content is frozen. Please use KDE Discuss instead.

Catch crashes in modular applications

54

Votes
56
2
Tags: crashes, kdelibs crashes, kdelibs crashes, kdelibs
(comma "," separated)
The User
KDE Developer
Posts
647
Karma
0
OS
Hi!

There is a way to prevent whole applications crashing without separate processes for all the modules. Especially Plasma and Konqueror have problems with crashing Plasmoids/KParts. (look here (Plasmoids as sep and maybe here (rekonq vs. Konqueror))It was suggested to put them in separate processes, but this solution would need slow and ugly code. So you should find a way to handle SIGSEGV,... in C++.
[quote="The User" pid='67703' dateline='1240866922']Hey Ignacio! (and all the others)
It's possible to stop a crash!
I'll say Python but it could be any scripting language in this example:
A common error is a "division by zero". In Python you'll gett a "ZeroDivisionError". The execution can't be continued. It's like a crash on script-level. Maybe more often than divisions by zero you have SIGSEGV, null-pointer-dereference. This means a unbound reference to some data was used to access some data. It's like sending a letter without specifying the adress. In Python you can also try to dereference the "None"-reference. You'll get an error because the None-type can't be used for anything. Again the Python-Plasmoid will crash but the process will survive. In C++ you would get SIGSEGV.
But!
In C there is a function signal() (try `man signal`). You can specify the code executed after a SIGSEGV or anything else. And there is a second error-concept (not signals) in C++ called exceptions. Also a "thrown" exception will roll back the execution until causes SIGABORT. But exceptions can be stopped. When the applicatiiion receives a signal you can throw an exception. In another location in the program the exception can be handled, e.g. in Corona. So the crash would be stopped but Corona could also show a KCrash-dialog for reporting the bug and could remove the Plasmoid. Plasma would continue running.

The User

Edit:
C++-Hackers, try it!
Somebody gave me the information and a few minutes ago I could write some code using those exceptions. It's pretty simple:
Code: Select all
#include                                     
#include                                     
#include                                   
using namespace std;                                   

struct seg_error : public runtime_error
{                                     
  int key;                             
  seg_error(string message = "", int _key = 0) : runtime_error(message), key(_key)
  {}
};

void segmentationFault(int key) throw(seg_error)
{
        throw seg_error("Segmentation Fault! Error key is: ", key);
}

int main()
{
        signal(SIGSEGV, segmentationFault);
        try
        {
                int *x = 0;
                cout << *x;  // Impossible, crash!
        }
        catch(seg_error e)
        {
                cout << e.what() << e.key << endl;
        }
        cout << "still alive..." << endl;
        return 0;
}
[/quote]

So you can execute it:
Code: Select all
g++ -o catchsigsegv ./catchsigsegv.cpp -fnon-call-exceptions -fasynchronous-unwind-tables -funwind-tables
./catchsigsegv


In very stable environments this feature could maybe turned off for even higher performance. A SIGSEGV would be still a bug which should be reported and fixed. But with this technologie bugs could be more easily located and the running Plasma could be keeped alive.
It would need a standard-way in KDELibs, libkdecore. Windows doesn't support the signals, although they are part of ANSI C. For Windows-support SEH would be needed. ut first of all a clean Unix-implementation would be still great.
In other applications, maybe KGoldrunner it would be less useful, because there is no consistent location for catching the exceptions. So the crash-handling should not be used in such apps.

The User

Last edited by The User on Sun May 03, 2009 12:13 am, edited 1 time in total.
The User
KDE Developer
Posts
647
Karma
0
OS
No feedback?
Everybody discussed the "separate-processes-idea", but now... There are still crashing Plasmoids! :D :D :D
job
Registered Member
Posts
18
Karma
0
I see 2 problems with this approach:

1. Inside the handler, you have no idea where the signal came from (so no idea which plasmoid to remove).
2. What should be done inside the handler? Jump right back into the event loop? I don't think that is possible.

I think this kind of handlers are intended to clean up after something went wrong, not to resume the program (especially for the SIGSEGV signal). Note that if you don't exit in the handler, execution will resume at the instruction that caused the signal, resulting in an infinite loop.

Please correct me if I'm wrong:)

Cheers
The User
KDE Developer
Posts
647
Karma
0
OS
Normally they are intended to clean up some data.
E.g. you can press ctrl+c for "ping 192.168.1.1" and you'll get the results.
But it's possible to throw exceptions. When you catch those exceptions you can determine the broken Plasmoid. This could be done somwhere in the event-loop. But to do that, the error would have to occure in the main thread. When you have another thread you can determine it using Qt. I think you could either determine the owner of the QThread-object. It could be a Plasmoid and you have solved it. I'm quite sure there is also a possibility to get the stack-trace.
So the application should give information about the handable objects (e.g. Plasma::Applet) to the crash-handler.
When there is no such information, the application shouldn't use the crash-handler. It's right: The crash handler won't be able to handle any crash. But with some information about Plasma::Applet or KParts::Part there could be good results.
For errors which aren't module specific the crash-handler should abort the execution.
The rules could contain those information:
-handable class
-a list of function-pointers called for this class
-handable threads, e.g. some threads could catch exception so it would be good to be able to say "throw the sigsegv-exception for thread 1234" to the handler

May the moc with you!

The User

Last edited by The User on Sun May 03, 2009 10:17 am, edited 1 time in total.
User avatar
ivan
KDE Developer
Posts
918
Karma
14
OS
While I like the general idea, I'm not sure whether it is doable.

You said that we could catch the exceptions in order to know which plasmoid caused the crash. But, if devs of plasmoids that are crashing knew where the sigsegv occurs, they would fix the code, and not throw the exception.


Image
The User
KDE Developer
Posts
647
Karma
0
OS
Execute it using gdb and you'll see the location where it crashed.
But that doesn't imply you know the reason.
The User
KDE Developer
Posts
647
Karma
0
OS
Update: Somebody told me that it works with windows. ;)
User avatar
Dario_Andres
Mentor
Posts
67
Karma
3
OS
I have extended the code a bit to integrate it with the Qt events. The applications could detect this "crashCatched" signal, identify the faulty widget/object/applet to invalidate/remove it, save the data to disk and gracefully exit.

http://darioandres.pastebin.com/d6cee37cd
User avatar
ivan
KDE Developer
Posts
918
Karma
14
OS

Catch crashes in modular applications

Sun Dec 27, 2009 10:27 pm
Very interesting, but this is not a very reliable way of determining the cause of crash.

And it doesn't seem to stop every crash:

http://ivan.pastebin.com/m48c3d36f

(maybe I missed something...)


Image
User avatar
Dario_Andres
Mentor
Posts
67
Karma
3
OS

Catch crashes in modular applications

Sun Dec 27, 2009 11:21 pm
No, it is just a hackish hack :P, I haven't tested it on all the situations..
Consider it "code to play with for a few hours".. nothing really reliable..
User avatar
ivan
KDE Developer
Posts
918
Karma
14
OS
Ok then, for a proof of concept, it is kinda neat :)


Image
The User
KDE Developer
Posts
647
Karma
0
OS

Catch crashes in modular applications

Wed Dec 30, 2009 12:00 am
Plasma-example:
http://ivan.pastebin.com/f29b573d5
(it's something like pseudo-code, no main-function or something like that, only the concept ;))

@Dario
I can't access your version.

PS:
The throw won't provide a useful backtrace.
It would be possible to use a throw without parameters, but I've tested it with gdb:
I get an incomplete backtrace containing a function called __cxa_rethrow.
It won't provide the information we need.
Maybe there's a way to store the context.
majewsky
KDE Developer
Posts
46
Karma
0
OS
I think the main problem is that Qt was not designed for exceptions (for a reason: I'm not sure if zero-cost exceptions have arrived on all platforms yet).

So the first chance for you to catch the exception is the QCoreApplication::exec() call in main(), because 99% of the relevant exceptions would be thrown in function calls dispatched by the event loop.

Maybe I'm sounding overly pessimistic when I say that stuff like multi-process Plasma won't be possible without a managed runtime platform. Please disabuse me if you can, I'm a big fan of compiled languages.


Proud kdegames developer since 2008, and member of the KDE forums since March 2009
The User
KDE Developer
Posts
647
Karma
0
OS

Catch crashes in modular applications

Wed Jan 06, 2010 10:18 pm
We should forget about ugly stuff like containing two underscores and some ugly characters like cxa. ;)
It is quite simple to invoke DrKonqi.
The idea for Plasma:
-Register a signal handler for SIGSEGV, SIGILL, SIGABRT, SIGFPE and maybe others
-Before doing anything else the signal handler should invoke DrKonqi (DrKonqi is a separate program, you can simply do that using some command line arguments)
-The handler should throw an exception
-This execption will be caught by notify
-The Corona can handle the Qt signal, find out which Plasmoid is evil, otherwise it will rethrow the exception
-Anti-recursion code is needed, when handling fails, the application should crash normally - without DrKonqi or exceptions. Therefore we would need a global variable indicating the state of error-handling:
Code: Select all
bool crashHandled = true;

-In "notify" there should be such code:
Code: Select all
try<br />{<br />... // Code from Ivan<br />}<br />catch(seg_error& err)<br />{<br /> try<br /> {<br /> crashHandled = true;<br /> ... // emit-stuff<br /> }<br /> catch(...)<br /> {<br /> crashHandled = false;<br /> throw;<br /> }<br />}

-The crash-handler should look like this:
Code: Select all
if(crashHandled)<br />{<br /> crashHandled = false;<br /> ... // DrKonqi, throw...<br />}


Do you see any problems? Should "crashHandled" be put into a QThreadStorage or not?

The User
The User
KDE Developer
Posts
647
Karma
0
OS

Catch crashes in modular applications

Thu Jan 07, 2010 11:02 pm
The proof of concepts do not work properly because of multi-threading (signal is unsafe with multiple thread). But this version really works:
Code: Select all
#include <stdlib.h><br />#include <signal.h><br />#include <iostream><br />#include <stdexcept><br />#include <pthread.h><br /><br />#include <QtCore><br />#include <QtGui><br />#include <QtDebug><br />#include <QThreadStorage><br /><br />/*Define signal catching stuff */<br />using namespace std;<br /><br />struct seg_error : public runtime_error<br />{<br /> int key;<br /> seg_error(string message = "", int _key = 0) : runtime_error(message), key(_key) {}<br />};<br /><br />class CrashFreeApp : public QApplication<br />{<br /> Q_OBJECT<br />public:<br /> static struct sigaction sa;<br /> static QThreadStorage<bool*> useHandler;<br /> static void fault(int key)<br /> {<br /> if(!useHandler.hasLocalData() || *useHandler.localData())<br /> {<br /> useHandler.setLocalData(new bool(false));<br /> QMessageBox *box = new QMessageBox(QMessageBox::Critical, QString("Error"), QString("This dialog should be replaced by DrKonqi!"), QMessageBox::Ok);<br /> box->setAttribute(Qt::WA_DeleteOnClose);<br /> box->show();<br /> throw seg_error("Segmentation Fault! Error key is: ", key);<br /> }<br /> else<br /> SIG_DFL(key);<br /> }<br /> static void installSignalHandlers()<br /> {<br />// sa.sa_flags = SA_SIGINFO;<br /> sa.sa_flags = SA_NODEFER;<br /> sigemptyset(&sa.sa_mask);<br /> sa.sa_handler = fault;<br /><br /> sigset_t mask;<br /> sigemptyset(&mask);<br /> <br /> #ifdef SIGSEGV<br /> sigaction(SIGSEGV, &sa, 0);<br /> sigaddset(&mask, SIGSEGV);<br /> #endif<br /> #ifdef SIGFPE<br /> sigaction(SIGFPE, &sa, 0);<br /> sigaddset(&mask, SIGFPE);<br /> #endif<br /> #ifdef SIGILL<br /> sigaction(SIGILL, &sa, 0);<br /> sigaddset(&mask, SIGILL);<br /> #endif<br /> <br /> pthread_sigmask(SIG_UNBLOCK, &mask, 0);<br /> }<br /> CrashFreeApp(int argc, char ** argv) : QApplication(argc,argv)<br /> {<br /> //Install signal handlers<br /> installSignalHandlers();<br /> //signal(SIGABRT, segmentationFault);<br /> }<br /> bool notify ( QObject * receiver, QEvent * event )<br /> {<br /> try<br /> {<br /> return QApplication::notify(receiver, event);<br /> }<br /> catch (seg_error & e)<br /> { //Catch possible crashes<br /> try<br /> {<br /> useHandler.setLocalData(new bool(true));<br /> cout << e.what() << endl;<br /> emit crashCatched(receiver, event);<br /> }<br /> catch(...)<br /> {<br /> useHandler.setLocalData(new bool(false));<br /> throw;<br /> }<br /> }<br /> return false;<br /> }<br /><br /> Q_SIGNALS:<br /> void crashCatched(QObject*, QEvent*);<br />};<br /><br />struct sigaction CrashFreeApp::sa;<br />QThreadStorage<bool*> CrashFreeApp::useHandler;<br /><br />class CrashingWidget: public QTextEdit<br />{<br />Q_OBJECT<br /> public:<br /> CrashingWidget() : QTextEdit() {<br /> // timer.start(3000, & timerHandler);<br /> timer.start(3000, this);<br /> }<br /><br /> private:<br /> void timerEvent(QTimerEvent * ev) {<br /> int *x = 0;<br /> cout << *x;<br /> }<br /><br /> class TimerHandler: public QObject {<br /> public:<br /> void timerEvent(QTimerEvent * ev) {<br /> int *x = 0;<br /> cout << *x;<br /> }<br /> };<br /><br /> QBasicTimer timer;<br /> TimerHandler timerHandler;<br />};<br /><br />class CrashTest: public QWidget<br />{<br /> Q_OBJECT<br /> public:<br /> CrashTest(QApplication * app) : QWidget(){<br /> connect(app, SIGNAL(crashCatched(QObject*, QEvent*)), this, SLOT(processCrash(QObject *, QEvent *)), Qt::DirectConnection);<br /> resize(200, 200);<br /> QHBoxLayout * lay = new QHBoxLayout(this);<br /><br /> widget1 = new CrashingWidget();<br /> lay->addWidget(widget1);<br /><br /> widget2 = new CrashingWidget();<br /> lay->addWidget(widget2);<br /><br /> setLayout(lay);<br /> }<br /><br /> public Q_SLOTS:<br /> void processCrash(QObject * obj, QEvent * evt) {<br /> //Handle the crash on the app itself<br /> qDebug() << "It crashed on" << obj << evt;<br /> if (obj->inherits("QWidget")) {<br /> qDebug() << "It is a QWidget, let's hide it";<br /> QWidget * widget = qobject_cast<QWidget*>(obj);<br /> widget->deleteLater();<br /> <br /> } else {<br /> qDebug() << "Not a QWidget";<br /> }<br /> }<br /><br /> private:<br /> CrashingWidget * widget1;<br /> CrashingWidget * widget2;<br />};<br /><br />int main(int argc, char ** argv)<br />{<br /> CrashFreeApp app(argc, argv);<br /><br /> CrashTest * test = new CrashTest(&app);<br /> test->show();<br /><br /> return app.exec();<br />}<br /><br />#include "main.moc"<br />


Try it: The signal handler will show a dialog and the slot will remove the widgets.

I'm working on a Plasma-like example.


Bookmarks



Who is online

Registered users: Bing [Bot], Google [Bot], kesang, Sogou [Bot], Yahoo [Bot]