Programming codex

Analyzing Application Core Dump on Solaris Made Simple

Analyzing application core dump on Solaris made Simple

[ad_1]

Of late, I have been trying to find out the source of a crash on Solaris. We have a C++ application that runs both on Windows and Solaris. We were searching for lots of sites to see how to debug a crash on Solaris. With windows, with our previous experience, it was simple as we knew more or less most of the tools and were comfortable with it. Solaris was sort of new for us for development as we were not very experienced in it. After searching some sites for help and with some of my experiences, I have assembled some tips and tricks to arrive at the source of the crash.

On a developer machine where you have been installed

Get the core file and the binary from the customer site and copy it to some machine that has dbx(comes with sun studio) installation. Load the core file using dbx

shellPrompt>>dbx my binary myWorrisomeCoreDump

you can use dbx to determine the abnormal thread which caused the core dump by typing the following at the dbx prompt

dbxPrompt>>threads

This command will give you the list of threads that were active when it crashed. The thread marked with “o” is the culprit thread. Please change to the thread which is causing the crash inside the dbx prompt itself.

dbxPrompt>>thread t@5 (Thread 5 has crashed for example)

And now, u cud see where it has crashed. It displays what caused the error. Type here to find the cause

dbxPrompt>>where

And u have some amount of stack trace for it now.

On the machine itself which doesnot have dbx

====================================

You can copy the dbx file(just a single file sufficed for me) to some temp location and do the same as above.

OR

See which thread caused the core dump using pflags

shellPrompt>>pflags myWorrisomeCoreFile

/1: flags = STOPPED why = PR_SUSPENDED sigmask = 0x00000004,0x00000000

/2: flags = STOPPED recv(0x4,0x2e0fee8,0x1,0x0) why = PR_SUSPENDED sigmask = 0x00000004,0x00000000

/3: flags = STOPPED lwp_park(0x4,0x0,0x0) why = PR_SUSPENDED sigmask = 0x00000004,0x00000000

/4: flags = 0 sigmask = 0xffffbefc,0x0000ffff cursig = SIGABRT

Check which of the threads has something like cursig=SIG(*) and u know this thread is the one to have caused the core dump. Here it is thread 4 which has caused the crash. Now do a pstack on the core file to see the current stack

shellPrompt>>pstack myWorrisomeCoreFile

—————– lwp# 4 / thread# 4 ——————– 005edafc __1cH__rwstdJ__rb_tree4nGString_nDstdEpair4Ckn0B_n0B___n0AL__select1st4n0D_n0B___n0CEless4n0B___n0CJallocator4n0D____Oconst_iterator2i6M_r5_ (fe579c8c, fffffff4, 270f4a0, 29a2a00, 0, fe579ce8) + dc

005edd60 __1cHAddressFtoUri6kM_nGString__ (fe57a2b4, 298da08, 258ce31, fe579c8c, 2499a8a, 6) + 1f0

005ecc48 __1cHAddressItoString6kM_nGString__ (fe57a2b4, 298da08, fe57a3e8, 25a019c, 0, 270f518) + 10

016563c0 __1cMXOConnectionOhandleMessage36MrknCsp4nHMessage____b_ (29852d0, fe57b188, 258ce31, 258ce4d, 2499a8a, 6) + 248

01655820 __1cMXOConnectionOhandleMessage26MrknCsp4nHMessage____b_ (29852d0, fe57b188, 3eeedec, feaa2a00, 298d9e8, 6) + 98

01655758 __1cMXOConnectionNhandleMessage6MknCsp4nHMessage____b_ (29852d0, fe57b188, fed73700, feaa2a00, 298d9f0, 6) + 10

You could output this and more to a text file and then see the threads and the stack trace and correlate with pflags output thread.

Alternatively you can also verify with adb

shellPrompt>>adb myWorrisomeCoreFile

Once core file is loaded, then check the trace where it crashed in the adb prompt by typing

adbPrompt>>$c

Understanding the stack trace __1cHAddressFtoUri6kM_nGString__ This means Class “Address” Method “toUri” which returns “String”. I am not sure I can tell u right now to differentiate between a return value and a parameter for the function __1cH “Address” F “toUri” 6kM_nG “String” __

If you want to know where in your application the crash occurred, instead of just identifying the function name.

====================================================================================

If you have the stack trace in dbx and have the address of the call which crashed, something like

005edafc __1cH__rwstdJ__rb_tree4nGString_nDstdEpair4Ckn0B_n0B___n0AL__select1st4n0D_n0B___n0CEless4n0B___n0CJallocator4n0D____Oconst_iterator2i6M_r5_ (fe579c8c, fffffff4, 270f4a0, 29a2a00, 0, fe579ce8) + dc

005edd60 __1cHAddressFtoUri6kM_nGString__ (fe57a2b4, 298da08, 258ce31, fe579c8c, 2499a8a, 6) + 1f0

005ecc48 __1cHAddressItoString6kM_nGString__ (fe57a2b4, 298da08, fe57a3e8, 25a019c, 0, 270f518) + 10

016563c0 __1cMXOConnectionOhandleMessage36MrknCsp4nHMessage____b_ (29852d0, fe57b188, 258ce31, 258ce4d, 2499a8a, 6) + 248

01655820 __1cMXOConnectionOhandleMessage26MrknCsp4nHMessage____b_ (29852d0, fe57b188, 3eeedec, feaa2a00, 298d9e8, 6) + 98

Assume we are interested in finding out more information in the following XOConnectionOhandleMessage3 function as the other functions above it might not have any error as they are tried and tested classes. So please go to dbx loading the binary and the core file as mentioned above and then type

dbxPrompt>>dis handleMessage3 /300

We are now disassembling the code where handleMessage3 is written for the next 300 lines as we know that the crash occurs at memory offset of hex 248 Once u disassembly copy to some editor and search for the address 16563c0 16563c0. This would be exactly at an offset of hex 248 from the beginning of the function. Now you will have to use some experience and some assembly language knowledge and some judgment to find out where it has crashed. Try searching for some text words before and after the above crash line and you are now in a little better state to understand where could the possible crash be. In my case, the lines above had “operator++ ” and some lines below it had “getLogLevel” which were some of my user-defined functions. So I checked my function of handleMessage3 code and searched lines between these where there could be any problem between “operator ++” and “getLogLevel” function call in my function of handleMessage3.

I had some respite, but still have to solve lots. As the saying goes, “Miles to go before I sleep”

[ad_2]

Source by Asif Khan R

Share on facebook
Share on google
Share on twitter
Share on linkedin
Share on pinterest
Share on whatsapp

Leave a comment

Your email address will not be published. Required fields are marked *

Search:

Topics

Recent Posts