I hope that you all had the time to get your favourite OOo flavour built before that post. We will now enter the real hacking part. In order to have some kind of example hack, I will use theissue #2838 about auto-correction replacements and caps. We will need the following steps to fix the bug:
- Locate the concerned code
- Understand it and fix it
- Create a patch
Show me the code!
How is it organized?
First of all, you need to globally understand how the OpenOffice.org and Go-oo source code is organized. We will start with the organization of the OpenOffice.org sources as these will be included in Go-oo. There is a lot of code in OpenOffice.org and has all been split into what are often called modules. These are coming from the CVS days and are the root folders of the OOo sources. You probably won't go through all of them, then here are some important ones:
- sw, sd, sc, starmath: these are containing the sources of the main applications of OOo, respectively Writer, Impress/Draw, Calc and Math
- sfx2, svx, svtools, editeng, svl, framework: these are containing some common code to all the apps. A lot of interesting stuffs are sitting here
- offapi: contains all the IDL files for the UNO API definition
- xmloff: contains the ODF filters code
- instsetoo_native: is the top module needing all the others. It's the one doing the final packaging
- solenv: contains the build macros and tools
For the other modules, the best is to discover them once you'll need them. There is a special folder named solver gathering all the build results: of course this is not really a module. You can just ignore everything in the binfilter as this is quite old and more or less duplicated code.
On the Go-oo side, things are pretty much the same, expect that the upstream sources are contained in the build directory. The folder name is the milestone name and is looking like dev300-mXX or oooXXX-mYY. Here are the important folders for Go-oo sources:
- build: contains all the build results, including the upstream sources unpacked and patched
- patches: contains the patches to apply. The patches/dev300/apply file describes which patch to apply during the build
- src: contains all the needed sources before unpacking them
- bin: contains the scripts used to build the sources as well as some other useful tools
- doc: contains some hackers notes and patches to explain some parts of the code
How do I find the code I need to hack?
One of the easy ways to find some piece of code is to use the UI strings. It's much better to use an en-US build (the default one) to ease the task. In the case of the #2838 bug, here are a few steps leading to useful infos:
- Run OpenOffice.org and open the Tools > Autocorrect options menu. Then go to the Options tab. The string "Use replacement table" is very likely to lead us to the option enabling the replacement... and then to the replacement code itself.
- In opengrok, run a full search with that string. Don't forget to add the double quotes to restrict the results. For the OpenGrok search you need to specify the milestone of your sources; in most of the cases there won't be any big difference... but who knows?
- Spot the .src files in the results of the previous search: these are containing the messages in en-US for the UI. In our case there are two of them: one in sw and the other in cui (or svx for ooo320-mXX milestones). A search of the identifier of the string (ST_USE_REPLACE in our case) will help you to find in which cxx file this string is used (cui/source/tabpages/autocdlg.cxx in our case).
- Now that we have some starting point in the code, we can read the surrounding code to find out the interesting place to hack. In our case, the UI string is used in the OfaSwAutoFmtOptionsPage class representing the tab page of the dialog. This class is using SvxSwAutoFmtFlags::bAutoCorrect and SvxAutoCorrect to store the option: searching for them in the code will lead us to the code actually doing the replacement.
- Using that method we discovered that all the auto-corrections are applied in SwAutoFormat::AutoCorrect and SvxAutoCorrect::AutoCorrect methods. We now only have to implement the changes in the methods SwAutoCorrDoc::ChgAutoCorrWord and SvxAutoCorrDoc::ChgAutoCorrWord.
Understanding the code and fixing it
If you are hacking Go-oo, you need to run some commands before changing any line in the code. Jump to the next section to ease your life!
ctags, opengrok and doxygen
Understanding the code means being able to get the correct infos about the surrounding code. There are several tools to help the developer in that task. I'm mostly using ctags and the go-oo doxygen documentation (and of course OpenGrok) to understand the OOo code. To easily navigate in the source code, I usually create a ctags database and use it in VIm (you can use it in Emacs as well but I don't know how it works).
In any case installing ctags in mandatory:
sudo zypper in ctags
To create your ctags database, on Go-oo:
make tags
On OpenOffice.org you will need to do basically the same, but for this you will need to download the script and run it from the root folder of the sources.
Then in VIm you can use Ctrl+] to jump to the definition of the selected type, and Ctrl+T to jump back to the previous place. The command :ts SomeType will also give all the matching places found in the ctags database. Generating the tags database will also allow you to have automatic completion in VIm thanks to OmniCppComplete. You should search for the previous VIm tips I posted: they are really useful when hacking OOo.
For VIm to find your tags file, don't forget to add this to your \~/.vimrc file:
" Tags files are searched first relative to the current file, then relative to
" the current working directory, and last in the $HOME directory.
set tags=./tags,./../tags,./../../tags,./../../../tags,./../../../../tags,./../../../../../tags,tags,../tags,../../tags,../../../tags,../../../../tags,../../../../../tags,~/tags
Hacking
After the hack you will need to rebuild and test. There is no need to rebuild everything as mentioned in the previous blog post. You will only need to rebuild the modules you have changed. Of course some modules are dependents and you may need to rebuild more modules. Before anything you need to setup the environment variables for building OpenOffice.org. If you are hacking Go-oo it all happens in the build/\<milestone> folder containing the upstream sources.
To setup the environment:
. Linux*Env.Set.sh
Go into the module to rebuild and run the following command. Note that the arguments provided here are useless is you aren't building with icecream:
build -P6 -- -P6
Now the hacked sources are built and the result have been generated in the \$INPATH folder of the module. If you are hacking Go-oo there is nothing more to do to test as the dev-install links to these folders. If you are hacking vanilla OpenOffice.org you need to copy the .so and .res files at the appropriate place in the installation (adding the writing permission to your user on the existing files will certainly be needed).
In any case, if you want to be sure that all the dependent modules are rebuilt, you need to run the following command in the instsetoo_native module (and reinstall OOo for the upstream version):
build --all -P6 -- -P6
Debugging
Debugging is often necessary to see what is going wrong. As you may have seen, we haven't used the configure options to build the debug symbols: that was on purpose to avoid building loads of unnecessary build symbols (they use place on the hard disk and take time during the build). Here is how to get them for a given module.
First remove the output folder for the module
rm -r $INPATH
Then rebuild with the debug symbols
build debug=t -P6 -- -P6
Now that the symbols are generated, you will be able to use gdb to actually debug. The goal of this tutorial isn't to make a master debugger of you, but here are some useful commands and tips.
- You can find a copy of my .gdbinit file on Go-oo git repository. It contains useful functions grabbed here and there. Have a close look at the ptu and pou functions: they will help you printing OpenOffice.org strings (which aren't just char*).
- To start gdb with OOo, go to the program folder of your install and run gdb ./soffice.bin. If you are running a dev-install of Go-oo, sourcing ooenv is mandatory before that.
- To place a breakpoint, use the b command in gdb. The easiest way is to give the filename and line where to stop, for example:b doc.cxx:501
- The basic commands to know are run, step, next, continue and print. I hope their names are clear enough for you (or you will need to read gdb help).
You can find a lot of debugging tricks on the Go-oo DebuggingIt page.
Creating the patch
Now that you have hacked OOo to get your feature implemented, you will need a patch to show to the other devs. This will be done differently depending if you are hacking OpenOffice.org or Go-oo. Note that for Go-oo users you need to run some commands before actually hacking.
The OpenOffice.org way
As the sources your have hacked are a local mercurial repository, you simply need to go through the following steps:
hg add path/to/each/new/file
hg commit #You will need to type a message after that
hg export tip >myhack.diff
The last command may need to vary if you ran several commits.
The Go-oo way
Before changing anything in the code, we need to create a new git repository in the build/\<milestone> folder: that will simplify the patch creation later.
cd build/<milestone>
../../bin/create-gitignores.sh && git add . && git commit -m "initial commit" && git checkout -b myhack
Now everything is ready: you can hack quietly and continue with these steps when your changes are ready to be shown to the world.
git add path/to/each/new/file
git commit -a #You will need to type a message after that
git diff --no-prefix master >myhack.diff
In this case you will be working in a new branch and keep the master branch untouched. You can commit as may times as you want the command to create the diff will always be the same. There are plenty of ways to handle the patches; a very nice one is described by Thorsten using stg.
The most complex task is left to you: getting familiar with the parts of the OOo code interesting you. Of course nobody knows it all (or I would really be impressed), but the more you hack, the more you will learn. Don't forget to ask other developers on IRC or on mailing lists: others can often have ideas to help you.
I still haven't really though about the next step, stay tuned...