From: Johnny Lee (firstname.lastname@example.org)
Date: Fri Oct 03 2003 - 01:25:47 EDT
There are several speedups here and some size reduction with these changes.
Import and building the RTF doc has been reduced from 23 seconds to ~15
Some of the changes may not be acceptable since I'm concentrating on
speeding up loading the RTF spec and I haven't been able to determine if
some of my assumptions are valid.
I'm just proposing a solution. It may not be the best solution since I
barely know the code and the coding style. I did increase the size of
fp_Container objects by 4 bytes and the size of UT_Stringbuf objects by 4
bytes. Feel free to code up a different solution.
The code for change #3 is in the attached bug5291e.txt.
The code for change #1, #2, and #4 is in bug5291f.txt.
After profiling Abiword with all the previous 5291 patches, the function
listed at the top was fp_TableContainer::getCellAtRowColumn.
Looking at the code for this function, there's a linear search thru the
cells in a table to find the cell in the appropriate spot.
When I dumped the position portion of the cell objects, they turned up in
sorted order, from top to bottom, left to right.
*** If this assumption is not correct, then the patch will probably fail to
work correctly. ***
Using that sorted array, I replaced the linear search with a binary search.
This saved about 4 seconds on building the document.
Profiling the new version, the top function listed was
fp_Container::clearBrokenContainers. This is a recursive function which is
called millions of times while building the document.
It took me a long time to figure out the function of 'broken containers'.
Robert Wilhelm had a patch so that clearBrokenContainers would only write to
the m_pMyBrokenContainer field if it was non-NULL.
I added code to see how often the code NULLed out this field.
Less then 20 times for the RTF spec! Millions of calls to NULL a variable
less than 20 times.
The reason it has to do this recursively is that the parent container
doesn't know anything about broken containers in its children.
And how many broken containers are there?
fp_Container::setMyBrokenContainer is only called about 120 times.
So I added a field to fp_Container which represents the number of broken
containers in the current container and any children.
When fp_Container::setMyBrokenContainer is called, the code goes from the
current container up to its parent, grandparent, etc. and increments the
count of broken containers in each of them.
When fp_Container::clearBrokenContainers is called, if m_pMyBrokenContainer
is cleared, the broken--container-count is decremented in the current
container, and in its parent and grandparent, etc.
When it loops through the children containers, the code checks their
broken-container-count. If it is zero, then we don't bother calling
clearBrokenContainers on them.
This saves a couple of seconds building the RTF spec.
Profiling again, showed the following functions near the top:
Looking at the code, UT_StringPtrMap::pick creates a UT_String object which
allocates space for the string, copies the string, and gets passed to
UT_StringPtrMap::find_slot. Afterwards, the UT_String is destroyed which
includes freeing the allocated string.
I modified UT_Stringbuf and UT_String so that you can pass in a
null-terminated string and no memory is allocated for the string, the code
just references the string passed in - ideal when used for a read-only
string, such as checking if something is in a hash table.
My code changes add a few checks to the base string classes so that you
can't modify the external string. You may want to come up with another
solution that doesn't involve such extensive changes in the class. The idea
is to get rid of the memory allocation/deallocation, etc. when the string is
only examined. There are other places in ut_hash.cpp where this occurs as
Finally, a patch that deals more with reducing size than speed. I noticed in
fp_TableContainer.cpp that there's lotsa code similar to this:
getNthRow(i)->requisition = 0;
getNthRow(i)->allocation = 0;
getNthRow(i)->spacing = m_iRowSpacing;
getNthRow(i)->need_expand = 0;
getNthRow(i)->need_shrink = 0;
getNthRow(i)->expand = 0;
getNthRow(i)->shrink = 0;
I looked at the disassembly of the code. The compiler didn't realize that
the result from getNthRow(i) was the same for every line in this block of
code. So the compiler generated code which would calculate the result for
I created a variable and assigned the result from getNthRow/getNthCol to the
variable and used the variable instead.
This saved about 16 bytes each time with VC6.
Instant message during games with MSN Messenger 6.0. Download it now FREE!
This archive was generated by hypermail 2.1.4 : Fri Oct 03 2003 - 01:41:20 EDT