Re: commit: UT_String (longish)


Subject: Re: commit: UT_String (longish)
From: Mike Nordell (tamlin@algonet.se)
Date: Sun Jun 03 2001 - 02:45:08 CDT


Dom Lachowicz wrote:

> I still fail to see why adding a vftable is bad. I doubt that you'll ever
> convince me of this.

In that case I will stop to even try. I can however give you a hint, it has
to do with the design principle of C++ itself: If you don't use it, you
don't pay for it. Interfaces shall be minimal and complete.

> [snip]
> > No no no no no no no! Absolutely NOT!
> >
> > UT_HashableString
[snip]
>
> You still haven't addressed my problems, or perhaps you still don't
> see them:

I think I see them, and I also know that in Java you have to use dynamic
cast. You are by the very design of the language forced to. You don't have
compile time type safety in the same way you have in C++. But C++ isn't
Java.

When instantiating std::map, you are instantiating a new class based upon
the given datatypes. Since we don't use templates we can't do this, but I
see no reason why we can't wrap type safe implementations around the current
<void*, void*> map. These classes can be a simple as inline functions simply
casting and forwarding to the UT_hash_map class, or they can be as complex
as one wants. The thing is that using this approach we maintain type safety.

If we take UT_string as key as an example:
Worried about memory management of the keys? Well, If we are only to use the
hashvalues the key should simply be an UT_uint32. If we are to be able to
also extract the original key values by iterating the map, let the class
UT_stringkey_hashmap handle that. If UT_String gets ref-counting itself or
if the UT_stringkey_hashmap takes care of it is of little or no concern
right now. (If UT_String should be ref.-counted, let me know and I'll take
care of it)

Perhaps it can be expressed like: A map owns bothe the key and the value.
Should any of these data types be a pointer, the map only owns the
pointers - not the data.

This can be done without forcing another inheritance hierarchy, and without
adding unnecessary vftables upon classes not in need of it.

> 1) Only pre-known types can be keys in the HashMap (or else you need a
> good bit of trickery to get around this)

Do you think that's a shortcoming? Do you really want to use unknown
datatypes as keys in a map designed and implemented to hold e.g. UT_Strings?
I sure don't. Static type safety is a *feature* of C++, not an obstacle to
overcome.

> 2) You still have my scenario where you want things other than strings to
> be keys....

As I see it there are a number of different maps here, differing in at least
key type. Apply my solution mentioned above with inline *typesafe*
forwarders, probably with specialization for calculating the hash values.
Please ask if anything is unclear.

> Wow.. take away templates and the whole world goes amuck?

Well, yes of course!

> rtti is great if we ever need to use it.

Sure, I agree. If we ever need to use it. I'm happy to say that I so
far haven't seen such need in AbiWord and dubt I will.

> Considering we're not using type-safe values either, I
> fail to see your point.

Just because we currently use a lot of non-type safe stuff, should we
continue down that track? I, which I hope is apparent, think not.

> > First let me explain that these are two separate issues. You're trying
> > to cram two unrelated behaviors into one class. That is a design
> > error.
>
> This isn't necessarily the case, unless you're frowning on the whole
> concept of multiple inheritance too (in the C++ or Java-interface POV).

You are suggesting *implementing*, not inheriting, two different behaviors
in the same class. But OK, maybe I should have expressed myself as "I and
many others feel it is a design error to implement more than one behaviour
in a class". Inheriting behaviour through the use of mixin classes is
another matter.

> Here I probably am doing something wrong and it is probably a design
> error. In general, it need not be. There's a *very* important distinction
> between the two.

I'd still categorically say that in at least 99%, probably 100%, of the
cases a class *implements* more than one behavior it's a design error. :-)

> > 4. It contains the Java construction "equals". Silly, and dangerous,
[...]
>
> #4 - what i was aiming for is perhaps the following Java-ish example:
>
> bool String.equals (Object o) {
> if (! (o instanceof String) )
> return false;
> else if (strcmp (this.buf, ((String)o).buf) == 0)
> return true;
> return false;
> }

I think I understand what you were aming for, and I still disagree with it.
This is the type of code frowned upon by C++ developers. This is use of
RTTI, and it's conceptually wrong. Imagine you pass an OutputStream to that
method. It simply doesn't make any sense. There's no way an OutputStream
object can be the same as a String object.

In C++ we can *enforce* compile time safety. Let's use that to our
advantage.

> #5 - The argument about the copy constructor/clone is well taken. However,
> it's important to note you don't need to make copies of an object to
> ref-count it,

True, but it's kind of pointless to ref-count a one instance object. It's
like ref-counting a singleton. Pointless.

The whole point of reference counted objects is that upon copy they increase
their ref-count, and upon destruction the ref-count decreases. That's the
way C++ works. I don't care how they do it in Snobol, Lisp or Java. Java has
no notion of object lifetime as C++ has, making comparisons pointless.

Manual reference counting *is* error prone, you have also seen it.

> though your argument holds for Objects created on the stack instead of the
> heap. I'm not a big fan of creating Objects on the stack, though...

My argument holds their own weight, even if you create heap objects.
Correction, *especially* when you create heap objects. That's were you *get*
memory leaks. (I have so far to see a stack objects leaking...)

/Mike



This archive was generated by hypermail 2b25 : Sun Jun 03 2001 - 02:44:59 CDT