Wednesday, October 27

UML, OO models and Abstraction

There's a hierarchy of sorts of abstraction away from machine hardware - you start with assembler - working with resigsters, raw memory, I/O ports, interrupts etc, but it gets more wooly and disintegrates into something resembling a spaghetti ATN the further you work your way up.

C takes you one layer from that - it gives you functions, typing and normal maths notation, but you are still working with raw lumps of memory (structs) and there's a pretty much one to one mapping between expressions and machine instructions - the size assembler output is going to be roughly proportional to source input.

"C with classes" C++ gives you some more layers - You can take a hunk of raw memory and associate it with a set of defined operations and acess it through those operations. You can do this in the above lanuages, too with sufficently disciplined programming, but C++ gives you encapsulation and operator overloading to make it more syntatically "nice". You still have to worry about the consequences of allocating things in memory and where they live - stack, heap, static..

"Polymorphic/OO C++" gives you other tools: the ability to pretend one set of objects are all the same thing - polymorphism, virtual functions, interfaces, inheritance. This saves a lot of repeat coding and lets you abstract away implementation details - for example a car, a character, a prop are all "objects in the world" and can be manipulated via a common "objects in the world" interface : no specific per - case coding.

This leads to..

"Template/Boost/STL C++" gives you the ability to do something like polymorphism at compile time - templates that take parameters, which leads to a lot of compile-time optimisations, and for the first time in the C++ learing curve, code which outperforms C (if written *properly*). Also, the STL takes the level of abstraction offered by the previous level one step further, seperating data and algorthims - on the one hand there's a set of containers: list, vector, map; and on the other hand a set of sorting, searching algorithims that work for every container and don't have
to be specialised for particular containers. STL has a weakness that's inherent to C++ - it has to manage memory allocation and does so nearly exclusively on the heap, although there is a mechanism to abstract away the allocations (an allocator object) and make the code independent of memory-management mechanisms (ie, not care wether the container is on the heap, static, or on the stack), it's flawed..(or at least so it's claimed by people who should know).


Python gives you a slightly different set of abstractions that do the same thing as C++: it doesn't do static typing, pushes function call and object resolution to run time, so if you use object.method() as long as your object implements method() it won't care what object it is. Also it frees you from the problem of allocating/deallocating raw memory, which C++ doesn't. It's
implemented to be easily interfaceable to C++/C and read like "executable pseudo code". It's therefore an ideal prototyping langugage.

Lisp takes a completely different apporach to providing these abstractions, one that is still boiling my brain. Rather than trying to provide abstrations that decouple code and data, code *is* data - a function is just a list of instructions, manipulable by list manipulating primitives, assignable to variables, etc. I suspect it to be ideal for genetic programming and self-modifying structures at a high level of abstraction thus it's niche in AI and 'hard' research problems. The nice thing about this apporoach is that the upshot is - like C++ you degree of abstraction is entirely up to you - you can go with typless programming ala Python, or you can declare your variables typed, and get full on C-type optimisation. You
can construct code in lists at compile time that have the same effect as template expansion. I haven't got into the object system, yet..but I fully expect to find it already doing things the ways it took C/C++ the last 20 years to discover.

Finally: I'm saying that the mechanism that the language uses to provide abstraction away from the machine hardware has consequcnces and it's inescapable, and it may also close off other possible abstractions

Where does UML fit into this hierarchy? It's another step up, isn't it, away from the machine? The fact it's OO oriented troubles me though. LISP, Python, C++ all have differences in the way the implement OO - I can't see UML being free of assumptions about how OO is implemented.

C++ has private members, Python makes everything public, LISP allows you to decide the precedence of multiply inherited classes on a case-by-case basis, Java doesn't do multiple inheritance it's like the old days when everyone had their own BASIC - everyone has their own OO model.

Wotta pain. Ouch!

No comments: