« Axis and Support for WS-* Standards (OSCON2005 Tutorial) | Main | Hearing Damian Conway in Provo »
Best Practices fo OO Perl (OSCON2005 Tutorial)

Damian Conway is giving a tutorial on Best Practices in Object Oriented Perl based on his new book: Perl Best Practices. What is a “best practice?”
- Same as the rest of Perl
-
Seek code that
- minimizes chances of “enbugging”
- makes it easy to detect failed edge cases
- scales to larger datasets
- Robust (create techniques that extend and incorporates new functionality)
- Efficient (play to Perl’s strengths and avoids its weakness while minimizing resource usage)
- Maintainable (optimize for comprehension)
Make OO a choice, not a default - Choose OO when
- When the system to be built will be large
- Data can be aggregated into obvious structures and there’s lots of data in each aggregate
- The various types of data form a natural hierarchy that facilitates inheritance and polymorphism
- The implementation of high-level operations on data varies according to data type (polymorphism is a big benefit here)
- Its likely you’ll have to add new data types later
- Interaction between data are best represented by operators
- You have a piece of data on which many different operations are applied
- And, those operations have standard names, regardless of the type of data they’re applied to
- Implementation of individual components is likely to change, especially in the same program
- The system design is already object-oriented
- Large numbers of clients will use your code
Don’t use pseudohashes or restricted hashes - Pseudohashes are prone to subtle errors, especially when used in inheritance hierarchies. Restricted hashes were developed to replace pseudohashes, but the can be unreliable. So..
Always use fully encapsulated objects - Put the contents of the class in a block (scope the variables). Bless a reference to a lexical scalar:
{
my % root_of # ...properties that are locally scoped
sub new {
my($class, $root) = @_;
my $new_object = bless \do{my $anon_scalar}, $class;
# initialize objects "root" attribute
$root_of{ident $new_object} = $root; # ident from Class:Std:Utils
return $new_object;
}
sub get_files {
my ($self) = @_;
... $root_of{ident $self}; ...
}
}
Damian calls this an “inside-out” object since normally an object is a hash with the information inside it. This has the hashes inside.
The differences in the above code are minor, but the combined effect is enormous. The client code gets nothing but an empty scalar which can’t be messed with.
Give every constructor the same standard name - There is only one acceptable name: new. Its short, accurate, and predictable. This makes it comprehensible in six months time.
Always provide a destructor for every inside-out class - Since inside-out objects always have external resources, they must manage them explicitly to prevent memory leaks. The destructor should remove references for that object:
sub DESTROY {
my($self) = @_;
delete $root_of{ident $self};
...
return;
}
This need for a destructor is the only disadvantage of inside-out objects over blessed hashes and other methods.
Methods should, in general, have fewer arguments than subroutines since methods have access to the data in the object. If that’s not true, you should re-evaluate your design. Ordinarily, its unacceptable to name subroutines after built-ins, but that’s not true of methods since they’re called with a different syntax and there’s not ambiguity.
Provide separate read and write accessors - Use setters and getters rather than a single overloaded method. If you only have one, every time you run the method, you have to do a test on the argument list. Getting is much more frequent than setting. Why impose a cost on something you do 99% of the time for something you do 1% of the time. What’s more, it can confuse intention, for example, when you don’t need a setter.
Don’t use lvalue accessors - lvalue subroutines return the actual thing instead of a copy. (This is how substring works, for example.) The obvious problem here, from an OO perspective, is that it breaks encapsulation.
Don’t use indirect object syntax - Indirect object syntax is when you put the object name after the method. You can run into trouble with built-ins, etc. leading to ambiguity and difficult to find bugs.
Provide an optimal interface, rather than minimal one - this reduces maintainability since it forces each programmer invent a subroutines to do common tasks. External subroutines are also less efficient since they don’t have access to internal data. Provide commonly used and needed, rather than just essential, functions.
Only overload the isomorphic operators of algebraic classes - missed this. :-( I think it means to ensure that overloaded names have expected behaviors or something like that.
Always consider overloading boolean, numeric, and string coercions. Objects used as booleans are always good. Objects used as numbers are always bad. Objects used as strings are always ugly. You can use the overload module to overload q(0+), q(bool), and q(“”) to make these behave nicely. One thing you can do is just make the croaks to kill programs that are using objects in funny ways.
Don’t directly manipulate the list of base classes - Don’t assign directly o @ISA, rather use use base …. This ensures the relationships are set up as early as possible.
Use distributed encapsulated objects - When you create inside-out objects, there’s no reason that lexically-scoped hashes that store variables need to be in the same lexical scope as long as derived class have access to them.
Never use the one-argument form of bless - Derived classes will bless their objects into the base class if you do. Bless’s default behavior is static and blesses its argument into the class the code is in, not where it was called.
Pass constructor arguments as labeled values in a hash - Positional arguments don’t work well for constructors. With positional arguments you have to slice and dice as you pass some arguments to the base class constructor. With labeled values, you just pass the whole hash and the constructors up the hierarchy just pull out whatever they need.
Separate your construction, initialization, and destruction processes - In multiple inheritance, you’ll end up allocating memory on each if new and initialized are combined. Similarly or destructor’s in multiple inheritance.
Don’t use AUTOLOAD() - it generally bespeaks bad design. Most common mistake is to forget to also provide an explicit DESTROY() method. Whenever you want to on AUTOLOAD() its almost always better o create a generic method that takes the names you would have autoloaded as an extra argument rather than having methods created on the fly.
Posted by windley on August 2, 2005 5:47 PM


