Programming

April 17, 2006

Fuzzy Boundaries

The January 2005 issue of ACM Queue contains and article by Roger Sessions called Fuzzy Boundaries that does a good job of discussing the differences between objects, components, and services and when to use each. This is a difference that’s hard for students to grasp at first and I suspect many a veteran programmer would have a tough time explaining it, even though they understand it intuitively.

To start with, we have to acknowledge that each of these has the same abstract purpose: stick some code behind a well-defined API and are designed to respond to requests from a client.

Sessions differentiates them by referring to process and environment. For objects for the object and it’s client live in the same process and the same environment. For components, the client and target exist in separate environments, but the same process. Sessions defines an environment as a component framework like EJBs or Corba. Services don’t share either processes or environments.

Sessions used the following table to clarify the differences. Builder relationship and quantity are interesting. Builder relationship specifies the relationship between the builders of the target and the client. Quantity is a relative measure of how prevalent these various forms will be in any given system. Because services are meant for top-level APIs between organizations, they will be relatively few.

  Objects Components Services
Locality Same Process Different Process Different Organization
Environment Same Same Different
Communication Speed Fast Slow Very Slow
Builder Relationship Same Person Same Group Different Organization
Quantity Tons > 1 per process few

The important point is that objects, components, and services aren’t mutually exclusive. Rather, each has it’s place and a complex system will probably have some of each in the proper proportions.

Sessions finishes the article by describing the “software fortress” model of architecture that he put together for his book by the same name. This model is all about defining boundaries and creating a framework for proper analysis. The book’s out of print, but available used from Amazon. I’ve ordered a copy.

02:49 PM | Comments (1) | Recommend This | Print This

March 20, 2006

Effective Scheming

I received an email from a former student who’s caught the Scheme bug. He says:

I took 330 from you last year and I really enjoy coding in Scheme. I do any class project I can in Scheme — even my Python code is riddled with lambda statements.

I have two questions I was hoping you could help me with
  1. What are the prospects for kids who like coding in Scheme/Lisp, and how does one locate/maximize those prospects?
  2. What are some key things I could do to become a really great Scheme/Lisp coder? That is, what are some concepts and capabilities I could focus on to become a truly outstanding Schemer?

These are questions I don’t necessarily have good answers for. Here’s my best ideas at the moment. If you have others, feel free to leave a comment below.

On prospects, if you’re interested in graduate school, there are some places like Utah and Northwestern that have people actively doing research in Scheme. As for jobs, I’m not aware of any. I teach Scheme for pedagogical rather than pragmatic reasons, so I don’t spend a lot of time worrying about whether it’s practical or not. 98% of the other courses we teach use Java, so I figure students have gotten a healthy dose of practical.

On what concepts and capabilities to focus on, I’d say that macros are at the top of the list. The best book I’ve seen on this is Practical Common Lisp by Peter Seibel. By chapter 3, Peter has shown what macros are, how they differ from what are termed “macros” in other languages, and why they’re important. The book is sprinkled with “practical” chapters that take the concepts just learned and apply them. It’s one of the most effective programming language texts I’ve read and certainly the best one for Lisp.

In addition to learning about macros, learning Lisp is a good thing because there are features and ideas in Lisp that aren’t present in Scheme and will expand your horizons. Obviously, Seibel is a good vehicle for doing that. As far as implementations, I’ve had good luck with OpenMCL. SLIME in Emacs is a great IDE.

03:23 PM | Comments (10) | Recommend This | Print This

March 06, 2006

Rails and Ajax for Page Application Development (ETech 2006 Tutorial)

I’m in David Heinemeier Hansson’s tutorial on Beneath-the-Page Application Development with Rails. His Rails tutorial from last summer remains one of my most viewed blog entries.

He starts out noting that AJAX is the most important innovation for the Web in years.

But JavaScripting the DOM still sucks…a lot. JavaScripting the DOM is incompatible with how regular programmers think about programming.

Part of the problem is the sorry state of browser. One line of change can lead to hours of regressions because of browser incompatibilities. Then there’s the browser underworld (all the old, out of date browsers that are still out there). That’s the bad.

Then there’s the ugly. Nodes are not for people. The idea of “createNode” as an API call has a nice academic feel, but it doesn’t add up to something that’s pleasant for developers to use. But the main problem with nodes is that they cause you to repeat yourself. You have to create the first version of the page using HTML. Your entire UI is mapped out in template files. To make a dynamic change, you have to say it again another way. This creates two versions of the interface—you have to take great care to ensure that they don’t drift apart.

Now, for the good: innerHTML is the hero. A triumph of pragmatism. No spec, but it’s supported by all. innerHTML’s companion, the saint, is eval. You can construct JavaScript outside the browser, that is you can generate it, and then evaluate the result. Compare this with what Simon Willison said this morning for a contrasting point of view.

So, how do we deal with the bad, not return calls to the ugly, and champion the good?

Rails uses the Prototype JavaScript library to try to put a layer on top of the browser incompatibilities. You write off the underworld as being too expensive and concentrate on IE, Firefox, and Safari. On top of Prototype, Rails uses script.aculo.us. All of these elements (rails, prototype, and script.aculo.us) are written with explicit knowledge of each other.

David is building a demo application from scratch using Rails (v1.1) to show his approach to page development and AJAX. The first time I saw this it was amazing. I did one for Kelly Flanagan (BYU CIO) a few months ago, so if I can do it…maybe it’s not so amazing! :-) You can get the functionality he’s demoing today using beta Gems.

The first part of the demo follows the basics “use rails to create a MVC system” script pretty closely. He’s building a simple ecommerce site. Once he’s got a way to add products, display them, and adding things to a shopping cart, he starts adding AJAX.

Transforming the rails link_to to link_to_remote creates an AJAX request to the same URL. The default action is to eval the links coming back. The other change is to remove the default rendering and replace it with a call to an rjs template. The rjs template returns JavaScript that’s eval’d in the browser. In rjs, there’s a proxy for the DOM in Ruby called page. Here’s what’s in the rjs file:

page[:cart].replace_html :partial=> "cart"

From within the page, you can get a reference to specific parts of the page and replace them with something else. This makes use of “partials” an rhtml page with a name that starts with the underscore.

So, David’s pulled out the shopping cart HTML and made it a partial. That’s referenced in the template and in the Ruby code. With that, he can update the shopping cart in an AJAX way without refreshing the page.

The cart has been in the session as an array. David creates an object for the cart (not persistent) and adds totals. By just changing the partial, the total is added to the shopping cart without modifying nodes in one place and HTML in another.

Adding a “discard” function to the cart demonstrates the use of inline rjs code. This is useful when the function is pretty simple. Adding this doesn’t work, so he shows how you can add an “alert(exception)” to the app and get debugging info in the browser.

Adding AJAX to Web apps, changes the nature of how you think about them. To demo this, he adds a style to the cart to hide it if there’s nothing in it. But when you add something to it, this doesn’t go away, because the page isn’t rerendered. You can add code to the rjs manage this instead of scripting the JavaScript by hand.

Next David adds a “remove from cart” link next to the product (pedagogically placed there instead of in the cart itself). He adds logic to test whether or not the product is in the cart to determine whether or not the link is shown. The problem is that you don’t reload, so you need to update products whenever something is added to the cart.

This calls for another abstraction. Clipping out the product display HTML as another partial does the trick. Actually, there’s two partials, one for products and one for product inside the products partial. The next problem is that the “add” method doesn’t have all the products, it has the new one. That can be fixed by adding somme code. David mentions that this points out the problem of state maintenance that AJAX code introduces. I wonder if continuation-based Web applications could solve this?

We’ve fixed one problem, but there’s a combinatorial explosion of dependencies in the UI. For example, now, when you discard the cart, the product listing needs to be updated. These things are hard enough to do when you’re programming in Ruby. When you’re hacking our the raw JavaScript, you often just ignore them.

How do you fall back when the browser doesn’t support JavaScript? Graceful degradation is important. The nice thing about having everything in partials, is that don’t have to recreate yet another view for non-compliant browsers. Adding this code to the end of the method to explicitly control the rendering is a key idiom:

if request.xhr?
   render
else
   redirect :action => "index"
end

The other key piece is to ensure that good hrefs are built in addition to the “onclick” actions. This creates some ugly code, so he defines a helper function to make it happen.

RJS isn’t an attempt to replace JavaScript, but a way to generate the most common things you want to do. You can drop out and use any JavaScript you’d like when you need to.

There are two main points to what we’ve been talking about today. Rule 1: less JavaScript and more tolerable JavaScript. By capturing common patterns and replacing them with abstractions like replace_html we make programming in JavaScript easier. He shows how each RJS line is translated into the equivalent JavaScript.

Rule 2: Less data, more interface. This is the opposite of what AJAX has been described as. Rails isn’t passing XML back and forth, rather it’s passing XHTML (admittedly a form of XML). Returning a fully described partial is more bandwidth than just returning the hash, but it’s worth the tradeoff. The amount of data usually isn’t the problem. It’s programmer time with JavaScript.

A new rule: make it snazzy. AJAX applications are distinctively different from regular Web apps. Effects not just fluff—they’re there to give the user confirmation that something has happened. David shows the video from Fluxiom, an asset management application, that looks like a desktop application but is all in Rails. This is the future of Web applications.

Next David shows off a new feature of Rails that allows you to drive the application just like a browser would from the console. It’s very similar to a command line browser that remembers sessions, etc. and gives them to the user as Ruby objects that can be manipulated. This is great for writing unit tests for a Web application.

David’s been following microformats. Microformats, annotate XHTML in such a way that machines can easily digest it. It’s a way to render data in a human readable way so that there’s only one view. He returns to the shopping cart example to show how classes can be added to embed semantics. By adding “product”, “name”, and “price” class attributes to the template, he gets a microformat.

To make this useful for machines, he changes the controller so that the to check the content type of the request. If it is “application/xml” he assumes that the client wants just the products and not the HTML wrapper. This allows the same controller to function for both browsers and programs.

He starts writing code to process what gets returned and gets an error. Lesson: you have to generate valid XML if you’re going to parse it as XML. Of course, you’ve got all the XHTML tags as nodes and so navigating it can be difficult (you want to navigate the class attributes). David’s working on an API that would allow class attribute-based navigation.

An alternate solution is to use XSLT transforms from within Ruby to transform the XHTML to XML. This isn’t actually working yet, apparently, but would be pretty easy to set up.

02:43 PM | Recommend This | Print This

Introduction to JavaScript (ETech 2006 Tutorial)

This morning I’m in the A (Re-)Introduction to JavaScript tutorial taught be Simon Willison.

Simon recommends Javascript: The Definative Guide by David Flanagan as one of the few Javascript references that’s worthwhile. He hasn’t found a good reference on the Web.

Brendan Eich invented JavaScript in 1995. The ECMA standard went through only 3 revisions with the last one in 1998. Good news: the language is stable. Bad news: there’s lots of codified warts.

Javascript avoids I/O, so it’s not quite a general purpose language: it has to be embedded in a host environment that supplies this. In the case of the browser, this is the DOM.

Some advice:

  • Use the semicolon even though it’s not requires.
  • Declare variables religiously
  • Avoid global variables.

There’s no such thing as an integer in JS. Only floating point (double precisions 64-bit IEEE 754 values). There’s a ton of Math functions. The parseInt function will guess the base. You can specify it in an optional second argument and you should to avoid guessing.

Strings are sequences of 16-bit unicode characters. Characters are just strings of length 1. Again, there are a ton of string functions that can be used to manipulate strings, including regexp functions. String concatenation overloads the + operators. Type coercion is automatic, so if you add a string to a number, you get a string. If you add a number to a string, you get a number.

null means no value and undefined means no value yet. Booleans are true and false, but there are truthy and falsy values as well. i, NaN, null, and undefined are falsy. Everything else is truthy.

Variables are declared. If you don’t use var, the language will automatically define a global variable. This can cause all sorts of hard to debug situations. So, never do this!

== and != do type coercion, so 1 == true. To avoid this, use === and !==.

typeof will give you the type of the object. Arrays, say their objects, which they are, but don’t give you the specific answer.

The logical operators (&&, and ||) short circuit and an undefined object is “false” so o && o.getName() will test for existence before applying the method.

Exceptions were borrowed directly from java. So, the have the try…catch…finally… syntax.

Objects in JavaScript are very different than objects in other languages. Objects are collections of name-value paris. The name part is a string, the value can be anything. Objects are created with the new operator, but you can also use a literal representation:

var o = {
 name: "Carrot",
 "for:", "Max",
 details: {
  color: "orange",
  size: 23
  }
};
obj.name = "Simon";
obj["name"] = "Simon";

In the final line, we could have dynamically created the string “name” so this is more flexible. An alternative form of the for-loop let’s you access the attributes in the object.

Arrays are the other primary data aggregator. The keys have to be numbers, but they use the [] syntax, just like objects. You can use literal notation to create and initialize an array in one line. Arrays will add items past the end. To append to the end, use a[a.length] = 3 (a.push(3) also works).

If you do a for-loop with arrays, it’s sensible to cache the array length in the initialization so avoid calculating the length each time (looking up a property in an object).

for (var i = 0; j = a.length; i < j; i++) {
...
}

Functions are pretty simple. If nothing is explicitly returned, then the return value is undefined. Parameters are more like guidelines. Missing parameters are treated as undefined. You can also pass in more arguments than the function is expecting. The extra arguments are ignored. There is a special array called arguments that contains all of the arguments. So, you can create variable arity functions. The arguments array is almost an array. You can't use all the array operators on it.

Functions are objects with methods. For example, apply is defined on functions.

You can create anonymous functions using the function operator. var avg = function() {...} is the same as function avg () {...}. This gives you the usual benefits: local scoping, closures, etc. Simon spends a bit of time showing how you can use anonymous functions to simulate the behavior of a let (see Lisp). Of course, it's pretty ugly since there are no macros.

The arguments array has a property called callee that points to the function that got called (similar to self, but for functions) so that you can create anonymous recursive functions. The callee property allows saving state between invocations, so you can, for example, define a function that remembers how many times it's been called and puts that in arguments.callee.count Cool.

You can use the property representation of objects to create literal objects that include methods. You can do this using the this keyword to refer to the current object. If you use the method without using the dot notation of method invocation, then this refers to the global object.

The problem with literals created in this way is that each object contains the function code. Using the new operator avoids this. new creates a new empty object and calls the specified function with this referencing the current object. Functions can be defined on the object using the prototype method.

function Person(first, last) {...}
Person.prototype.fullname = function () { ... }

This is a potentially confusing point because JavaScript is a prototype-based object-oriented language, not a class based language. Inheritance in prototype based languages happens by creating copies of prototypes.

A nice side effect of prototypes is that you can add new methods to objects at runtime. Existing instances of the object will look at the prototype to find methods and find the new methods. This applies to the core objects of the language as well, so you can add a reversed method to the built-in string object. This method would be seen even by string literals. Be careful, you can redefine built-ins doing this. Simon tells the story of using this feature to add a specific functionality to JavaScript on Safari that wasn't properly implemented prior to version 2.0.

Prototypes can be nested, creating chains. Deep nesting combined with dynamic prototype creation can make chains hard to debug. All chains terminate at Object.protoype. Object includes a toString method, for example. So, overriding it gives good error messages, etc. because you inherit properties (like read-only or don't-enumerate) defined in Object

JavaScript is a prototype-based language that pretends to be class-based, so it doesn't do either very well. You can use this idiom to reuse an object:

function Geek() {
 Person.apply)this, arguments);
 this.geeklevel = 5;
}
Geek.prototype = new Person();
Geek.prototype.setLevel = function(lvl) {
 this.geekLevel = lvl;
}
Geek.protoype.getLevel = function() {
 return this.geekLevel;
}

> s = new Geek(“Simon”, “Willison”)

The prototype chain, in this instance is Geek —> Person —> Object. This is a little weird because we’re creating an instance of Person to serve as the prototype (rather than a prototype). This makes constructors lame because you can’t do anything useful in the constructor. If you google “javascript inheritance” on the Web, you’ll see there are dozens of proposed solutions (workarounds) to this problem.

Functions in JavaScript are first class objects. A function is just another object. You can store it in a variable, pass it around, return it from a function, and so on. Simon starts with an example of defining arrayMap, a function that maps a function over an array, creating a new array. Next he shows how to define a function, salesTaxFactory that takes a tax rate and returns a function that calculates tax for that tax rate (i.e. he defines a closure).

Simon shows off the shell bookmark (that only works in Firefox). The book mark opens a JavaScript shell that you can use to test JavaScript.

A closure is a function that has captured the scope in which it was created. Functions in JavaScript have a scope chain that reference scopes that function is defined in. This is similar to other block structured languages that allow anonymous functions. This can cause problems when you refer to a loop variable, for example, since every closure created in the loop will refer to the same iteration variable, which has the final value, not he value it had when it was created.

Simon introduces the singleton pattern. This avoids namespace collisions. JavaScript has no sense of modules or packages. Any JavaScript running in the browser can access any other names. The less code that affects the global namespace, the better.

The singleton pattern uses closures to hide information. Here’s an example:

var simon = (function () {
  var myVar = 5;
  function init(x) {
  ... // can access myVar and doPrivate
  }
  function doPrivate(x) {
  ... // invisiable to the outside world
  }
  function doSomething (x, y) {
  ... // can access myVar and doPrivate
  }
  return {
    'init': init, 'doSomething': doSomething
  }
})();

This is exactly what you’d do in Scheme or Lisp to create a self-defined object. Ironic for JavaScript since you’re essentially using closures to create objects in an object-oriented language. Again, with macros, the ugliness could be hidden, but alas…

One of thorniest problems in JavaScript is memory leaks. JavaScript is a garbage collected language. If you’re using large amount of memory, you need to make sure you cancel references and let the garbage collector do it job. Internet Explorer handles garbage collection differently for JavaScript and the DOM and can’t handle circular references, creating ugly memory leaks. Call this function enough times in IE and it will crash:

function leak() {
  var div = document.getElementById('d');
  div.obj = {
    'leak': div
  }
}

This pattern can show up when you create a slider widget, for example. Apparently, this same bug showed up in the 1.5 alpha release of Firefox. It’s been fixed now???

There are more sneaky examples.

function sneakyLeak() {
  var div = document.getElementById)'d);
  div.onclick = function() {
   alert("hi!");
  }
}

This is a common idiom. This can be hard to detect. The problem is that closures maintain access to the environment they were created in, regardless of whether they access anything in that environment or not. One way to avoid the problem is to assign null to div at the end of the function. This throws the reference away and lets the garbage collector pick it up.

Simon mentions that these are tricky problems and he’d programmed JavaScript for three years before he really understood them. Interestingly, anyone who’s had CS330 would be equipped to understand these problems quickly. Another reason 330 is important.

Simon mentions that using most popular JavaScript libraries have systems for attaching and detaching events. Many of these can automatically free event references when the page is unloaded. These can be used to solve circular reference problems. Using libraries can solve a big chunk of the problem, but be aware.

Everytime you do complex lookups (using dot notation), dereferencing them can increase performance. This is especially important inside loops. Here’s an example of dereferencing (actually it seems more like referencing to me):

var s = document.getElementById('d').style;
s.width = '100%';
// and so on

You see this in drag and drop operations, for example.

Simon recommends some JavaScript libraries:

  • dojotoolkit.org
  • developer.yahoo.net/yui - Yahoo! User Interface libraries. Similar to others. Good drag and drop, animation, etc. facilities. Designed to play well with each other and give good performance on high traffic Web sites.
  • mochikit.com - borrows a ton of ideas from Python and functional programming
  • prototype.conio.net - most famous library (included in Rails). Extends the language and has it’s own style. Need to understand the style to use it. Comes with a set of tools for DOM manipulation and AJAX stuff.
  • script.aculo.us - Extension of prototype and adds lots of visual effects.

Things that every JavaScript programmer should know:

  • Everything in JavaScript is an object, even functions.
  • Every object is always mutable
  • The dot operator is equivalent to de-referencing by hash
  • The new keyword creates an object that class constructors run inside of, thereby imprinting them
  • Functions are always closures
  • The this keyword is relative to the execution context, not the declaration context (dynamically scoped)
  • The prototype property is mutable. This is the basis of inheritance in JavaScript.

Someone asks a question about macros and Simon starts talking about eval. His basic advice: don’t use it. Slow, dangerous, etc. Much of what he saus is right for JavaScript, but not for languages with real macros like Lisp and Scheme. It’s interesting that people who don’t know about a particular langage feature just can’t see how it could be used. This isn’t a dig on Simon—he could be completely right with respect the JavaScript. Yet, I wonder…

Simon did a good job with this tutorial. He introduced JavaScript in a way that didn’t lose people and at the same time brought out the gotchas that more experienced programmers would want to know. It’s not easy keeping a language tutorial interesting, but Simon did alright.

09:51 AM | Comments (1) | Recommend This | Print This

ETech Tutorials

I’m at ETech, just waiting for the the first tutorial to begin. I’m signed up for two today. This morning I’m going to A (Re-)Introduction to JavaScript taught be Simon Willison. This afternoon, I’m going to Beneath-the-Page Application Development with Rails with David Heinemeier Hansson. His Rails tutorial from last summer remains one of my most viewed blog entries. I’ll post notes, so follow along.

09:18 AM | Recommend This | Print This

February 14, 2006

ThinkCAP JX

Does anyone have any experience with ThinkCAP JX? It’s a development framework for J2EE. Any comments you have would be appreciated.

10:10 AM | Comments (1) | Recommend This | Print This

February 13, 2006

BYU RUG Report

I wasn’t able to go the BYU Ruby User’s Group meeting last week, but Lee Jensen went and filed this report:

I went to the BYU RUG Meeting last night in Provo. The guest speaker was Eric Hodel part of the Robot Coop makers of the 43(things,people,places) social sites. He explained some of the interesting projects that he’s been working on and has done in Ruby.

He’s currently working on a project called Ruby2c or MetaRuby which seeks to make an parser which will implement a subset of Ruby that can be output to the C language and then compiled. What they seek to achieve by doing this is to write the Ruby language in Ruby itself, opening up future development of language internals to anyone that knows their subset of Ruby and not just Japanese C hackers. It was an interesting project but the presentation was over just about everyone’s head.

He next went on to talk about Drb which for those that don’t know is distributed Ruby. Essentially it’s a simple networking library in Ruby that allows a Ruby process to use remote objects as if they were local. In addition to this he gave some examples of his usage of Drb. One example was using Drb with Rinda (a Ruby Linda distributed computing implementation) to monitor live application servers.

He also talked about extreme programming concepts. He emphasized three philosophies.

  1. Ya ain’t gonna need it. This means don’t try and plan for everything and implement it all at once. You can work faster if you don’t have all that crap in your head.
  2. DRY: Don’t repeat yourself. Refactor early, refactor often. If code starts looking bad fix it. It’ll only be harder later
  3. Test everything. He emphasized test first principles and showed some examples from code he was working on.

One of the sample rails sites he showed us had 816 lines of code and over 2000 lines of test code. It was like a 1:2.6 code to test ratio. Not bad.

01:58 PM | Recommend This | Print This

February 10, 2006

LISP Ecosystems

I criticized Allegro yesterday at Between the Lines for a business model that sells programming language development environments like they were enterprise software. Programming languages and their development environments are free in the 21st century—at least that’s how most people think about them. I can’t imagine approaching a VC, for example, with a business plan that has as it’s basis selling programming language tools.

The problem is that programming languages depend on complex ecosystems of libraries, IDEs, testing tools, Web components, and so on. A reader at BTL said it in this way:

Where’s the ecosystem?

LISP was born in 1958… but where’s the ecosystem of tools and libraries—FOSS and commercial—that surrounds C++, Java Perl, Python, PHP and even the .NET platform? Ruby dovumentation and libraries are so much easier to come by and that language was born in the mid-90s.

Looks like LISP’s fans have a lot of work to do if they want to bridge the support gap with competing languages and platforms. Until that happens, LISP will be like Latin—historically significant but otherwise dead.

There actually is a community of people who use LISP, but it’s not as big as it could be. Several factors contribute to this:

Common Lisp isn’t all that common. That is, the CL spec covers the language, but not much else. Transferring programs is hard. This contrasts sharply with new languages like PHP, Python, and Ruby where the core language and libraries are free so there’s only one implementation. Consequently, developers create large bodies of code that can be easily used by anyone else.

07:41 AM | Comments (2) | Recommend This | Print This

February 08, 2006

JavaSchools, Scheme, and Sin

Joel Spolsky has a great essay on the perils of JavaSchools, those CS programs that adopt Java (or .Net, to be fair) because it is easy for students to learn. In it, he sings the praises of learning Scheme and being exposed to functional programming.

Without understanding functional programming, you can’t invent MapReduce, the algorithm that makes Google so massively scalable. The terms Map and Reduce come from Lisp and functional programming. MapReduce is, in retrospect, obvious to anyone who remembers from their 6.001-equivalent programming class that purely functional programs have no side effects and are thus trivially parallelizable. The very fact that Google invented MapReduce, and Microsoft didn’t, says something about why Microsoft is still playing catch up trying to get basic search features to work, while Google has moved on to the next problem: building Skynet^H^H^H^H^H^H the world’s largest massively parallel supercomputer. I don’t think Microsoft completely understands just how far behind they are on that wave.
From The Perils of JavaSchools - Joel on Software
Referenced Wed Feb 08 2006 10:53:20 GMT-0700 (MST)

In another article Sriram Krishnan says that “Lisp is sin:”

I was on vacation a couple of weeks ago at my parents’ house in Chennai. My dad and I share a love for James Bond movies so my dad had bought a set of DVDs containing all the Bond movies in existence. I can’t help but strike a politically incorrect analogy - Lisp is like the villainesses present in the Bond movies. It seduces you with its sheer beauty and its allure is irresistible. A fleeting encounter plays on your mind for a long,long time. However, it may not be the best choice if you’re looking for a long term commitment. But in the short term, it sure is fun! In that way, Lisp is…sin.
From Sriram Krishnan : Lisp is sin
Referenced Wed Feb 08 2006 10:55:40 GMT-0700 (MST)

Sriram is tempted by LISP, but put off by some of it’s raw “hacker power.”

At BYU, we teach Scheme in the Concepts of Programming Languages class. There’s a continual struggle to maintain Scheme in that class. I never apologize to the students for teaching Scheme—I think anyone who calls themselves a Computer Scientist ought to have done some programming in it. Beyond that, however, it’s the best vehicle I know for allowing me to challenge student’s notions about what programming languages are and how they should work.

10:49 AM | Comments (1) | Recommend This | Print This

Eric Hodel at BYU RUG

The BYU Ruby User’s Group is meeting tonight at 7pm in 120 TMCB. The guest speaker is Eric Hodel from Seattle Washington.

06:27 AM | Recommend This | Print This

February 06, 2006

Time to Learn LISP

I just posted an article at Between the Lines called Time to learn LISP, a riff on Peter Coffee’s recent piece on LISP and other “exotic” languages and techniques going mainstream.

02:33 PM | Comments (1) | Recommend This | Print This

February 02, 2006

Cadena: Analyzing Component-Based Software Architectures

John Hatcliff spoke at this morning’s BYU Computer Science colloquium. John is a professor of Computer Science at Kansas State University. He’s speaking on Model-driven development, analysis, and optimization in a system called Cadena.

The project is based on using middleware to form abstractions of distriburted computing components. The talk is focused on a real-time CORBA event service. The “model-driven” portion of the talk discusses formalisms for building high-assurance distributed systems. The framework supports plugging in various light-weight specification, analysis, and verification systems.

The work was done in the context of an avionics mission control system project sponsored by DARPA through Boeing. In a typical situation, a component, A, computes some data that is read by other components (B1, B2, etc.). A publishes a dataAvailable event and then Bk calls a getData method on A to fetch the data. Bk may or may not fetch the data from A depending on its state.

The Boeing challenge problems included things like calculating forward and backward data dependence and trace the parts of the system influenced by data produced by a particular component (architectural slicing). Boeing had heuristics about priorities that various tasks had to run at depending on the dependencies and couplings. Other issues included dependency interactions.

Here’s a specific problem: If components 1 is in mode A when component 2 produces an event E, when will component 3 consume data F? This is a classic model checking problem.

Cadena is an Eclipse plugin that provides support for analysis of middleware architectures, specifically, the CORBA component model (CCM). This analysis can solve specific problems like the one above and other’s that Boeing was interested in.

Components can have synchronous and asynchronous connections to each other. Cadena uses CORBA IDL to describe these interfaces. Code templates are automatically generated that can be filled in with the business logic. This is completely integrated in Eclipse and uses the Eclipse Java facilities.

Cadena provides a graphical, spreadsheet, and textual view of how components are connected together. These all operate from the same model, so that changing any one updates all three. Cadena provides tools within the graphical view that do graphical analysis of the overall system including forward and backward slicing, cycle checking, and so on. The spreadsheet view allows components to be filtered and sorted for managing large numbers of components. Pulldown menus include only type-safe choice for component connections.

Configuration and deployment information (in XML) is automatically generated so that components can be assembled on specific network nodes.

Cadena requires engineers to annotate the CORBA IDL with some semantic information so that model checking can analyze the design. This information includes, in order of increasing effort and strength of verification, port action dependencies, state-based dependencies, and component transition semantics. This is a nice iterative approach to getting semantic specification from engineers since they can see real results from easy actions and understand that adding more semantic information will increase the level of analysis they can do. The tool also can check that the semantic specification tracks through to the Java code.

One of the goals was to simplify the view of the system topology by hiding components and connections based on the mode the system is in. The tool can automatically generate these views. In the graphical view, changing the mode of a particular component simplifies the view to show just the components and connections that will be active in that mode.

This talk was an interesting intersection of two things that have held great interest for me in my professional life: formal methods and distributed component-based architectures. Its surprising to me that there aren’t more tools for analyzing component-based architectures (and service-based architectures) given the modularity of those designs. Cadena shows that this type of analysis is both possible and beneficial.

03:53 PM | Recommend This | Print This

January 21, 2006

Ruby on Rails and OS X

Devlin Daley gave a presentation in our 601R class on Rails so that we could discuss frameworks and the choices Rails had made. While he was talking, I poked around a little since I wanted to get Rails going on my Powerbook and found this great little tutorial on getting Rails working on Mac OS X (Tiger).

The tutorial walks you through setting Rails up with SQLite and creating a simple application. I only ran into two problems with the tutorial as written. First, when you load the Ruby Gem for SQLite, it says to type:

sudo gem install sqlite3

In fact, this returns an error. You need to type:

sudo gem install sqlite3-ruby

Second, after you create the sqlite database, you need to set its permission so that the Web server can get to it:

chmod 666 addressbook.db

I like that this uses SQLite since it’s already installed on Tiger and easy to get going. SQLite is great for just exploring. I also like that it uses Apache since I’m very familiar with it.

12:25 PM | Comments (2) | Recommend This | Print This

December 20, 2005

Programming Head Shakers

If you’re not reading The Daily WTF and you program, you really ought to give it a spot in your attention stream. Today’s entry is a classic: using a termporary file instead of sprintf. The comments are pretty good as well, dissecting the code and pointing out all kinds of style programs. A humorous way to learn from bad examples.

09:03 AM | Comments (4) | Recommend This | Print This

November 16, 2005

Writable Web: Annotating Manual Pages

I’m doing a project in Perl where I needed to tidy the HTML in some pages. There’s a nifty little Perl package called, appropriately enough, HTML::Tidy. I used CPAN to grab it and got it working, but I’ll be darned if I could figure out how to actually get the tidied text back from the package. If you read the man page, I’ll bet you can’t figure it out either. No useful examples with the code and searching the Web turned up very little. Turns out the clean method returns the cleaned text, although the man page doesn’t say that—it says that it returns “true if all went OK, or false if there was some problem calling Tidy, or parsing Tidy’s output.”

In the process of searching for how to actually use this little gem, I can across the annotated CPAN documentation project. Here’s the annotated page for HTML::Tidy. Notice that someone has annotated the page to correct the oversight of telling you where the tidied output will be, along with some other useful information. Very handy idea. One more instance of the writable Web at work.

01:36 PM | Recommend This | Print This

November 09, 2005

Monads in Ruby: Yum!

Just ran across an introduction to using monads in Ruby. If you’re more of a Schemer, you might enjoy this introduction more.

10:34 AM | Recommend This | Print This

October 12, 2005

Ways of Thinking, Ways of Doing

In a recent column, Jon Udell says “much of what seems to be modern innovation is, in fact, rediscovery of … Lisp and Smalltalk.” He goes on later to say:

If existing tools can do more than we realize, we could spare ourselves a bit of grief. But probably not a lot. Translating ways of thinking into ways of doing always takes longer than we predict.
From The spiral staircase of SOA | InfoWorld | Column | 2005-09-28 | By Jon Udell
Referenced Wed Oct 12 2005 09:55:00 GMT-0600 (MDT)

This is an interesting point and one that’s under-appreciated, particularly by academics. For example, I’ve frequently maintained that anyone with a CS degree can understand XML and cut through the hype in a few sentences:

  • XML is a way of describing context free grammars.
  • An XML schema is a BNF for a particular grammar (it can contain more, but this is a good start).
  • XML parsers are interpreted versions of LEX and YACC.
  • A DOM is a standardized parse tree.
  • XSL is an interpreted pretty—printer.

This pretty much says it all except for Jon’s point. Because there’s nothing new in the principles behind XML, good programmers have been using the principles of XML for years, but by creating the “way of doing” we call XML and encapsulating those principles in standards and tools, Tim Bray and others gave those techniques to the masses.

09:49 AM | Comments (3) | Recommend This | Print This

October 03, 2005

Persistence Configuration in EJB 3.0

In EJB 3.0, persistence is done using plain old java objects (POJOs). As far as I know, JBoss is the only J2EE capable application server supporting EJB 3.0 at this point. In the JBoss implementation the Hibernate roots of persistent POJOs are still very much visable. That’s good news since that means that much of the Hibernate documentation can be used to understand EJB 3.0.

In JBoss, the default persitence properties are stored in

$JBOSS_HOME/server/all/deploy/ejb3.deployer/META-INF/persistence.properties

The meaning of most of the configuration parameters you see there can be found in the Hibernate configuration documentation.

By default, the persistence configuration on JBoss’s EJB3.0 says to create new tables each time the application is deployed and to drop tables when it’s undeployed:

hibernate.hbm2ddl.auto=create-drop

This means that your data will be lost each time you redeploy your application. Probably OK when you’re developing your entities, but not what you want for a production database. To change this, you can either change the line to read

hibernate.hbm2ddl.auto=update

Theoretically, you can override the default in your persistence.xml deployment descriptor:

<entity-manager>
  <name>Animal</name>
  <jta-data-source>java:/DefaultDS</jta-data-source>
  <properties>
    <property name="hibernate.hbm2dll.auto"  Value="update"/> 
  </properties>
</entity-manager>

I found, however, in a few simple tests, that this didn’t work.

The default data source in JBoss is the Hypersonic database. This is an easy way to get going. JBoss stores the data in

$JBOSS_HOME/server/all/data/hypersonic

with the name localDB. This name, along with other configuration parameters for the Hypersonic DB, which is given the JNDI name DefaultDS can be found in this file:

$JBOSS_HOME/server/all/deploy/hsqldb-ds.xml   

08:05 AM | Comments (3) | Recommend This | Print This

September 29, 2005

First Impressions on EJB3.0

I’ve been getting some EJB3.0 stuff together for my class and posted some of my thoughts over at BTL:

First and foremost, the beans in EJB3.0 are significantly less complex. Entity beans are just plain old Java objects (POJOs) and the container manages the mapping of these objects to a relational database and the persistence of POJOs to the database. For example, the interfaces for EJBs do not have to implement EJBObject or EJBLocalObject. In addition, lifecycle methods like ejbPassivate, ejbActivate, ejbLoad, ejbStore, etc. are no longer required.

The metadata annotation feature of Java 1.5 is put to use in annotating entity and session beans to give hints as to the behaviors that you want from the bean. In the past I’ve used XDoclet to simplify bean building, but EJB3.0 doesn’t require it because of the annotations.

While I’ve not done any performance tests, they seem faster. Now in addition to moving to EJB3.0, I also moved from JBoss 3 to JBoss 4 and that could be the difference.

The change to POJOs for entity beans has been advertised as Hibernate in EJB, but you don’t feel like you’re using Hibernate; the connection is more conceptual than anything else. I’ve only used the mapping from entity objects to the relational table, not the other way around.

Documentation is sparse. There’s been lots of questions that I’ve had trouble finding answers to. I’m sure that will change. As long as your application follows the few examples fairly closely, you’ll be able to gather what to do from them, but that only goes so far.
From » First reactions to EJB 3.0 | Between the Lines | ZDNet.com
Referenced Thu Sep 29 2005 13:35:47 GMT-0600 (MDT)

I’m still exploring, so more to come…

01:34 PM | Recommend This | Print This

August 10, 2005

Overloading: Syntactic Heroin

ACM Queue has an article entitled Syntactic Heroin which says that user-defined overloading (ad hoc polymorphism) is a drug.

User-defined overloading is a drug. At first, it gives you a quick, feel-good fix. No sense in cluttering up code with verbose and ugly function names such as IntAbs, FloatAbs, DoubleAbs, or ComplexAbs; just name them all Abs. Even better, use algebraic notation such as A+B, instead of ComplexSum(A,B). It certainly makes coding more compact. But a dangerous addiction soon sets in. Languages and programs that were already complex enough to stretch everyone’s ability suddenly get much more complicated.
From ACM Queue - Syntactic Heroin
Referenced Fri Aug 05 2005 09:30:29 GMT-0700 (PDT)

This echoes comments that Damian Conway made last week at OSCON regarding Best Perl Practices. Students seem to be especially taken with overloading when they learn about it. Its a novelty to be able to define syntax looks like its a built-in. This article points out the dangers.

10:28 AM | Comments (2) | Recommend This | Print This