Mar 30, 2010

The Problem of Incomplete Javadocs

The Problem of Incomplete Javadocs

Good and comprehensive documentation is crucial for the success of open source software. But creating such documentation takes time and energy, is boring and has almost no immediate rewards. Consequently, documentation of open source frameworks is (too) often incomplete or outdated.

However, whenever there are users of a framework there is example code that uses the framework's API. And if there is example code, the question arises whether information about how to use the framework's API can be extracted directly from example code.

We think so, and thus started to study how documentation could be completed by automatically mined documentation. So far we concentrated on mining documentation required fore developers that plan to extend a given baseclass and created what we called "subclassing directives" from code.

In a nutshell, subclassing directives are generalizations of frequently made observations in code like "Subclasses of Wizard always override its method addPages()" or "Reimplementors of Dialog.createContents() may call its super implementation." etc. Our findings are summarized in our paper "Mining Subclassing Directives" published on the 7th Working Conference on Mining Software Repositories 2010 which takes place in May 2010. The Extended Javadoc View presented here is a result of this research work.

This post describes the basic concepts behind the Extended Javadoc View, provides some examples of how mined documentation could be integrated non-intrusively into Eclipse, and how others may extend the view to provide their own documentation providers. Please note that this project is still work in progress. That means that there is much more work ongoing (see Sketchbook Page about the proposal) and appreciates your feedback.

The Extended Javadoc View

The extended Javadoc View is essentially an aggregator of different information sources for a single code element like a class, method, field or parameter. It is designed as a replacement for the existing Eclipse Javadoc. It provides basically the same functionality as the Eclipse Javadoc View. Let's walk through the existing documentation providers.

Javadoc Tab

The screenshot below shows the view displaying the standard Javadoc information of the JFace Dialog class.

But replacing on view with another one is not a big deal. The interesting part comes with the other tabs in the view: Subclassing Directives and Subclassing Patterns. These tabs contain mined information about how developers typically extended the selected code element. Let’s look on the Subclassing tab in more detail now.

Subclassing Directives Tab

As said above, subclassing directives are generalizations of frequently made observations in example code like "Subclasses of Wizard always override its method addPages()" or "Reimplementors of Dialog.createContents() may call its super implementation". The screenshots below give two examples for these mined directives are presented to a user.
The first screenshot gives a quick summary which methods are typically overridden by subclasses of JFace Wizard. The second screenshot shows a detailed look on Wizard's addPages() method and informs a developer which methods are frequently called within the control-flow of addPages(), namely, Wizard.addPage() and Wizard.addPages(). For both methods the percentage is given how frequently these methods actually have been called to allow developers to decide whether these methods are relevant for him and his task at hand or not.

Such subclassing directives are currently mined for almost all Eclipse 3.5 classes were extensions of these classes could be found in our example code base.
However, displaying which methods to override and to call is just one thing you can do with an extended documentation provider. Let's look on the Subclassing Patterns tab in more detail.

Subclassing Patterns Tab

Subclassing patterns try to group observed extensions of a base class into typical extension patterns, i.e., they cluster subclasses by similarity to find patterns in data. For illustration of the results, look on the following screenshots below. The first picture shows the frequent subclassing patterns found for the JFace ViewerComparator class. It states that typically either the method ViewerComparator.compare() is overridden or ViewerComparator.category() but typically not both at the same time (even if possible). It also states that extenders typically stick with the first pattern (~82%) and only in 19% follow pattern two.

Also for the JFace Dialog class some patterns can be found. Here developers typically overwrote the createDialogArea() method and often the methods okPressed and configureShell. However, also other patterns exist that directly respond to buttonPressed events.

For JFace Wizard two patterns can be found: The standard pattern (overriding performFinish and addPages) and a mixture of several other ways of extending Wizard.


Venturing a look at the Future

To my opinion, the current Extended Javadoc View is an interesting approach that shows what can be found in client code. But much more things can be found in code that might be used to enrich existing documentation. But to make this come true much more aspects need to be considered. We are currently working on a draft for a task-oriented, crowd-sourced API documentation which grounds (at least partially) on mined documenation. How do you feel about that? Does this sound interesting for Eclipse? We appreciate your comments!