May 15, 2011

A Subword Matching Completion Engine for Eclipse?

A few hours ago stumbled across a post at titled  "Eclipse: Fulltext autocompletion for Java?" where a programmer is essentially asking whether Eclipse' code completion supports partial proposal matching.

I copied the question here to make this post a bit more self-contained:


Hi, can I have fulltext autocompletion for Java @ Eclipse? Let's demonstrate:
Final piece of code:

getVariants().add(new Variant(MediaType.TEXT_XML));
How do I code now:
getv[ctrl+space].a[Enter]new V[ctrl+space, down arrow, Enter]M[Ctrl+Space, Enter].text_x
Basically, Eclipse completes word "TEXT_XML" when I provide letters "TEXT_X".
How would I like to code:
getv[ctrl+space].a[Enter]new V[ctrl+space, down arrow, Enter]M[Ctrl+Space, Enter].xml
and Eclipse should realise I meant "TEXT_XML" (fulltext autocompletion).


Well, thanks to JDT's quite comprehensive APIs this is not too complex to implement, and I wrote down a simple version of this completion engine (10 locs).

This is how it looks like in action now:

But now I wonder:

  1. How often do you just remember a few letters but don't know the complete name of a method or member?
  2. Would you like to see such a completion engine in Eclipse?
  3. What else would be missing?

I would be glad if you could send your thoughts to the code recommenders forum.


Stay in touch. Follow me on Twitter.

Update #1: Neat research on this topic.

I just found a paper (almost) exactly on this topic. Could you check out these screencasts?
Maybe it's even more than you/we would like to see :-) But I think that's pretty cool stuff and having a similar completion engine for Eclipse could be very, very cool. Internally it's based on Hidden Markov Models which learn how people abbreviated certain identifiers (like  stx --> 'setText') in the past. The nice thing about it is that we don't have to come up with our own matching algorithm but just use what the programmers used before... I like the Han's work :)

Update #2: Prototype available.

After a quick and dirty monday afternoon hack a first prototype is ready. In a nutshell, it takes your prefix entered and creates a regular expression from it. For example, when entering button.stx|<^Space> the engine only presents those proposals that match this regular expression .*s.*t.*x.* which finally may match "setText(text)". This is roughly what Maxime proposed in his comment below. 

In addition it now respects upper case letters as suggested by Deepak. For example button.sT<^Space> now evaluates to  '.*s.*T.*' whereas would evaluate to .*s.*[tT].*.

Currently, it does not reduce the proposal score as recommended by another comment nor does is give precedence to those words that match the prefix - but we may do that if you find this more useful. Just drop a comment in the forum mentioned below.

Please find the installation instructions here:

Please send your comments to the forum here:

Stay in touch. Follow me on Twitter.