Archive for the ‘refactorbot’ Category

Evolutionary vs Extensive: Refactorbot explained

Wednesday, August 1st, 2007

Yesterday, I was discussing the Refactorbot with a friend of mine. Although he has been a developer for longer than me, he couldn’t grasp why the Refactorbot could not possibly test every solution and had to fall back to evolutionary design.

A bit of maths

If you want to test every possible solution for a given design, you have to test every possible number of packages to hold classes and every possible combination of those classes in the packages.

If you have done some probabilities, statistics or enumerative combinatorics, you certainly will see where I am going to with that…

I have never been very good myself at mathematics, but my guess is that the number of possible solutions to our design problem can be represented with the following formula.

This is the sum of all combinations, without repetition, of n classes in p packages for p ranging from 1 (all classes in 1 packages) to n (each class in its own package):

Equations for all solutions

with

Equation for all solutions, for a given number of packages and classes

(for example, for 30 classes and 5 packages, this gives us 17100720 possibilities for this case alone)

We then quickly understand that it would be impossible for a program to test each and every possible design for a sizeable system: the sheer number of possibilities would make it impossible to test in a lifetime!

Refactorbot v0.0.0.3

If you want to have fun staring at an empty console while your CPU is hitting 100%, I have created an implementation of such a refactoring method.

You can download the Refactorbot v0.0.0.3 which includes this method. Good luck!

The case for small systems

As much as my mathematical demonstration convinced me, myself and I, I had to tell you about my experiments on the design challenge (included in the test cases for the Refactorbot)…

It seems that for a small number of classes, say 8, the method is not only quick but very effective: I found the best ever solution to the design challenge!


ref model: 3 packages/8 classes (D=0.4444444444444445, R=0.8333333333333334)
-> p1 (3 classes): B A C
-> p2 (3 classes): D E F
-> p3 (2 classes): G H

new model: 3 packages/8 classes (D=0.08333333333333333, R=1.0)
-> p0 (4 classes): G C A B
-> p1 (3 classes): E H D
-> p2 (1 classes): F

However, don’t be fooled by this stunning result! It would become increasingly tedious to try and find a solution for bigger and bigger systems.

The deal with evolutionary refactorings

I don’t think that the purpose of the Refactorbot is to come up with the best design - and I will write a piece soon about a definition of the BEST design - possible, but rather with a better design.

If we consider that the design of a software is its DNA, then you can probably try small improvements and keep them or discard them in an evolutionary fashion. Better design would emerge by trial and error; the Refactorbot should do just that.

What is perfect design anyway?

Sphere: Related Content

Refactorbot v0.0.0.2: just use it!

Friday, July 27th, 2007

Following the term coined by Jason Gorman, here is a new version of the Refactorbot that you can use to test your models with:

download Zip file
(binaries + libraries + source + Eclipse workspace = 4Mb).

XMI loading capability

In this version, you can feed it an XMI file (I create mines with ArgoUML) and it will attempt to create a better model for it.

The XMI loading, however, is very - very - slow for big models… I coded it using XPath (Xalan-j precisely) and that proved so sluggish that I implemented a cache system: after the first creation of the in-memory model of your XMI file (be patient), it will create a .ser version of your XML file in the same directory and reuse it for next runs.

Because of the nature of the algorithm (using random refactorings), you may want to execute the program many times for the same model, and I can guarantee you that this cache will come quite handy!

New refactoring algorithm

I have implemented a new algorithm that changes only one class at each iteration: it will randomly select a class, randomly decide to create a new package and/or randomly choose the package to put the class in. It will then run the metrics again and keep or discard the new model based on the new metrics.

Don’t worry, the original algorithm that attempted to fathom a complete new model is still here. It is just that I thought it would be interesting to have different algorithms to play with.

Furthermore, I think that this second algorithm is closer to Jason’s initial view that the Refactorbot would do 1 random refactoring and then run all tests to prove that the system has been improved…

Using it

For you lucky windows users with JSE 1.5+ already installed, there’s a batch file in the archive that let’s you just run the application; just run:

refactorbot.bat myxmifile.xmi 1000 p

The others will have to install a version of Java greater or equal to 1.5 and launch the refactorbot.improvemetrics.ImproveMetrics class. The required libraries are provided in the lib folder.

The output is still very crude because it will only tell you the list of packages it has generated and the the classes they contain. I should produce an XMI output very soon, but that’ll wait until I learn a bit more about XMI!

Your impressions

My own results are quite encouraging: I have tried the Refactorbot with a sizeable model (177 classes in 25 packages), and although the first loading of the CSV file is slow (it has 625 associations in a 20Mb file, and that’s what takes most of the time), the improvement of the model is quite fast! Granted, it is quite easy to improve on that model (that I reverse-engineered from a project’s codebase with ArgoUML), but the insight I got was still very invaluable!

However, this is probably the first usable version of the Refactorbot, so I would like to hear from your own experience with the automatic refactoring of your own models! Send me an email at contact@<my domain here>.com, that’ll help improving on the program…

Oh and by the way, I care about software!

Sphere: Related Content

Automated Design Improvement

Friday, July 20th, 2007

Jason Gorman, again, inpired me for a new blog post. Some time ago, he offered an OO design challenge in which a design’s maintainability had to be improved upon without any knowledge of the underlying business case.
I quickly gathered that you could solely rely on metrics for determining quality level for a design and did a few trials myself (see below) to improve on the metrics[1].
Jason posted his own solution yesterday, and I suddenly remembered one of his earlier posts that suggested we should be able to automate this process. I detail such a solution further in this article and I give one possible (better?) solution to the design challenge.

Trying to find a methodology

My first attempt at making a better model was to try and see the patterns “through” the model. I moved a few classes around in ArgoUML, generated the Java classes and ran the Metrics plugin for Eclipse… alas, although the normalized distance from main sequence[1] was quite good (D=0.22), the relational cohesion (R=0.55) was worse than the reference model’s (R=0.83)!

First attempt at design challenge

In order to be able to improve models metrics with consistency, I had to devise a methodology.
I ordered the classes by their distance in the dependency graph: the more dependable, the better for maintainability. The dependency arcs are as follows:

B -> A -> C -> G
B -> A -> C
B -> A
B -> D
E -> D
E -> F
H -> D
H -> F

This prompted me to put the classes in four different packages like this:
Second attempt at design challenge
Not very different from my model created without applied methodology, but it has a great flaw: it has a cycle between p2 and p3! (and awful metrics too: D=0.40 and R=0.66)
Moving class A back to package p2 does break the cycle and improve the normalized distance, though only slightly (D=0.375).

Automating the search for a solution

At that point, I went to bed and left it running as a background thought until Jason posted his own solution to the challenge… the way he was proceeding with the refactorings reminded me of one of his earlier posts, though I can’t seem to be able to find it any more, that suggested we might be able to build a robot to perform random refactorings that would be kept if they improved the overall design of a system… if I couldn’t devise a method for solving this problem, I had better leave it to chance completely!

So I built a simple version of such a lucky robot with a very simple design, that would just pick classes from the reference model and, for each of them, decide randomly if it should create a new package or choose, still randomly, a package to put it in…
Once the model is fully built, it runs a few metrics and compare them to the reference model and, if it shows an improvement, keeps the generated model as reference model (otherwise discards it) and does another cycle.
And here is what it produced, after a few thousand cycles:

Third attempt at design challenge

It is definitely much more complex than any other model I could have come up with by hand, but it translates into nice metrics: D=0.1875 and R=1.0!

This leads me to believe that with a bit of work and time we could come up with a more advanced solution that would help designers get the best out of their design without risking to break everything…

You can download the rather crude source code if you wish to have a look at the application.

[1] see http://aopmetrics.tigris.org/metrics.html for a good explanation of a few software metrics metrics

Sphere: Related Content