I'm looking forward to my talk on automated self-testing and health check of live Adobe CQ instances (now part of Adobe Experience Manager) at CQCON 2013 in Basel. A number of Apache Sling and CQ5 developers have been working on various tools to check the health of a live CQ or Sling instance, and we seem to be converging on a set of simple tools that can be very helpful for system administrators and developers.
The focus of my talk is presenting the SLING-2822 prototype that I've been working on recently that allows for creating health check tests for Sling systems in an extensible way. Out of the box, you can configure rules to check MBean attribute values, OSGi bundles states, disabling of Sling default accounts or any rule that can be checked by executing a script in a language that Sling supports. Extension points allow for registering additional rule types as OSGi services.
This prototype is not a finished tool. The goal is to start a concrete discussion about what's actually needed in terms of health checks. This can be seen from various angles: checking that the CQ security checklist has been applied, checking that the system is still in good shape after upgrading some OSGi bundles, checking that performance counters are within acceptable limits, etc.
With this in mind, I hope that you will attend CQCON 2013 and join in the open discussion to provide feedback on which direction such tools should take. I'll make sure to mention the idea of using server-side JUnit scripts for auto-testing production instances, as reactions to that idea have been varying from sheer enthusiasm to cries of horror. Just slightly exaggerating; but it looks like opinions may vary on which tools to use where.
I'll also present related tools like Joerg Hoh's cq5-healthcheck tool and Davide Gianella's sanity check tool. We haven't yet decided if and how much we should merge all those ideas into a common tool, so getting feedback from you and the rest of the CQCON audience on that will be invaluable.