In the introduction I covered the concept of blogging a new project from beginning to end. The reason for the vagueness of ‘project’ and ‘experiment’ is because I have not even decided on a name for the project yet. That is part of the entire ‘from concept to end result’ thing. If I had a name already it would not really be from the beginning now would it.

Now it is time to give the concrete context in which this project is being done. This could be considered the pre-requirements. It is important to know how your project is going to be used and the environment it is going to be used in. If you want your project to succeed, this information is crucial. this is where you find out what is really needed over what is requested; especially when the person declaring what is wanted is yourself.

This is going to be a very long post, but I will be touching on almost every sentence later on as this all needs to be taken into consideration. It is crucial to looking at these big picture issues when designing some tool or application which will be used in that context. To properly understand the why behind decisions I later make, I would rather dump it all here and refer back later. I hope it will be interesting to some at least.

As the background for this project is going to delve into the processes and development procedures at my day job, it is time for a,

Disclaimer:

The postings on this site are my own and don’t necessarily represent my employer’s positions, strategies or opinions. Opinions and ideas expressed here are mine and mine alone.

Who we are

My title is ‘Principal Research Engineer, MREC’. “MREC” is the modular speech recognizer project. We are a development group within research, in charge of productizing the algorithms and features needed by our extensive research teams, and deliver a core speech recognizer. We are a small team of 8 developers. There are many internal clients with technology built on top, and we are not the only engine in the company. Everyone on the team wears many different hats. We are all developers, researchers, operations engineers, QA engineers, release engineers, technologists, and much more. Everyone works on everything, down to the build, release, deployment, and source code management tools. No one is irreplaceable.

How we Work

There has been a wave of ‘agile’ going through the company which I personally find both invigorating and frustrating. At issue here is that we are really talking about ‘agile-but‘ at best. What is frustrating is that our team has been agile sense before agile existed. We release often, and not all releases are used by all our clients. Some are only really used by our team. Most often releases are only used by our researchers, and never see a product. A release is made either because we would release on a date (irregardless of what is in the release) or because a feature or set of features were just completed. It has always been that way.

We do mainline development with continual integration. That is we have one single branch which all the work is done. When a major release occurs we fork the project, and the newly created fork is a maintenance fork which only has bug fixes. Any bug fixes are first implemented in the mainline and then back-ported to the forks which we determine need it. We try to only keep one or two forks live; which means sometimes forcing clients to upgrade to a newer version even when they only think they care about a single bug affecting them.

Notice I have not mentioned branches or branch development. We don’t do that. Check in, and check in often. Update with the mainline multiple times a day as needed. All tests must pass before checking in. Every bug is really two bugs, the bug and the missing test for it, both must be fixed to consider the issue resolved. This may sound like it slows down development, but just the opposite is true, and I would never work any other way now. We do have the ability to branch and work in isolation, but we prefer to keep that type of thing very short, and to be honest I can’t think of a time it has been needed. The point is not that you need to integrate with other people changes, but that you need to get your changes in to the mainline immediately so that everyone else can integrate with you.

There are code reviews. We only implement what is needed to solve the underlying problem or request. We do not implement what is requested, but what the requester needs, and rarely go beyond that. We do not implement features that are not used. We love to remove code and unused features. Over the past sixteen years we have removed three times more code than the current size of the project. We have competitions to see who removed the most code. That is the only code metric we look at with any seriousness. Our client facing API is reviewed by the entire team and all clients, and is approved before checked in. The documentation for the API is in the API headers. We bikeshed on the API endlessly; or at least it sometimes feels that way.

None of this should look surprising, revolutionary, or out of the norm. All the major open source projects I know of work this way; more or less.

Communication is the Problem

As I said, there has been this wave of ‘agile’ in the company. Recently someone said to me, “I saw this months sprint document for MREC. It is great to see that MREC is going agile too!”; and I bite my tongue. I want to say, “You keep using those words. I do not think they mean what you think they mean.” When a term like ‘sprint’ is used is does not have the meaning I expect it to have. Let me be clear here: the problem is me, not everyone else. The problem is not inherit in the systems, but in the communication. My communication and the communication between our group and the rest of the company.

Our group has been operating somewhat like a black box. Different clients would file requests and our manager would work with the clients and put together a general schedule for getting features in, and re-prioritize as needed. This is a group effort to figure out what needs to get done and in roughly what order. Our clients do not talk much to each other for prioritizing our work, as we do the prioritizing. The end result is, while we have advertised what we are working on, and what goes into a release, these things have been very hard for our clients to track. They know when a feature they need is in a release, but then they have to accept all the other features for all the other clients, which they have not been tracking. One could argue that this is not our groups fault, as we do make all this information available in one form or another; but that is not really fair and as we will see ‘one way or another’ hides a multitude of sins.

Communication is the Solution

Much of this is old news and we have changed much in what it appears we are doing. In truth we have not changed the way we develop at all. We instead are exposing what we are doing and allowing our clients more control over our priorities. The end result is that while there is no real change in our priorities, our clients understand them much better and feel like they can predict things better using agreed upon terms like ‘sprint’. In truth it is just a change in perception; instead of not reading internal web pages and e-mails with text attachments, they are not attending meetings, not not reading different internal web pages, and not reading different e-mails with excel attachments. But these are meetings, pages, and e-mails they asked for and we designed together, verses things our group dictated; which makes all the difference in the world. Yes, I am being hyperbolic.

Even with this added communication, which is real and effective, there are still problems. They center around our tools and environment. We are in the late parts of migrating to new systems, but all this has done is highlight the flaws and weaknesses in the systems we have yet to migrate.

The Environment

The different teams have different development styles which best fit their needs. Same goes for the tools used. The source code management we use is an in-house wrapper on Perforce called P4M. This is open sourced as part of the DevTools project; which is woefully out of date and we will be releasing a new version of soon. We have multiple projects hosted on a single server. Our team has four projects including MREC, and other teams have projects as well. P4M enforces a directory layout including forking and branching. We choose Perforce over SVN or a DVCS for many reasons. We need to deal with huge files (>2Gig) and a basic install takes up >32Gig. The MREC tree alone has 300GB of history. We need to be very careful of permissions. Our engine is used in many medical transcription applications, so we must be very careful. We try never to check in unwashed data which might contain any PII. In the off chance that some does get through we need strict centrally controlled permissions. At any time we need to know who has what on which machines. For patent and litigation reasons we need everything centrally managed with strict backup rules. Perforce really is the only affordable system out there which can handle this. The entire 16 year history of the project is maintained and with proper dates and attribution.

Our project planning is managed external to the source code management, and the bug tracking and issue management. This at first might sound odd, but with so many internal clients, the priorities are often changing. In our issue management system, we have 3 severity levels (low, med, high), and a fourth level called ‘request’, for all features and improvements. Requests are automatically lower than any bugs. This has caused some friction with client, especially research. The problem is, the person filing the request really has no clue what the real priority is for a request. That is unless they are a project manager and have consulted all our other client product managers, and want to take on the task of updating the priority over time. As such the real priorities are managed in another system which links back. Known bugs on the other hand are always more important to fix than a new feature or improvement. This is a royal pain to try to communicate, and still results in heated discussions; we will get to this in a later post.

Releases are made to network directories which are accessible from the same path on all our grids. There is a ‘current’ symlink for the latest supported version, the version is always increasing, and the directories are named with the version number. Each checkin to a codebase get’s its own unique increasing version number (managed by P4M). The release includes built static html documentation. This documentation is served up by an apache instance, or can just be accessed directly. We are migrating away from a hodge-podge of doxygen, pydoc, and other inhouse hacks over to a standardized sphinxdoc system. this includes taking the help from our API headers and generating sphinxdoc which is cross references with our Python interface. The overarching project planning, and release information is in a plone instance which is used by all of research and development. This is all magically cross linked and searchable; or will be once the sphinx stuff is done, some of which will be tackled by this hear project. Our sphinx extensions will be released as part of DevTools, and some of the release exposure stuff will be dealt with this new project.

Lotus Notes

There are some pain points which have been hitting me over and over again now that the other parts are so much better and fully integrated. Specifically the way we review changes to our API, discuss architecture and format changes, and issue tracking. All three are managed in a 16 year old Lotus Notes system which has not changed much in 10 years. Very few groups in the company still use notes, and most can not be bothered to install the client. Most have other tools for these things which have web interfaces, or similarly, are causing communication problems. Requiring a tool like Lotus Notes does not sound like it would be much of an issue at first, especially as Perforce is required for code access. But when you realize that this includes a new e-mail address which is not integrated with your ‘real’ exchange address, and you need help to set it up to make sense, and that we have no IT staff which know or understand Notes, nor a support contract; heck for all the nice things we have in it, I don’t want it. There is 0 chance of getting a web interface as well. In short it is an isolated island which plays well with it’s self, and can be linked to externally with some contortions (on windows only, with specific OLE/ActiveX plugins), but does not play well with others; and playing well with others is the point.

I do want to give a shout out to Eric Ochieng who set up the Notes infrastructure years ago and which has been working flawlessly sense with no real support or maintenance.

What this project really is

This project really is not a bug tracker. What it really is, is something to replace all the functionality we currently have in Lotus Notes, with other things which integrate into all the other things we do. Part of that is indeed issue tracking, but focusing on just that would be a failure from the outset. We could go with something like CearCase+ClearQuest+Replication+Send-All-Our-Revenue-To-IBM, and spend months trying to get it all working, but that is just silly. There are a number of open source and pay for solutions which almost fill our needs. It is that ‘almost’ which is the problem. There is nothing out there which does the things we need, let alone cover the things we want. This project is to dive down into what the real needs are verses the wants. Find the solutions out there which come closest while integrating together and with what we already have. the pieces which are still missing, I will create from whole cloth of some type. Python is not a requirement, but I will be leaning towards it, as I am the only person on our team which knows Ruby or Java well enough to support those languages; we are allergic to .NET for reasons I will go into later.

Up next: The Big Picture – a dive into needs and wants and whys.