So, someone who was hoping for a particular feature to be shipped by my group was disappointed; the representation of their info in the websearch index wasn’t in a form that was usable by them. This was a pure miscommunication: the guy who dealt with the requesting group didn’t know that they wanted this particular format; the requestor assumed that because he had dealt with a peer of mine on a similar issue that I would be familiar with that history, etc.
How seriously one should take this kind of disconnect obviously depends (just like with software bugs) on the fault-tolerance of the application domain. Nuclear power plants and Mars-exploration probes should have more review applied to their software than consumer websites. But given a certain level of care required, how should we treat communication failures like this?
I think that communication failures should be treated just like software bugs, by which I mean: seriously, but without freaking out, and with some process attached. There are two dangers: 1) freaking out and issuing stupid dicta (“There must be no bugs! People who create bugs must be sacked!”), and relaxing standards (“Hey man, like, bugs happen, so like bugs are OK.”)
Here’s a facile equivalence: bugs are to testing as communication failures are to specs. That is, if we had infinite testing periods, all bugs would be caught, and if we had infinite request-for-comment bandwidth, all specification bugs like the above would also be caught, and we’d all be on the same page before anything launched. But everyone is moving faster than that, and we’re all trying to find the right balance between speed and process.
Process and communication failures like this are like tires beginning to squeal (or maybe like getting pulled over by the Highway Patrol) — they’re telling us that we’re moving just slightly too fast. But the response can’t be to start driving 15 miles/hr in all conditions — the right response is to do a little diagnosis and then back off on the gas (but only a little). Because what you’re looking for is the sweet spot: the maximum speed (i.e. lack of process) that lets you get there as fast as possible without spinning out or getting ticketed…
In this case, it just meant a workaround for this release, and the right thing in the next, and everyone was cool with that.
(And what was the feature? Well, one of the minor dissatisfactions of what is otherwise a great job is that I can never answer this question. We don’t talk about how websearch relevance algorithms work, and we don’t talk about upcoming product releases, and a lot of what I do gets covered by one or the other. I work behind the back end — sometimes I envy the people at Y! who work on the pieces closest to the users (or developers): front ends, web services APIs, etc. Because what they do ultimately _can’t_ be secret, and talking about it is eventually part of the job. But anyway, the feature will be way cool, really, and it will surface soon – maybe via a web services API near you.)