Thursday, May 1, 2014

Ramblings on the research process...

It's coming up to 1 year since I started my PhD studies, and over 2 years that I've been involved in CS research work. This post is just a random collection of ramblings about various aspects of the research process and/or academia in general.


On the Prior Literature
Do not underestimate the prior work...
One of the first things I discovered was the extent to which the prior work has largely covered a good portion of the thoughts/ideas I'd been thinking about investigating when first starting out on HCI research, but also nearly every follow up idea that those would lead to (see this for some examples/ideas). The worst part was, that all of this was stuff that had been done (and largely abandoned + forgotten by academia, save for the fact that these articles are still loadable/findable via the various databases) over 20 years ago! There is still some on-going work in these areas, with "new" approaches generally few and far between (and the rest of the work just on refining algorithms for use within one or two frameworks *cough* treemaps *cough*).

The "Damn... Someone already wrote my paper... <x> decades ago"
Supposedly, there comes a point in every PhD student's career that they come across a paper, which threatens/looks to have already scooped their work. Adding insult to injury is the fact that they did it ages ago.

A few weeks ago, this happened; not just once, but twice.

I stumbled across a paper from 2003 that (according to its abstract, intro, and conclusion seemed to all intents and purposes to) have already looked into the problem I was looking into, and that they had come up with the taxonomy/design framework I'd been building up. (Note, I'm making progress here... 10 years ago! 10 - after 2 years of involvement in the research community, my ideas of interesting problems to investigate are catching up at last ;)     However, upon closer reading of the rest of the paper to see exactly what they did, and their justifications, it soon became apparent that their claims couldn't be further from the truth. They had, it turns out, only 3 techniques, many of which aren't even used/relevant anymore today, and none of which lead to any generalisations that could apply to the entire class of approaches.

The second time was when I started checking the references cited by that 2003 paper. In particular, there was a PhD thesis from 2001 which contained a lot of useful material related to my work (and actually, touching upon one aspect I'd been having a bit of trouble trying to figure out how to parametrize). Sure, it's pretty comprehensive, but even then, there is still room for more. 

The lack of trickle-down to industry (and even across fields)
One of the more depressing sides to slowly coming across much of this material is the slow and sickening realisation that there's a stinking large amount of stuff that the world at large should know but doesn't, and not in the way we'd normally assume either. Rather, instead of an absence of knowledge about certain things, the problem seems to be that the people who would benefit most from this know absolutely nothing about the things have been discovered. That is, there are many findings, tools, etc. that people in industry should really be made aware of so that they can apply it to their work and also apply this knowledge when they work. Instead, we see people in these fields skittering to and fro, blindly bashing at things like stoneage cavemen (and cavewomen), wondering why their stuff doesn't work 99% of the time, and marvelling when the 1% accident (which isn't even the best for the situation at hand) appears to be relatively successful.

And it's not just that findings are not trickling down to industry, where they are likely to be useful. Research tends to get quite "siloed" into various disciplines, where useful results from one area are hardly known about in other fields where they may be the exact breakthrough needed (or perhaps to just avoid duplication of effort). You need look no further than something as benignly textbook material as the "Travelling Salesman Problem" - which is studied in no less than 3 different disciplines (though they at least seem to have realised this, and have started to share findings): Fundamental (Algorithms-Research) Computer Science, Operations Research, Abstract/Discrete Mathematics.

The fickle nature of finding good leads
One of the problems compounding the trickle-down problems is just how inherently difficult it is to find half of the material out there. Using the wrong keyword(s) simply from not knowing the exact set of right terms is often enough to frustrate even the most dedicated of searchers. Sifting through the noise to find promising sources is even harder (with many duplicates and false positives cluttering the path). Then, there is the problem of extracting value from the information out there. 


On Pursuing Project Ideas
It's been done before...
Sorry, I just have to bring this one up again, since it is just so bloody common. Ever second step you take, this will eventually happen. Over and over and over and over again, ad infinitum. Let's just say it is quite demoralising when this happens (versus in industry, where if you happen to stumble across such a find - as long as it isn't a competitor in the same field which is already well established while you're trying to start out - this could very well be a godsend... assuming of course that you manage to make sense/use of the prior work and that it does as promised). It's bad the first time. It's bad the second time. And it's certainly still bad the ith time it happens, though somewhere in between, you sometimes learn to just shrug it off and go bash at something else when you hit a dead end. But...

There's still some merit to this idea... 

Sometimes though, it takes a while to beat the notion that some idea is a dead-end out of your mind. Or perhaps your supervisor's ;)

The former is a bit easier to deal with: you simply try reframing the problem to narrowly skirt past the prior work or whatever was causing the failures dooming the idea in the first place, and try attacking it from this new vantage point. Your personal conviction in its worth will see you through once you've managed to work through the cognitive dissonance long enough to trick yourself to attack the problem again. However, if after several rounds of this you're still not making much new headway, and all the evidence seems to suggest the inevitable, then it's still satisfying to decide that you've exhaustively investigated it and found it to be a crap idea.

The same can't be said for the other scenario. Like a bad stench that won't go away, each failure just adds more albatrosses and other rotting half-formed corpses that make it harder to figure out what exactly is the point of what you're working on anymore. I'm currently staring at the 3rd (or probably more like 4th or 5th, if you count some of the intermediate half-versions that weren't ever really complete) iteration on one such idea; I'm not too keen on tackling it another time (or at least, not so soon again), and certainly find it quite hard to be too motivated to try again another time. Then again, it's also true that that's because I've just taken my learnings from previous attempts and just used these to manually optimise my own systems such that none of the problems that had initially motivated much of this work are of any relevance anymore. Sigh... yet, this gets us back to the original point: there is still a small inkling of promise in this idea somewhere, and it does seem like no-one else has yet published anything exactly along these lines. However, the tortured incubation period of the work has so thoroughly warped it, that you're left mulling the dreaded question...

What exactly are my contributions from/with/in this work???
There is often times a bit of an expectation to tackle "big untapped problems". I'm sure you've probably heard of at least one or two of these big blue-skies scenarios when someone comes along, scoops the field, and appears to easily waltz to glory clearing the first dirt path through the bush. Certainly, it's what many people would think of when thinking of what researchers do, and what most aspiring new researchers themselves hope to achieve.

However, if you spend enough time tackling a problem and/or trying to broaden your understanding of doing due diligence to avoid reinventing the wheel, that gulf between the difficulty/importance of the problem you're tackling and the inadequacies of the existing solutions starts to shrink. Sometimes uncomfortably so... For instance, I'm slowly starting to suspect I may have exaggerated the extent of the problem in my current intro (although without doing so, it's harder to justify spending time working on it I guess), while the more I learn about how the existing solutions fit around various human limitations, the smaller the scope for any improvements is left (in short, they've practically got the most of our needs covered... we just need to use them better)... All of a sudden, a rich juicy band of innovation space starts looking like a paper-thin sliver of twisted semantics (and a whole lot of willing + cursing that it will all work out).

Crumbs. For all this effort, that's all we have to show for ourselves. Blegh!

Is the grass greener in the paddock next door?
There's the saying that the grass looks greener on the other side. In research, this comes in the form of tangential topics suddenly looking like really appealing exit strategies, the untapped goldmine that we should have focussed on in the first place. In the face of doom and gloom diminishing contributions, it is certainly an attractive proposition. Though, any entry to another field is always fraught with risk - namely, that of just running straight into another rapidly-closing vice, whose nature is often not entirely noticeable initially through the plain old ignorance that everyone else out there who hasn't attempted to tap into this stuff also suffers from. Gah!

No comments:

Post a Comment