In the summer of 2017 I had the pleasure of spending several months as a research intern at UCL's Surgical Robot Vision (SRV) lab. I was primarily working with HoloLens, developing a system for tracking different surgical tools and displaying relevant information as holograms.
Almost a year has passed since I joined SRV for a two month internship. It's true that I've since forgotten some minor things, but there are several important lessons I have learned back then that are still with me now. Truth be told, I was planning to publish this article shortly after my internship has ended (pic below) but, for some reason, I felt like it would be incomplete - now I have finally understood why.
What I've been waiting for...
As I said above, something just didn't feel right before. A couple of days ago I finished a project involving autonomous drone exploration using ROS, and it dawned on me - if only I'd use ROS more during my time at SRV, my development speed would be much higher.
A year ago, back when I joined SRV, I had very little robotics experience. During my first year in UCL I've taken a class called COMP105P Robotics Programming. This class didn't even mention ROS, which was definitely a huge oversight since ROS is a crucial part of robotics these days. In UCL's defense, the timeline for that class had to be rushed because of the exam period, but I still wish they would include ROS in some shape or form.
Caltech, on the other hand, has a very strong robotics track with amazing classes like ME/CS 133 Robotics. This year they offered an new class called CS/EE/ME 134 Autonomy. As the name implies, it focused on autonomous systems, mainly autonomous exploration and SLAM using different robots. The practical part of the class relied heavily on ROS, with a fair bit of handholding from the TAs during labs.
This soft introduction to ROS helped me understand just how powerful (and, in a sense, simple) ROS is. Thinking about the final product of my SRV project, I can see exactly where ROS would make my life easier - most of the time it's about standardadising communication between, say, HoloLens, and the ultrasound machine.
Considering SRV literally has the word "Robot" in its name, it should be obvious that people there use ROS all the time. Back when I was just starting my project, my colleagues suggested I use ROS. I quickly brushed off the idea because the learning curve was a bit too steep. Additionally, I had to use Windows for HoloLens development and ROS doesn't play well with Windows.
This wasn't a bad decision overall: I still ended up building a decent proof-of-concept system even without ROS. That said, if I were to redo the whole project, I would definitely use a different approach. I only needed Windows on my development machine to compile my HoloLens application, so it wasn't a runtime dependency. This meant that, potentially, I could do HoloLens development on Windows but then switch to Linux to get access to all ROS features. This is the main change I would make if back then I had the experience I have now.
Mistakes made and lessons learned
Note: Don't get the wrong idea - although I start the article by reflecting on my mistakes, my experience with SRV was dominantly positive. In fact, I had a blast working with them! Check out the end of the article for my thoughts on this.
First of all, I must say that the proof-of-concept system I built ended up working as expected. This video shows one of the final prototypes in action. You can notice that the hologram for the ultrasound probe (green mesh) doesn't perfectly align with the actual probe - I initially used a wrong transform in the absolute orentation algorithm. I fixed it later, but I didn't shoot any videos after the fix was made, which leads us to my first mistake.
Mistake #1: Not enough videos
As the saying goes, a picture is worth a thousand words. Extrapolating to videos, your average PAL stream shows you 25 thousand words per second. In my case, shooting more videos of the final setup would not only make it easier to publicize the project, but also help my colleagues reproduce my experimental setup.
This might sound silly but making video instructions actually makes a huge difference. In my case, the setup was using a Python server, OptiTrack Motive software, infrared cameras, ultrasound machine and appropriate software, Unity3d, HoloLens-specific Visual Studio plugins and some other minor tools. During development, I've become familiar enough with these tools to setup the whole system with ease - but it is unreasonable to expect other people to learn the whole pipeline just to test a small component of my system (say, IR tracking).
Producing more videos showing the whole setup in action and explaining how to setup each separate component (using some voiceover) would definitely make things clearer. Unfortunately, I didn't have the time to do that, which leads us up to my second mistake.
Mistake #2: Hard to reproduce
While it's true that I was learning some new things on my way, I wouldn't say I wrote bad code. The code I wrote was reasonably easy to follow and I tried to clarify any ambigious points in the README. The issue was that there was a lot of code - not only that, but it was also split between a Python (backend) codebase and a C# (HoloLens) codebase.
Near the end of my internship, I decided to use my last week primarily to fix bugs and increase the quality of codebase in general. While this wasn't a bad idea, I had my priorities wrong - I should've spent more time writing documentation instead. I learnt the hard way that mediocre code with good, detailed documentation can be better than perfect, poorly-documented code.
Despite the little time I dedicated to documentation, I still ended up writing a very long README file describing everything in detail. Sadly, quantity doesn't imply quality and there were still things in the README that were ambigious (mostly related to experimental setup). I believe that, if I would spend more time bouncing documentation off my colleagues, I'd be able to make things more clear.
I still don't have a universal solution to writing detailed documentation for complex systems. That said, in my particular case I could avoid some of the complexity by not reinventing the wheel and using ROS, as I mentioned in the beginning of this article.
Mistake #3: Not enough expiremental data
With the help of my supervisor I ended up producing a short paper for the SPIE 2018 Medical Imaging conference. That paper wasn't particularly good because it didn't display much experimental data - because I didn't record much data in the first place.
I was too focused on improving the prototype that I completely forgot to measure quantitative things like hologram drift, lag in ultrasound imagery, error in hologram placement and so on. I didn't really have any research-related experience before SRV so I didn't realise how important data is before I actually begun writing the paper.
Another important lesson was that just recording data isn't enough - you must also make sure to store it in a safe place. I used a MacBook provided by SRV for my work and stored all of the data on it. After I left, the laptop was repurposed for other projects, and the need for my data didn't come up until several months later. At that point, tracking down the data proved to be a bit of a challenge, which could easily be avoided if I'd just store the data online.
Conclusion on my mistakes
Some of these mistakes might seem silly - they certainly do to me when I re-read the draft of this article. Nevertheless, those are issues that I have actually experienced and I think it's important to reflect on them. After all, this is exactly what internships are for.
Working at SRV
The list of mistakes above is the result of me reflecting on my SRV experience for a year or so. During the actual internship I had an amazing time - not only did I get to play with cool, expensive hardware (see pic below) but I was also surrounded with friendly, smart and helpful people.
As I said earlier, when I joined SRV I hardly had any research experience. I was also not too familiar with robotics and computer vision. Much to my surprise, that wasn't an issue most of the time - I got so much support from SRV members that I was never stuck on some particular problem. Apart from helping me out with my project, they were happy to answer my questions about their PhD and postdoc experience, as well as give me academic advice.
Now I'd like to take some time to thank Dan Stoyanov, my supervisor and the head of SRV, for inviting me to work with them and guiding me in my work. I'd also like to thank Mirek for introducing me to Dan and answering numerous questions I had during my internship. Thanks to Francisco for explaining computer vision and ultrasound concepts to me, to Francois for helping out with hardware and video conversion, to Evangelos for giving me a Quaternions 101 lecture and to George for giving me a hand with 3D printing. Huge thanks to Neil for helping out with the paper, it would be impossible without him!
I'll wrap up this article with a picture of a $2 million da Vinci robot that I had a chance to play with. I had a very good time interning at SRV and I hope I'll get a chance to work with them again in the future.