Effective tools for computer systems research

Open review version. Please leave feedback!

Give me six hours to chop down a tree and I will spend the first four sharpening the axe.

Abraham Lincoln

Introduction

Research is messy. Our body of knowledge is scattered across countless journals, presentations, blog articles, and tweets, making it difficult to get up to speed in a field. Data collection often requires numerous iterations and is frequently insufficiently documented. The subsequent analysis of data is sometimes powered by fragile code and obscure plotting systems. The resulting research paper is originally called paper.doc, then paper-new.doc, paper-new-2.doc, paper-final.doc, and eventually paper-final-FINAL.doc.

It doesn’t have to be like that. A well-structured research project is not only possible but well within your research – as long as you know how. That’s what this book is about. The following chapters introduce effective tools to do research; in particular related to organising, versioning, reading, writing, programming, visualising, automating, and communicating. These tools are software, processes, or of cognitive nature.

In addition to discussing these tools, I will explain why they are effective. You will realise that logging your time will give you freedom, good writing is the opposite of academic writing, and having a Twitter account isn’t all about posting cat pictures.

Building an effective tool set is a significant return on investment in terms of time, sanity, and research quality. You will save time by automating parts of your research pipeline and putting them under version control; you will keep your sanity by having the peace of mind that your pipeline is robust; and your research will increase in quality because you’re now less likely to make unnecessary mistakes.

Who is this book for?

I wrote this book primarily for new graduate students in computer systems.The field of computer science consists of “theory” and “systems.” I myself am working in systems, and you will find this book the most useful if you are a systems researcher – for example in a field like security, programming languages, or networking.

If this includes you, the entire book should be of interest. The secondary audience is people who program, write in LaTeX, or do research – basically anyone in a STEM field. This crowd will find a subset of this book useful.

Why am I writing this book?

I had little research experience when I started my Ph.D. I faced a steep learning curve in the first couple years and research frequently felt overwhelming. In addition to reading hundreds of papers and trying to carve out your own area of research, you have to teach, take courses, and adopt best practices in research. I’ve always been curious about how other people work and cope, so I wrote the book that I wanted to read as a young student.

Perverse incentives in research place too much value on the number of papers and citations. As a result, people cut corners to maximise their research output – frequently while sacrificing rigor. Poorly documented workflows and sloppy code can lead to mistakes that jeopardise the correctness of a project. By adopting strict and effective workflows, we can minimise these mistakes and save time.

An unfortunate amount of knowledge in a scientific field is implicit, meaning that it’s rarely spelled out. Pinker calls this phenomenon the “curse of knowledge” (Pinker 2015, chap. 3): Years in a field make it difficult to put yourself in the shoes of a newcomer. Examples of (mostly) implicit knowledge are the reputation of conferences, collaboration etiquette in research projects, and how to organising your time. People do write and talk about these topics, but you may have a hard time finding a course or book that teaches how to go about these topics. In this book, I spell out aspects of research that don’t see much elaboration elsewhere – even if it means that parts of this book may seem obvious to you.

How should you read this book?

All chapters are self-contained, so dive into whatever chapter appeals to you. Of particular importance is the chapter on versioning, which is referenced several times in subsequent chapters. Finally, this is a hands-on book and I strongly encourage you to read it while in front of your laptop – ideally with an open terminal. The retention rate is significantly higher if you put into practice what you read in this book.

Organising

A combination of stubbornness and luck got me through my Ph.D. and postdoctoral training without a todo list or schedule. Each day, I would simply continue work where I stopped the day before. My lack of organisation didn’t get me in trouble because for the most part, I was involved in only one or two projects at a time, which was manageable. This changed after my postdoc. I suddenly found myself juggling software projects, monthly reports, research papers, and blog posts; all on tight deadlines. I had no choice but to become more organised.

You may – like me – get away with poor organisational skills during your Ph.D. but why not improve these skills before it’s absolutely necessary? Why not become more efficient in the process, and also prepare for your post-Ph.D. life, which will most likely be more demanding and require juggling several projects in parallel? This chapter discusses organisation tools and behavioural hacks that help you stay on track and make steady progress. After all, research is a marathon and not a sprint.

Incorporating new skills

The incorporation of new organisational tools requires forming new habits. When adopting new habits – be it organising, exercising, or eating healthily – you may find that you can keep up the new habit for a few days but then drift back into old habits. In his book Atomic Habits (Clear 2018), James Clear provides insight into why that is: key to forming good habits (and eliminating bad ones) is to i) create an obvious cue to trigger an action, ii) make the action’s outcome attractive, iii) make the action easy to perform, and iv) create a satisfying outcome.

Curate a todo list

As a Ph.D. student (and even as postdoc), I spent most of my days working on very few tasks – so few, in fact, that I was able to keep my todo list in my head. Most days, I worked on moving my research project forward, only interrupted by the occasional paper review, presentation, or work with students. I had few tasks and deadlines to keep in mind at any given time. Once I transitioned from my postdoc into the “real world”, I quickly realised that this had to change. I suddenly found myself having to push forward several small projects: research papers, bug fixes in complex code bases, organising workshops, applying for grants, and analysing data sets. It was no longer possible to keep my todo list in my head, so I started experimenting with tools. Console tools proved a bit too cumbersome while web tools proved too inconvenient and clunky, so I eventually ended up maintaining a simple text file with my todo list.

In essence, a todo list helps you keep track of tasks that you need to get done. As I mentioned before, the trick is to incorporate it into your workflow seamlessly. If adding a new item to your todo list involves spending thirty seconds finding todo.docx on your hard drive, waiting five seconds for Microsoft Office to open, and another three seconds to scroll to the end of the file to add a new task, you will soon give up.

If you feel the need to curate a todo list but find yourself unable to do so, work on minimising friction. First, make it so that you can open your todo list as quickly as possible. For example, configure a keyboard shortcut that opens your todo file. This way, if a new task is assigned to you during a meeting, you should be able to add it to your list within five seconds. Once a new task is in your list, you can forget about it, freeing you of the cognitive load of having to remember the task.

If you work on multiple devices and need your todo list on all these devices, you will have to sync it somehow. In this case, browser tools may be your best bet because they can take care of the syncing part for you. Manual syncing is out of the question because it’s not sustainable and you will wind up with out-of-sync todo lists.

The todo list format I eventually adopted is a markdown-formatted text file which is in the same file as my work log (see the section on work logs. Some tasks are more pressing than others. To reflect this, I use three sections, “today”, “this week”, and “eventually”. Here is an excerpt of what my todo list currently looks like:

When I finish a task, I move it from my todo list to my work log, which is in the same file, so it’s a matter of literal seconds. This minimises friction and renders the curation of my todo list easy enough that I actually stick with it.

Again, it does not matter what I do. The best todo list is one that works for you. I’m giving you an idea of my workflow in the hope that it helps you discover what works for you. My workflow is heavily centered around text files and command line tools. You may be more of a browser person, in which case a pinned browser tab may work significantly better. Experiment with different workflows until you found one that you can sustain.

Plan your day

A habit that I picked up relatively recently, after reading Cal Newport’s excellent book Deep Work is to plan my day (Newport 2016). Each morning, right after I turn on my laptop, I take a look at my todo list and derive a plan for the day, based on 30 minute blocks. I draw these blocks on my tablet but pen and paper work just as well.

This is an example of my daily schedule. I reserve my mornings for design and development, and other tasks that require intense concentration.

Note

A day without a schedule runs the risk of resulting in yak shaving: imagine you want to fix that annoying bug in your code that has been messing with your experiments. While thinking about how to best fix the bug, you notice that the function that contains your bug has poor documentation. So you spend a moment updating the documentation. While doing that, you realise that your functions follow an inconsistent documentation style, which really bugs the perfectionist in you. So you harmonise the way functions are documented in your code and learn in the process that the documentation tool you’re using has released a new version that makes it more convenient to browse your software’s documentation. However, you operating system doesn’t have the newest packages yet, so you set out to compile it manually. Three hours later, you find yourself hunched over your laptop, covered in sweat, after finally having compiled your documentation tool manually. That bug that you originally intended to fix? It’s still there.

I used to come to the office each day without having a clear idea of what needed to get done. I would typically continue work where I left the day before, or turn my attention to whatever seemed the most urgent. This approach may suffice if most of your time goes into a single project whose details you can keep in your head but it falls apart in the face of more complex responsibilities.

You may think that a detailed plan for each day impedes your creativity. In fact, it’s the opposite. By spending a few minutes planning your day, you reduce your cognitive load throughout the day. You won’t have to think about what to work on next, or when it’s time to switch tasks. You already took care of that in the morning, leaving the rest of the day for deep thinking with minimal context switches.

Perhaps the biggest benefit of a planned day is that it helps you stay on track. I have the annoying habit of polishing finished work more than is necessary or even useful. Having a daily plan in front of my nose serves as a reminder that the perfect is the enemy of the good, and that many other tasks are still waiting to get done. This helps me move on to the next task quickly, and be more productive throughout the day. Higher productivity means more happiness. At the end of the day, I feel that I have accomplished what I wanted, making it easy to get out of “work mode.” Whenever I feel unaccomplished, I find it difficult to leave work mode because I keep thinking about unfinished tasks. Needless to say, this work is far from productive and only prevents me from relaxing and recharging my batteries. A detailed plan for the day is a good antidote and can help you draw a clear line between your work and personal life.

Track your time

Do you know the feeling of taking a break from work, only to catch yourself an hour later watching obscure YouTube videos? Or the feeling of having spent a full day of work but feeling like you have little or nothing to show for? It’s as if the day passed and you accomplished nothing? The solution to these problems it to establish a tight grip on your most prices possession: your time.

I use Time Tracker on Debian Linux. This lightweight tool lives in my system tray, allowing me to quickly open it and take note when I’m switching from one task to another. Each time I switch between tasks,Examples for tasks are “answering email,” “changing database API,” “reading research paper,” and so on. Tasks like “writing” or “programming” are likely too general while tasks like “adding second paragraph to introduction of research paper” are too specific.

I open the Time Tracker tool and jot down what I’m going to work on next. On a typical day, I end up with five to ten tasks. At the end of the day, I know exactly what I did, and can contrast it with what I was supposed to do.

A typical day consists of a handful of tasks I worked on. The numbers next to the tasks correspond to bug tracker IDs.

Note

Tracking your time at such granularity may feel oppressive and stressful. After all, it’s yet another thing to remember and worry about. That’s exactly what I thought until I started doing it but I learned that policing yourself helps you stay focused. Tracking my time helps me stay on track. It’s easy to feel busy all day long, without really getting anything done. One can spend several hours going over meeting notes, mulling over the next email, or stressing about all the things that need to get done. Despite feeling busy all the time, your output may be little. Keeping track of what exactly I’m doing throughout the day helps me notice when I’m actually productive. I actually am productive when I finish a handful of well-defined tasks throughout the day. If you are working on a big project, try to split it into tasks, so you can accomplish a handful of them each day.

Part of my day job consists of working on development tickets for sponsors. These tickets consist of software bugs, feature requests, or small projects. My employer, The Tor Project, is mostly funded through grants, and we need to have a good understanding of how much time a specific development task takes. How long does it take to set up a testbed to evaluate a new pluggable transport protocol? One hour? A day? A week? It’s important to have both experience and data for time estimation because our intuition isn’t always the most reliable predictor. I use my time tracker to record for each bug tracking ticket how many hours it took for me to work on it, which I then compare to the hours we projected to be working on it. Over time, my estimates became closer and closer to how much time I really needed–minus the occasional outlier, obviously.

Log your progress

I am a big advocate of keeping a log of what I have accomplished throughout the day. Did you finally manage to finish the introduction of your latest research paper? Mention that in your log. Did you finish refactoring the data processing pipeline in your research prototype? This should go straight to your log. Right after I complete a task worth writing down, I spend approximately five seconds adding it to my log and then move on.Occasionally, I forget to add a completed task to my log right away. I then add it later, or sometimes even the day after.

I don’t bother getting punctuation or even grammar right.

On a typical day, I jot down somewhere between five and ten tasks in my log. I don’t log every single email I write but I do sometimes log emails if they’re both lengthy and important. Here’s what my Oct 14, 2019 work log looked like:

  • 2019-10-14
    • filed #32064 for improved search results for “download tor” and to incorporate gettor links in our website descriptions
    • replied to [redacted] and asked him if he’s willing to run default bridges for orbot
    • created wiki page to formalise our “support ngos with private bridges” process
    • thought about design for system that can scan the reachability of PT bridges (#31874)
    • created summary of obfs4 work for quarterly race report and updated our obfs4 ticket (#30716) with current project status
    • reviewed #31384 (snowflake.tp.o language switcher)
    • reviewed #31253 (webext packaging target)
    • responded to email with points worth communicating at otf summit
    • started working on #17548 (deal with expired pgp keys)

You can see that my phrasing is rough around the edges but that’s okay: You are the primary consumer of this log and you will likely remember what you meant after the fact. You will interact with your work log several times a day, so minimise the friction of adding tasks to it. My work log is always open on a virtual desktop, so I don’t spend any time opening it. I also added a shortcut to my editor, vim, to quickly add today’s date to the bottom of the file. The format of my work log is markdown, which facilitates conversion into other document formats such as HTML or PDF.As always, do what works best for you. I’m a terminal person and enjoy fast, lightweight, and robust console tools. You may be more of a browser person, in which case it’s worth looking at web services that help you log your progress.

Using pandoc, converting a markdown-formatted file to PDF is as simple as running:

pandoc log.md -o log.pdf

As a student, I would take my work log spanning the past month and send it to my advisor at the end of each month. He appreciated seeing what I’ve been up to – at a level of granularity that was neither too coarse nor too detailed. Sharing your work log with your advisor also serves as insurance: your advisor won’t be able to ever complain that he or she was not kept in the loop.

Actually writing down and seeing what you have accomplished throughout the day can be surprising – in both a good and bad way, depending on how much you have accomplished. The importance of a progress log is that it allows us to police ourselves. If I see one or two low-progress days in a row, I realise it’s time to make a conscious effort to improve my productivity. Progress logs make it less likely to drift into a slump and perform poorly for many days or even weeks without noticing – something that happened several times throughout my Ph.D: I wasted many weeks going down rabbit holes, losing sight of the big picture. I mulled over attractive research ideas that were ultimately infeasible but I was too stubborn to give up on the idea. I was no longer on track and I didn’t realise that I had to take a step back to re-evaluate my research direction. Taking a look at your work log makes it easier to realise when you’re off track and by sending your work log to your advisor, your advisor should also be there to help you.

Note that time tracking and a work log are very similar but fulfill different goals. Time tracking allows you to police yourself on a micro scale while a work log polices you on a macro scale. It’s possible to be on the right track but spend a significant part of your day watching YouTube videos (a time tracker would reveal this bad habit) or be efficient in your day-to-day business but not spend time doing the right things (ideally, a work log would help you see this).

Take notes of meetings

I had the privilege of collaborating with 25 people throughout my research career. Several of the projects I was involved in were not led by me, so I took a back seat. In some of these projects I was surprised by the lack of notes during meetings. Collaborators would meet and discuss projects as I was used to, but nobody took notes (at least not for the entire group) and the implicit expectation was that everyone would remember what was said and what was left to do. Needless to say, this expectation didn’t always work out. Throughout the next one to two weeks, people forgot what was discussed, only to end up discussing some of the very same topics at the next meeting. Besides, misunderstandings along the lines of “wait, I thought you were supposed to do this?” happened.

You can avoid these issues by consistently taking notes. Before each meeting begins, there must be a designated note taker. In fact, multiple people can take notes simultaneously in a Google Document or a Riseup Pad. The note taker(s) jot down key points during the meeting and todo items for each person. At the end of the meeting, these notes are then sent to all participants. If anybody disagrees with any of the notes that were taken, they should speak up. This way, all collaborators will be on the same page, there is a written record of what was discussed, and it’s also easy to go back in time to find out in what meeting a certain task was discussed. I can guarantee you that your collaborators will love you for taking notes.

Note taking isn’t just for high-stakes meetings with important collaborators. I take notes almost every time I am interacting with somebody – including with my advisor, back in my Ph.D. days. I created a simple shell function that facilitates the creation of a new file for each meeting. I simply type meet alice into a terminal, and the command automatically creates a new file, 2019-12-21-alice.md, and opens it in my text editor. Here’s the script, which you can add to your ~/.bashrc:

I take notes in the simple, yet expressive markdown format, which is both expressive and very simple. All my meeting notes are in the same directory, in ~/doc/meetings/. Sometimes, I’m looking for something that was said in a past meeting but I don’t remember in which one, exactly. I then grepThis is as simple as running grep keyword *.

all my meeting notes for a keyword that I remember to be present in the log. Again, this is meant to minimise friction because otherwise I would be too lazy to take meeting notes.

Persevere

Graduate work takes its toll. Long work hours bring with them loneliness and isolation; paper rejections chew on your self worth; witnessing colleagues excel fuels imposter syndrome; and seeing childhood friends buy homes and get children makes you wonder if graduate school really was the right choice. It comes as no surprise that poor mental health is problematically common yet rarely talked about subject.

As a Ph.D. student, I was – and still am, to some extent – struggling with insecurities. I read outstanding papers in my field and immediately got disillusioned by the depth of the work and how it combined concepts I had not even heard of before. How would I ever be able to compete with that?

It is important to understand that your seemingly perfect colleagues are anything but, and often struggle with the very same issues. I was both surprised and relieved to learn that a friend – whose work I greatly respect – mentioned that he, too, struggles with imposter syndrome. I realised that if somebody as knowledgeable and capable as him is plagued by these feelings, it’s significantly more widespread than I thought it was.

Mental health takes place in your mind but a healthy mind lives in a healthy body. There are a few things you can do that greatly affect your mental well being.

  • Maximise your intake of whole foods (broccoli, potatoes, beans, coffee, apples) and avoid processed foods (cereal, pasta, candy, soda). (Pollan 2009)

  • Exercise regularly by making exercise a fixed part of your daily schedule. Don’t try squeeze exercise in between other responsibilities; squeeze other responsibilities around your mandatory exercise. Convince a friend to do the same and hold each other accountable.

  • Go to sleep and wake up at a regular time. Get at least seven hours of sleep. It’s fashionable to brag about how little sleep one gets, and in times of pressure it is easy to believe that sleep is a “waste of time” but the opposite is the case. What may take a tired brain three hours to accomplish can be a matter of 30 minutes for a well-rested brain. (Walker 2018)

  • Get a bit of sun every day. I like to go for a run around noon, or sometimes go for a brief walk while listening to a podcast. When I get back, I’m full of energy, calm, and ready to get back to work.

  • Practice meditation. It occupies only ten minutes of my day and I like to meditate right after waking up, with a hot cup of black tea in my hands. I enjoy Sam Harris’s Waking Up app. Using an app to facilitate meditation may sound ironic but there is merit in guided meditation – even if it’s just a few spoken sentences every other minute.

  • Procrastinate productively. Deep thinking requires creativity. you cannot force that.

Summary

  • Minimise friction by making organisational tasks as quick and pleasant as possible.
  • Experiment with different workflows until you find one that you can sustain.
  • Log your time to become more efficient.
  • Log your progress to become more effective.
  • Take notes during meetings to create a “paper trail” and clearly jot down todo items.

Reading

Graduate work requires an awful lot of reading. Course material, blog posts, textbooks, emails, and most importantly: research papers. This chapter (i) shows that there’s more to reading a research paper than working your way from one page to the next; (ii) presents strategies to organise your reading; (iii) shows how you can learn about relevant new papers; and (iv) explains how you can access papers that are locked away behind paywalls.

How to read a research paper

I spent too much time reading research papers like a novel: cover-to-cover, in the mistaken belief that that’s how I would get the most out of it. I eventually learned that my time is not always best spent trying to understand the minute details of a paper’s method section – especially if I never intend to apply this method myself. Nobody awards you a medal if you fight your way through a paper that lost you on page two already. Sometimes, there is no need to read a paper’s method section at all – you may only be interested in the conclusion, or the section on data collection. Before diving into a paper, know what you want to get out of it. Needless to say, there’s nothing wrong with reading a paper cover-to-cover. I still do it all the time, but understand that this is not universally the best approach to reading a paper.

Throughout your research career, you will take a look at hundreds of research papers. You may not necessarily read them all cover to cover but you will at least skim them. When is a paper worth reading cover to cover? And when is it best to only skim a paper? And what exactly does skimming mean? There are heuristics that help answer these questions and save you time. Sometimes your time is better spent understanding a specific key aspect of a paper – for example its methodIn computer science, researchers curiously started referring to their method as methodology, which is the study of methods. If you are writing a paper that compares and contrasts methods in your field, you should refer to it as methodology but otherwise as method.

– and ignoring the rest.

When opening a paper for the first time, you will read its title. Not all titles are descriptive or provide a good idea of what a paper is about but as long as it sounds vaguely interesting, you want to read the abstract too. Abstracts are generally short, and can be read in one or two minutes. Ideally, an abstract reveals the problem that the paper tries to solve, explains how it solves the problem, and what the results are. A well-written abstract provides enough information for you to decide if the paper is worth diving into.

Next up is typically the introduction in which authors provide context on what research problem they are working (the problem statement) and on why it matters (the paper’s motivation). I often skip the introduction of papers that are in my field because I understand the context and I have already bought into the paper’s motivation. For papers far outside my field of expertise, the introduction can be the most interesting part: I occasionally read the introduction (and nothing else) of cryptography research papers because I cannot be bothered to dig into their proofs and mathematical models. The introductions however help me understand why a topic matters and how it relates to a broader field.

A handful of other sections can separate a paper’s introduction from its “meat.” Many papers have a dedicated background section in which they explain technical concepts that the reader may not be familiar with. A good rule of thumb is: if one is unlikely to encounter a concept in a standard computer science curriculum, it may warrant a few paragraphs in the background section. A section on related work is also common, in which the authors put their own work into the context of existing work in the field. Well-written related work sections are as helpful as they are rare. Many papers approach their related work as thoughtless lists of references without any context. “Person A did this; person B did that; person C did this.” That completely misses the point. A related work section is meant to answer questions like:

  • How does this work compare to similar work that was published in the field?

  • What advantages and disadvantages does this paper have over others?

  • How do papers overlap and complement each other?

What your readers really want to read is: “Person A did X but we decided to do Y. While X comes with stellar performance benefits, we believe that Y provides security benefits that are needed in our threat model. Future work should study a hybrid approach that incorporates both X and Y.”

After several sections on introduction and background, the actual research begins, typically introduced by a “method” section. A method can make or a break a research project. It’s what reviewers pay the most attention to because this is where flaws and biases have the biggest effect. If a given paper is very similar to your own research, you probably want to read its method. In particular, you will want to know if the paper’s method addresses issues that you failed to consider (or the other way around),

Another crucial aspects of a method are its assumptions. Each research project makes assumptions on the format of data, the number of users in a system, the way users interact with a system, the performance of underlying hardware, and so on. Ideally, these assumptions are spelled out explicitly, but sometimes they are implicit and therefore more difficult to understand. Assumptions are often flawed. For example, is the paper assuming a normal distribution of its data? But is the data more likely to follow a power law? If so, it may not be appropriate to use whatever statistical analysis tool the paper uses to study the data. When reading a paper, you want to get a good idea of what its assumptions are. Finding an issue in assumptions can quickly invalidate a project’s results.

The above is a very brief overview of what to pay attention to when reading a paper. A 2007 article by Keshav goes into more (Keshav 2007).

Adopt a reviewer mindset

Your mindset is a key aspect in how you approach a paper. It’s tempting to read papers the way we read textbooks; by assuming that each paragraph is an untouchable source of truth that is not to be questioned.Arguably, this is not true. Textbooks contain mistakes and have errata but we are still conditioned from an early age to believe what’s printed in books.

A significant part of graduate training consists of unlearning this mindset and replace it with perpetual questioning. We wouldn’t be where we are today if Galileo didn’t question the geocentric world view. Adopt the mindset that you are reviewing rather than reading a paper. You aren’t passively absorbing but rather actively verifying information. Your null hypothesis should be to distrust a paper’s results and only through sound reasoning and rigorous methods can a paper change your mind.

Peer review is meant to weed out critical flaws in papers but that does not mean that peer-reviewed papers are free of flaws. Peer review is not perfect and has false positives (flawed papers pass the filter) and false negatives (decent papers get rejected – often because of reviewer antics). Be vigilant and question everything, all the time. Throughout my career, the smartest and most capable people in the room all shared one trait: they had a well-calibrated bullshit filter and would not fall for “smooth talkers.” We all need to strive for that.

Don’t just pay attention to what a paper says; pay attention to what it doesn’t say. Do you believe that a dataset requires an important preprocessing step that a paper never mentions? Can you think of an evaluation scenario that’s not discussed in the paper? Is a paper conveniently ignoring a performance analysis of the proposed database? These are examples of problems that arise through omission. Keep in mind, however, that papers are subject to page limits and therefore must omit content at some point. Bad reviewers often forget that and criticise papers for missing their favourite kind of analysis. The trick is to include what matters and omit the rest, which is easier said than done.

I highly recommend taking notes when reading a paper – be it with pen and paper, on a tablet, or in a text file on your laptop. Here are a number of questions that can get your started with thinking critically about a research paper:

  • Briefly summarise the paper’s method.
    • Is the method section detailed enough to facilitate reproduction?
  • What are the paper’s assumptions?
    • How robust and realistic are these assumptions?
    • Can you think of counter-examples to these assumptions?
  • What are the paper’s key results?
    • Do these results contradict or confirm existing work?
  • How could the paper (its method, presentation, or results) be improved?

  • What follow-up research questions come to mind?

  • What’s the paper’s conclusion?
    • Do the results support the conclusions?Competition at high-ranked journals and conferences is fierce, which creates the incentive for researchers to overstate the importance of their own work. Be wary of that.

Read and engage in actual peer review

There’s only so much you can do to train your mind to be more vigilant. Luckily, there are ways to draw on the knowledge of more experienced researchers. Some conferences publish peer-reviewed papers together with their reviews. As a young Ph.D. student without any experience in peer review, I found it fascinating to get a glimpse of how peer review works in practice. The IMC conference used to publish paper reviewsThe last iteration that still published paper reviews was the Internet Measurement Conference 2013.

but eventually stopped doing so because it constituted a significant burden for its reviewers and the extra work was not worth the use people got out of it.

Other conferences implement a “shadow program committee” that gives graduate students the opportunity to participate in an “alternate universe” reviewing process, which ultimately leads to a “shadow conference program.”Several conferences are or were running a shadow program committee; for example the IEEE Symposium on Security & Privacy, the Internet Measurement Conference, EuroSys, and USENIX NSDI.

This is a great resource and I highly recommend participating at least once.

Once you got your feet wet with reviewing papers, you may want to join an actual technical program committee (TPC) – the set of people who review a conference’s paper submissions. People usually wind up on TPCs after being invited by the conference chairs. To get invited, the chairs need to know you or your research. That’s generally not the case for junior researchers. To work around that, ask your advisor if she or he can get you onto a TPC. Alternatively, your advisor can hand you one or two papers to review for the conference she is reviewing for.

Join or organise a reading group

We get the most out of reading a paper by discussing it with colleagues. Many university departments hold regular reading groups for this purpose. Similar to a book club, reading group meetings revolve around a paper. All members are supposed to read the paper in advance, and then discuss it during the meeting. Your department may already have a reading group but if not, why not organise one? It’s as simple as asking your colleagues “let’s meet next Tuesday for an hour to discuss the attached research paper.”Ask your advisor or your department chair if they are willing to “sponsor” the reading group by ordering pizza. Try to keep the culinary incentive small—if the food is too good, people will show up for the food without having read the paper.

N+1 brains are strictly better than N brains because everyone brings a unique perspective to the table. Each reading group I ever attended left me walking away with a significantly better understanding of the paper and its context than I could have acquired myself. Besides, you will refine your sense of what to pay attention to when reading a paper. It’s a great way to flex the “reviewer mindset” muscle in you.

Here are guidelines for successful reading groups:

  1. Nominate a “session lead” who gets to pick a paper, maybe gives a brief summary of the paper before the discussion, and then moderates the subsequent discussion. Session leads can change with every reading group iteration. It can be a humbling experience to explain a paper to someone else. We often don’t know if we truly understanding something before trying to explain it to someone else.

  2. Go into the reading group with a set of questions to discuss:
    • What did you like about the paper?
    • What did you dislike about the paper?
    • What are follow-up research questions?

    Reading groups always drift off topic. Without any guidance, you will find yourself discussing the advantages of vim over emacs after five minutes. The session lead should steer the conversation back into productive territory, which is easier when keeping a few questions in mind.

  3. In my experience, the most effective reading group has fewer than a dozen attendees. As the number of participants increases, discussions become disorganised and the reading groups less productive.

  4. Seek to organise reading groups around specific research areas and not entire fields – for example, “privacy and security” rather than just “computer science”.

Organise your reading

You are likely to read hundreds of research papers throughout your Ph.D. studies. If you’re anything like me, you won’t remember them all. Early on in my Ph.D. career, I was struggling with the sheer amount of papers that were waiting to be read. How should I archive these papers? How can I make the list of papers that I have already read easy to search? How can I show my Ph.D. advisor the progress I have made? Eventually, I started building a small Web page that listed all papers I have read; each paper contained its title, authors, proceedings, year, publisher, BibTeX record, and a pdf copy.In the hope that it would be useful to somebody else, I decided to publish this page shortly after I built an initial version. It must have been some time in 2012. Eight years after its original creation, I am still curating it because it ended up being a useful resource to me and others.

My bibliography on Internet censorship-related research papers.

Note

To facilitate the curation of this Web page, I wrote bibliograpy, a Python tool that takes as input a .bib file and turns it into an HTML bibliography. There’s a good chance that you will write your research papers in LaTeX, so a BibTeX file is a convenient way to keep track of your reading. Whenever you write a new paper, you can include your BibTeX file and your citations are readily available. The BibTeX format supports custom fields that are typically not used when generating a bibliography. You can use these fields to take notes of papers. Here’s an example:

Once you have your own, growing BibTeX file, I would encourage you to build a bibliography – and ideally publish it, so your colleagues get to benefit from it too. Once you finished an initial version of your bibliography, maintenance requires little effort. I regularly take a look at the proceedings of relevant conferencesIn my field, these are USENIX Security, ACM CCS, IEEE Security & Privacy, the Internet Society’s NDSS, and a few others.

and add new papers to my bibliography’s .bib file. I then run a script, deploy_website.sh, which builds the Web page and uploads it to my Web server. Automation is key here. I would never bother to curate my bibliography if it took more than five minutes to add new papers. Occasionally, I receive patches from colleagues who stumbled upon papers that I wasn’t aware of.

Remember, the best system is one that works for you. Some people like to print papers and take notes with a pen. I used to read papers on my laptop, using the tool MendeleyI no longer use or recommend Mendeley. It is now owned by the company Elsevier, a long-standing opponent to the open access movement. Elsevier does not deserve our support.

. At some point I got a tablet, which I now routinely use to read papers. I like its high-resolution screen and the tablet is less distracting to my laptop, allowing me to better focus on the paper. Experiment with a few ways of organising your reading and stick with whatever you like.

Learn about new papers

Every conference cycle brings with it a new set of papers with the potential to affect your research. It’s important to stay up to date on these new papers because (i) you must know if a research group has been working on the same problem as you, and published before you (this is called getting “scooped”); (ii) you can learn about new research directions that you may want to pursue; and (iii) you may learn about ways to improve your own work, e.g., by building on better methods or datasets. My favourite tools for finding out about new papers is to regularly skim conference proceedings, use arXiv subscriptions, and Google Scholar.

Skim conference proceedings

The primary source of new papers in your field should be your top conferences and journals. Each scientific discipline has a few venues that are considered “top-tier” and your advisor will likely encourage you to publish in these venues. It’s not easy for newcomers in a field to understand what venues are of the highest quality, so ask your advisor or colleagues. In my field of computer security, these are the Network and Distributed System Security Symposium (NDSS), the Conference on Computer and Communications Security (CCS), the USENIX Security Symposium, and the IEEE Symposium on Security & Privacy. Each of these conferences takes place once a year – one in every quarter. Once one of these conferences publishes its research papers, I spend thirty minutes skimming the list of papers, to see what’s worth reading.I first skim the titles and then take a look at the abstracts of the papers whose titles seem promising. Depending on the abstract, I may then decide to skim or read the paper.

There are always a few papers that I find really interesting. I cannot over-emphasise the importance of following your top venues. It is the source of new research in your field.

Sign up for Google Scholar

In addition to conferences, there’s another handy way to learn about new papers. Google operates a service, Google Scholar, that provides several useful features for academics, one of which is a notification system that sends you emails about new research papers in your field. I have been using Google Scholar for many years to get emails about new papers from a given set of authors or by keyword. You can provide Google Scholar with a handful of names and it will then send you an email notification each time one of these people published new work. After a few months of reading papers, you will be able to name a handful of researchers whose work you find particularly relevant or insightful. Consider adding their names to Google Scholar and get notified whenever they publish a new paper. You can also get email alerts for keywords that Google Scholar extracts from papers. The more specific the keywords, the more useful the alerts will be. I currently have an alert for the keywords “censorship”, “system”, “anonymity”, and “tor.”

Every other day, I get an email with about a dozen new papers. Most are uninteresting and some are amusingly unrelated, but occasionally Google Scholar sends me highly relevant papers that I would not have found otherwise, which makes the service absolutely worth it to me: The value I get from the occasional true positive far outweighs the annoyance of the regular false positives. Google Scholar once helped me learn about an important research paper that nobody in my field knew about because it was published at a somewhat atypical conference.

In addition to research papers, Google Scholar tracks the citation count of each paper. At least among computer scientists, Google Scholar has turned into the source of truth to learn somebody’s h-index.The h-index is the largest n for which one can say “I published n papers that were all cited at least n times”. Academic hiring committees frequently use the h-index to quantify the “productivity” and “impact” of a researchers. I’m putting these two words under quotation marks because like all metrics, the h-index is a poor approximation of one’s productivity and people have learned to abuse it by forming citation rings and engaging in “salami publishing.”

It’s not healthy to obsess over one’s h-index but it can sometimes come in handy: I once had to provide my h-index as part of my green card application for the United States. I applied for a green card using the “national interest waiver” track, which requires the applicant to prove that they have an established research record. Providing one’s h-index is part of this proof.

Use arXiv subscriptions

The arXiv is a database of preprintsA preprint is a paper that has not yet been peer reviewed.

that was originally conceived by the physics community. It has since gained popularity in many other fields including computer science, and some disciplines built their own version of the arXiv; like bioRxiv in the biological sciences. Many researchers post their papers to the arXiv before (or in parallel to) submitting them to a journal or conference. Note that most journals do not accept work that has been previously published – this rule typically does not include preprints but in case of doubt, you should check with the editors.

The arXiv manages an email subscription service that allows you to subscribe to research topics of your interest. Similar to Google Scholar, this email service will alert your of new papers.

Circumvent paywalls

Most research papers can be found in the databases of academic publishers like IEEE, ACM, Springer, Elsevier, Sage, or Wiley. It is the very job of these publishers to make available research to researchers. Unfortunately, access to these databases is not universally free. One can pay to access single articles but universities generally sign up for subscriptions, which allows university employees to access (a subset of) all of a publisher’s articles. The cost of these subscriptions is exorbitant and has prompted numerous universities to cancel them. In 2019, after months of unfruitful negotiations, the University of California system decided to cancel its Elsevier subscription.

A paywall at the ACM Digital Library. ACM non-members have to pay $15 to purchase this article. A simple Google search for the paper title typically uncovers a freely accessible version of the same paper.

Note

Researchers without a university affiliation, or an employer that is not wealthy enough to pay for subscriptions, are stuck behind paywalls. I mentioned earlier that a typical Ph.D. student will skim hundreds of research papers throughout her education. Even if it’s just 200 research papers; at an average price of $15 per paper, this would amount to $3,000 dollars – unaffordable to many.

Thankfully, there are ways around paywalls. The most popular service these days is Sci-Hub.Sci-Hub is the brainchild of the scientist Alexandra Elbakyan who is still curating the service.

The site’s minimalistic interface has a search bar that allows you to search for a paper’s URL or DOI. For the above paper, Sci-Hub promptly opened the pdf for the URL https://dl.acm.org/purchase.cfm?id=2517856. Note that publishers deem Sci-Hub a threat and keep trying to have its domains taken down. If sci-hub.se ever stops working, the website whereisscihub.now.sh can point you to its latest domain.

Sci-Hub’s website, available at sci-hub.se as of February 2020. Add a paper’s URL or DOI in the search box and get instant access.

Note

If Sci-Hub was unable to find a paper for you, then take a look at Library Genesis (often abbreviated as Libgen). The project has similar goals, providing a web frontend for a database full of scholarly articles and books. Another option is social media. On Twitter, people have started using the hash tag #icanhazpdf to ask for papers that other Twitter users will make available. The /r/scholar subreddit is used to request scholarly literature.

The #icanhazpdf Twitter hash tag in action.

Note

Finally, you can often find a paper (or at least its preprint) by simply searching the web for the paper’s title. Many authors (including myself) make available their research papers on their personal websites.All of my research papers are available on my personal website and I encourage you to do the same. People occasionally hesitate to put their papers online because it may be a violation of the copyright agreement they have with their respective publishers. That may be true but I’m not aware of anyone ever getting into trouble for that.

If all else fails, email the authors of the paper you need, and ask for a copy of their work. Don’t worry about crossing a line: If anything, the authors will feel flattered that somebody is showing interest in their work. All authors I’ve personally asked for a copy of their work (most of whom I have never met) have generously sent me a copy. Talent is everywhere but opportunity is not. Many of our colleagues cannot afford to keep up with the literature because science is being held hostage by greedy publishers that failed to adapt to the internet. It is our duty to support our colleagues by making our work freely available.

Summary

  • Don’t feel obliged to read a research paper cover to cover. Know what you want to get out of it and focus on what matters to you.

  • Train yourself to question everything, all the time. Join reading groups and engage in (shadow) program committees to train your questioning skills.

  • Organise your reading somehow. A single BibTeX file can work surprisingly well.

Writing

It is very common for researchers to experience writing as the most frustrating part of doing research. After all, people start careers in research because of the actual research and not the act of writing up results. I believe that part of the frustration with writing comes from poor strategy and organisation: writing is seen as a necessary evil that happens last minute; the idea of creating twelve pages worth of content triggers anxiety; and the obscure nature of LaTeX makes everything worse.

This chapter discusses ways to make the process more structured, effective, and – perhaps most importantly – less painful.

Start writing

What is the point of writing tips, you may ask, if you cannot get yourself to write in the first place? We have all been there, many times. Writing is a deeply creative process and therefore difficult to invoke on demand. There are however some hacks that help with getting you to write.

Write consistently

Have you ever accomplished something hard and meaningful, only to have people tell you that they wouldn’t have the “motivation” to do the same? That’s nonsense. Motivation doesn’t help you achieve goals because it comes and goes, and nobody is always motivated. You accomplished your goal because you’re disciplined. When motivation is nowhere to be seen, discipline is what keeps you going. Don’t conflate these two concepts.

And as you may expect, discipline and consistency is also the key to getting writing done. Motivation alone won’t get you far. I’m writing these very words while feeling unmotivated to work on this book. It’s a sunny Saturday morning here in San Diego and I would rather be outside. However, each day of my todo list has “Write on book,” and that’s what I’m doing. If I always waited for motivation to strike, I would never get anything done. Sometimes I only edit two sentences and sometimes I spend an hour adding a new chunk of writing. There are good and bad days but you have to keep showing up and do the work – even if it’s just a little. I don’t have much writing to show for on any given day but small and steady improvements keep adding up. You would be surprised by the amount of progress you can make by simply being consistent.

Consistent writing does not only help your progress. It also helps your writing! I strongly believe that it is impossible to churn out good writing in a short amount of time. Instead, good writing ages, like an expensive bottle of wine. I wrote my best research papers over multiple months, in small increments, and edited my writing frequently. Each iteration made the writing a tiny bit better. On some days, all I did was to add two or three sentences. Or rephase the caption of an image. On others days, I felt more creative, and worked my way through several paragraphs, and maybe even sections. Don’t let anyone or anything fool you into thinking that you’re not a good writer. Just keep going at it, one day at a time, and you will eventually have great writing to show. Trust me on that.

Capitalise on creativity

Writing is a deeply creative process but unfortunately one cannot enforce creativity. When it finally does show up, it is important to make the best of it. In the words of the insightful Naval Ravikant: “Inspiration is perishable – act on it immediately.” Open your text editor, brew yourself a delicious cup of coffee,The coffee is not optional. To engange in a task you don’t particularly enjoy, you need to make it more attractive. Applied to writing, this could mean getting your favourite (non-alcoholic) beverage, listen to relaxing music, or writing in the park, under the sun.

and get to work! Ignore your surroundings as much as you can until you feel your creativity wane. I argued above that you cannot rely on motivation alone but if you do feel motivated, make the best of it.

I produce 80% of my work in 20% of my time (the pareto principle all over again). That is not because I’m lazy and waste 80% of my time but rather, these 20% of my time is when I’m particularly focused and creative. I am then able to produce work at a quality and quantity that is outside my reach on most days. Learn to identify when you visit these periods of peak creativity and don’t let them go to waste!

Living documents

I have met many researchers whose idea of writing is to wait until three days before a submission deadline and then engage in a manic, caffeine-fuelled sprint towards churning out twelve pages. Needless to say, the output is always underwhelming. In addition, the prospect of having to write a full paper in a short amount of time is daunting and agonising, which leads to even more procrastination – a negative feedback loop.

I encourage you to instead space out your writing process over time, and let your papers evolve. Try to write a little, whenever you feel like you have something to say. Capitalise on your creativity. I treat my research papers like a living document. One of the first things I do when starting a new research project is the creation of a paper.tex file. I use it to jot down bullet points about research ideas, how these ideas connect, and can ultimately presented. These bullet points eventually turn into sentences, paragraphs, and then a finished paper. Once I start writing code or collecting measurement results, I try to distill key results and start thinking about a narrative.

I find it helpful to approach a research paper by distilling its key messages. What are your 1–3 core insights? What evidence do you have that supports these insights? The rest of the paper is then spun around these key messages. Not only will your writing quality improve drastically, you will also look at the writing process in a more favorable way, which leads to more writing – a positive feedback loop.

Modular decomposition of a research paper

A finished research paper can evoke emotions of satisfaction or anxiety – depending on where you are in the writing process. The finish product may seem overwhelming but remind yourself that you won’t be tackling it all at once, just like you don’t climb a tall mountain all in one go but you take breaks, rest, and make steady, little, consistent process. It’s much easier to make steady progress by breaking a research paper into separate modules, which become easier to tackle. We use the concept of “modular decomposition” in programming to break down programs into separate modules that interact with each other, and I’m telling you that you can do the same with research papers. A research paper consists of several sections: typically an introduction, related work, presentation of your method, experimental results, and so on. Each of these sections consists of several subsections. For example, your experimental results may consists of a subsection on experimental setup, data pruning, and visualisation. Drafting a subsection on data pruning is significantly less daunting then writing the entire section on experimental results.

Once your broke your research paper into pieces, tackle these pieces in isolation. Ask your advisor or collaborators to help you with breaking your paper into pieces.

Note that the above sections apply to any creative work and not just writing. I put them in the writing section because writing happens to be the activity that most people struggle with.

Write better

Many books have been written on writing in general and academic writing in particular.

  • Delete unnecessary words. Many people often make their academic writing very fluffy, which makes it difficult and tedious for a reader to read. Notice how awful that sentence was? Let’s delete unnecessary words: Many people often make their academic writing very fluffy, which makes it difficult and tedious for a reader to read.

    Go over your writing sentence by sentence and delete every word that isn’t necessary. Removing the fluff from your writing makes it substantially easier to read. Most college student (including yours truly) eventually pick up the annoying habit of adding unnecessary words to sound sophisticated and pad their writing assignments. It is now time to unlearn this habit.

  • Use active instead of passive voice. Instead of “the data was analysed,” write “we analysed the data.” Instead of “it has been shown by Turing,” write “Turing showed that.” Excessive use of the passive voice is yet another habit that’s particularly widespread in academia. Don’t imitate the lifeless and boring writing of your peers.

  • Engage your audience. Never start your paper with something entirely obvious and uncreative like “The Internet has become the world’s biggest communication network.” You are more creative than that. Nobody expects pearls of wisdom at the very beginning of a paper but make an active effort to make your paper as engaging to your reader as possible. Take a look at the introduction of a seminal paper in the field of cryptography (Diffie and Hellman 1976). Its first sentence is a classic.

    Note

Finally, keep in mind that good writing is often subjective, and not everyone will agree with the advice above. I once got the following feedback on one of my paper submissions:

First, authors abuse of questions in the paper. Writing a scientific paper is not writing a “thriller”. No suspense is required (this makes the reading pretty much annoying). We only have to write about facts.

Reviewer A

Reviewer A is the kind of person who starts their paper by pointing out how important the Internet has become. Don’t be like reviewer A. Bring some colour to your writing.

Ask for feedback

Good writing rarely happens in isolation. Even professional novelists have editors and actively seek out feedback. In academia, advisors often assume the role of an editor (if you’re lucky) while friends colleagues can provide you with feedback. As a Ph.D. student, my friends and I would often share drafts with each other even though we worked in slightly different subfields. I would review my friend’s applied cryptography paper and while I didn’t understand every single aspect about it, I would still learn a lot and provide an important perspective: the one of a potential reviewer.

The feedback of somebody who’s not intimately familiar with your research is so important because it’s not affected by the “curse of knowledge” – a cognitive bias that emerges when a professional has a hard time putting herself in the shoes of a newcomer.

When asking your peers for feedback, keep in mind that somebody can read your work for the first time only once. The first reading is particularly important because somebody approaches your writing with a fresh mind and an unbiased frame of reference. Don’t waste these opportunities. Instead of sending your very first draft to all the people you know, send it to only one or maybe two. Pay close attention to their feedback, address it, and then send the revised draft to the next person. After incorporating this “chained” feedback, your paper will end up much stronger than it would have, had you gotten feedback on a single revision of your paper.

Some of your colleagues may be inexperienced and not know what exactly they should pay attention to when reading your work. In that case, help them out by letting them know what you want feedback on – be it the narrative in the introduction, or the intelligibility of your method section, or the informative value of your diagrams in the results.

Write effectively

Writing effectively means writing in LaTeX. While LaTeX tries hard to produce good-looking output, most computer science papers are still poorly typeset. This comes as no surprise because unlike other fields, we typeset our own papers and hardly anybody is ever trained in typesetting. But again, by embracing a handful of rules, you can greatly improve the visual clarity of your writing.

Use LaTeX comments

LaTeX interprets lines that begin with a % character as comments. I recommend using comments in the following scenarios:

Balancing references

The package balance balances your list of references, making them visually more pleasing. Include the package by adding \usepackage{balance} to your list of packages and use it by adding \balance before you include your references. Compare the end of a bibliography without balance:

And here is the same bibliography after using the balance package:

Better tables

What table looks better? The one on the left or the one on the right? We both know it’s the one on the right! Notice how horizontal and vertical lines are almost absent, removing visual clutter. Both the top and the bottom of the table exhibit more spacing, making the table less cramped, and the quantities in the two rightmost columns are right-aligned instead of centered, rendering the numbers easier to compare.

The booktabs package helps you build create more pleasing table and, equally important, read its documentation because it does a great job at explaining how to properly typeset tables. For a quick reference, here’s the LaTeX code of the above table on the left.

Backreferences

It is possible to add backreferences to your references. At the cost of using up a little extra space, backreferences list the pages that cite a given reference. Backreferences are not have a must-have but a nice feature regardless because they make it easier to find a given reference. People also often look for their own work in a paper’s references because they are curious in what context their work is cited. Backreferences make this easier.

An example of backreferences for three papers. Note the “Cited on” right after the URLs. The hyperref package makes these pages clickable, rendering it easy to jump directly to the respective references. The acronym “p.” is short for “page” and “pp.” is short for “pages.”

Note

There are several ways to implement backreferences. One option is to use the popular hyperref package’s pagebackref option as follows:

Self-contained diagrams

The TikZ package makes it possible to typeset diagrams directly in LaTeX. TikZ diagrams have the advantage that they are of very small size, they use LaTeX fonts (and therefore look less clunky), and can be created with just a text editor. On my personal website I maintain several TikZ examples examples that help you get started. Granted, it does take time to learn TikZ but I recommend it for those among you who value high quality type setting.

Note how this TikZ digram integrates well by sharing a font with the surrounding text. It is typeset directly in LaTeX, making it lightweight and visually appealing.

Note

Ggplot2 is a plotting library for the R programming language. It has a feature that exports plots as TikZ code rather than, say, pdf. You can then embed this TikZ code in your research paper. You can make use of this feature by adding the following line to your R script:

library(tikzDevice)

Common LaTeX mistakes

While reviewing research papers, the same kind of LaTeX mistakes stick out to me over and over again. Fortunately, they are easy to fix.

  • Use decimal separators to make large numbers easier to read.
    Bad: 1000000
    Good: 1,000,000 (or 1.000.000, depending on your language)

  • Use a ~ to prevent dangling references.
    Bad: Newton et al. [1]
    Good: Newton et al.~[1]

  • Citations are not nouns.
    Bad: as discussed in~[1]
    Good: as discussed by Newton~[1]

  • Use proper LaTeX quotation marks.
    Bad: "Foo"
    Good: ``Foo'' (note that quotation marks are language-dependent.)

  • Reference more specific parts of a paper if possible.
    Bad: See Newton et al.~\cite{Newton}
    Good: See Newton et al.~\cite[\S~5]{Newton}

Don’t miss the comprehensive typesetting guides of Eddie Kohler, Markus Kuhn, and D. J. Bernstein’s to learn more about effective and beautiful typesetting in LaTeX.

Also take a look at How to write in plain English.

Building LaTeX papers

LaTeX papers with references are cumbersome to compile, remembered by most people as “a few runs of pdflatex plus a few runs of bibtex.” Wouldn’t it be much simpler if all you need to do is type make? Here’s how: I recommend using a compilation tool such as rubber plus a Makefile. Below is an example of a Makefile that I typically use for all of my research papers. The Makefile assumes that your root document is called paper.tex.

The environment variable GS_OPTIONS ensures that all fonts that the paper uses are embedded, so the pdf looks the same on each machine, no matter what fonts are installed. This is a requirement of many conferences and generally best practice.

When using this Makefile, make sure that the indented lines containing the two rubber commands must be prefixed by a tab character and not by spaces. Take a look at chapter TBA to learn more about creating Makefiles.

Makefiles are powerful and excel at tasks that involve repeated processing of files. I use a Makefile to compile this book from the markdown format to HTML, epub, and pdf; and also to automatically publish new drafts of the book. The Makefile’s target is index.html–the HTML file I want to create. The prerequisites are book.md, pandoc.css, and references.bib–the source files that are necessary to produce the HTML file. Finally, the recipe is an invocation of the tool pandoc, which converts my markdown file to an HTML file.

Whenever I added some more content, I type make, which compiles the given source files into an HTML file, which I have open in my browser. If I type make and nothing has changed since the last build, I see:

A Makefile can also contain rules that are not about compiling input into output files. To share drafts of my book, I upload it to my personal web server. This involves copying the relevant files into a directory that contains my websites, and then invoking the script that syncs web content from my laptop to my web server. All of this happens by simply running make publish. If the book’s output formats currently don’t exist, make will first compile them (hence the prerequisite on $(book_output)). Then, an invocation to cp copies the book’s HTML files to another directory on my laptop and, finally, I invoke the script that uses rsync to sync all files to my web server.

If you are no fan of command line tools, you can still benefit from LaTeX by using one of its online development systems. The tool Overleaf has often been popular among my collaborators.

A LaTeX template

Below, you can find a LaTeX template that I use for research papers. When submitting a paper to a conference, you typically have to use the conference style – which you can simply add to the template – but you may also have to change or remove parts of the template, depending on how restrictive the conference style is.

Note that \input{introduction} is replaced with the content of introduction.tex. I find it convenient to outsource sections to separate file because it makes your paper easier to manage and it also helps with version control if multiple people are working on a paper.

Pre-submission paper checks

Conferences and journals almost always have specific requirements that paper submissions need to satisfy. It’s frustrating to have your paper rejected for unnecessary reasons like formatting violations, so it’s a good idea to spend five minutes checking the conference’s requirements before pressing the “submit” button.

  • Make sure that your paper is within the page limit. The page limit sometimes includes and sometimes excludes references or appendices, so read carefully.

  • LaTex shows broken references as question marks. Do a Ctrl + F for the string [?] to find broken references.

  • Make sure that all fonts were properly embedded in your pdf. On Linux, I use the tool pdffonts which is part of the Debian package poppler-utils. I run it as pdffonts file.pdf and it displays a column called “emb,” which shows if a given font is embedded or not. While using pdffonts to write this paragraph, I realised to my dismay that one of my old papers did not embed all of its fonts:

    $ pdffonts Winter2012a.pdf
    name                                 type              encoding         emb sub uni object ID
    ------------------------------------ ----------------- ---------------- --- --- --- ---------
    GJYVBN+NimbusRomNo9L-Medi            Type 1            Custom           yes yes no     100  0
    NLMFQI+NimbusRomNo9L-Regu            Type 1            Custom           yes yes no     101  0
    XNJNRQ+NimbusRomNo9L-ReguItal        Type 1            Custom           yes yes no     102  0
    ZZEWFV+CMSY10                        Type 1            Builtin          yes yes no     103  0
    UIPGCJ+CMTT8                         Type 1            Builtin          yes yes no     127  0
    Helvetica                            Type 1            Custom           no  no  no     174  0
    Helvetica                            Type 1            Custom           no  no  no     180  0
    HNYWOO+StandardSymL-Slant_167        Type 1            Builtin          yes yes no     203  0
    JHYTSG+CMR10                         Type 1            Builtin          yes yes no     204  0
    CUJHND+CMMI10                        Type 1            Builtin          yes yes no     205  0
    ZapfDingbats                         Type 1            ZapfDingbats     no  no  no     211  0
    Helvetica                            Type 1            Custom           no  no  no     212  0
    Helvetica                            Type 1            Custom           no  no  no     218  0
    XEQPPW+CMTT10                        Type 1            Builtin          yes yes no     242  0

Git integration

LaTeX files are all text files, which makes them prime candidates for version control. I recommend putting all your LaTeX source files into a git repository.It doesn’t matter if you prefer subversion, CVS, or mercurial over git. What matters is that you use some sort of version control. I like git because it has emerged as the most popular system and with that comes great documentation, tooling, and most people you collaborate with will have at least some understanding of git.

Having your paper under version control has several advantages:

  • No writing is ever lost. Whatever you remove during editing is part of git’s history and can always be recovered.

  • You can easily determine the difference between two versions of your paper, making it easy to produce a pdf that highlights differences.

  • You can tell who changed what.

Use tags for milestones

A specific git commit can be assigned a “tag,” which is effectively an arbitrary label. Git tags are often used for version numbers. Whenever you publish a new version of your software, you assign the latest commit a tag like “0.2.4.” It doesn’t have to be version numbers though. I like to tag important milestones of my writing, for example whenever I submit a paper to a conference, or to the arXiv, or when I publish the final camera-ready version of a paper. You can even assign a tag to remember when you sent your paper to your advisor for feedback.

* 5de077a - (tag: ndss17-camera-ready) added cs to my email (3 years, 7 months ago) <laurar>
...
* 2cd29b1 - (tag: arXiv-resubmission-1) fixed last paragraph of internet scale section based on corrected plots (3 years, 9 months ago) <laurar>
...
* fabf1e3 - (tag: arXiv-submission) Turn passive into active voice. (3 years, 10 months ago) <Philipp Winter>
...
* 2187ef7 - (tag: NDSS-submission) Minor style harmonization and spelling fixes. (3 years, 11 months ago) <Philipp Winter>

Learn who changed what

With multiple people working on the same project, you will occasionally notice mistakes in your writing. Some of these mistakes may require discussion and instead of asking all your collaborators who’s responsible for a given piece of writing, you can find out yourself, by using the git’s “blame” functionality. It’s as simple as running git blame FILE. The output is the text file and for each line you can see when it was last changed, by whom, and as part of what commit.

Help git do its job

Remember to make one change per commit. Here are a few examples in the context of research papers:

  • Fixing one or more typos. If somebody is proof-reading an entire paper, it’s fine to have a single commit that fixes many (or all) typos in the paper.

  • Add a reference. Many claims need to be supported by references. Such a commit may add a new reference to the BibTeX file and then reference it in the corresponding LaTeX file.

  • Rephrase a paragraph or section. You may not like the way a paragraph (or entire section) is phrased. The action of rephrasing this paragraph or section should go in one commit. If you want to rephrase several pages worth of writing, consider using multiple commits.

  • Add more writing. Adding a coherent argument, paragraph, or section should go into a single commit. Adding two two independent paragraphs two separate sections should go into two commits.

  • Delete text to meet a page limit. Papers must sometimes be trimmed to meet a page limit. Unless it severely cripples the paper, it’s fine to do this in a single commit.

Note that making small changes is not always possible or reasonable. As you are rewriting a paragraph, you may realise that the rewrite only makes sense if you also rewrite the paragraphs before and after. This is fine. The above recommendations are just that: recommendations.

I personally find it helpful if paragraphs of text are broken into several lines spanning a maximum of 80 characters, instead of a single line of text. This makes it easier to inspect commit messages and understand what change was made. Consider the following example:

Only a single character changed in this paragraph, which is formatted as a one line. It’s difficult to see what changed because the line is so long.

Here, the same paragraph (and the same change) is formatted as separate lines. It’s easier to see what character was changed in this commit.

Communicating

Regardless of what research you do, a substantial part of your job will be communication; mostly with your peers, but ideally also with the general public. We communicate constantly, by writing papers, sending emails, talking to advisors, presenting our work, and by complaining on Twitter. Being an outstanding researcher goes a long way but to truly excel, we have to also master communication.

Effective communication creates numerous opportunities by 1) exposing your research to people who would otherwise not see it, 2) saving time, 3) “selling” your work, and by 4) earning the respect of your collaborators.

In this chapter, I will encourage you to create project pages, publish pre-prints, present effectively, engage in popular science writing, and use social media to your advantage. Regarding the more “intimate” communication with your peers, this chapter also goes into socialising, manage your collaborators, proper email etiquette, picking the right communication mode, and the reasons for communicating openly.

…with the world

Project pages

Data and source code of research papers are often only available upon request.

Note

Have you ever stumbled upon a promising research paper that mentions that its source code is available upon emailing the authors? Only to find out that the authors’ email addresses are no longer available? Or they don’t respond to your email? Or they did get back to you but cannot find their source code anymore? The main output of a research project is the resulting scientific paper and once it’s published, there is little incentive for authors to do more.

Early on in my Ph.D. life, I made it a habit to create project pages for almost every research project I have ever been involved in, including:

  1. An network traffic obfuscation protocol
  2. a scanner for Tor exit relays
  3. a Sybil detection tool
  4. a DNS measurement study
  5. a usability study on onion services
  6. an analysis of weak RSA keys in Tor relays
  7. A study on China’s Great Firewall (and another one)
  8. a measurement study using the RIPE Atlas network

The workload in research can be overwhelming and having to take care of yet another part of a project may sound daunting but the creation of a project page doesn’t take much time – maybe one afternoon, if you take your time. Once you have a template, you can re-use it for your next project, minimising the marginal cost of each new project page.

I recommend that project pages have at least the following sections:

  • Project summary: Start with a paragraph that summarises your project. Similar to an abstract, it should convey (i) what problem your project solves, (ii) how it solves the problem, and (iii) what the results are. Try to write the project summary for a broad audience; write it the way you would explain your research to somebody in another department, or to somebody in the grocery store. In other words: use simple language and avoid jargon.

  • Datasets: Does your research come with a dataset? If so, your project page should link to the data. You don’t need to host datasets yourself as this can be difficult for large datasets. Consider using the Internet Archive to archive your dataset and have your project page link to your Internet Archive page.

  • Code: Your code matters because it allows others to reproduce your work. We therefore have an obligation to publish our code. Don’t ever be embarrassed of your code’s quality because code is never perfect. Nobody reasonable person would every judge you by your code’s quality. As with datasets, there is no need to host code yourself: feel free to link to a GitHub or GitLab repository.

  • Papers: Being the main outcome of a research project, we should all make our research papers and other write-ups available on our project page. Be sure to make your paper openly accessible instead of linking to a paywalled portal. Research papers behind a paywall are an injustice and prevent less wealthy scientists from engaging in the scientific discourse. If you are worried about legal consequences of publishing a paper that’s not meant to be published: don’t be. I have yet to hear of a single case of a scientist getting into trouble for making available their own work.

  • Contact information: Consider providing contact information to make it easy for fellow researchers to reach out to you. Try to use email addresses that will still work five years from now – even if this means using your GMail address instead of your university’s email address.

I recommend keeping your project pages under your control, so you can edit them whenever you need to. It’s difficult to update the page if it is hosted at university.edu/project/ and you are no longer employed by your former university. At some point I decided to host all my project pages on my own Web server, nymity.ch, which gives me full control but this control comes at a price: responsibility. It is now your responsibility to keep your Web server alive, refresh your domain names and HTTPS certificates. If you want the same control with less responsibility, I recommend hosting your pages on services like GitHub Pages.

It is increasingly common to buy fancy domains for project pages, often ending in the desirable “.io” top level domain. There is nothing wrong with that but try to not let these domains expire and your project page disappear. Are you still going to pay the yearly $15 fee for myproject.io ten years from now? If not, then don’t go down that route.

To get you started with project pages, feel free to use the following template that gets you a simple, fast, and decent-looking project page in little time.

One can think of project pages as documentation of a finished piece of work but I prefer to think of them as living documents that evolve as a research project progresses. The earlier you can share information about your work, the better. Research papers are often preceded by workshop papers, posters, abstracts, or presentations. All of these are worth making available early on, on a project page. In fact, a project page can serve as documentation for yourself, to keep track of your project’s output. I am not suggesting to create project pages simply for altruistic reasons; you get something out of it too:

  • You get an idea of your audience by taking a look at your Web server logs. I used to regularly check the visitor log of my project pages. It was interesting to see what universities and departments would look at my work. In fact, it was gratifying to realise that anyone at all was interested in reading my work.

  • You expose your research to a broader audience. Research papers follow a style of writing and presentation that can be alienating to a general audience. Project pages mitigate this problem. Somebody who would not read your paper may read your project page – and perhaps then decide to take a look at the paper too.

  • It signals to potential employers that you go the extra mile and care about the presentation of your work even if you don’t have to.

Publish preprints

For fear of getting scooped, research projects typically remain confidential until publication of a peer-reviewed paper. Getting a paper through peer review can take many months if not years because it is common for a paper to be submitted multiple times for review. Throughout all this time, your work could have been useful to others.

In a short-lived field like computer science, this antiquated publication model causes frustrating and unnecessary delays. It does not have to be this way. While we don’t get around publishing peer-reviewed papers – it is academia’s currency, after all – we can publish a technical report before the final, peer-reviewed version of a paper is out. If you are still not convinced: Correa et al. (Correa et al. 2020) provide (not yet peer-reviewed) evidence that openly accessible papers are cited more than closed access papers.

Originally created for the publication of physics pre-prints, the arXiv turned into computer science’s most popular pre-print publication platform too. You “publish” your work on the arXiv by uploading your research paper’s LaTeX code (be sure to first remove all cusswords in the comments) and, after a moderator reviewed your submission, your article will appear on the arXiv – typically after one or two days.

Conveniently, the arXiv provides a notification system that informs subscribers about new reports in their area of interest. This means that a non-trivial number of people who subscribe to the field “computer networks” will get a notification after the publication of your new report in computer networks.

A frequent concern about the arXiv is that many conferences don’t allow paper submissions that have previously been published in a peer-reviewed venue. Fortunately, the arXiv is not peer-reviewed, so a report on arXiv typically does not count as “published.” In my field of computer security, all top-tier conferences accept papers that were previously “published” on arXiv. Regardless, in case of doubt, ask a conference’s program chairs to clarify their policy regarding previously published (but not yet peer-reviewed) technical reports.

“But Philipp,” you may ask, "why go through the extra trouble of uploading your report to the arXiv? It’s all about exposure. Once your report is published, many of your peers will come across it – be it over Google Scholar, which crawls the Internet for research papers; the arXiv’s in-house notification system; or other aggregators. Early exposure can result in citations, potential collaboration, or at least people having heard of your work.

Presenting

A good conference presentation opens doors. Science journalists may approach you to write a popular science article about your work,Or, in the time-tested academic tradition of unpaid labour, they may ask you to do it for them.

people from industry may wonder how one would deploy your research, and other academics may suggest projects to collaborate on. A great presentation can elevate your research from obscure insignificance to something that people talk about. Even if your research is not spectacular, a great presentation sets you apart from other presenters. Take presentations seriously.

Most conference talks I have attended are a missed opportunity. The average academic talk is difficult to follow, poorly structured, and dispassionate. Entire books have been written on effective presenting and I am not going to compete with these books. Instead, I am going to distill my advice into a few key points:

  • Rehearse your talks. Some people believe the myth that great presenters are born instead of made. This is wrong. My best talks were the result of numerous (up to a dozen) rehearsals. That’s why they were my best talks. With more rehearsing comes confidence. You will know what to say, resulting in fewer “ehms,” poor transitions, and awkward pauses because you won’t have to try make sense of your own slides. Consider recording yourself to use your voice more effectively, improve your body language, and be mindful of and eliminate fillers like “ehm,” “you know,” and “like.”

  • Capture your audience’s attention. Don’t dive right into the research. Try to start with a lighthearted joke, an interesting anecdote, or anything that gets people engaged. I once presented a paper on Sybil attacks and curiously, my name was listed twice on the conference’s list of accepted papers. I used this fact to start my presentation with a joke that got a few laughs.

  • Focus on what matters. It is very common for presenters to ramble on about irrelevant details. Keep in mind that your audience is very limited in what it can take away from your presentation. Ask yourself: what are the two or three most important points that I want my audience to remember? Spin your presentation around these points.

  • Have a narrative. Every sentence you say should be directly connected to the previous sentence. If you jump from one topic to another without proper transition, you will gradually lose your audience. Even with a proper narrative it can be difficult to follow a talk. Recapitulate occasionally, e.g., by saying “now that we looked at X and Y, it’s time to talk about Z.”

If you would like to learn more, take a look at Patrick Winston’s excellent lecture on “How To Speak”.

A good presentation uses slides sparingly but effectively. Here are my suggestions for optimal slide use:

  • Minimise the number of words on slides and avoid clutter. Your audience is going to read what’s on your slides and while they are reading, they cannot pay attention to you. Your slides are supporting material and are not supposed to keep the audience busy reading.

  • Use slide numbers, allowing your audience later in the Q&A to reference specific slides.

  • Make sure that the font (even in diagrams) is big enough that even people in the last rows can clearly read it. Most presenters get this one wrong.

  • When presenting charts, guide the audience. Explain the axes, discuss how one should read your chart, and highlight important insights.

  • Optimise your slides for the 16:9 widescreen format, which is now supported by all modern projectors. To be safe, consider exporting a second slide set for the (outdated and increasingly rare) 4:3 aspect ratio.

  • Use a sans-serif font (e.g. Arial) and avoid serif fonts (e.g. Times New Roman) because they are optimised for reading large amounts of text. Needless to say, this is not critical advice but I find it useful nonetheless.

Science Twitter

Twitter has a (sometimes deserved) reputation for being a time sink fueled by conflict and outrage but in my experience, the platform is all about who you follow. If you follow the right people, you will learn a lot. Over the years, I compiled a set of Twitter accounts that share sharp insights and thoughtful commentary. I find that one can find high-quality discourse on Twitter that’s similar to dinner conversations at conferences. The best thing about Twitter is that you don’t have to pay $800 worth of conference registration fees to participate in these conversations.

In deciding who to follow, I use the heuristic of following somebody for a few days or weeks and if I get the feeling that I don’t learn much from this person, I unfollow them again. Twitter is as much a marketing platform as it is a discussion platform and some people push the marketing aspect a little bit too far for my taste.

While you’re at it: use the opportunity to follow people outside your field. And by that, I don’t mean somebody in programming language design if you’re in computer vision. Follow people in psychology, economics, or biology. You can learn a lot by observing what problems scientists in other fields struggle with, and how culture and methods differ.

Twitter can be a great option to stay in the loop on various topics:

  • Conferences and workshopsFor the non-computer scientist: original research in computer science is typically submitted to conferences instead of journals.

    carry with them a reputation. When I was new to research, I did not know that. Eventually, I got a feeling of this (sometimes informal) “ranking” and what conferences carry the most prestige. Listening to researchers talk about conferences will help you get a sense of where you should submit your work to.

  • Professors occasionally talk about faculty hiring processes, what they look for in Ph.D. or grant applications, and their opinions on the peer review process. While this knowledge does not necessarily generalise to all of academia, it is still helpful.

  • Some researchers openly talk about their paper rejections, which serves as an important reminder that even the people you admire deal with rejection just as much (if not more) than you do. This helps calibrate your perspective.

  • By engaging in discussions, you will eventually build a following, allowing you to promote your own work more effectively.

By no means do you need to use Twitter to be successful in your field but controlled use can be an advantage. However, avoid Twitter fights as they make you look like a combative fool to bystanders. Also, make an effort to post interesting and insightful content and don’t just advertise your latest paper.

Teach the public

Communicating your work does not have to end with your peers. We have a responsibility to make our work accessible to the broader public. To that end, I have published two articles in The ConversationI was not paid to write these articles and have no financial interest in this site. I merely mention The Conversation because I have some experience with it.

. I was originally contacted by an editor who encouraged me to explain my research to the public by publishing an article in The Conversation. The site does not pay its authors but I still experienced it as an interesting endeavour because I have never worked with an editor before. For both articles, I originally came up with a first draft and my editor left plenty of suggestions and advice. After three or four more iterations, the article was at a point where it was ready to be published.

There are many other outlets that encourage scientists to explain their research to the public. Your research group’s or department’s blog platform is another great opportunity to practice these skills.

…with collaborators

Socialise

Academic conferences are where one forms new connections and collaborations. Conferences can be intimidating and uncomfortable, particularly for sufferers of the impostor syndrome. You find yourself surrounded by accomplished and smart people, and believe that your research pales in comparison to theirs. I know the feeling.

If you find that approaching somebody at a conference is scary, I recommend finding somebody who can introduce you. That could be your advisor or a common friend. Also, keep in mind that there is nothing wrong with approaching someone during a coffee break. Introduce yourself and start with a question or compliment someone’s research. Most people are flattered to hear that somebody enjoyed (or even just read) their research. If you feel very nervous or anxious, you may spend too much time in your head, focused on your stressful feelings. Try instead to make a conscious effort to focus on the other person. Pay close attention to what they are saying and ask follow-up questions.

While most of the networking at a conference happens in the “hallway track” during breaks, there are more networking opportunities in the evening. People often head out for dinner and drinks, which creates a less formal environment that makes it easier to strike up conversations and meet new collaborators. Consider tagging along with a group to not miss out on this opportunity.

Manage collaborators

Eventually, you are going to lead a research project. This involves coordinating collaborators, organising meetings, and keeping everyone in the loop on the project’s progress. Things inevitably get messy when people with different personalities, cultures, and communication styles work together. The following tips help make the process more smooth:

  • Your advisor is a collaborator too and needs “management.” Advisors differ significantly in their style and range from entirely hands-off to micromanaging their students. To learn more, take a look at Nick Feamster’s excellent blog post on the matter.

  • If you want a collaborator to work on something, ask specific questions and provide clear instructions. Don’t expect them to realise how busy you are and offer help – they are likely too busy to notice.

  • Keep people up-to-date on the project’s progress. Some people like to use email for this; others schedule regular calls to discuss progress. Consider using email for short updates and have the occasional call when there’s more to discuss.

  • Don’t be afraid to express frustration but do so respectfully and with the intent to improve the collaboration rather than assign blame. For example, if half of your team always misses meetings, strike up a conversation on how to collaborate in ways that work for everyone.

  • Even more important than expressing frustration is the expression of gratitude. Let your collaborators know when they did a good job! We all love to feel appreciated.

  • Conflict among collaborators is a common occurrence. If you find it difficult to resolve a conflict yourself, consider involving your advisor as mediator.

  • Whenever there is something to discuss, involve all collaborators by default, unless you have a good reason not to. Your collaborators will feel respected for being kept in the loop. (More on that below.)

  • As if all of the above were not difficult enough, the average team consists of researchers from several cultures that have different customs regarding communication. Give people the benefit of the doubt and try to be clear and respectful in your communication.

Email etiquette

Pick descriptive email subjects that make it clear what your email is about. I occasionally prefix the email subject with “FYI:” or “Action needed:” to let the recipient know that an email can be ignored or does require action.

  • Bad: “Project update”
    Good: “FYI: Paper got accepted”

  • Bad: “Repository”
    Good: “Action needed: Commit missing code to repository”

  • Bad: “Need help with code”
    Good: “Please commit missing code to repository”

Try to avoid top-posting when dealing with long and complicated emails because it makes it difficult to follow an email discussion:

I have strong opinions about your email.  You are wrong about X, Y, and Z.

On Tue, Jan 05, 2021 at 07:35:54PM +0000, John Doe wrote:
> Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
> incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
> nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
> Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
> fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
> culpa qui officia deserunt mollit anim id est laborum.

Instead, try to quote and respond to specific parts of the original email:

On Tue, Jan 05, 2021 at 07:35:54PM +0000, John Doe wrote:
> Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor
> incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
> nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

This I agree with.

> Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
> fugiat nulla pariatur.

I believe we should do X instead because of Y.

> Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia
> deserunt mollit anim id est laborum.

Well said.

Use To: and Cc: wisely. Put everyone whose attention you require into the To: field and the remaining collaborators into the Cc: field. Create an email alias to make it easy to reach all of your collaborators. For more email writing tips, take a look at Philip Guo’s excellent article.

Pick the right communication mode

Find the right balance between the least invasive and the most convenient communication method. It may be convenient for you to call your collaborator each time you need something but she may experience this as distracting and invasive.

To discuss complicated research designs, you typically need a synchronous meeting; either a phone call or an in-person meeting. For topics that require less back-and-forth, asynchronous communications methods like email are a better fit. If you need something right now, an instant message or phone call may be the most appropriate.

Also, keep in mind that everyone’s communication preferences differ. Some people enjoy video calls while others prefer texting. Collaboration often requires compromising, i.e. to find communication methods that work for everyone.

Regarding concrete communication tools, Slack (or its free software alternative Mattermost) are useful because they allow collaborators to self-select what communication they want to participate in.

Communicate openly

Imagine a small research projects consisting of three collaborators; Alice, Bob, and Eve. There are four possible communication channels – assuming nobody talks to themselves:

  1. Alice ↔ Bob
  2. Alice ↔ Eve
  3. Bob ↔ Eve
  4. Alice ↔ Bob ↔ Eve

Four collaborators have eleven possible communication channels while five collaborators have a whopping twenty-six possible communication channels!The binomial coefficient (a.k.a. choose n out of k) reveals the number of communication channels among a group of collaborators.

Five collaborators are by no means unusual – in fact, the top four academic security conferences now average five authors per paper (Balzarotti 2020).

The good news is: you don’t have to ponder which one of the twenty-six communication channels you opt for before writing an email. Unless you have a good reason not to, err on the side of inclusion when communicating. That is, include everybody in your email CC list by default. If any one of your collaborators feels overwhelmed by the communication, they can request to be omitted from future correspondence, or simply ignore your emails. Typically, it should be your collaborator’s decision, what to participate in – not yours. In my experience, collaborators appreciate being kept in the loop – even if they rarely respond to email threads.

As a young Ph.D. student, I mistakenly believed that I would do my collaborators a favour by not including them, unless I really needed their help. After all, isn’t everyone busy and has better things to do? This is a fallacy. Collaborators exist to help each other out and generally like to know what’s going on. Give them the opportunity to! Besides, a lack of communication can quickly lead to a culture of distrust when people are left out. Especially junior collaborators will eventually wonder if there are ulterior motives for them being left out.

However, not everything needs to be discussed with all of your collaborators. Do you need your advisor’s signature on a document? Your collaborators won’t care. The same is true if one of your collaborators is unable to log into a machine that you use for experiments. When it comes to the actual research, however, you need to have a good reason to not include somebody.

I know first-hand that it’s often tempting to initiate one-on-one communication; for example when you feel insecure about an idea, and would like to run it by someone before you share it further. Try to avoid this. The more you communicate in the open, the better for you and the project, and your collaborators will respect you for it.

Acknowledgements

Contact and Support

Send email to phw@nymity.ch.

Balzarotti, Davide. 2020. “System Security Circus 2019.” January 2020. https://s3.eurecom.fr/~balzarot/notes/top4_2019/.

Clear, James. 2018. Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones. Avery.

Correa, Juan C., Henry Laverde-Rojas, Fernando Marmolejo-Ramos, and Julian Tejada. 2020. “The Sci-Hub Effect: Sci-Hub Downloads Lead to More Article Citations.” 2020. https://arxiv.org/pdf/2006.14979.pdf.

Diffie, Whitfield, and Martin E. Hellman. 1976. “New Directions in Cryptography.” Transactions on Information Theory 22 (6). IEEE. https://ee.stanford.edu/~hellman/publications/24.pdf.

Keshav, Srinivasan. 2007. “How to Read a Paper.” SIGCOMM Computer Communication Review 37 (3). ACM. http://ccr.sigcomm.org/online/files/p83-keshavA.pdf.

Newport, Cal. 2016. Deep Work: Rules for Focused Success in a Distracted World. Grand Central Publishing.

Pinker, Steven. 2015. The Sense of Style: The Thinking Person’s Guide to Writing in the 21st Century. Penguin Books.

Pollan, Michael. 2009. In Defense of Food: An Eater’s Manifesto. Penguin Books.

Walker, Matthew. 2018. Why We Sleep. Scribner.