Saturday, December 28, 2013

Are women better than men at multitasking? A myth that is hard to dispel....

The movie Kramer vs. Kramer offers an interesting
perspective on gender differences in multitasking
Almost every time I give a talk about multitasking, people ask me about gender differences. Even in my ERC grant interview, in which it was clear that gender differences were not the focus of study, I was asked the question.
Interestingly enough, this is not a question that is easy to answer. There are many references in popular culture and the popular press, but hardly any in scientific papers. This is somewhat mysterious: why haven't scientists ever investigated this?
The most likely answer is that they have, but did not find anything. However, a null result is not easy to interpret (it certainly isn't proof), and even harder to publish. Sometimes it is, for example Thomas Buser from the University of Amsterdam (, but this is hardly ever picked up.

Final evidence for gender differences?

Sometimes, however, there are some claims that evidence for gender differences have finally been found. Keith Laws from the University of Hertfordshire was already mentioned in the popular press in 2010 (BBC News - Is multi-tasking a myth?), but only recently a paper was published with the actual evidence: "Are women better than men at multi-tasking?"( This gave quite a stir (again) in the popular press (BBC News - Women 'better at multitasking' than men, study findsWomen Really Are Better At Multitasking...In Some Cases - Big ThinkWomen better at multitasking, study says | National News - KSBW ..., etc.). But what is the actual evidence?
Close scrutiny of the article reveals several flaws both in methodology and statistics. Let me go over these in a bit more detail. Experiment 1 uses a so-called task-switching paradigm. In task switching, subjects have to perform different tasks on the same stimuli. In particular, the stimuli consisted of a shape (diamond or square) with a filling (two or three dots). Subjects had to either do a single-task block, in which they always responded to either the shape or the filling, (so AAAAAAA or BBBBBBB) or a switching block, in which the they had to do one of the tasks twice, followed by the other twice, etc. (AABBAABBAABB) There is a huge body of literature on this phenomenon, consistently showing that switching is harder than single tasking, and a switch trial (going from A to B), is slower than a repeat trial (going from A to A or from B to B in a switching block). What the paper claims to have found is a significant gender difference in the so-called mixing costs, that is the  difference in average reaction times between single task blocks vs. switching blocks.

Problems in methodology and statistics

There are two problems with this experiment. The first is whether it tests actual multitasking ability. Task switching is widely studied in experimental psychology, but although it tests a skill that is in the general family of executive control abilities that multitasking is also considered to be part of, it is not generally considered to require multitasking skills as such. Why not pick a "typical" multitasking task?
The second problem is statistical. The experiment explores many different contrasts: repeat vs. switch trials, congruent vs. incongruent trials, reaction times and error rates. There are no gender differences in all but one: the mixing costs. And even that comparison is dubious, because the actual outcome of the interaction test (the so-called Anova) is not reported. Instead, the difference is recalculated and fed into a t-test with a significance of 3%. This means that among all the possible gender differences, only one of them was found to be significant after a dubious transformation at a measly 3%, even though the experiment used 240 (!) subjects. Officially, the 3% should have been corrected for multiple comparisons by multiplying it by the number of comparisons, rendering it insignificant.

A better follow-up experiment?

Sometimes a somewhat sketchy first experiment can be saved by a follow-up experiment that fixes mistakes in the first. However, in this article the second experiment is far worse than the first. It uses a fairly loose paradigm to study multitasking: subjects are given three tasks to perform within a limited timeframe, during which they are interrupted by a phone call that they can choose to answer or ignore. If they answer the call, they can choose whether or not to do this concurrently with whatever task they were performing. On all possible measures of multitasking that this experiment allows (differences in the three tasks, whether they answered the phone at all, whether they multitasked during the phone call, whether this affected their performance), the only gender difference that showed up was on one of the three tasks.
This experiment has the same problem as the first: many contrasts can be tested for gender differences, but only one of them shows a significant (2% this time) results. Moreover, even if this would be "properly" significant, it still only shows a significant difference between men and women on that task. The authors cite a personal communication with the designer of that test that there are no known gender differences in a single task setting. This is not enough: instead an interaction between gender x single vs multitasking should be demonstrated, which is a tougher test.

Sloppy science?

The discussion above may appear to be awfully technical, but the quality of psychological research has been under scrutiny lately exactly when it concerns these details (e.g., Nobel laureate challenges psychologists to clean up their act - Nature). The authors include various disclaimers in their discussion. However, the suggestion that is picked up by the media is clear: there are gender differences, and we only need someone to do a replication to prove it. Instead, my conclusion would be that the experiments demonstrate nothing at all. I have written a letter to the journal, outlining my concerns, but I have not received an answer yet. I will post an update if I do.

So, are there gender differences or what?

As far as I can tell, there is no evidence for one way or the other. But the fact that no reliable study has been published yet makes it much more probable, given the way scientists work, that there are no differences. From a more general perspective of philosophy of science, it is also better to assume there are no differences until they are clearly supported by evidence. 
The only reason to suppose that there are gender differences is anecdotal. But anecdotal evidence can easily confuse being skilled in the tasks themselves with the ability to multitask. People are just better at multitasking when they have fully mastered the tasks involved, and those effects are extremely strong.

Monday, August 19, 2013

What is the opposite of multitasking? Reading? Maybe not....

In many popular articles, multitasking is contrasted with reading. "People multitask all the time, they never spend time reading a book." So, reading is not multitasking? Maybe it is. When you read a book, you almost never finish it in one sitting. This means that at some point you put the book down, proceed with other activities, and then later pick it up again. The fact that we can just continue reading, even after days of interruption, is quite a remarkable feat of memory, considering that in scientific studies, interruptions of only tens of seconds already produce strong interference results.
But even if we disregard the interruption factor in novels, reading a novel is often a multitasking challenge in itself because authors like to play games with the reader. Telling a story strictly linear can be boring, so authors sometimes reveal information out of temporal order, challenging the reader to keep all the facts straight. Or, the story is told from the viewpoint of several characters, and it is up to the reader to keep the different story threads apart.
Cloud Atlas: Hexatasking
An extreme example of a multitasking novel is "Cloud Atlas" by David Mitchell. The novel consists of six stories that have common elements and themes, but each of which has a completely separate plot (and writing style). The book starts with the first half of the first story, then proceeds with the first half of the second story, etcetera, until the sixth story, which is told in full, and is then followed by the second half of the fifth story, the second half of the fourth story, etcetera, until finally the second half of the first story.
In other words, halfway the reader has to maintain a representation of six partially finished stories. The challenge is to get into the second half of each of the remaining stories as they unfold, up until the second part of the first story. In my own experience this became increasingly harder, because more time and interfering text had passed in between. Nevertheless, Cloud Atlas is not a heavy read, and in fact quite an enjoyable book I would recommend to anyone. Reading: quite a feat of human multitasking.
Interestingly enough, the screenwriters of the movie version of Cloud Atlas decided not to follow the structure of the novel. Instead, they interleaved all six stories, ending with the end of the sixth story. And this was probably a good idea: if the movie would have been cut in the same way as the book is told, it would probably have become way too confusing (although it would be an interesting experiment to try out: can someone edit the movie in the book order an email it to me? I'd gladly run the experiment). Somehow the movie format requires refreshing the memory of the viewer every once in a while.
Why is there this difference between books and movies? Is it just the amount of time we spend on them? This cannot be the whole story, because the movie is approximately three hours, so if the movie would have been cut consistently with the structure of the book, the interruption in the first story would have been around 2.5 hours. When I read the book, the interruption was over a month. Even though reading the book may take longer than watching the movie (a factor of 5, perhaps?), the interruptions are even longer (a factor of 300, in my case).
Perhaps the difference is that we have different expectations, and therefore bring other memory strategies to bear. Maybe, when we read we are more proactive in building up a story representation than when watching a movie, in particular when most movies are designed to not require too much thinking anyway.

Tuesday, August 6, 2013

Why are people sometimes such poor decision makers when it comes to multitasking?

In earlier posts on this blog, I argued that people are not so bad at multitasking as the popular press often claims. It is, however, undeniable that people are often very inefficient in dividing their time between tasks.

One of the reasons for poor multitasking is poor decision making. People decide to multitask, or to switch tasks when it is not the best moment to do so. The mail/chat experiment I described earlier is an example of an experimental setup where people switch from one task to another at moments that they really should have sticked to the main task they were doing.

When people have to make decisions in ordinary tasks, they often base choices on utility. This means that they compare the potential payoffs for each of the choices, and pick the one with the maximum payoff. So when I want to go to the grocery store, and have to decide whether to walk, drive, or take the bicycle, the utility of these choices is based on how much time it takes, what my capacity to carry groceries will be, and how much money it costs. Taking all these factors into account is quite complex, so many theories assume we base our utility estimations on past experiences instead of trying to calculate the consequences of future action.

The problem in multitasking is that we cannot use the same strategy as in normal decision making. In multitasking, each of the individual tasks has its own utility, and from the perspective of that calculation it is always a bad idea to multitask. If we were perfect utility maximizers, we would never multitask, which is obviously not true.

So, what is the basis for our decisions? A possible explanation for which we have collected some evidence (including the mail/chat experiment) is what I call resource availability. If there is another tempting task, and we have mental resources available to do that task, we go and do that task, either by adding it to what we are already doing, or by switching to it. By mental resources I mean memory, visual attention, working memory, motor control, etc. Let me give some examples.

A wedding in China: obviously the groom was too boring....
In the mail/chat task, people are waiting for a web browser to give information that will allow them to answer emails. But while they are waiting, almost all of their mental resources have nothing to do, and will be on the lookout. And indeed, there is a chat window in which there is something to do! So they switch to the chat task and in the meantime forget all of the context of the mail task they were doing.

Many real-life situations are in traffic, whether in a car, or walking, or cycling. I like to listen to music or podcasts while running to keep all my mental resources happy. Dutch students on a bicycle seem to have an incessant need to call their friends when they are on a bicycle. Whenever I am stuck writing the next sentence of this Blog, email and Facebook are tempting me.

In the end, our poor decisions in multitasking may be a consequence of our continuous attempts to escape boredom. Boredom forces us to jump from one attractive electronic carrot to another, but this never completely satisfies us.

Friday, November 9, 2012

The perfect grading strategy, or: How to benefit from your own research

Grading exams is one of the burdens of being a teacher, especially when it is a large course. My biggest course has approximately 120 students, which means that right now 120 exams consisting of 6 essay questions each are sitting on my desk.
What is the right grading strategy? There seem to be two strategies.
One is to do one exam at a time, and grade all six questions. There are several problems with this approach. The main problems is the risk that your standards start drifting: if students are bad on average you may become more forgiving, or more stringent if they do well. The second problem is the change of mindset necessary after each question, which is mentally taxing and time-consuming.
The second strategy is the one I think most people use, which is to grade all question 1's, then shuffle the whole stack of exams, grade all question 2's, etc. This seems to be the perfect solution that solves the problems of strategy 1: your standards are less likely to drift, and if they do the effect is more randomly distributed. You also don't have to shift mindsets, because you grade the same question over and over again.
So isn't strategy 2 the perfect method? Unfortunately not. When reading and grading answers that are all similar but not the same it becomes increasingly difficult to separate them in your mind. Each new version of the answer needs more mental focus to keep it apart from the previous answers from other exams. That is why I often use this as an example of a case where interruption is beneficial instead of harmful: an interruption helps clearing the mind of previous answers, reducing interference. But regular interruptions do not, of course, speed up the process of grading, and as we all know interruptions often lead to more distraction.
The solution is so simple that I wonder why I haven't thought of it before. Isn't it obvious: grade TWO questions per exam, so first all questions 1 and 2, then all question 3 and 4, etc. Alternating two questions delivers just the amount of memory interference needed to disentangle the current answer from previous answers. Keeping two answer standards in mind is still doable, and the standard drift is also reasonably under control.
I am now halfway through with grading, and although there still a lot of work to do, I can report that it works very well, and I can recommend it to everyone in the same situation.

Sunday, January 29, 2012

Seven plus or minus two: hard to eradicate

The website of "de Volkskrant", a Dutch newspaper, had an interesting bit of news: ToDo lists don't work. The article referred to a blog of the Harvard Business Review by Daniel Markovitz. One of the main claims of that article is that our brain cannot handle more than seven choices, and will therefore be overwhelmed when a ToDo list is longer than seven items.
Seven plus or minus two has been a staple of Cognitive Psychology since Miller published a paper on the limitations of short-term memory in the fifties, which led to the naive theory (not endorsed by Miller himself!) that human short-term memory consists of seven (plus or minus two) slots in which items can be stored. But even though cognitive psychologists have abandoned this theory for decades, it still floats around in the applied domain as a serious limitation of human cognition.
To get back to the original topic, I always find ToDo lists quite useful. From a multitasking perspective they perform a very useful service, helping you keeping track of your uncompleted goals. So now this blog claims that these lists do not work, because your brain can only handle seven uncompleted goals.
The cited research for this claim is a set of studies that show that people want choices, but not too many of them. Being able to choose between 3 brands of detergent is nice, but choosing between 20 brands is ridiculous. So, what's the threshold? Seven. Really seven? In many situations I'd rather not choose at all, that is why I like house brands in supermarkets. Maybe in a restaurant I prefer some more choices on the menu than just seven.
But even if this were all true, the generalization from choosing between similar candidates for a purchase and which item to do next on a ToDo list is pretty far-fetched. And the idea that your brain will overheat if the length of your ToDo lists exceeds the threshold is pretty ludicrous.
To conclude: I think ToDo lists are pretty useful as a memory aid. But maybe the lesson is to be really, really suspicious about any type of psychological advice that involves the number Seven. Plus or Minus Two.

Tuesday, December 27, 2011

Rationality in Multitasking

There is typically a big gap between the tasks studied in psychological experiments and real-life cognition. This is because psychologists want to study the individual psychological functions in great detail, and therefore develop tasks that try to isolate these functions. Although this offers a wealth of information, the danger is that the interaction between those functions is not well understood. For example, in the example of the attentional blink that I discussed earlier, researchers still do not agree on whether the phenomenon is due to a attention, memory or control.

When Dario Salvucci and I were working on our book on multitasking, we also found an example of this gap. Observational studies had shown that the cost of interruption can be very large, in terms of tens of minutes. However, interruption costs in experiments are typically in the order of only a second. To study interruption behavior at a more realistic scale, Dario and his graduate student Peter Bogunovich designed an experiment that was in between real life and a basic experiment. Subjects had to answer emails for which they had to look up information using a web browser. Occasionally though, a chat window blinked in the background. If subjects clicked on this window, a question about movie preferences was asked which they then had to answer.
The important aspect of the task was that subjects had complete freedom in when they wanted to switch between email and chat, something that is typically not done in experiments. They could switch to the chat right when it would come up with a new question, or they could wait until they were done with the current email. The mail task was structured in such a way that information had to be remembered. For example, in the example in the picture, the price of the mp3-player Killor U-32 had to be looked up, which takes three clicks in the web browser. In the real experiment (as opposed to the picture) the windows were always on top of each other. So, when you have just clicked on "mp3-player" in the browser, it is not very clever to switch to the chat, because then you probably have forgotten "Killor U-32" when you get back.
It turned out that the subjects in Dario's experiment were indeed smart about this, and continued on the mail task until they reached a smart switch point, a point during which they did not need to remember any information.

Together with Jelmer Borst and a group of project students (Joost Timmermans, Anita Drenthen en Tom Janssen), we redesigned the experiment in order to try to tempt subjects into switching to the chat window at moments that information from the email had to be remembered. We did this by introducing delays in the web browser (after clicking a link it took a few seconds before the subsequent page appeared) and the email program (it took a second for an email to load). As a result, a substantial number of subjects now switched from mail to chat during a delay in the browser, at a moment when information had to be kept active. Moreover, the average time needed to answer an email with delays was longer than without delays, even if all the delays were subtracted from this time first. In the strongest condition the extra cost was more than 6 seconds (ok, still not minutes, but better than just a second). In other words, subjects would have been better off if they had just waited during delays instead of making a switch. Trying to use the waiting time for something else turned out to decrease efficiency instead of increasing it.

What these experiments show is that people are typically smart about their choices in switching from one task to another. But delays can surely thwart this rationality. Apparently, we'd rather act than wait, and that is probably why I prefer taking the bicycle over waiting for the bus, even though the latter option might get me there faster.

Saturday, December 3, 2011

Become a better multitasker through... sport!

Last week I gave a talk about multitasking in a symposium "The Science and Art of Brain Maintenance". Two of the other speakers in the symposium, Chris Visser and Erik Scherder, talked about the benefits of physical exercise. We all know that exercise is good for your physical health in a great number of ways, but Chris and Erik reported on research that shows that exercise can improve your mental prowess. Chris' research demonstrates that children that are good at sports on average also perform better in school on mathematics and language. Erik showed that exercise during your lifetime is correlated with a lower incidence of diseases like Alzheimer's and other forms of cognitive degeneration.

At first this may sound surprising, but if you think about it, sports is not just a matter of physical practice but also of mental practice. Sports requires motor coordination, discipline and skill learning. According to Chris and Erik, the main cognitive function of importance that connects physical and mental prowess is executive control. Executive control, located in the frontal areas of the brain, concerns our goals and the handling of our goals. It keeps us focussed on what we do, help us in determining our actions in the absence of outside information, and plays a role in juggling all the items that we needs to maintain in working memory. Sounds familiar? Those are exactly the functions that are needed for multitasking.

In my own talk, I showed a picture of brain pictures associated with sequential multitasking: an area in the frontal cortex and an area in the parietal cortex (made by Jelmer Borst). To my surprise, both Chris and Erik showed pictures of roughly the same areas.

If we connect the two together: physical exercise improves executive control, and executive control is the main factor of success in multitasking, then we can conclude that physical exercise can make us better multitaskers.
The pitfall of this line of reasoning is that most evidence is correlational. We cannot be sure whether physical exercise actually causes better executive function. Apart from that, it is an interesting thought that this is another reason why body and mind are maybe not as separate as we tend to think.