Towards safe and productive human multitasking: 2013

Saturday, December 28, 2013

Are women better than men at multitasking? A myth that is hard to dispel....

The movie Kramer vs. Kramer offers an interesting
perspective on gender differences in multitasking

Almost every time I give a talk about multitasking, people ask me about gender differences. Even in my ERC grant interview, in which it was clear that gender differences were not the focus of study, I was asked the question.
Interestingly enough, this is not a question that is easy to answer. There are many references in popular culture and the popular press, but hardly any in scientific papers. This is somewhat mysterious: why haven't scientists ever investigated this?
The most likely answer is that they have, but did not find anything. However, a null result is not easy to interpret (it certainly isn't proof), and even harder to publish. Sometimes it is, for example Thomas Buser from the University of Amsterdam (http://link.springer.com/article/10.1007/s10683-012-9318-8), but this is hardly ever picked up.

Final evidence for gender differences?

Sometimes, however, there are some claims that evidence for gender differences have finally been found. Keith Laws from the University of Hertfordshire was already mentioned in the popular press in 2010 (BBC News - Is multi-tasking a myth?), but only recently a paper was published with the actual evidence: "Are women better than men at multi-tasking?"(http://www.biomedcentral.com/2050-7283/1/18). This gave quite a stir (again) in the popular press (BBC News - Women 'better at multitasking' than men, study finds, Women Really Are Better At Multitasking...In Some Cases - Big Think, Women better at multitasking, study says | National News - KSBW ..., etc.). But what is the actual evidence?
Close scrutiny of the article reveals several flaws both in methodology and statistics. Let me go over these in a bit more detail. Experiment 1 uses a so-called task-switching paradigm. In task switching, subjects have to perform different tasks on the same stimuli. In particular, the stimuli consisted of a shape (diamond or square) with a filling (two or three dots). Subjects had to either do a single-task block, in which they always responded to either the shape or the filling, (so AAAAAAA or BBBBBBB) or a switching block, in which the they had to do one of the tasks twice, followed by the other twice, etc. (AABBAABBAABB) There is a huge body of literature on this phenomenon, consistently showing that switching is harder than single tasking, and a switch trial (going from A to B), is slower than a repeat trial (going from A to A or from B to B in a switching block). What the paper claims to have found is a significant gender difference in the so-called mixing costs, that is the difference in average reaction times between single task blocks vs. switching blocks.

Problems in methodology and statistics

There are two problems with this experiment. The first is whether it tests actual multitasking ability. Task switching is widely studied in experimental psychology, but although it tests a skill that is in the general family of executive control abilities that multitasking is also considered to be part of, it is not generally considered to require multitasking skills as such. Why not pick a "typical" multitasking task?
The second problem is statistical. The experiment explores many different contrasts: repeat vs. switch trials, congruent vs. incongruent trials, reaction times and error rates. There are no gender differences in all but one: the mixing costs. And even that comparison is dubious, because the actual outcome of the interaction test (the so-called Anova) is not reported. Instead, the difference is recalculated and fed into a t-test with a significance of 3%. This means that among all the possible gender differences, only one of them was found to be significant after a dubious transformation at a measly 3%, even though the experiment used 240 (!) subjects. Officially, the 3% should have been corrected for multiple comparisons by multiplying it by the number of comparisons, rendering it insignificant.

A better follow-up experiment?

Sometimes a somewhat sketchy first experiment can be saved by a follow-up experiment that fixes mistakes in the first. However, in this article the second experiment is far worse than the first. It uses a fairly loose paradigm to study multitasking: subjects are given three tasks to perform within a limited timeframe, during which they are interrupted by a phone call that they can choose to answer or ignore. If they answer the call, they can choose whether or not to do this concurrently with whatever task they were performing. On all possible measures of multitasking that this experiment allows (differences in the three tasks, whether they answered the phone at all, whether they multitasked during the phone call, whether this affected their performance), the only gender difference that showed up was on one of the three tasks.
This experiment has the same problem as the first: many contrasts can be tested for gender differences, but only one of them shows a significant (2% this time) results. Moreover, even if this would be "properly" significant, it still only shows a significant difference between men and women on that task. The authors cite a personal communication with the designer of that test that there are no known gender differences in a single task setting. This is not enough: instead an interaction between gender x single vs multitasking should be demonstrated, which is a tougher test.

Sloppy science?

The discussion above may appear to be awfully technical, but the quality of psychological research has been under scrutiny lately exactly when it concerns these details (e.g., Nobel laureate challenges psychologists to clean up their act - Nature). The authors include various disclaimers in their discussion. However, the suggestion that is picked up by the media is clear: there are gender differences, and we only need someone to do a replication to prove it. Instead, my conclusion would be that the experiments demonstrate nothing at all. I have written a letter to the journal, outlining my concerns, but I have not received an answer yet. I will post an update if I do.

So, are there gender differences or what?

As far as I can tell, there is no evidence for one way or the other. But the fact that no reliable study has been published yet makes it much more probable, given the way scientists work, that there are no differences. From a more general perspective of philosophy of science, it is also better to assume there are no differences until they are clearly supported by evidence.

The only reason to suppose that there are gender differences is anecdotal. But anecdotal evidence can easily confuse being skilled in the tasks themselves with the ability to multitask. People are just better at multitasking when they have fully mastered the tasks involved, and those effects are extremely strong.

Monday, August 19, 2013

What is the opposite of multitasking? Reading? Maybe not....

In many popular articles, multitasking is contrasted with reading. "People multitask all the time, they never spend time reading a book." So, reading is not multitasking? Maybe it is. When you read a book, you almost never finish it in one sitting. This means that at some point you put the book down, proceed with other activities, and then later pick it up again. The fact that we can just continue reading, even after days of interruption, is quite a remarkable feat of memory, considering that in scientific studies, interruptions of only tens of seconds already produce strong interference results.
But even if we disregard the interruption factor in novels, reading a novel is often a multitasking challenge in itself because authors like to play games with the reader. Telling a story strictly linear can be boring, so authors sometimes reveal information out of temporal order, challenging the reader to keep all the facts straight. Or, the story is told from the viewpoint of several characters, and it is up to the reader to keep the different story threads apart.

Cloud Atlas: Hexatasking

An extreme example of a multitasking novel is "Cloud Atlas" by David Mitchell. The novel consists of six stories that have common elements and themes, but each of which has a completely separate plot (and writing style). The book starts with the first half of the first story, then proceeds with the first half of the second story, etcetera, until the sixth story, which is told in full, and is then followed by the second half of the fifth story, the second half of the fourth story, etcetera, until finally the second half of the first story.
In other words, halfway the reader has to maintain a representation of six partially finished stories. The challenge is to get into the second half of each of the remaining stories as they unfold, up until the second part of the first story. In my own experience this became increasingly harder, because more time and interfering text had passed in between. Nevertheless, Cloud Atlas is not a heavy read, and in fact quite an enjoyable book I would recommend to anyone. Reading: quite a feat of human multitasking.
Interestingly enough, the screenwriters of the movie version of Cloud Atlas decided not to follow the structure of the novel. Instead, they interleaved all six stories, ending with the end of the sixth story. And this was probably a good idea: if the movie would have been cut in the same way as the book is told, it would probably have become way too confusing (although it would be an interesting experiment to try out: can someone edit the movie in the book order an email it to me? I'd gladly run the experiment). Somehow the movie format requires refreshing the memory of the viewer every once in a while.
Why is there this difference between books and movies? Is it just the amount of time we spend on them? This cannot be the whole story, because the movie is approximately three hours, so if the movie would have been cut consistently with the structure of the book, the interruption in the first story would have been around 2.5 hours. When I read the book, the interruption was over a month. Even though reading the book may take longer than watching the movie (a factor of 5, perhaps?), the interruptions are even longer (a factor of 300, in my case).
Perhaps the difference is that we have different expectations, and therefore bring other memory strategies to bear. Maybe, when we read we are more proactive in building up a story representation than when watching a movie, in particular when most movies are designed to not require too much thinking anyway.

Tuesday, August 6, 2013

Why are people sometimes such poor decision makers when it comes to multitasking?

In earlier posts on this blog, I argued that people are not so bad at multitasking as the popular press often claims. It is, however, undeniable that people are often very inefficient in dividing their time between tasks.

One of the reasons for poor multitasking is poor decision making. People decide to multitask, or to switch tasks when it is not the best moment to do so. The mail/chat experiment I described earlier is an example of an experimental setup where people switch from one task to another at moments that they really should have sticked to the main task they were doing.

When people have to make decisions in ordinary tasks, they often base choices on utility. This means that they compare the potential payoffs for each of the choices, and pick the one with the maximum payoff. So when I want to go to the grocery store, and have to decide whether to walk, drive, or take the bicycle, the utility of these choices is based on how much time it takes, what my capacity to carry groceries will be, and how much money it costs. Taking all these factors into account is quite complex, so many theories assume we base our utility estimations on past experiences instead of trying to calculate the consequences of future action.

The problem in multitasking is that we cannot use the same strategy as in normal decision making. In multitasking, each of the individual tasks has its own utility, and from the perspective of that calculation it is always a bad idea to multitask. If we were perfect utility maximizers, we would never multitask, which is obviously not true.

So, what is the basis for our decisions? A possible explanation for which we have collected some evidence (including the mail/chat experiment) is what I call resource availability. If there is another tempting task, and we have mental resources available to do that task, we go and do that task, either by adding it to what we are already doing, or by switching to it. By mental resources I mean memory, visual attention, working memory, motor control, etc. Let me give some examples.

A wedding in China: obviously the groom was too boring....

In the mail/chat task, people are waiting for a web browser to give information that will allow them to answer emails. But while they are waiting, almost all of their mental resources have nothing to do, and will be on the lookout. And indeed, there is a chat window in which there is something to do! So they switch to the chat task and in the meantime forget all of the context of the mail task they were doing.

Many real-life situations are in traffic, whether in a car, or walking, or cycling. I like to listen to music or podcasts while running to keep all my mental resources happy. Dutch students on a bicycle seem to have an incessant need to call their friends when they are on a bicycle. Whenever I am stuck writing the next sentence of this Blog, email and Facebook are tempting me.

In the end, our poor decisions in multitasking may be a consequence of our continuous attempts to escape boredom. Boredom forces us to jump from one attractive electronic carrot to another, but this never completely satisfies us.