Showing posts with label Methodological. Show all posts
Showing posts with label Methodological. Show all posts

Wednesday, 12 September 2012

Another look at the "magical" benefit of frequent family meals

"The statistics are clear," Nancy Gibbs wrote in her article for Time magazine in 2006 entitled the Magic of the Family Meal: "Kids who dine with the folks are healthier, happier and better students". She's right, there is lots of evidence showing these positive associations, and there are plausible explanations for the benefits, such as a chance for children and parents to talk, and the sense of structure that the ritual provides.

But as Daniel Miller and his colleagues point out in their new study, the supposed benefit of frequent family meals is based on research with limitations. Many studies have been cross-sectional snap-shots in time - so it's possible that frequent family meals are merely a proxy for other relevant factors, such as warmer family relations or parental wealth and education. And the causal direction could run backwards. Maybe parents are more inclined to dine with children who are happier and better behaved.

Miller's team have conducted a comprehensive, longitudinal study using data that was collected from 1998 - when 21,400 participating US children were aged 5 years - to 2007, by which time the average age of the remaining 9,700 participants was 13.6. At five time points during that period, the children's parents were surveyed about how often they ate as a family at breakfast and dinner; the children's reading and maths abilities were assessed; and teachers were surveyed about the children's behaviour.

The results were clear - there was little or no evidence (depending on the precise analysis used) of any association between more family meals at earlier time points and better outcomes later, in terms of the children's academic abilities or good behaviour. "Our results suggest that the findings of previous work regarding frequency of family meals and adolescent outcomes should be viewed with some caution," the researchers said.

But we shouldn't be too hasty about dismissing the value of family meals. This study comes with its own caveats. Chief among these is that the children were younger than in most other studies on this issue. Relevant here is that past research has linked frequent family meals with outcomes such as less substance abuse among older teenagers - a potential benefit that was not addressed in this study given the younger sample. Another problem, acknowledged by the researchers, was the reliance on parental reports about the frequency of family meal times. A suspiciously high number of parents reported having family meals every day of the week. If they were lying it could have affected the trustworthiness of the results, although the researchers think this is unlikely based on some checks they made of their data.

Taken altogether, Miller and his colleagues said their study should be seen as "an extension rather than a repudiation of previous work". Their cautious conclusion is that "the magnitude of the effect of family meal frequency may be less than suggested by previous work."

_________________________________ ResearchBlogging.org

Daniel Miller, Jame Waldfogel, and Wen-Jui Han (2012). Family meals and child academic and behavioural outcomes. Child Development : 10.1111/j.1467-8624.2012.01825.x

--Further reading-- You are what you eat? Meal type, socio-economic status and cognitive ability in childhood.

Post written by Christian Jarrett for the BPS Research Digest.

Thursday, 9 August 2012

Made it! An uncanny number of psychology findings manage to scrape into statistical significance

Like a tired boxer at the Olympic Games, the reputation of psychological science has just taken another punch to the gut. After a series of fraud scandals in social psychology and a US survey that revealed the widespread use of questionable research practices, a paper published this month finds that an unusually large number of psychology findings are reported as "just significant" in statistical terms.

The pattern of results could be indicative of dubious research practices, in which researchers nudge their results towards significance, for example by excluding troublesome outliers or adding new participants. Or it could reflect a selective publication bias in the discipline - an obsession with reporting results that have the magic stamp of statistical significance. Most likely it reflects a combination of both these influences. On a positive note, psychology, perhaps more than any other branch of science, is showing an admirable desire and ability to police itself and to raise its own standards.

E. J. Masicampo at Wake Forest University, USA, and David Lalande at Université du Québec à Chicoutimi, analysed 12 months of issues, July 2007 - August 2008, from three highly regarded psychology journals - the Journal of Experimental Psychology: General; Journal of Personality and Social Psychology; and Psychological Science.

In psychology, a common practice is to determine how probable (p) it is that the observed results in a study could have been obtained if the null hypothesis were true (the null hypothesis usually being that the treatment or intervention has no effect). The convention is to consider a probability of less than five per cent (p < .05) as an indication that the treatment or intervention really did have an influence; the null hypothesis can be rejected (this procedure is known as null hypothesis significance testing).

From the 36 journal issues Masicampo and Lalande identified 3,627 reported p values between .01 to .10 and their method was to see how evenly the p values were spread across that range (only studies that reported a precise figure were included). To avoid a bias in their approach, they counted the number of p values falling into "buckets" of different size, either .01, .005, .0025 or .00125 across the range.

The spread of p values between .01 and .10 followed an exponential curve - from .10 to .01 the number of p values increased gradually. But here's the key finding - there was a glaring bump in the distribution between .045 and .050. The number of p values falling in this range was "much greater" than you'd expect based on the frequency of p values falling elsewhere in the distribution. In other words, an uncanny abundance of reported results just sneaked into the region of statistical significance.

"Biases linked to achieving statistical significance appear to have a measurable impact on the research publication process," the researchers said.

The same general pattern was found regardless of whether Masicampo and Lalande analysed results from just one journal or all of them together, and mostly regardless of the size of the distribution buckets they looked at. Of course, there's a chance the intent behind their investigations could have biased their analyses in some way. To check this, a research assistant completely blind to the study aims analysed p values from one of the journals - the same result was found.

Masicampo and Lalande said their findings pointed to the need to educate researchers about the proper interpretation of null hypothesis significance testing and the value of alternative approaches, such as reporting effect sizes and confidence intervals. " ... [T]he field may benefit from practices aimed at counteracting the single-minded drive toward achieving statistical significance," they said.

 _________________________________ ResearchBlogging.org

Masicampo EJ, and Lalande DR (2012). A peculiar prevalence of p values just below .05. Quarterly journal of experimental psychology PMID: 22853650

Post written by Christian Jarrett for the BPS Research Digest.

Tuesday, 7 August 2012

Is a taste for extreme answers distorting cross-cultural comparisons of personality?

Data on how personality varies around the world is puzzling. Take the dimension of conscientiousness. Among individuals within a particular country, those with higher conscientiousness tend to earn more money and live longer. This makes sense given the behavioural sequelae of conscientiousness, including diligence and attention-to-detail. Compare across countries, however, and what you find is that richer countries with longer life expectancy tend to have lower average conscientiousness. Now a new study has tested a possible explanation for this paradox - perhaps there's a systematic bias between countries in people's tendency to tick more extreme scores on questionnaires.

How do you tell if a population's higher scores are a reliable reflection of their underlying traits, or if they're caused by a proclivity for more extreme answers? One way is to ask them to rate not just their own personality, but also the personality of a number of fictional characters described in vignettes. Exaggerated scores for the fictional characters would be a sign of a skewed response style.

A small army of researchers around the world led by René Mõttus at the University of Tartu in Estonia has taken on this challenge, recruiting 2,965 people across 20 countries (including European, African, American and Asian nations) and asking them to rate their own personalities and the personalities described in vignettes.

Mõttus and his colleagues uncovered systematic differences between nations in people's proclivity for extreme responding. One pattern to emerge was that richer East Asian countries tended to avoid extreme scores, whereas poorer countries in Africa and SE Asia tended to give more extreme ratings. Adjusting for cross-cultural response styles, the puzzling negative correlation disappeared between average international conscientiousness scores and national longevity and wealth.

The researchers acknowledged that they haven't shown conclusively that extreme response tendencies cause higher conscientiousness ratings. Theoretically the causal direction could run backwards, although common sense suggests this is unlikely. You'd expect higher scorers on conscientiousness to avoid extreme scores, not embrace them. Another possibility is that another unknown factor is at play, inflating conscientiousness scores and encouraging extreme responding. However, it's difficult to imagine what such a factor might be. Taken altogether, the researchers think the most likely explanation is that a proclivity in some countries for extreme responding has had the effect of inflating their conscientiousness scores.

All this raises a further intriguing question ... why should people in some countries be more prone to giving extreme answers? The answer remains beyond the current study, but the researchers suggested one factor could be "dialectical thinking ... 'an emphasis on change, a recognition of contradiction and of the need for multiple perspectives, and a search for the "Middle Way" between opposing propositions'". Countries where dialectical thinking is more common would be expected to avoid extreme scores. Consistent with this, there's some evidence that dialectical thinking is higher in East Asian countries that were found in this study to refrain from giving extreme scores.
  _________________________________ ResearchBlogging.org

Mõttus R, Allik J, Realo A, Rossier J, Zecca G, Ah-Kion J, Amoussou-Yéyé D, Bäckström M, Barkauskiene R, Barry O, Bhowon U, Björklund F, Bochaver A, Bochaver K, de Bruin G, Cabrera HF, Chen SX, Church AT, Cissé DD, Dahourou D, Feng X, Guan Y, Hwang HS, Idris F, Katigbak MS, Kuppens P, Kwiatkowska A, Laurinavicius A, Mastor KA, Matsumoto D, Riemann R, Schug J, Simpson B, Tseung-Wong CN, and Johnson W (2012). The Effect of Response Style on Self-Reported Conscientiousness Across 20 Countries. Personality and social psychology bulletin PMID: 22745332

Post written by Christian Jarrett for the BPS Research Digest.

Tuesday, 3 July 2012

Fake data or scientific mistake?

Social psychology is reeling from its second research scandal in less than a year, after the Erasmus University of Rotterdam announced the withdrawal of two articles by one of its senior social psychologists. The problematic papers were identified by a ‘Committee for Inquiry into Scientific Integrity’ (chaired by Rolf Zwaan, a psychologist in the University’s Brain and Cognition lab), which was set up to investigate concerns raised about the work of Dirk Smeesters. Among the Inquiry’s recommendations was a call for greater regulation of the fields of marketing and ‘to a lesser extent’ social psychology.

Smeesters, who was Professor of Consumer and Society in the Rotterdam School of Management, was found guilty by the Inquiry of ‘data selection’ and failing to keep suitable data records. Smeesters resigned his post after admitting to using a ‘blue dot technique’ whereby, after achieving a null result, he omitted participants who failed to read the instructions properly (7 to 10 per study, he claims), thus lifting the findings into statistical significance – a procedure he failed to detail in his affected papers. However, Smeesters blamed the unavailability of his raw data on nothing more heinous than a computer crash and a lab move. The Inquiry said it ‘doubted the credibility’ of these reasons.

The affected papers pertained to social priming and past selves and were published in the Journal of Personality and Social Psychology, published by the APA, and the Journal of Experimental Social Psychology, published by Elsevier. A third affected paper had only reached the submission stage of publication. The Inquiry found no evidence of wrong-doing by Smeesters’ co-authors although there’s no doubt they are suffering from the fall-out (at least one of them has posted his feelings online).

These latest revelations come in the wake of the case of Diederik Stapel, a senior social psychologist at Tilburg University, who last year admitted to fabricating the results behind several dozen published studies (see December news, 2011). Smeesters has kept a low profile since the scandal broke, but he surfaced late in June to tell the Dutch newspaper Algemeen Dagblad that he was ‘no Stapel’ – his data was not fabricated; he had made a scientific mistake. Stapel and Smeesters reportedly never worked together.

Concerns were first raised about Smeesters’ work by Uri Simonsohn, a social psychologist at The Wharton School, University of Pennsylvania. Simonsohn has developed a statistical technique for detecting massaged data, details of which are contained in an as yet unpublished paper with the working title ‘Finding Fake data: Four True Stories, Some Stats, and a Call for Journals to Post All Data’ (criticisms of the technique have surfaced online). Simonsohn contacted Smeesters requesting his raw data, and then he reported his findings to Smeesters’ head of school, which led ultimately to the Inquiry.

According to the Inquiry’s report (pdf), Simonsohn’s technique identifies dubious data by looking at the amount of variation in the group means derived from the same population. With the aid of two statistical experts, the Erasmus University Inquiry applied Simonsohn’s algorithm to 22 of 29 of Smeesters’ papers published or submitted since 2007, for which the necessary data were available, which led to the identification of the three suspect papers (the technique was also applied to a random selection of four comparable control papers by others in the field and no anomalies were found).

Concerns were also raised about data anomalies in a fourth paper published by Smeesters and co-authors in the Journal of Consumer Research. In relation to this paper, the Inquiry stated that it had found a file on Smeesters' network desk that shouldn't have been there based on his description of how the data were collected. The Inquiry states it 'cannot rule out that Smeesters used the ... file to manipulate the raw data before sending these' to his data-analyst.

This isn’t the first time the whistleblower Simonsohn has taken an interest in research integrity. Last year he co-authored a paper ‘False-positive psychology’ in Psychological Science, in which he and his colleagues demonstrated the ease with which false-positive results can be obtained by indulging in research practices that occupy a grey area of acceptability, such as adding more participants to a subject pool in search of a significant finding. A paper published in May this year in Psychological Science (but detailed on the Research Digest blog last December) surveyed 6000 US psychologists about practices in this ‘grey zone’ and found that 58 per cent admitted excluding data post-hoc and 35 per cent had doubts about the integrity of their own research. Smeesters told the Inquiry that he doesn’t feel guilty because many authors in his field knowingly omit data to achieve significance.

Early in July, Simonsohn gave an interview to Nature in which he claimed to have identified a third case of scientific misconduct that's yet to be made official, and a fourth that's not been acted upon. He said he was motivated to act in these cases by the fact that 'it is wrong to look the other way', but he stressed he hadn't taken justice into his own hands - he was careful to pass things over to the appropriate authorities. 'If it becomes clear that fabrication is not an unusual event,' he said, 'it will be easier for journals to require authors to publish all their raw data. It’s extremely hard for fabrication to go undetected if people can look at your data.'

--Link to Erasmus University of Rotterdam press statement.
--Link to English translation (pdf) of the Inquiry report.
_________________________________

Post written by Christian Jarrett for the BPS Research Digest.

Thursday, 21 June 2012

The Alien awakened by a rubber hand

What happens if you administer a tactile illusion to a brain-damaged patient whose hand is out of their control? A team of researchers has done just that, figuring that illusions could offer new insights into complex neuropsychological disorders.

The patient in question was a 69-year-old lady whose left-sided stroke had left her with alien hand syndrome*. Most of the time her right hand was held in a clenched position that she couldn't open. Occasionally, accompanied by a mild electric sensation, it moved involuntarily, jerking, or even slapping her in the face.

Michael Schaefer and his colleagues at Otto-von-Guericke University Magdeburg tested the lady on two sensorimotor illusions - the traditional rubber hand illusion and the lesser-known somatic rubber hand illusion. The first involved the patient placing one of her arms on the table-top, with the other underneath. A rubber arm was placed alongside her real arm on the table. The researcher then stroked the patient's hidden arm and the rubber arm in synchrony. When the illusion works it creates the sensation of feeling in the rubber arm, as if it's a part of the person's body. In fact the patient experienced no feeling in the rubber arm at all, regardless of whether it was her healthy arm or alien arm that was being stroked under the table. The rubber hand illusion doesn't work for everyone so this null finding is not particularly surprising.

Things got more interesting when the researchers tested their patient with the somatic rubber hand illusion (see picture, above). This procedure involved the rubber arm being placed between the patient's two real arms on a table-top. This time, the patient was blindfolded and the researcher (wearing plastic surgical gloves) picked up one of the patient's hands and used it to tap the rubber hand. At the same time, and in synchrony, the researcher tapped the patient's other hand. This procedure creates the strong illusion for the participant that they are touching their own hand rather than the rubber hand - a feeling that the patient said she experienced.

But something surprising also happened when the researchers tried out this illusion. Within moments, the patient's alien hand leapt up off the table and was grabbed by her healthy hand. She said she felt an electric sensation in her alien hand prior to it rousing. The illusory experience seemed to have awakened her alien hand. This effect occurred every time the procedure was repeated. But crucially it only happened when it was the patient's healthy hand that was used to tap the rubber hand, whilst the patient's alien hand was simultaneously tapped by the researcher (and not when the illusion was done the other way around). The awakening effect also disappeared when the procedure was repeated with the patient's blindfold removed, which is known to destroy the illusion.

All this suggests that it wasn't touching the alien hand per se that roused it, but rather it was the experience of the body illusion. Schaefer and his colleagues think that their patient has a disconnect between the anterior supplementary motor area (SMA) at the front of her brain (involved in inhibitory control) and other brain regions involved in movement. They reckon this impaired motor integration somehow interacted with the illusory feelings of body ownership triggered by the rubber hand trick. Perhaps, they said, the illusion further weakened the SMA's already compromised control of the alien hand.

"Although our results should be confirmed by further studies, we believe that the examination of experimental-induced illusions in patients with disorders of self-embodiment is promising and might help us to develop treatments for these diseases in the future."

 _________________________________ ResearchBlogging.org


Michael Schaefer, Hans-Jochen Heinze, and Imke Galazky (2012). Waking up the alien hand: rubber hand illusion interacts with alien hand syndrome. Neurocase: The Neural Basis of Cognition DOI: 10.1080/13554794.2012.667132

Further reading: Sergio Della Sala on the bizarre ‘Dr Strangelove syndrome’ and what it tells us about free will (Psychologist magazine article).
Simulating anarchic hand syndrome in the lab (earlier Digest report).

*Some experts prefer the term anarchic hand syndrome for this patient's condition, reserving the term alien hand syndrome for a distinct but related condition in which the patient no longer believes the hand is theirs. For consistency I decided to use the terminology adopted by the authors of this paper.

Post written by Christian Jarrett for the BPS Research Digest.

Monday, 23 April 2012

Do psychology findings replicate outside the lab?

Most psychology research takes place under laboratory conditions allowing tight control over the exact interventions and procedures participants are exposed to. That makes for neater science but leaves the discipline vulnerable to claims that the results aren't relevant to real life where things are far messier. Now Gregory Mitchell at the University of Virginia has tested this very issue by poring over the literature looking for previously published meta-analyses that compared findings in the lab to the same issue addressed in a field experiment. His searches, which built on a similar 1999 study (pdf), led him to 82 meta-analyses from the last three decades, comprising 217 lab vs. field study comparisons.

Overall, Mitchell found that lab findings usually replicate in the real world (r = .71, where 1 would be a perfect match), but the devil is in the detail: some sub-disciplines in psychology fared much better than others; the size of the effects often differed greatly between lab and real world; and in a worrying number of cases, the real world results were actually in the opposite direction to the lab findings.

"Many small effects from the laboratory will turn out to be unreliable," Mitchell concluded, "and a surprising number of laboratory findings may turn out to be affirmatively misleading about the nature of relations among variables outside the laboratory."

Breaking the results down by sub-discipline, findings replicated from the lab most often in Industrial-Organisational Psychology (based on 72 comparisons) and least often in Developmental Psychology, where the three comparisons showed the average field result was actually in the opposite direction to the lab findings. The massive discrepancy in number of comparisons in these sub-disciplines makes it difficult and unfair to draw any definitive conclusions from this particular contrast. However, Social psychology had a similar number of comparisons (80) to Industrial Organisational Psych, yet produced a far lower replication rate (r = .53 vs. r = .89). Mitchell said further research is needed to find out why this might be.

There were also important differences in replication rates (from lab to field study) within different psychology sub-disciplines. For example, Industrial Organisational Psychology studies of performance evaluations translated less well from the lab compared with other topics of study in that discipline. Across subfields, lab studies of gender differences were particularly unlikely to translate to the real world. "We should recognise those domains of research that produce externally valid research," Mitchell said, "and we should learn from those domains to improve the generalisability of laboratory research in other domains."

_________________________________ ResearchBlogging.org


Mitchell, G. (2012). Revisiting Truth or Triviality: The External Validity of Research in the Psychological Laboratory. Perspectives on Psychological Science, 7 (2), 109-117 DOI: 10.1177/1745691611432343

Further reading: Gregory Mitchell contributed to The Psychologist's current opinion special on replication in psychology (free access).

Post written by Christian Jarrett for the BPS Research Digest.

Wednesday, 18 April 2012

Toddlers don't take the risk of entrapment seriously

Infants can't tell us what they can and can't perceive in the world so psychologists make assumptions about this based on their behaviour. A new study by John Franchak and Karen Adolph at New York University exposes the limits of this approach, demonstrating that how babies choose to behave isn't based only on their perceptual abilities but also on their assessment of risk.

Thirty-two 17-month-old infants were allocated to one of two conditions - they either had to judge whether they could fit through a narrow gap (of varying widths) between two surfaces, or they had to judge whether they could fit though a narrow gap (of varying widths) between the edge of a table and a wall. Both conditions took place atop a table but the risk in the first case was getting stuck, whereas the risk in the second case was falling off the edge.

The toddlers in the first, "entrapment" condition frequently misjudged the situation and found themselves stuck on over 80 per cent of trials (this error rate showed no signs of diminishing over time). By contrast, toddlers in the "falling condition" were shrewder judges and only fell off on just 21 per cent of trials (don't worry, no babies were hurt in this research). This was the case even though one might imagine that the gap between two wall-like surfaces was easier to judge, from a perceptual point of view, than a gap between a wall and a drop, and despite the fact that infants in both conditions exhibited similar approach behaviours - lining their bodies up in advance and feeling the gaps with their hands.

Franchak and Adolph point out that if developmental psychologists relied on the "entrapment" condition, they would wrongly conclude that infants of this age have yet to develop the sensory and motor sophistication to judge gap size in relation to their own body size. In fact the results from the "falling" condition show that toddlers are capable of judging the relative size of a gap versus their own body. The discrepancy in performance between the two conditions is presumably because babies aren't that bothered about the risk of getting stuck - so they're fairly reckless about trying to squeeze through a too-small gap - but they are bothered about the risk of falling, so they take their size estimations along precipices far more seriously.

As an aside, the infants' histories of getting stuck or not in real life (for example, the researchers noted that one boy had previously managed to get his head stuck in a training potty) bore no relation to their performance in the task.

The researchers said their findings had theoretical implications - challenging previous assumptions made by other psychologists that the tendency for infants to get stuck in gaps meant they had poor body knowledge. The new results also have practical implications. "Falling and entrapment are two of the leading causes of accidental injury in infants," the researchers said. "The results suggest that even though experienced walking infants can perceive risks of falling and entrapment accurately, they may discount the potential danger of entrapment. Their willingness to squeeze themselves into possibly small openings may contribute to the prevalence of entrapment injuries."

 _________________________________ ResearchBlogging.org


Franchak, J., and Adolph, K. (2012). What Infants Know and What They Do: Perceiving Possibilities for Walking Through Openings. Developmental Psychology DOI: 10.1037/a0027530

Post written by Christian Jarrett for the BPS Research Digest.

Monday, 16 January 2012

Is it time to resurrect post-trauma psychological debriefing for emergency responders and aid workers?

You've probably seen on the news, after a disaster, the announcement that trained counsellors will be on hand as a matter of routine. Or you used to. In fact, the practice of offering routine post-trauma psychological debriefing (Critical Incident Stress Debriefing - CISD - to give it its original, formal title) is all but dead and buried. It's hard to say who exactly executed the fatal blow.

NICE - the trusted, independent UK body that provides health advice - is a chief culprit. Based on seven randomly controlled trials (RCTs) comparing psychological debriefing against control groups, NICE recommended in 2005 that brief, single-session interventions not be routinely offered to individuals who have experienced a traumatic event. In 2006, another likely culprit, the Cochrane Collaboration, (widely respected for its meta-analyses of published studies) identified 15 relevant RCTs and made a similar recommendation.

Psychiatrist Simon Wesseley, based at the Institute of Psychiatry in London, went further and must also be a chief suspect. In a debate held at the Royal Institution in 2006, he proposed psychological debriefing after trauma as the "worst ever idea on the mind", based on the fact that it's ineffectual and possibly harmful. "It's a bad idea and a bad intervention," he said.

I must confess that I too may have played a part, however minor, in the demise of post-trauma counselling. In my Psychologist magazine article When Therapy Causes Harm, I highlighted Critical Incident Stress Debriefing as among the therapies identified by Emory University psychology professor Scott Lilienfeld as potentially harmful and that should be avoided. In my book The Rough Guide to Psychology, I used the possible harm caused by post-trauma psychological debriefing as an example of a counter-intuitive finding in psychology.

Now a team of therapists and trauma consultants, Debbie Hawker, John Durkin and David Hawker, who've worked extensively with NGOs, aid workers and emergency responders, have called for post-trauma debriefing to be resurrected for these specific client groups. In a scholarly plea, they've argued that the damning conclusions formed by NICE, Cochrane, Wesseley and others were premature and too narrowly interpreted (NICE acknowledges that their guidance may not apply to debriefing of emergency workers or group debriefing). Hawker and co claim that there are many who would welcome the return of post-trauma debriefing: "As mental health professionals active in the military, emergency service and humanitarian fields, we are aware that the personnel we work with often request debriefing, and speak of its benefit for them". Yet the debriefing is usually not available: "Professionals ... are afraid of being accused of professional misconduct if they offer psychological debriefing ...".

Hawker and co point out that of the 15 RCTs identified by the NICE and Cochrane reviews, three found a positive effect of debriefing, nine found no effect and only two found a harmful effect. These two studies, they explain, were seriously flawed. The patients who received debriefing were more severely injured than the controls; they received debriefing too soon, before they were ready; the debriefing was too brief (it averaged 44 minutes, whereas experts say it should last at least two hours, with at least one follow up); and the debriefers were inadequately trained (a research assistant delivered the debriefing in one study; the other negative outcome study said the debriefers had received half a day's training).

In effect, Hawker et al say, these trials were more like "inefficacy trials" - exploring what happens when an intervention is delivered badly to the wrong people. As it was originally conceived, they explain, post-trauma psychological debriefing was meant to be part of:
"a package for emergency workers who'd experienced critical incident stress as part of their work. It was specifically designed for selected psychologically resilient personnel who are trained to cope with expected pressure during their routine work in stressful situations. These are teams of people who have trained together and been briefed together before working together."
Post-incident debriefing was also meant to be delivered by a mental health worker and a peer debriefer, both of whom should have experience of the emergency services they're working with, thus lending the debriefers all-important credibility.

Debriefing is popular with emergency workers and aid workers, Hawker and co say, because many of them see it as their only chance to talk about their experiences. It allows them to do so as a matter of routine, without the stigma of therapy, which they sometimes fear could be detrimental to their careers. Given this need, perhaps it's no surprise that post-trauma psychological debriefing is surfacing under new names like "powerful event group support" and "trauma risk management".

"We have been told that the case against debriefing is proven and the debate is closed," Hawker, Durkin and Hawker conclude. "We disagree ... We predict that appropriate psychological debriefing will be shown to have benefits for secondary victims of trauma who have been briefed together and who have worked together through traumatic events. Research into these uses of debriefing should be encouraged and supported."

_________________________________ ResearchBlogging.org


Hawker, D., Durkin, J., and Hawker, D. (2011). To debrief or not to debrief our heroes: that is the question. Clinical Psychology and Psychotherapy, 18 (6), 453-463 DOI: 10.1002/cpp.730



Post written by Christian Jarrett for the BPS Research Digest.

Thursday, 1 December 2011

Questionable research practices are rife in psychology, survey suggests

Questionable research practices, including testing increasing numbers of participants until a result is found, are the "steroids of scientific competition, artificially enhancing performance". That's according to Leslie John and her colleagues who've found evidence that such practices are worryingly widespread among US psychologists. The results are currently in press at the journal Psychological Science and they arrive at a time when the psychological community is still reeling from the the fraud of a leading social psychologist in the Netherlands. Psychology is not alone. Previous studies have raised similar concerns about the integrity of medical research.

John's team quizzed 6,000 academic psychologists in the USA via an anonymous electronic survey about their use of 10 questionable research practices including: failing to report all dependent measures; collecting more data after checking if the results are significant; selectively reporting studies that "worked"; and falsifying data.

As well as declaring their own use of questionable research practices and their defensibility, the participants were also asked to estimate the proportion of other psychologists engaged in those practices, and the proportion of those psychologists who would likely admit to this in a survey.

For the first time in this context, the survey also incorporated an incentive for truth-telling. Some survey respondents were told, truthfully, that a larger charity donation would be made by the researchers if they answered honestly (based on a comparison of a participant's self-confessed research practices, the average rate of confession, and averaged estimates of such practices by others). Just over two thousand psychologists completed the survey. Comparing psychologists who received the truth incentive vs. those that didn't showed that it led to higher admission rates.

Averaging across the psychologists' reports of their own and others' behaviour, the alarming results suggest that one in ten psychologists has falsified research data, while the majority has: selectively reported studies that "worked" (67 per cent), not reported all dependent measures (74 per cent), continued collecting data to reach a significant result (71 per cent), reported unexpected findings as expected (54 per cent), and excluded data post-hoc (58 per cent). Participants who admitted to more questionable practices tended to claim that they were more defensible. Thirty-five per cent of respondents said they had doubts about the integrity of their own research. Breaking the results down by sub-discipline, relatively higher rates of questionable practice were found among cognitive, neuroscience and social psychologists, with fewer transgressions among clinical psychologists.

John and her colleagues said that many of the iffy methods they'd investigated were in a "grey-zone" of acceptable practice. "The inherent ambiguity in the defensibility of research practices may lead researchers to, however inadvertently, use this ambiguity to delude themselves that their own dubious research practices are 'defensible'." It's revealing that a follow-up survey that asked psychologists about the defensibility of the questionable practices, but without asking about their own engagement in those practices, led to far lower defensibility ratings.

John's team think the findings of their survey could help explain the "decline effect" in psychology and other sciences - that is, the tendency for effect sizes to decline with replications of previous results. Perhaps this is because the original, large effect size was obtained via questionable practices.

The current study also complements a recent paper published in Psychological Science by Joseph Simons and colleagues that used simulations and a real experiment to show how toying with dependent variables, sample sizes and other factors (the kind of practices explored in the current study) can massively increase the risk of a false-positive finding - that is, claiming a positive effect where there is none.

"[Questionable research practices] ... threaten research integrity and produce unrealistically elegant results that may be difficult to match without engaging in such practices oneself," John and her colleagues concluded. "This can lead to a 'race to the bottom', with questionable research begetting even more questionable research."
_________________________________

ResearchBlogging.orgLeslie John, George Loewentstein, and Drazen Prelec (In Press). Measuring the prevalence of questionable research practices with incentives for truth-tellingPsychological Science


Pulled from the comments: Psychfiledrawer is a repository for non-replications of published results.


Post written by Christian Jarrett for the BPS Research Digest.

Wednesday, 12 October 2011

Steve Jobs' gift to cognitive science

The ubiquity of iPhones, iPads and other miniature computers promises to revolutionise research in cognitive science, helping to overcome the discipline's over-dependence on testing Western, educated participants in lab settings.

That's according to an international team of psychologists who say the devices allow for experimentation on an unprecedented scale. "The use of smartphones allows us to dramatically increase the amount of data collected without sacrificing precision," say Stephane Dufau and his colleagues, "and thus has the potential to uncover laws of mind that have previously been hidden in the noise of small-scale experiments." In contrast, they argue that conducting cognitive psychology experiments over the internet has not been a great success because of problems obtaining the necessary precision of timing.

To illustrate their point, the researchers developed an iPhone/iPad App that replicates the classic "lexical decision task" used by psychologists to study the sub-second mental processes involved in reading. Participants are presented with a series of letter strings and simply have to indicate as quickly as possible whether each one is a real word or not. The App was launched as a seven-language international effort in December 2010 and after just four months data had been collected from over four thousand participants. By way of comparison, it took more than three years to collect a similar amount of data via conventional means. It will be easy to add further languages to the App, including non-Romanic alphabet languages like Chinese.

The free Science XL App presents the task to users as a test of word power and offers a choice of task lengths from two to six minutes. Once enrolled, participants use Yes/No buttons on the touch-screen display to indicate whether the letter strings that appear are real words or not. Each participant's performance stats are presented at the end and they are given the option of forwarding their results to the researchers via email. Extreme negative outliers were excluded from further analysis. There is the obvious issue of participants choosing to only send in favourable performance data. However, this doesn't spoil the ability to examine the effect of different factors on performance. For example, the data collected via the App matched many known features of lexical decision time data: reaction times were quicker for more common words and mean reaction times correlated with data collected in psychology labs.

Using smartphones "has wide multidisciplinary applications in areas as diverse as economics, social and affective neuroscience, linguistics, and experimental philosophy," say Dufau and his collaborators. "Finally it becomes possible to reliably collect culturally diverse data on a vast scale, permitting direct tests of the universality of cognitive theories."

This isn't the first time that psychology researchers have aired their excitement about the potential of mobile technologies to revolutionise their methods. A 2009 study used mobile phones to monitor participants' social movements and phone calls.
_________________________________

ResearchBlogging.orgDufau, S., Duñabeitia, J., Moret-Tatay, C., McGonigal, A., Peeters, D., Alario, F., Balota, D., Brysbaert, M., Carreiras, M., Ferrand, L., Ktori, M., Perea, M., Rastle, K., Sasburg, O., Yap, M., Ziegler, J., and Grainger, J. (2011). Smart Phone, Smart Science: How the Use of Smartphones Can Revolutionize Research in Cognitive Science. PLoS ONE, 6 (9) DOI: 10.1371/journal.pone.0024974

-Thanks to Marc Brysbaert for the tip-off.

Post written by Christian Jarrett for the BPS Research Digest.

Wednesday, 14 September 2011

How not to spot personality test fakers

Personality tests are an effective recruitment tool: higher scorers on conscientiousness and lower scorers on neuroticism tend to perform better in the job. But a major weakness of such tests is people's tendency to answer dishonestly. A study now shows that a popular approach to spotting cheaters is likely to be ineffective.

This approach, which has gained momentum in the research literature, is to focus on applicants' response times. Honest test-takers show an inverted U-shaped response profile, being fast when they strongly agree or disagree with test items (these come in the form of statements about the self, such as "I pay attention to details"), and slower when they answer more equivocally. This is thought to reflect a process whereby test takers refer to their self-schema and find it easier to answer when statements clearly conform or contradict this schema.

At least two theories predict that fakers won't show this inverted U-shape, and that response times therefore offer a way to expose those who are cheating. One theory has it that fakers refer to their self-schema and then exaggerate the truth on key statements. This has the effect of extending answer times for unequivocal answers, flattening out the inverted U-shape response time profile shown by honest answerers. Another theory says that fakers don't refer to a self-schema at all - they simply assess the social desirability of each item and exaggerate answers where necessary. This is a cognitively simpler task than referral to a self-schema, and again the inverted U-shaped response profile is predicted to flatten.

To test these predictions, Mindy Shoss and Michael Strube had 60 undergrads (38 women) complete a personality test (the Revised NEO Personality inventory) three times: once honestly, once to create a general good impression, and lastly, either to create a good impression specifically for a public relations role, or specifically for an accountant role.

The key finding is that participants showed the inverted U-shaped response time profile regardless of whether they were answering honestly or not. Response times were faster overall for the fakery conditions, and the inverted U-shape was actually accentuated in the specific public relations fakery condition. Shoss and Strube said these results are consistent with the idea that fakers form, and refer to, an idealised personality schema in their mind when completing a personality test, and so their answers show a similar response time profile to an honest test-taker. The accentuated inverted U-shape for the PR-role condition comes from the fact that the schema for such a role is like a caricature, making unequivocal answers for certain items even easier to provide than usual.

Digging deeper, the researchers found that when striving to make a good impression, participants scored higher on extraversion, agreeableness, openness and conscientiousness and lower on neuroticism.  The inverted U-shape in response times was greater for agreeableness and conscientiousness in the fake conditions than when answering honestly.

"This study casts doubt on the validity of response times for detecting faking in general," the researchers said. "... it seems that researchers and practitioners interested in detecting and reducing faking would do well to focus on other strategies."

An alternative approach to reducing test fakery is to force applicants to choose between pairs of equally appealing statements about themselves, as reported previously on the Digest. Other recent research has shown that many recruitment measures might actually be testing applicants' ability to discern what's required of them, rather than anything more specific, as reported recently by the BPS Occupational Digest.
_________________________________

ResearchBlogging.orgShoss, M., and Strube, M. (2011). How do you fake a personality test? An investigation of cognitive models of impression-managed responding. Organizational Behavior and Human Decision Processes, 116 (1), 163-171 DOI: 10.1016/j.obhdp.2011.05.003

Post written by Christian Jarrett for the BPS Research Digest.

Tuesday, 26 July 2011

Brain scans could influence jurors more than other forms of evidence

It's surely just a matter of time until functional MRI brain scans are admitted in US and UK courts. Companies like No Lie MRI have appeared, and there have been at least two recent attempts by lawyers in the USA to submit fMRI-based brain imaging scans as trial evidence.

Functional MRI gauges fluctuating activity levels across the brain, with experts divided on the merits of using the technology as a high-tech lie detection measure (see earlier). The late David McCabe who died earlier this year, and his colleagues, have put that debate to one side. They asked: if fMRI evidence were to be allowed in courts, would it have a particularly influential effect on jurors' decisions? There's good reason to think it might. For example, a 2008 study by Deena Weisberg found that lay people and neuroscience students (but not neuroscience experts) were more satisfied by bad scientific explanations when they contained gratuitous mentions of neuroscience.

For the new study, 330 undergrads at Colorado State University read a vignette about a criminal trial in which a defendant was accused of killing his estranged wife and lover. Various points of evidence were mentioned and summaries of testimony and cross-examination were provided (the vignette amounted to two pages).

Crucially, a sub-set of the participants read a version in which fMRI evidence was cited: "... there was increased activation of frontal brain areas when Givens [the defendant] denied killing his wife and neighbour, as compared to when he truthfully answered questions." For comparison, other participants read a version that either included incriminating evidence from polygraph, from thermal imaging technology (which measures changes in facial skin temperature), or that contained no lie-detection technology.

The key finding was that participants who read the brain-imaging version were far more likely (76 per cent) to say they considered the defendant guilty, compared with participants who read the other versions (47 to 53 per cent). Moreover, the lie-detection evidence was more likely to be cited by participants in the fMRI condition as key to their decision, as compared with participants who read versions that didn't mention fMRI.

The participants were not entirely seduced by fMRI. Some of them were given a slightly different version of the fMRI vignette, in which the expert witness warned about the technology's unreliability. These participants came to a similar proportion of guilty verdicts as the participants who read the vignette versions that lacked fMRI evidence. So it seems the persuasive influence of fMRI evidence can be tempered easily enough if people are reminded of its limitations.

The researchers acknowledged the obvious weaknesses of their study: the use of students as mock jurors, the use of vignettes rather than a real trial, and so on. These caveats aside, they said their data show that fMRI evidence could be more influential than other types of evidence. "... [T]hough determining whether that indicates the evidence would lead to unfair prejudice, confusion of the issues, misleading the jury, or needless presentation of cumulative evidence is a complex issue," they said. "At the very least, it appears that juries should be informed of the limitations of fMRI evidence."
_________________________________

ResearchBlogging.orgMcCabe, D., Castel, A., and Rhodes, M. (2011). The Influence of fMRI Lie Detection Evidence on Juror Decision-Making. Behavioral Sciences and the Law DOI: 10.1002/bsl.993

Further reading: The brain on the stand, by Jeffrey Rosen, New York Times magazine.

This post was written by Christian Jarrett for the BPS Research Digest.

Monday, 6 June 2011

Beware the "super well" - why the controls in psychology research are often too healthy

Many studies in clinical psychology and psychiatry are making the mistake of using healthy controls who are too healthy. That's according to a thought-provoking opinion piece by Sharon Schwartz and Ezra Susser - experts in the epidemiology of mental health.

Schwartz and Susser invite readers to consider a hypothetical study that samples participants from a wider group made up of people exposed to a virus prenatally and people not exposed to that virus. Imagine that a psychiatric registry is used to identify all the participants from this wider group who are diagnosed with schizophrenia, and they are compared with a slice of healthy participants recruited from the same source. The aim is to see what proportion of the participants with schizophrenia were exposed to the virus and what proportion of the healthy controls were exposed to the virus. If the history of exposure is higher among the schizophrenia participants, then this would suggest there may be an association between the virus and the later development of schizophrenia. In Schwartz and Susser's hypothetical scenario, there is no difference between patients and controls in rates of virus exposure and so the virus seems unassociated with schizophrenia. So far, so good - this is a classic case-controlled study.

The problem identified by Schwartz and Susser is that many such studies apply an exclusion criterion or criteria to the healthy controls that they don't also apply to the patient group. For example, they might rule out healthy controls with an alcohol problem, or depression, or even a physical disorder. The motivation for this is often the fear that these other disorders will obscure the potential link between the cause of interest and the condition of interest (virus exposure and schizophrenia in our ongoing example).

But to apply such exclusion criteria in a one-sided fashion (to the controls but not the patients), creates a serious confound. In our example, imagine that depressed "healthy" controls are excluded and imagine too that there is an underlying association between the virus exposure and depression. Excluding healthy controls with depression in this scenario would distort the results such that the virus appeared wrongly to be associated with schizophrenia (check out the full paper for the data behind this).

"With all the potential sources of bias in a biologic case-control study, why do we focus on the use of well controls?" the researchers asked. "We do so because the use of well controls is a common, and often recommended, method to select controls. Yet it is time-consuming and expensive, can cause considerable bias and does not improve study results."

If researchers include patient participants with other co-morbid diagnoses in their case-controlled studies, Schwartz and Susser went on to explain, then they must also include "healthy" controls who happen to have these other conditions. On the other hand, if researchers want to exclude other conditions, so as to clean up their investigation, then they must exclude both patient participants and controls with these other diagnoses.
_________________________________

ResearchBlogging.orgSchwartz, S., and Susser, E. (2011). The use of well controls: an unhealthy practice in psychiatric research. Psychological Medicine, 41 (06), 1127-1131 DOI: 10.1017/S0033291710001595

This post was written by Christian Jarrett for the BPS Research Digest.

Monday, 18 April 2011

Psychologists like to cite themselves

In a striking case of the experts falling foul of a phenomenon studied by themselves and their colleagues - the self-serving bias - it turns out that psychologists have a tendency to over-cite their own research papers.

Marc Brysbaert and Sinead Smyth analysed one recent issue of Psychological Science and the Journal of Experimental Psychology: Learning, Memory, and Cognition and two recent issues of the Quarterly Journal of Experimental Psychology and the European Journal of Cognitive Psychology.

For each of the articles in these journals, Brysbaert and Smyth used the 'find related records' function on the ISI Web of Science to find the article out there in the wider literature with the greatest overlap in the references it cited, but which was written by a different set of authors. This way the researchers ended up with a list of original target articles, each one paired with a second comparison paper by a different research group, presumably on the same or a highly similar topic (hence the overlap in the reference lists).

To check for a self-citation bias, Brysbaert and Smyth simply looked to see how many times the authors of a target article cited themselves compared with how many times they cited the authors of the comparison paper (and vice versa). For target articles, the average number of self-citations was 4.1 (11 per cent) compared with 2.3 citations of the comparison paper's authors. For the comparison papers, the average number of self-citations was 9 (10 per cent), compared with 1.8 citations of the authors of the target article.

The researchers summed up: 'A typical psychology article contains 3 to 9 self-citations, depending on the length of the reference list ... In contrast, cited colleagues in general receive 1 to 3 citations. This is what we call the self-citation bias: the preference researchers have to refer to their own work when they [supposedly] guide readers to the relevant literature.' The finding adds to past research that's shown academics are biased towards citing other researchers from their own country, and towards citing the work of the editor of the journal their research is published in.

Brysbaert and Smyth believe that psychology researchers indulge in biased self-citation practices not because their own past papers are always necessarily useful to the reader, but because it's 'good for the researchers' esteem, by means of self-enhancement and self-promotion.'

If that's the case, does it work? The evidence for this is mixed. A 2006 study in the field of economics found that papers with more self-citations were no more likely to end up being cited by other research groups. However, another study published in 2007 (pdf), which involved the analysis of over 64,000 Norwegian journal articles, found that authors who self-cited more also tended to receive more citations from others. 'So, although self-citations may not increase the likelihood that a particular article is cited, they do increase the chances that a particular author is cited,' Brysbaert and Smyth explained.

So, what to do about this self-citation bias? One option proposed by Brysbaert and Smyth is for journals editors to impose a cap on self-citations, particularly for journals, like Psychological Science, that have a cap on the total number of references allowed per paper - articles in this journal tended to have the highest proportion of self-citations. What do you think?
_________________________________

ResearchBlogging.orgMarc Brysbaert, and Sinead Smyth (2011). Self-enhancement in scientific research: The self-citation bias, Psychologica Belgica. In Press. [pdf via author website].

Post written by Christian Jarrett (@psych_writer) for the BPS Research Digest.
Google+