EBRC In Translation
EBRC In Translation
24. Preprinting Biological Research w/ John Inglis and Richard Sever
In this episode, we interview Dr. John Inglis and Dr. Richard Sever, the executive director and assistant director respectively of the Cold Spring Harbor Laboratory Press. John and Richard also co-founded both bioRxiv and medRxiv, the primary preprint servers for biological and medical research. We discuss what makes the Cold Spring Harbor Laboratory Press unique, how they founded bioRxiv and medRxiv, how preprinting allows us to do experiments on the publication system itself, and much more!
In the episode, John recommends a book on career options for biomedical scientists.
Note from the editor: Unfortunately, there were some minor issues with Richard's source audio which resulted in some occasional skipping. Our apologies for this!
For more information about EBRC, visit our website at ebrc.org. If you are interested in getting involved with the EBRC Student and Postdoc Association, fill out a membership application for graduate students and postdocs or for undergraduates and join today!
Episode transcripts are the unedited output from Whisper and likely contain errors.
Hello, and welcome back to EBRC in Translation. We're a group of graduate students and postdocs working to bring you conversations with members of the engineering biology community. I'm Andrew Hunt, an incoming postdoc in David Baker's lab at the University of Washington. And I'm Ross Jones, a postdoc in Peter Zanstra's group at the University of British Columbia. They were joined by John Inglis and Richard Sever. John is the founder and director of Cold Spring Harbor Laboratory Press. Richard is the co-director of CHL Press and the executive director for the CSH Perspectives and CSH Protocols. Together, they also co-founded BioArchive and MedArchive, the primary pre-print servers for biological and medical research. Thanks so much for joining us today. A pleasure to be here. Nice to meet you. Okay, so to get us started, could you each introduce yourselves and talk a little bit about how you got from your PhD to your current positions at CSHL? Sure. Happy to do that. If it makes it easier, just call it Cold Spring Harbor Press. That's largely what we do. So, hello, everyone. I'm John Inglis. I'm the executive director and publisher of Cold Spring Harbor Laboratory Press. I grew up in the north of Scotland and I did a zoology degree, first of all. And in the course of that, I discovered the wonders of immunology. And I then went on to do a PhD in immunology at Edinburgh Medical School. And after that, through sheer chance, I was given the opportunity to become an assistant editor of The Lancet, the medical journal in London. And then through another sheer chance, I got the opportunity to found a journal which I called Immunology Today. And it's been since rechristened Trends in Immunology. And it was part of that suite of monthly review journals that have become so very well known over the years. So I did that for a number of years. I was the editor and I also initiated some of the other Trends titles. And then Jim Watson appeared out of the blue and invited me to come to New York to found a publishing house at Cold Spring Harbor Laboratory, which had been in the publishing business for quite a long time, but had published largely books, but had recently initiated a new journal. And Jim's vision, which rapidly became mine, was to sort of build on that foundation and build up as diverse a program in publishing as I could manage. And it was forced to take a couple of years, but unfortunately, it's taken 35 so far. What about yourself, Richard? Yeah, so I did a degree in chemistry at Oxford, and then went to the LMB in Cambridge to do my PhD. And I think I realized pretty early on that I wanted to do some journalistic or editorial work. And so after I finished my PhD, I did a little bit of work for the BBC, but then sort of really realized that, you know, it was, that was too, it was too, it was too much simplifying of the science and I wanted to be talking at a level that was more like the level of my PhD. So I went to become an in-house editor at Current Opinion and Cell Biology, and then assistant editor at Trends and Biochemical Sciences. So the same place where John was, but some years later, so our paths never actually crossed there. And then I then I was executive editor of Journal of Cell Science for a number of years working with Fiona Watt. And then John recruited me to Cold Spring Harbor about 15 years ago, essentially to kind of do more of the same and new things. And one of those things was Bioarchive, and later MedArchive. Very cool. Well, before we delve into Bioarchive and MedArchive, I'm hoping you all could give us a little bit more of an overview about what Cold Spring Harbor Press is, and maybe how it's a little bit similar to and different from other publishers. I'll make a stab at that and Richard can expand on it. The principal difference, I mean, so the press is a publishing house, and there are many, many different publishing houses, we publish books, and a variety of journals, like many publishing operations do. The principal distinction, I think, between what we do and what other publishers do is that, first of all, we are not commercial, we are a division of an academic institution, which means that the decision making about what we do and how we do it is based on different principles. It's not based on primarily on financial concerns, although those are part of our concerns. But our mission really is to create outputs that help scientists do their jobs, ideally better than they are able to do it already. And that thinking guides an enormous amount of what we do, and in terms of what we develop, what we publish. And then also, we are mindful of the fact that Cold Spring Harbor has been around for 120 years, and has been a place where scientists came to communicate their work for really most of that time. And publishing in the way that we do it is very much an extension of that sort of community based getting together, sharing science, improving the process by exchanging ideas and data. And so that's another part of what we do that is so important. And we are a division, we're very, very much integrated into the institution. I mean, there are many university presses, they are often very separate from the institution they're connected to. But we are right there, embedded in the community. I'm a faculty member, I mentor graduate students, Richard helps organize seminar programs, and so on. So our scientific staff are very much part of the life of this institution. So I think those are just some of the ways in which we sort of think of ourselves. And I think that does make us quite different to many, many other organizations. What have I left out, Richard? No, I don't think you've left anything out, John. I think, in many respects, it really does make the place unique. Because I mean, I think everybody's familiar with the big five corporate publishers, which are essentially commercial firms. And then you have this nonprofit sector, which is university presses, various publishers that are associated with academic societies. But as John says, the nonprofit, and they're sort of stronger or looser associations with academic environments. But Cold Spring Harbor Laboratory Press is really pretty unique in that it's completely embedded within an academic research center, which it is the kind of a key part of as as Devin says, you know, he's on faculty, we do things, we interact with the scientists here. So it's very, very close. And that has been kind of really important for a lot of the things that we've done, not least of which bio archive, which, you know, in many respects stemmed from early conversations with faculty members, the sort of people that we can chat with over lunch. And I mean, I found it very appealing and continue to find it very appealing because if you really do feel like you're within the scientific community, rather than being outside and trying to kind of guess what's happening, you have people all around you who can tell you. And in the case of bio archive, one of the things they told us was this thing is needed. CB That's very cool, being much closer to the researchers in the science and being able to have tighter feedback loops between what you do and what they need. I really like that. Including telling us when ideas are really crap. Yes. Yeah. So you already started talking about it, but we were wondering if you could dive in a little bit on the origin story of bio archive and later med archive. How did it go from those conversations to actually starting it up? And what about the Cold Spring Harbor Press made this a possibility? Well, let me start and Richard will say more, but it's important, I think, to make the point that the idea of preprints is not a new idea. And it wasn't a new idea 10 years ago when bio archive started, because as I'm sure you guys in engineering know, the archive for physics and math and computational biology has been around now for more than 30 years and very, very much a central part of the communication within those communities. And that fact did not go unnoticed in biomedicine. And over the years, there were, after archive got going in the early 90s, there were serious efforts to create something similar in biomedicine, several of them, maybe three or four, and they did not work. And so that is an important part of the background. In fact, we have a lot of meetings here at Cold Spring Harbor. And in 1998, I met Paul Ginsberg, who is the founder of archive. He came to a meeting here, and we had a long conversation then about what archive was, and why, in my view, it absolutely wouldn't work in biomedicine. But that was in 1998, and things changed, and several things changed more or less all at once. So Richard, you can pick up the story from here. Yeah, yeah, I think several things changed. And you know, I mean, that was what I was conscious of, because I mean, as John says, you know, it's an incredibly unoriginal idea that far smarter people than we had had, you know, sort of two decades before in the physics community, namely Paul Ginsberg. But I think what was interesting to me was, you know, I was sort of looking around, and I started to see a group of population geneticists talking about this idea. And there was actually a few papers in this sort of slightly odd section of archive called quantitative biology, which reflects a certain strand of biology, but not really mainstream biology. And, you know, I mean, I remember talking with John about this, you know, and John's reaction was, you know, what's different now? And this is, I say, this is an idea that's come up time and time again, what's different about this now? And I think, you know, the more we spoke about it, we really did think actually, the difference was the timing. And that, you know, there's, you know, there's a time and a place and I felt very strongly that this was inevitable, that someone was going to do it. And it should be us, because archive was embedded in Cornell, having been launched at Los Alamos, it was embedded in Cornell, it was very much a part of the scientific community and Cold Spring Harbor is, of course. And so having this start in Cold Spring Harbor would be the right thing to do. So it was of the academic community, not of the corporate sector. And I also, I think we both thought that, you know, it being in Cold Spring Harbor, which has this very, very kind of long time reputation for scientific communication, that in fact, as John said, goes back even longer than its research reputation. I mean, people were coming out to learn about science more than 100 years ago. So I think that there was that there was this feeling that the time was right. And if it could be done in the right place, it could work. And then we, we talked to a lot of people, I mean, we talked to some of those geneticists, we talked to faculty members at Cold Spring Harbor, who were former mathematicians and physicists who've used the archive in the past. And I remember one of them sort of saying to me that, you know, he was really surprised when he moved from physics into biology, and sort of, you know, was routine, was used to kind of waking up every morning and checking the archive for the latest physics papers, and was astounded that a similar thing didn't exist in biology. So when we saw it was a real opportunity to kind of like, you know, sort of have our ear to the ground and take the pulse of what the biology community was thinking. And then so there were clearly a whole bunch of early adopters who were just ready to leap if this was the right place. And we were lucky enough to, you know, to find them. I think another factor, which is very related to what Richard is saying is the fact that the Human Genome Project had brought together a very diverse group of scientists, including mathematicians and physicists and software engineers and so on. And they had all recognized that their ability to openly share their work was critical to the progress of the Human Genome Project. And every year for gosh, I think maybe 30 years now, 20 over 25. Anyway, we have had an annual gathering of the genome research community here on the campus. So that was a group of people we knew very well. And we were very struck by their sort of ability to think ahead and very, I was particularly struck by their conviction that sharing their work was a critical thing to do. In fact, that 1998 meeting that I mentioned where I met Paul Ginsberg on the grounds here was organized by two geneticists who were already thinking about the future. And they had invited Paul to sort of talk about what communication of research might look like in their field in future. So our feeling was, you know, if this bio archive thing only works for genomics and genetics, it will still be useful. But we had obviously hopes that it would be a very much more broad effort across all of biomedicine or at least all of biology when we began bio archive. But you also asked about med archive. And we did actually think about the possibility of extending the scope of this thing that we were trying to dream up into clinical medicine. And we decided not to. And everything that has happened since has shown us what a wise decision, even though we didn't know it at the time, but it was a wise decision. But what prompted med archive in particular was an op ed that appeared in New York Times two years into the life of bio archive, written by two prominent physicians saying medicine needs this kind of open sharing and pointing to bio archive as a model for how to do it. And we got to know one of the authors of that piece, Harlan Krumholz, a professor at Yale. He and his colleague Joe Ross were already running projects completely committed to openness of data, of clinical trial results and so on. And so after we had the long series of conversations with Joe and Joe and Harlan and also with Theo Bloom at the British Medical Journal, because one of the failed efforts in the past had been by the BMJ to start a clinical preprint server in the 90s. So we had the great benefit of the input of these academic professors and someone so deeply immersed in medical publishing, which is different to scientific publishing. And that's where the genesis of Med Archive came from. I mean, I think there's one other there's one other factor, again, something John and I have talked a lot about in the past, which I think is also important to think about perhaps to the biology community was changes in the culture of the community that were happening online around that time. One of the strongest arguments, I think, for preprints is that it's the natural way scientific communication should work in the era of the web. And obviously, Archive started in 1991 before web servers, but rapidly after 1992 became a web server. But I think what was interesting for the biology community was Bioarchive coincided with a point when a lot of biologists were beginning to explore essentially kind of like scientist to scientist communication online through things like Twitter. And so, you know, the traditional journals, et cetera, depend on search engines, I guess, tables of contents, et cetera. But, you know, what was very interesting about the timing of Bioarchive was it happened just when scientists were going to be doing this more themselves. So they could tell even though, for example, Bioarchive was not in Medline or PubMed, they could tell each other about papers. And it very quickly, you've got to the point where, I mean, I remember talking to very senior, serious and not really early adopter type biologists who would say this was how they were discovering information. Something would be on Bioarchive and somebody they would put on something like Twitter or another network that this was a really good paper and people should read it. So I always think that was very critical to the success of Bioarchive because I think that we have just around a time when a lot of scientists were doing this, not just the early adopters. Yeah, that makes a lot of sense. I want to circle back to something you had said a little earlier, John, which is that you are very glad you didn't start Medarchive at the same time as Bioarchive. And I'm wondering if you could expand on that a little bit and maybe compare and contrast the way the two work. Yeah, sure. Well, I mean, obviously the great concern that we shared, but certainly lots of other people did, was about the unfiltered distribution of clinically or medically relevant information. And we had great concerns about that. And so the main point of the lengthy discussions that we had with our co-founders was about how to mitigate risk. And we've done that in a variety of ways, mostly manifest by the more stringent submission requirements that are placed upon submissions to Medarchive. And particularly the sense of what we often call in-house, you know, do no harm. These servers, both Bioarchive and Medarchive, are open. The content is not peer reviewed. It's not a matter of passing a quality bar or anything of that sort. But in Medarchive in particular, we reserve the right to say no to things that we think might cause public concern and alarm. And that that kind of material, our phrase is better after peer review. In other words, you can't just take the author's word for it. Now that's a, you know, it's not a very large proportion of the submissions that we receive that fall into that category. But we do have that category, and I'm for one very glad that we do. But we use that process sparingly because really the point is to have open communication, but not at the risk of causing an enormous amount of alarm or distress. And I think that's really the principal difference between Medarchive and Bioarchive. Yeah. I mean, one thing that people often don't realize or forget is that every paper that goes on Bioarchive or Medarchive is looked at by a scientist. And sometimes people say, oh, you just put everything up and act. But we have a screening process where things are looked at quite carefully. But as John says, the point is looking at them to see whether they're right or undergo any form of formal peer review evaluation is basically just saying, you know, deciding, is this science? Is this medicine? And could it be dangerous in some way? And Medarchive has, there's another bunch of other criteria that you have to fulfill for Medarchive. If you're, you know, if you're doing, if your paper's on an experimental treatment, then you need to have a clinical trial ID. So there's a series of kind of declarations and information that authors are expected to provide. So as John says, that sort of bar is slightly higher on Medarchive, but it's not about the quality of the article. It's the sort of the hoops you have to jump through, and that better after peer review aspect. The wisdom of a lot of this was really made clear during the pandemic, when suddenly you had, you know, things like vaccine mandates, and you don't want reasons for people to go against accepted public health advice. Yeah. Yeah. All right. So this is some great discussion and really timely that Medarchive came out just before the COVID-19 pandemic and ended up being a huge contributor towards clinicians and scientists able to communicate on that. So we were wondering if we could sort of zoom in on the pandemic and sort of how you saw the role of Medarchive in this and how the role of Medarchive changed over time or any sorts of things that you needed to do to change your sort of screening processes or anything like that during the pandemic and where you see it now versus where you saw it at the beginning of the pandemic. Well, the principle benefit of posting preprints is the speed with which information can circulate and the relative, you know, the few barriers that are placed to that circulation. So in January of 2020, Medarchive had something like 200 manuscripts. It was six months old at that point, and it posted 200 manuscripts. In May of 2020, it posted 2000 manuscripts, and most of those were about the pandemic in some form. Bioarchive had 30 papers on SARS-CoV-2 in the second half of January 2020, and then, you know, the number of COVID-related papers on Bioarchive began to grow as well. And those papers became the topic of conversation around the world as public health authorities everywhere tried to figure out what was going on with this brand new disease and this newly identified organism. So everybody was making, everybody was sharing, everybody was making their own interpretations and impressions of what they were reading. There simply wasn't time for the peer review process to work. I think people were aware that they were looking at unfiltered information, but they were glad to have it so that they could make their own evaluation. And that situation continued throughout 2020. Fast forwarding to now, we are still getting a reasonable number of COVID-related manuscripts, but relative to the overall volume of preprints, then the proportion has dropped very precipitously. And so now just MedArchive, we always thought, was going to make a slow start because we thought the conservatism of medicine would mean that its adoption would be maybe a little bit faster than it would have been if Bioarchive hadn't existed, but we thought it was going to be slow. And then the pandemic completely changed that. And so now we're dealing with a situation where people often think, well, that's the only rationale is a public health crisis for having medical preprints. But of course, the rationale for having preprints in medicine extends over all of medicine. And I think people are beginning to get to grips with that. But there's been this sort of blip in the middle of the life of MedArchive, which altered perceptions quite a bit. And we're sort of trying to reset that now. Richard, would you add to anything to that? Yeah. Well, I think what's kind of interesting was, you know, I mean, I think both of us, probably on a slide deck prior to the pandemic, had a slide about that, I can't remember the name of the guy, and there was a paper in PLoS Biology talking about the potential for preprints many years ago. And they pointed out that when the SARS outbreak happened, something like 95% of the papers on the virus appeared after the outbreak had ended. And that, you know, in that you could imagine kind of scenarios in which really rapid transmission of information would be important. And I think, you know, so what was interesting about the pandemic was all the kind of things that we thought about preprints were kind of writ large, you know, suddenly people really wanted information really quickly. And there was no possible way you could expect. I mean, you had this kind of bizarre phenomenon where journals would publish an article on the alpha variant of the virus when it no longer existed anymore, and everybody was getting infected with Delta. So it made it kind of made it really clear that, you know, the speed was incredibly important. And also the speed of the basic research informing the translational. I mean, I was talking to somebody fairly recently who said that, you know, they got sort of drugs that going into clinical trials to treat COVID. And they said it's inconceivable to imagine how that could have happened without the rapid transmission of information through bio archive and med archive. So you see that. And then I think the, as John said, the odd thing was suddenly that, you know, that that's kind of what you predicted you would hope for some kind of geometric acceleration of science. And that was what was kind of interesting was then, you know, meta archive being there, and then some kind of looking at the numbers thing, my God, 10 million people this month, which was not kind of scientists. And then sort of, you know, there was a set of occasions in, I think it was like sort of a late 2020, early 21, where you'd go to the New York Times front page, and the article would be about an article on meta archive, because people would, you know, the British variant has arrived in New York, or, you know, you know, how are the vaccines, you know, working against this? So there was that was kind of interesting. And I think a lot of attention that was the public attention paid to meta archive papers, again, underscored that that it was wise to have those kind of high guard rails, because there were there were sort of a number of claims that you know, about scientific misinformation coming out about COVID. And, you know, people had worried that preprints would fuel this. But actually, the reality is that most of those things weren't preprints at all. You know, I mean, that's that they're really bad screw ups in many respects, where it were in journals. And in journals where there was no capacity for modification, you know, one of the most notorious incidents on bio archive was literally the first weekend in February of 2020, when a paper appeared on bio archive claiming that this new virus had sequences in common with HIV one. And that was posted late on a Friday afternoon. And by Sunday afternoon, the authors had withdrawn it because the community had weighed in so vigorously to point out the errors in their interpretation and their methodology. And so we had science, you know, almost operating in real time through the through the lens of the preprint service. Now, that was an exception, for obvious reasons. But in in a much more low key way, that is part of the benefit of having a culture that revolves around preprints, rather than revolves around so called peer reviewed published papers where the capacity for change that changing that content is very, very much more limited. There's actually quite a nice example of it on the more clinical side, where there was a paper that overestimated. So there's a very small risk of myocarditis following vaccines, which is much, much lower than the risk of getting myocarditis following COVID. But there was a preprint that came out where they overestimated the myocarditis rate. And what was interesting, then was they spotted really quickly that they'd made an error and withdrew the paper, which you can do on Med Archive. And so the corrective mechanism is there. But it was interesting talking to a few cardiologists about that, who that what was very interesting was that in all likelihood, this would not have been picked in peer review. Because the reason that you could tell that the myocarditis rates were overestimated was because the denominator of the number of cases in the people in Canada was wrong. So you had to know that to know the calculation was wrong. And the likelihood of anybody from that region of Canada being the peer reviewer was very low. So almost certainly, if this paper had not gone to Med Archive, it would have passed peer review, and then had to have gone through the whole protracted retraction process. But because it went on Med Archive, within a matter of days, the error was spotted by a broader community that the wisdom of the crowd who did know what the numbers ought to look like in that region of Canada. So it was interesting seeing some of these aspects that you'd always thought rapid correction is possible, just like publication. Yeah, this rapid outsourcing of peer review, it's really cool. It's not quite in biology or medicine. But we kind of recently had this LK 99 superconductor saga, and sort of still ongoing a little bit, although maybe tapering off. And I'm wondering if you all had any thoughts about that. I personally enjoyed seeing it sort of play out on Twitter in real time, seeing all these folks trying to replicate it. Absolutely. I think what's great about that is that, you know, people people talk about that, and they go, look, this is a failure. This shows, you know, because this thing was on Archive. And now we think it doesn't work. And it's like, actually, no, that's success. Because what happened was it was on Archive. It looked kind of plausible. So people tried to read it, and they couldn't. Like, you know, it's kind of interesting, every now and then we get something sort of peers on bioarchive, and people have some issue with it. And a number of people go, oh, why was it posted? You know, what should happen? And then other people go, this is what should happen. Exactly what it's for. So I think that's kind of so that that LK 99 is a great example of that. Were there any negative outcomes, do you think, from the way that information was shared during the pandemic, or for this more recent superconductor thing, where, you know, you think that there could be some improvements to the pre printing system to help streamline things to help eliminate misinformation? Or do you think that you've been getting it to where you want it to be? I think, and you know, I hope this doesn't sound complacent, but I think that not just bioarchive and meta archive, but archive too. I mean, I think they they work in the way they work, they work slightly differently in our cases, but they work for the benefit of the scientific community. And I think but what happened in the pandemic was that the mass media started paying attention to what was on preprint service. And this had not really happened with archive, except in very, very rare cases of, you know, mathematical proofs that of some challenging conundrum that had been around for a while. And occasionally, that sort of thing would be picked up. But with meta archive, as Richard said, every single day, it seemed throughout 2020, there was something in the media about something that had appeared on that archive. Now, a lot of the coverage was excellent. But a lot of it really wasn't. And this, I mean, there were perfectly good reasons for that, you know, newspapers were scrambling, most newspapers do not have specialists on their staff, experienced in reading the scientific literature, people who are sort of general beat reporters were being asked to cover the pandemic. So there was a lot of rather poor reporting on what scientists were doing. But I don't think that means that the preprint servers were not functioning as they are supposed to function. And in fact, the scientific community was functioning as it was supposed to function as well. But the translation of that work into the minds and hearts of the public was definitely a problem. And I think it's part of a much broader problem of how the general public understand how science is done, and how scientists work, and their intrinsic scepticism about their work, their own work, and that of others. And that's translated into some sort of statement, how you can't believe anything a scientist tells you. And I think that was unfortunate. Yeah, I think, I mean, I think that because we add on the side of caution, then actually, there weren't too many worries about things like that. Because we've always had a policy, particularly with Med Archive, when John and I and our co-founders, Joe, Harlan, Claire, and Theo talk, it's like if anybody worries about the danger of the paper, then we, you know, we, and sometimes we disagree, but really, we always will err on the side of caution. So if one to other people are really concerned that this might be dangerous, then we won't post it. But, you know, I mean, ultimately, a lot of these things get published anywhere. And I think John's right. I think the real problem was not so much content, but the extent to which stuff is spun. You know, you can exert one sentence from a pre-print or a journal article, which totally does not represent the whole paper. I mean, this is what happens with anti-vax people all the time. They'll take something out of context, absolutely amplify it. And it's very difficult, you know, that's going to affect journal articles. There's also a bit of a misunderstanding about how science works among the general public and the way in which it's an accumulation of observations to build some kind of emerging consensus. And there can be turns and then people can get things wrong. And I think there's been a lot more conversation about getting the public to understand that. And it's very difficult to get the public to understand that at the same time as having a whole bunch of bad actors trying to manipulate the message. But I think that was the thing I took more than anything. And I think, you know, we always thought that there may be some, you know, big error that would happen at Bio Archive or Med Archive where we let something through like that. You know, we were very cautious, but we thought it might happen. But I really find it very hard to kind of go back and point to a paper where I think we have terrible regrets about letting it through. I mean, there's a few things that there's things that we turn away because of better after peer review. And, you know, maybe we were too cautious. But I don't think I really don't think that things slip through. And John mentioned that the very early paper showing the alleged homology to HIV, which, you know, I mean, we still debate whether or not that is something that should have gone through or not. I mean, it was a it was a crap paper. But, you know, was it just a crap paper that really people should have forgotten about? It was somewhat over interpreted. It's not clear to me that a huge amount of damage was done to anybody other than fueling a few conspiracy theory circles for a weekend. Yeah, I was just thinking about there's so many questions that we could sort of delve into from here. I was just thinking about which one to ask next, because anyway, it's such an interesting discussion. I'll just ask a quick related question to some of these possible criticisms of preprints before we start getting into some of the benefits and how we think about science is transforming. So beyond these sorts of things about unverified findings or misinformation, other criticisms include, for example, researchers just sort of planting a flag, putting out sort of very minimal studies to kind of say we did something first and other things like that. And we were kind of curious, what other criticisms of preprints do you see that we haven't talked about? What ones do you think have merit and sort of what might lead you to make some further future changes to how bioarchive and that kind of work? Other criticisms? I guess it's a little bit. I would never call it flag planting, because I honestly think that in any community of people who are closely engaged in a certain topic, those who matter, in other words, those who can read a paper with adequate understanding of its significance, if it is full of holes, if it lacks information, that's going to be obvious. So you don't get the credit for planting your flag if you've done it in an inadequate way. And I think that would be what I would say in response to that particular kind of criticism. Slightly outside that inner circle, we sometimes hear from those who want to repeat something that, oh, well, this preprint didn't have enough information to allow a full blown replication. One can say that in the journal literature, that is true as well. As someone who sat on 15 PhD committees now and hearing the sad tale of the first year graduate student who's trying to reproduce something and can't, because it turns out there was just little bits in the methods that were not actually included. So I'm sort of talking while thinking here, but I'm not sure I can think of any other kind of criticism that we have heard about preprints, Richard, would you? Yeah, I mean, I think there's a few things. I mean, one point I do want to make on this issue of the flag planting is that I think, as John says, reputation is key among scientists. So if you're the dude who's constantly putting out crappy stuff and claiming that you've made, I mean, there are people like that already. They existed before bioarchive. Anybody who works in any field knows that there's a few people and you only meet in the conference bar and you kind of go to talk and you say, oh, wow, that was interesting if you don't know somebody like me who doesn't know the field. And then you talk to somebody who knows and they just say, yeah, everybody knows that guy, always claiming like that, don't believe a word he says. That happens that you can do it now with like rapid publication in journals with negligible peer review. So I think there's an important reputational feedback mechanism in science, which kind of is something that is a barrier to that type of thing happening. I also think we should be very careful about this idea of priority in science, because I worry about this with younger researchers getting the impression that everybody's always in these kind of winner takes all races. And there are a very small number of cases where that happens, where there's like some massive finding that somebody discovers. But 99.9% of scientists are never in a race like this. Most of them, there's an awful lot of people working in an area, making incremental advances, building things forward at different places, building on each other's work. And then if you look back five years later, nobody kind of sort of gets out the kind of tape measure and says, well, you know, that person was first, you know, you say, oh, well, you know, 2019, there were a couple of papers, one was in nucleic acid research, one was in nature communications, I can't really remember who was first, they both did the same sort of thing. That's more kind of what happens. And I do worry a little bit, particularly for junior scientists, that we one of the big concerns that people sort of talk about with preprints, of course, is scooping. And, you know, and it's always sort of cast in this idea that, you know, if you put your paper out on bio archive, you might get scoops. And of course, then there's these conflicting narratives, one says, you can't be scooped, because you put it on bio archive, everybody knows that you've used the anti scooping device. And then other people say, Oh, well, you know, but more importantly, I won't get my paper in nature. So I've been scooped doesn't matter if I think I'm first, because my papers on bio archive, this person's papers in nature, I've been scooped. But you know, I mean, some of this is pretty fanciful, because it's pretty rare that any of this sort of thing happens outside the kind of like, sort of boogeyman tales that PIs tell their graduate students, frankly. I mean, I can say as an early career researcher that I feel like it's had a it's had a positive impact on me, definitely. I mean, the ability to like, not have to wait the only several months, if you're lucky, up to like, 1224 months to get your paper published is actually huge to have your work out there and have other people see it. So yeah, I'm curious. In that vein, we've already talked about speed, I'm wondering if there's any other ways that you all see sort of pre print servers changing the way researchers shared and consumed? Like, what are the biggest impacts that that you've seen? Well, I mean, one of the tremendous boosts to pre print behavior in biomedicine has been the fact that so many funders and agencies have weighed in to say that they encourage the use of pre prints and permit the inclusion of pre prints as sort of recognizable outputs when the scientists that they fund are being sort of weighed up. And that happened really quite early on in the life of bio bio archive, going back to what I don't know, to 2016, maybe 2017, when these pronouncements came out from places like NIH and, and Howard Hughes Medical Institute, and so on. So that has been, I think, a very significant fact. And that has definitely helped encourage people to believe that this is the right thing to do and beneficial to them. And of course, it means I mean, there, you know, I mean, the scenario that you were describing earlier, there are absolutely people who have jobs much, much sooner than they would have otherwise. You know, you don't, you know, if you can, if you have a great result, and you can put it on bio archive, then you don't have a situation where somebody is treading water as a postdoc for another two years to try and get the expert and the additional experiments to get the paper in. And so I mean, there's the john and I both met numerous people who have said, you know, I got my job, because I put this preprint on bio archive, and I didn't have to wait another another two years. And actually, I had an interesting from somebody who's involved in recruitment. And they said, actually, what a really savvy institution does is look at the bio archive preprints and try and recruit people from them. Because if they wait until the papers in science or nature, then they'll have to compete with MIT, Harvard, Stanford, and everybody else. So it's kind of interesting when you see all these all these ways that it impacts careers. I mean, I think the your question was about how what changes can happen. I think the really interesting question, which we're just at the beginning of right now is how it changes what we think about and how we do peer review. I mean, this is something that again, john and I thought from the outset that, you know, bio archive would really speed up science, but it also, because it decoupled the dissemination from the evaluation, gives it a real opportunity to ask this question about peer review is whether we should do it for all papers, how we should do it. And I think, you know, I mean, I think people are just beginning to to think about this most obvious and somewhat controversial example is, is a life who now don't have to recheck papers, they put the peer reviews on bio archive, and then, you know, and then they appear on eLife later. But that's all kind of enabled by the fact that the work is public. So I think, you know, it'd be really interesting to see, you know, in the coming years, how people change the way they think about peer review. I mean, I often say, was going to start a new journal tomorrow, then I probably wouldn't build a website like a journal now, because why would you bother if the articles always are there's already on bio archive, then you know, you can do things in a different way. I think related to that is these. And as Richard said, we are in very early days, but there are already quite a number of experiments are new ways of thinking about peer review and new ways of doing it. And one of the great benefits is that it has brought a broader diverse and more diverse group of people into the process of evaluating science. And many of these groups have made a very conscious decision to embrace more widely in terms of, you know, age or geographical distribution, and bring more bring a larger proportion of the working research community worldwide into the process of evaluating progress in science. And that that has to be a good thing. Yeah, yeah, I really love that. I want to circle back a little bit to to what we're talking about with the pandemic, the pandemic provided an interesting opportunity to sort of measure the success of pre print servers, there was a there was a paper that got published, and it was actually published as a pre print and then a peer reviewed article was great. And they compared a large number of studies published during the pandemic, both pre and post peer review. And they found pretty much that little changed between the pre print and the published version and, and that whatever did change didn't really qualitatively change the interpretation of the results. So to me, this seems like a huge win for pre printing. And I'm just curious in this vein, what metrics do you both use to sort of quantify how successful pre printing has been and how successful bio archive and med archive have been? I mean, the obvious answers are, of course, are the rate of submission and the rate of usage, you know, based on downloads or page views or whatever. I mean, those are sort of broad measures. There are but there are a lot of others, you know, the going back to my point about diversity, we have manuscripts from on both servers from over 190 countries, and very few journals can sort of point to that as the origin of their of their content. So that's a win. And we'd certainly regard that as a as a measure of success. Obviously, the proportionately the dominant volumes are coming from the United States and the UK and Western Europe, but there is lots of potential for growth in other parts of the world. So, you know, seeing seeing changes in those proportions is one of the metrics that we would we would look at over the course of the next the next few years. Certainly that's amongst them. I mean, I think the other thing that's kind of interesting from that point you made about the lack of difference, for the most part between the pre print and the published version, is it gets back to that kind of the existence of something like bio archive, allowing you to look at peer review and say, what is this for? What does it do? If the difference between those two things is negligible, then you say, well, what's the peer review doing? And of course, one of the things that peer review is doing in that case, is it's deciding where a paper ends up. So, you know, if you peer review paper, it doesn't change very much between the pre print and the published version. But what peer review may do is say, this isn't good enough for my journal, it's good enough for this journal. And so is peer review just sorting out a pecking order, or an impact of articles. And, you know, I think it definitely is doing that. And then we want to ask the question, is this the best way to do that? You know, do we want to spend sort of, you know, three months submitting and rejecting a paper at one place, six months at another, and then, you know, and ultimately 18 years later, find it ends up in a journal that was two rungs lower than you wanted. And maybe it would be better to just, you know, sort of put the paper out and then decide that afterwards. And of course, then the scientific community can argue for the next 50 years about the best way to do that ranking and whether or not we should do it at all. But the very existence of pre prints, Carth really puts that in the kind of headlights and says, what are we doing with this? And my argument has always been that I do see a lot of you in peer review. But the problem is the way we have it set up now is it conflates a whole bunch of different things to do with quality, soundness, impact and interest, which are all fundamentally four different things. Yeah. Another point worth making, I think here is that one of the metrics that people ask us about all the time is, well, what proportion of these pre prints got published in a journal? And the answer, which we know to a reasonably good approximation, is somewhere between 70 and 75 percent in both cases. But that really avoids several more nuanced sort of observations. One is that, and this we saw this in particular in the pandemic, where people posted to Metarchive in particular, not because they wanted a communication that was going to get published in a journal. Many did, of course, but often they were, I always call them dispatches from the front. You know, these were these were intended as communications to help other people who were struggling with the same problem. And that was the primary purpose of the communication, not publication. And that's a reminder to me that that's why scientists scientists communicate in order to advance knowledge, to advance the enterprise and help each other not to add something necessarily to their resumes. And I think we forget that far, far too often. But another of the changes that has we've seen and we don't have any way of quantifying this is that people, more people tell us anecdotally, you know, putting it on Bioarchive, that's enough for me. You know, we did the work. We've come to our conclusions. We've analyzed our results. We've put it out there for the benefit of the community. And now we are moving on. We're just not prepared to go through the two year nonsense that you refer to, Andrew, in order to get it into a journal that may not actually be a very satisfying result at the end of the process anyway. And we don't know how many times that is happening, because we don't ask authors that, obviously. But it wouldn't surprise me if it's happening more than we realize. Yeah, one way that I've already sort of experienced these changes is that journals now will invite submissions based on what they see on Bioarchive. And I think that that's that's one cool way that the preprint system has started to change the regular publishing system. I was wondering if you guys just had some quick thoughts before we go to our next main question about how you see research articles changing as preprints become more prevalent, as the digital age kind of pervades through to more and more to normal journal articles too. Richard will answer that. I know. But I just one point that I realized I meant to make earlier on, which was that one of the things that really helped the Bioarchive to get established was the willingness of journals to make public a policy or even change their policy about accepting manuscripts for editorial review that had already been posted to a preprint server. If that hadn't happened, then preprints and biomedicine would have had a very much harder road to hoe. And we will always be grateful to particularly a group of society publishers who got together and said, OK, we are cool with this. We haven't had to think about this before, but our policy from now on is going to be openness to preprinted manuscripts. And funnily enough, the natures and sciences of the world had had this openness for many years because they had got to grips with archive a long time before and something that wasn't adequately appreciated amongst biomedical scientists. But journals were very important to the acceptance of preprints. But Richard, you have you'll have other thoughts about Ross's question. No, I was just going to say on that aspect, what was interesting was how quickly that happens. It was almost a bio archive launched in November of 2013. And within a couple of months, I was already hearing from people saying, oh, they were getting emails from interested journals saying, would you like to submit to us? I really hate that phrase, if you build it, they will come. But it's like we built it. The editors came to look to try and find papers. And then without any advertising at all, suddenly tons of Chinese scientists decided that that was where they got they were going to put epidemiological analyses of COVID. So it was kind of, you know, one of the rare occasions where that phrase is actually applicable. Yeah, very cool. Something else that I wanted to ask you all about is been sort of bouncing around my brain a little bit is with all of the sort of rise of AI chat bots, I've seen sort of the claim that AI chat bots have reduced the cost of creating bullshit to zero. And I'm curious what you think, if you think this will have an impact on pre printing and sort of if we will need additional precautions in this era of AI. So this is obviously something that's on our minds, too. But picking up on something Richard said earlier, which is that, you know, reputation is what keeps the scientific enterprise going. It's not because people necessarily get rich, but they get rich in terms of respect and appreciation for how they behave. So there are always going to be bad actors. Fortunately, in our experience, very, very, very few, but they do exist. So I don't think that we're going to have a sort of vast tsunami of completely made up papers, because I don't think there's much in it for the architects of that bullshit that costs zero to produce. We are discussing what our guidance should be in relation to the authors of preprints. And it's not going to be that much different to the emerging policies around journal articles. I think it's fruitless. This is my opinion anyway. I think it's fruitless to say these things must be ignored at all cost. We have, as I said, manuscripts from 190 countries where most of the time the author is writing in a language that is not her own. How dare we prevent the use of a chat bot for improving the communication of somebody who really, really wants to find the broadest audience possible. So I don't think that's the right way to do it. I think what at the moment we are considering is the idea that we should advise authors to acknowledge when they use any kind of tool, including language tools, just so that it's there in the manuscript and nobody is trying to fool anybody else. It's transparent. It seems to me the right thing to do. But beyond that, I don't think we are not going to legislate against the use of these tools, at least not now. I mean, we're still very, very, very early stage in all of this stuff. And we're all learning at the same time. But that's where we've sort of come out at the moment. We'd like to know, but we're not going to check and we're certainly not going to prohibit. Yeah, I think that makes a lot of sense. I mean, I think there's a bigger picture. I mean, I think the bigger thing that we need to think about is how are things like data verified? And I have a thing about identity verification as well, which I think is important because reputation works if you can be held accountable. So I think that's important. But I think ultimately the scientific community has to think about it. I mean, you look at the appearance of Photoshop and then suddenly everybody started seeing images being manipulated, you know, but just from somebody who remembers papers before people used Photoshop, people would try and manipulate those papers as well. So there's a fair bit you can do with film and there were authors who did that. But obviously it makes it easier. But I think that's an issue because, you know, I mean, mostly just the text in the paper is, you know, a narrative, right. But the thing that you're really concerned about is the data. And, you know, you can fake data by using an antibody different from the one you report. So but I mean, I think that's what we as a community need to think about. As I said, as John said, I think that, you know, the tools that people use to help them make their words better to describe what they've done are obviously things that we should allow. And we need to be clear that, you know, you should be clear about all the methodology that you've used. Yeah, totally agreed. So in 2022, kind of jumping into another thread here, the Biden administration announced a new policy that mandates all research funded by the US government be immediately open access on publication by 2025. You've both argued that this actually doesn't go far enough. We're wondering if you could talk about what you think the policy should be and why you think posting preprint should actually be mandated. Well, I guess it's interesting you frame it that we say this doesn't go far enough. My view would be that this is how we do it. That, you know, my view has been, you know, having sort of participated and John even longer than I in this in the space for many years, people have been talking about how to make research findings free to everybody for, you know, more than 25 years. And, you know, I call me a cynic, but I continue to think that nobody really has any idea how to do it. Right, you know, or certainly not any idea that's broadly accepted. You know, I mean, you don't have to kind of discuss things with scientists very long before they start complaining about open access fees. But, you know, of course, the open access fees are there to make the paper open access, which is what people want. And then, you know, all the numerous alternatives are equally problematic. So my view is that, you know, the Biden administration and Europeans and various people have come out over the years and said that publicly funded research should be made available to the public to read. I mean, I would argue that it's not so much about the funding and who funded it. It's more that it's philosophically, it's the right thing to do. We should try and enable this. And the simple for me is to just say, well, when you've written up a paper, make it public by putting on a preprint server. And again, going back to that decoupling and then separate out that much, much, much more problematic conversation about how you pay for the peer review and or if you're the case of somewhere like Nature of Science, how you load the costs on one paper of the tenfold more that you reject to get that. So I think that's complicated. So I think, you know, the simple thing is to solve that problem. You mandate the data and you mandate the narrative and you can achieve that with data repositories and preprint servers. And it sounds very simple to me. And I, you know, maybe I'm politically naive, but I still don't understand why nobody's done this other than a few small, small, small philanthropic funding organizations. But John, sorry. No, no, I think you've really said it. You've really said it all. I mean, it fundamentally, and we are wrestling with this in the Cold Spring Harbor Journal program. We are committed to a pathway of transforming them into fully open access journals by 2025. And we're doing that in response to these funder mandates. But it's problematic and we are more fortunate than most small not-for-profit publishers in being able to derive a plan that we hope gives us the possibility of continuing to publish these journals. But small societies, small academic societies with highly respected journals are having a serious problem with continued survival. And the research community may well wake up one day in not so many years from now and discover that a decent proportion of the journals that they liked most have gone away. And the ones that have remained are the ones they like to complain about most. And this is because of the fact that the commercial publishers have many, many more resources with which to navigate this transition to an open access world. And while we are, as Richard has said, we are very committed to the idea that scientific information should be publicly available, but doing it through the medium of journals has already proved to be complicated and costly. And using preprints and data repositories would be so much simpler. Yeah. It'll be really interesting to see how all this transition goes. I agree. I'd love to see preprinting used as an easy way to meet that goal. So I want to ask a little bit just career wise, what are some of your favorite and least favorite things about working in science publishing? What gets you out of bed in the morning? What makes you excited to do your job? Well, I can start with that. With two things, really. I mean, one, or maybe it's just one. I think that young scientists are a constantly amazing group. And I see that through the lens of the Cold Spring Harbor having our own graduate program with these incredibly talented young people who go through it. And also seeing, I mean, we've got 350 people who are obsessed about RNA biology arriving tomorrow. And, you know, they come to the bar and they hang out at lunchtime and so on. The demographic of that group skews very, very young. And these are energetic and creative and dedicated young people. And I should say they are far more interested in the process of communicating science than their old elders ever were. You know, they've grown up at a time when they use sharing means without even thinking about it. And that includes sharing their science with people who are not scientists. So just in the last two or three weeks, two 30-year-old scientists who I know have signed contracts with major publishing houses to write books for the general public about the kind of science that they do. I mean, they're in their 30s. Whether that's a good thing from their career perspective is a whole other matter. But the fact is they are really passionate about doing that kind of communication. And I think that's a tremendous asset to the scientific enterprise. And I hope that as, you know, the academic community can embrace that sort of ambition and make communication one of the opportunities that young scientists are both in given and encouraged to pursue if they turn out to be passionate and good at it. So that's just one thing off the top of my head. I mean, I think the thing that excites me needs to is one of the things that kind of like led me into it in the first place. I mean, I vividly remember when I was doing my PhD in Cambridge, I was at the LMB and I was sort of very focused on signal transduction and transcription factor activation. But when I was in the lab, that was all I was doing for sort of three and a half years. And it was very, very focused. But I was lucky because it was at Cambridge, I was also in the college system. So I was able to do tutorials and supervisions for the undergraduates in molecular cell biology. And so I was teaching two or three hour long sessions a week and that, you know, one week I would be doing bioenergetics, the next week I'd be doing oncogenes, the next week I would be doing, you know, sort of Michaelis-Menten and protein kinase action. And I was worried that I was becoming so focused on such a narrow thing and that actually what I found more enjoyable was the whole spectrum. And so, you know, when I went to work at TIBS, for example, it was great that, you know, one week it would be structural biology, the next week it would be kind of homologous recombination, the next week it would be, you know, plant hormone signaling. And that was the thing that kind of interested me and continues to interest me, you know, and bioarchive is obviously incredibly broad and we've met archivists into the clinical space. So I think that's one of the things that I think always inspired me about the career that I have and that I got into. I think, you know, the flip side is that, you know, and if you talk to any editor they will mention this is, and it's much, much better now, but historically there has been a narrative that people who are editors are only editors because they failed at science. And, you know, I mean, it's not just editors, but people in various other professions that require a PhD do get this. And nothing could be further than the truth, really. I mean, you know, I've met patent lawyers and editors who are incredibly smart people. Often, and, you know, it pains me to say that, but many of them are smarter than a lot of PIs I've met, you know, not exclusively, but it does happen. I think that's a shame, but I think what is really good recently is that there's been increasing recognition. This isn't the case. There's been an increasing understanding that only a very small percentage of people go on to become a PI. Some people don't have the opportunity and a lot of people make active choices to go into other professions that they have their own challenges and certainly aren't an easy way out at all. Yeah, that makes a lot of sense. Relatedly, do you guys have any advice that you'd like to provide for early career researchers navigating the publication space and trying to get their work out as they try to graduate and move on to, you know, the next stage in their career? Well, the most obvious thing, since you're speaking to us, is that they should go to their PI and insist that their manuscript is put on Bioarchive. You know, we're very, you know, grateful for the progress that Bioarchive has made, but it's not a universal platform by any means and the proportion of papers that start their life as preprints is still pretty small. It's somewhere hard to estimate, but somewhere maybe say 15% of the literature begins as a preprint. And there are still young scientists, post-docs who tell us anecdotally that they would love to put their manuscript on Bioarchive, but their PI says no for a variety of reasons. So that's something I think having a conversation, ideally early in your life as a graduate student or a postdoc, having a conversation maybe even before you join the lab, ask the PI what her attitude is towards preprints, because let's face it, if you're an established PI, then preprints matter much less to you than they do to the person who's doing the work. So I would recommend, first thing is to have that conversation in advance and just see where your PI comes out. Nobody may have had the conversation with them before, so it might lead to interesting changes. Yeah, I would echo that. And I think again more broadly, just talking to other scientists, talking to people, not just your own PI, about the landscape, about what you should do. Because I do think, as a graduate student in particular, you can become very focused and all your work is on this particular area or in a lab, and sometimes there can be a temptation to get very blinkered and not go to seek broader opinions. And if you go and talk to other scientists, they may say, hey, there are other options, you should think about this. You should post a preprint. It's not the end of the world if you don't get your paper in to sell, for example. One thing that I occasionally criticize scientists for, and I think scientists are bad for, is extrapolating from N equals one themselves to the experience. And this can be how to write a paper, whether you want papers double-spaced, figures at the end, inline figures. And everybody kind of says, I like it this way, so everybody else must be like me. And if you get that narrative from your PI, sometimes it can be really advantageous to go and talk to somebody else who says, oh no, you can do things in different ways. It doesn't have to be like that. And so I think especially in the context of things like scooping, worrying about preprints, where you should submit a paper, and all these things that you feel like it's expected and this is the path, when there are other paths, and talking to a broader group of scientists will enable you to find out that you may be getting a fairly restricted description of the challenge you're facing. Well, thank you so much for joining us. Before we wrap up, is there anything you'd like to promote before we finish up? Well, since we're on the subject of career options, because that was one of the questions that Russ mentioned, as I said at the beginning, the mission of the press is to create things that help scientists. And one of those things is a book called Career Options for Biomedical Scientists, which Richard was one of the editors of. And these are commissioned chapters from people who we knew who had all got PhDs in some form of biomedicine, and they went and did something completely different. And a whole range of interesting possibilities written about in an instructive and informative way. So that's a book that many of your, much of your audience might find very helpful. Yeah, and I think that gets back to the point that I made before, is one of the, it's very important with career advice in particular to seek opinions and knowledge from people who know about it. And actually the one person who's least likely to know about this is your PI, because they've only ever done one job normally. Oh, that's really wonderful advice. I am going to go buy that book myself. So it's been a delight having you both on. Thank you so much. I really appreciate your time. Thank you for the invitation. It's been fun. Yeah, it's been a pleasure. This has been another episode of EBRC and Translation, a production of the Engineering Biology Research Consortia's Student and Postdoc Association. For more information about EBRC, visit our website at ebrc.org. If you're a student or postdoc and are interested in getting involved with the EBRC Student and Postdoc Association, you can find our membership application linked in the episode description. A big thank you to the entire EBRC SPA podcast team, Andrew Hunt, Ross Jones, David Mai, Heidi Klumpa, and Rana Said. Thanks again to EBRC for their support. And of course to you, our listeners for tuning in. We look forward to sharing our next episode with you soon.