Writefull Archives - Digital Science https://www.digital-science.com/tags/writefull/ Advancing the Research Ecosystem Thu, 07 Mar 2024 17:23:12 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.2 Writefull Case Study https://www.digital-science.com/resource/writefull-case-study/ Thu, 08 Feb 2024 11:02:28 +0000 https://www.digital-science.com/?post_type=story&p=69236 Case Study from Writefull demonstrates how their tools support the entire publishing workflow

The post Writefull Case Study appeared first on Digital Science.

]]>

Writefull Case Study

One of the world’s largest scientific societies and the premier home of chemistry professionals publishes over 80 journals spanning all disciplines of the chemical sciences.

The society’s Production Operations team is responsible for tasks like technical editing and proof review. As the organization grew, it needed more capacity, and they now leverage a global supply chain – including Writefull.

The team is also involved in developing and implementing new technologies, with a particular focus on how to make the best use of Artificial Intelligence. They looked into the concept of language modification done programmatically but were “really striking out with commercially available broad-sweeping tools. They did not perform well on our content.” What they needed was something that could suggest improvements to academic language usage, such as sentence structures in scientific writing or discipline-specific vocabulary. 

Supporting the entire publishing workflow

Then the organization was introduced to Writefull. They were immediately intrigued by the specificity it offers, as Writefull’s models can be tailored to a particular niche of scientific content. After a competitive RFP process, they decided to work with Writefull, and together they began developing some of the services they have since operationalized. 

Writefull’s full publisher solution offering supports each stage of the publishing pipeline, from submission to copy editing to quality assurance:

Over the years, the society has incorporated four unique tools from Writefull. A member of the Production Operations team explains: “We have a very rigorous method for evaluating the quality of each step of the production process. For example, one of the things we need to do when we evaluate quality involves scoring the language after copyediting. Traditionally, it would take a long time for one of our team members to complete a review like this, limiting the amount of content that could undergo quality review. But when we tailored Writefull’s Manuscript Categorization API to our requirements and benchmarked it against our own quality analysts, it reached over 95% alignment in categorising texts.” With such a high level of alignment, the automation of this task can significantly reduce the time spent on manuscript evaluation and allow more content to be reviewed for quality purposes.

The organization’s quality evaluation program also includes assessment of metrics such as whether author names and affiliations are present and correct. They use Writefull’s Metadata API to facilitate and partly automate this task, by extracting manuscript elements for post-editing quality assessment. This reduces the number of items in their quality evaluation that need manual review and the time spent reviewing them.

Another API they have adopted is Language Edit Assist (LEA), which auto-applies a highly customized subset of Writefull’s language edits to manuscripts prior to copyediting. The manuscript is then at a better starting point when it reaches the copyeditor, increasing efficiency in the editing process.

A transformative step

Writefull’s services have delivered multiple benefits to the society – including demonstrable time and cost savings. 

A prime example comes from its recent analysis of LEA results. 715 papers were randomly split into two workflows, with manuscripts of similar language quality levels present in each workflow. Prior to technical editing, one group had automatic edits applied by Writefull’s LEA service and the other group did not. Without LEA applied, 58% of the papers were classified as well-written (the highest quality level). But with LEA applied, 81% of the papers were classified as well-written – enabling the organization’s copyeditors to focus their time on high-value-added edits, instead of on basic grammatical fixes, such as articles, punctuation, and conjunctions.

Additionally, editors at the society evaluated 35,000 individual edits made by LEA. After refinement using internal data, edits applied without error were 99.04%, and edits introducing errors were 0.96% – showing a high level of accuracy.

It has been transformative,” says a member of the Production Operations team. “The data we’ve obtained has really been useful because it’s highly customized to our work. Having quality information at the right time is so important, and there’s a limit to what human manual efforts can do and when they can do it. Being able to do it programmatically with a high degree of accuracy gives you a wealth of data – and that means you get new insights into how you’re operating the business and the strategic directions you should consider.

From strength to strength

There has been a high degree of adoption, and the society and Writefull are working together to refine the processes further. 

They are now looking ahead at what this adoption and their new knowledge means for operational performance: “Is there some opportunity there for improving throughput or quality? What are the key metrics that this is going to impact? The next part is to manage that forward. We’re seeing the refinements boost the tools’ overall performance and increase staff buy-in further.”

Part of the project’s success is attributed to how well the organization and Writefull work together. Juan Castro, CEO and Co-founder of Writefull, reflects: “Many of the services we offer to publishers today are the fruits of our collaboration with this society. It is together that we have identified challenges, thought of solutions, and developed and vetted new technologies. I personally think LEA is the most exciting of all—it is such a powerful solution, which offers direct assistance. We look forward to many more future projects together.

The society has found Juan and the wider Writefull team to be “very creative and collaborative, and they’ve really been engaged in coming up with innovative ways to overcome the challenges involved and achieve our goals. They’re a fantastic group to work with.

If you’d like to explore how Writefull’s services could support your organization, request a demo

The post Writefull Case Study appeared first on Digital Science.

]]>
AI at Digital Science https://www.digital-science.com/resource/ai-digital-science/ Fri, 18 Aug 2023 14:42:22 +0000 https://www.digital-science.com/?post_type=story&p=65335 How does Digital Science and it's portfolio of companies use AI?

The post AI at Digital Science appeared first on Digital Science.

]]>

How does Digital Science use AI? We ask ChatGPT

The year 2023 will forever be remembered as the year that AI finally broke into the mainstream. But in the research ecosystem, what do we mean by AI and how is it already being used? Simon Linacre does what everyone has been doing and asks AI itself for some answers

Anyone would be forgiven for thinking that artificial intelligence (AI) had only just been invented given the huge interest that has suddenly been generated in it in 2023. However, AI actually has a long history, being first conceived of in the 1950s and gradually becoming a reality in the decades after. Nevertheless, a real step forward has been made with Generative AI, which can create seemingly flawless text or images in just seconds. But is it really that new?

Writing these words on a Google Doc, I have already accepted one or two suggestions from the predictive text function, itself a form of AI. When you realize that the words you have chosen have not 100% been your own, you can see how lines have started to blur with the more advanced Generative AI’s capabilities. However, caution must still be used as AI is more than capable of getting things wrong – just read Digital Science’s post on ‘The Lone Banana Problem’. Moreover, a lack of context or key detail can lead to misrepresentation or redundancy in the content being created.

Digital Science and AI

Digital Science has been using AI and related technologies such as Machine Learning (ML) and Natural Language Processing (NLP) for several years now – indeed, Dimensions has had https://www.dimensions.ai as its URL since its launch in 2018 – and it has become a trusted partner for hundreds of organizations that depend on its AI expertise. These are the ways Digital Science and its portfolio companies uses AI-related technologies in 2023:

Dimensions

  • Document classification; Researcher Disambiguation; Research Integrity Trust Markers

OntoChem

  • Image classification to decide whether an image shows a chemical compound; Entity recognition to decide whether a term describes a chemical compound

ReadCube

ReadCube graphic
  • Sentiment Analysis for Mentions; Article Recommendation Engine

metaphacts

  • Knowledge Graph and graph-based reasoning

Writefull

Writefull graphic
  • Core writing assistance offering is AI-based; AI detection

Ask ChatGPT

The information above is correct because, well, I work at Digital Science and I know it to be true. It also breaks the information down by constituent parts in terms of each Digital Science company and how it uses the relevant technology. So what happens if we ask a Generative AI model how Digital Science uses AI? Well, we get the text below courtesy of OpenAI’s new ChatGPT iPhone app:

“Digital Science, as a company that provides scientific software and research solutions, utilizes AI in various ways. Here are a few examples:

  • – Text and data mining: AI algorithms are employed to analyze large volumes of scientific literature, extracting valuable insights, identifying trends, and accelerating the discovery process.
  • – Natural Language Processing (NLP): Digital Science uses NLP techniques to enhance the understanding and processing of scientific texts, enabling advanced search capabilities, automatic categorization, and semantic analysis.
  • – Predictive analytics: AI models are trained on scientific data to predict outcomes, such as the success or impact of research projects, funding opportunities, or potential collaborations, helping researchers make informed decisions.
  • – Recommendation systems: By employing machine learning algorithms, Digital Science can provide personalized recommendations to researchers, suggesting relevant articles, journals, or research resources based on their interests and previous activities.
  • – Data visualization and exploration: AI-powered tools enable the creation of interactive visualizations and data exploration platforms, allowing scientists to gain deeper insights from complex datasets and facilitate data-driven decision-making.

These are just a few examples of how Digital Science incorporates AI into its offerings. The field of AI is rapidly evolving, and new applications are continually emerging to support scientific research and discovery.”
Source: ChatGPT

While all of the information presented by the AI is correct, it is broken down by technical functionality, only presenting ‘examples’ of how technology is used and not what it is used on. As such, it is technically very accomplished, but of little interest to most audiences who would miss the context of how the technology is actually being applied.

AI vs Humans

In 2023 while all the hoo-haa around Generative AI was at its peak, Digital Science launched a campaign called ‘Mind the Trust Gap’ which sought to underline the importance of trust in research among those who both carry out research itself and the rest of us who depend on its integrity and quality. The developments in AI technology are often compared to humans like it is some sort of competition or even battle, but it is perhaps more sensible to view it as a relationship entering a new phase, and trust is key to that relationship. As long as we can use AI with transparency and openness like the examples above, we can build a better understanding of the world around us.

Can our AI-powered solutions help you?

The post AI at Digital Science appeared first on Digital Science.

]]>
Provocative Paper Titles https://www.digital-science.com/blog/2021/08/provocative-paper-titles/ Tue, 24 Aug 2021 15:00:03 +0000 https://www.digital-science.com/?p=55333 Does a disconnect between a paper’s abstract and its title indicate a potential need to inspect the article for possible trust issues?

The post Provocative Paper Titles appeared first on Digital Science.

]]>
By Dr Leslie McIntosh, Founder and CEO of Ripeta, and Dr Hilde van Zeeland, Applied Linguist at Writefull.

At Ripeta, we develop tools to automatically scan manuscripts for key scientific quality indicators and provide feedback on ways to improve research reporting. We assess, design, and disseminate practices and measures to improve the reproducibility of, and trust in science with minimal burden on scientists.

In what can often feel like a sea of dry scientific writing, provocative titles in scientific research papers stand out. Occasionally, legitimate scientists conducting good research will attempt more humorous titles. Sometimes, they even land! 

To highlight the joy of a jaunty paper title, our friends at Writefull, providers of AI-based research proofreading services, have developed a fun app to generate scientific paper titles based on article abstracts. When pondering paper titles, I wondered whether a disconnect between a paper’s abstract and its title could indicate a potential need to inspect the article for possible trust issues, and what better way to investigate it than to use their app!

And so, without further ado, or indeed statistical significance, I present three articles: 

Article 1: A (very not) scintillating title 

Human Created Title

An analysis of form and function of a research article between and within publishers and journals

Writefull’s Computer-Generated Title 
  • Research Article Heading Organization and Forms for Machine Learning and Natural Language Processing: A Case Study from a Single Institution
  • A Heading Form and Function Analysis for Machine Learning
  • Research Article Heading Form and Function Analysis Using Rhetorical Functions

Picking an enticing article we just published at Ripeta about research article heading and subheadings, I wanted to see how close our paper name compared to those generated by a computer. Based on the alternative possibilities, the similarities of titles reassured me that we accurately framed our paper as dryly as possible, keeping in line with scientific naming conventions. Quite an interesting article when training machine learning algorithms to parse and categorize articles. However, definitely not click-bait.

An image of a paper entitled "An analysis of form and function of a research article between and within publishers and journals" on an iPad screen

Article 2: A title from an author trying to be clever (apologies Dr. Luke)

Human Created Title

Where there’s smoke there’s money: Tobacco industry campaign contributions and U.S. Congressional voting

Writefull’s Computer-Generated Title Possibilities
  • Voting Behaviors of Representatives from the Tobacco Industry Political Action Committees in the United States: A Cross-Sectional Analysis
  • The Effectiveness of Campaign Contributions for Tobacco-Related Legislators in the United States: A Cross-Sectional, Multilevel Model
  • Voting Behavior of Tobacco Industry Political Action Committees

A search in Dimensions shows over 160 articles alluding to the proverb ‘Where there’s smoke’ in the title. Not that uncommon. Maybe even overused? From personal experience, Dr. Doug Luke enjoys using more flavourful titles for his papers and talks to make statistics sound as interesting as it really is. The generated titles compare favourably to the original segment after the academic colon.

An image of a Dimensions screen showing a paper entitled "Where there's smoke there's money: Tobacco industry campaign contributions and U.S. Congressional voting" on an iPad screen

Article 3: A provocative title (from a retracted article)

Human Created Title

The Safety of COVID-19 Vaccinations—We Should Rethink the Policy

Writefull’s Computer-Generated Title Possibilities
  • Vaccine Safety and Risk Assessment for mRNA Vaccine COVID-19
  • Vaccination of COVID-19: A Review of the Safety of Vaccines
  • Safety Evaluation of COVID-19 Vaccines: The mRNA Vaccination versus the Number Needed for Vaccination

The problem with this title is the authors put in a recommendation into the title, which plays on the boundaries of scientific cultural norms. In fact the term ‘rethink the policy’ appears in only a handful of article titles. More troublesome is that the recommendation in the title does not logically follow from the paper, as also reflected by the auto-generated titles given by Writefull. Before even considering the fraughtful methods of the paper, we know the title and substance of the paper don’t agree with each other.

Provocative paper titles remind us that, first, scientists are able to laugh at themselves a little, and second that the title itself could have a bearing on the readership and thus the exposure of the science within. Could there be a relationship between paper titles and trust? We’d love to hear your thoughts. Tweet us @ripetaReview.

An image of a paper entitled "The Safety of COVID-19 Vaccinations—We Should Rethink the Policy" on an iPad screen

Want to try your hand at the title generation app? Go to the Writefull Title Generator and let us know what you found @Writefullapp and @ripetaReview.

At Ripeta we will keep exploring and automating checks to make better science easier. To learn more, head to the Ripeta website or contact us at info@ripeta.com.

Leslie Ripeta - Headshot

Dr. Leslie McIntosh
CEO and Founder, Ripeta

Leslie is the founder and CEO of Ripeta and a researcher passionate about mentoring the next generation of data scientists. She is active in the Research Data Alliance, grew the St. Louis Machine Learning and Data Science Meetup to over 1500 participants, and was a fellow with a San Francisco based VC firm. She recently concluded as the Director of Center for Biomedical Informatics (CBMI) at Washington University in St. Louis where she led a dynamic team of 25 individuals facilitating biomedical informatics services. Dr. McIntosh has a focus of assessing and improving the full research cycle and making the research process reproducible.

The post Provocative Paper Titles appeared first on Digital Science.

]]>
The next generation of language feedback https://www.digital-science.com/blog/2021/06/launching-writefulls-full-edit-the-next-generation-of-language-feedback/ Thu, 24 Jun 2021 11:54:14 +0000 https://www.digital-science.com/?p=53790 After months of developing, testing, and fine-tuning, we are thrilled to launch Writefull’s new language models.

The post The next generation of language feedback appeared first on Digital Science.

]]>
Guest Post by Hilde van Zeeland, Chief Applied Linguist at Writefull.

After months of developing, testing, and fine-tuning, we are thrilled to launch Writefull’s new language models. Our new models take a big leap in language feedback – leaving other tools far behind in coverage and accuracy.

Writefull’s new models go beyond the fixing of individual words: they revise entire sentences. In addition to correcting grammar, spelling, vocabulary, punctuation, and style, they offer rewrites at the phrase- and sentence level. This means fixes in word order, word use, sentence structure, clarity, and more. All while still being trained on scientific papers, giving language feedback tailored to academic writing.

This advanced language check is available to all users of Writefull for Word on Mac, Windows 2016 and newer*, and Word Online. Simply select ‘Full Edit mode’ to see it in action. To get Writefull’s traditional language check of grammar, spelling, vocabulary, punctuation, and style, go for ‘Proofread’ instead – this mode is enabled by default. Where Proofread may suffice for a quick check, Full Edit gives you a more thorough copy editing-like service.

For both modes, freemium users get to see a subset of the suggestions, while premium members get unlimited use.

Follow Writefull on Twitter to get notified when Full Edit is launched for Writefull Revise, Writefull for Overleaf, and Windows 2013 and older.

The post The next generation of language feedback appeared first on Digital Science.

]]>
Writefull https://www.digital-science.com/product/writefull/ Wed, 04 Nov 2020 12:04:41 +0000 https://www.digital-science.com/?post_type=product&p=37555 Writefull helps researchers improve their written English and enables a new way of writing with confidence.

The post Writefull appeared first on Digital Science.

]]>
Writefull
Academic writing made easy
Get Writefull now

Writefull helps you write, paraphrase, copyedit, and more.

Writefull helps researchers improve their written English and enables a new way of writing with confidence. So much more than simply a grammar and spelling checker, it suggests improvements to academic language usage, such as sentence structures in scientific writing, discipline-specific vocabulary, and appropriate word choice.

Suggestions are based on real-world, context-specific usage rather than on a fixed set of grammatical rules. Using Artificial Intelligence, Writefull has learned billions of sentences taken from scientific literature, making it very different from the standard rule-based grammar checks.

“As a non-native English speaker, I should edit my writing carefully.
I use many apps to do so. Writefull spots mistakes others do not. Incredible.”

Andrea Beltran Lizarazo

Read more about Writefull

The post Writefull appeared first on Digital Science.

]]>
NLP Series: Speeding up academic writing https://www.digital-science.com/blog/2020/04/nlp-series-nlp-in-academic-writing/ Tue, 21 Apr 2020 12:07:24 +0000 https://www.digital-science.com/?p=33642 Writing is key to science. Whether it is journal articles, book chapters, reports or conference proceedings, most research is communicated through written texts.

The post NLP Series: Speeding up academic writing appeared first on Digital Science.

]]>
In this week’s edition of our blog series on Natural Language Processing, we hear from two members of the team at Writefull, the academic writing support tool. Dr Hilde van Zeeland is Chief Applied Linguist at Writefull. After having completed an MSc and PhD in Applied Linguistics at the University of Nottingham, UK, she worked for several years as a language testing consultant and a scientific information specialist before joining Writefull. Dr Juan Castro is one of the founders of Writefull. He finished his PhD in Artificial Intelligence at the University of Nottingham, UK. He then did a few post docs at the same university before founding Writefull.

SEE MORE POSTS IN THIS NLP SERIES

Introducing Writefull

Writing is key to science. Whether it is journal articles, book chapters, reports or conference proceedings, most research is communicated through written texts. For most researchers however, writing takes up more time and effort than they would like. Fortunately, we now have Writefull: a tool that uses the latest Natural Language Processing (NLP) techniques to speed up the writing process.

Data, data, data, and models

NLP is a strand of Artificial Intelligence that refers to the automatic understanding and generation of human language. It can be applied to many purposes, such as predictive text, automatic translation, and text categorisation. Whatever the application of NLP, its techniques often rely on the training of models on vast amounts of data. While these models process batches of data, they acquire knowledge needed for the task at hand. For predictive text, for example, they require recurrent linguistic strings.

NLP models and Writefull

To help with academic writing, we need models to do three things:
1) to learn the recurrent patterns of academic texts;
2) to recognise when an author’s language does not follow these patterns, and;
3) to change such language so that it follows the expected patterns. 

writefull for publishers
Writefull can suggest changes to academic writing based on the likelihood of a word or sentence to be correct.

At Writefull we have spent the last few years developing and training models that do just that. We offer an editor in which researchers can write their text. They then get automatic feedback on their writing, and can accept or reject Writefull’s suggestions. The models that Writefull uses to give feedback have been trained on millions of journal articles. Thanks to this, they can spot when the author’s writing deviates from the norm – that is, from the expected language patterns as acquired from our dataset. In many cases, such deviations will be grammatical errors, but they can also include things like awkward wording or unnecessary commas.

Why AI beats grammar rules

Traditional language checking software uses grammar rules to check for fixed elements in a sentence. For example, they might ensure that the right prepositions precede certain nouns by coding rules such as: correct ‘at progress’ into ‘in progress’. 

Programming rules are definitely easier than training models. However, once models work well, they are much more powerful. Rules are limited; even thousands of rules wouldn’t cover all of the mistakes that authors can make, whereas models can cope with any input: their knowledge is generalisable to any sentence. To give you an example, Writefull recently corrected ‘time of the day and day of the week’ into ‘time of day and day of the week’. Writefull knew that, in this context, ‘the’ precedes ‘week’, but not ‘day’. There are many of these usage-based norms, and it is impossible to cover all of these in a rule set, but a model, if trained sufficiently, will eventually learn them.

Another downside of rules is their black-or-white nature. If an author’s sentence triggers a rule, it will then be corrected regardless of the context. This may lead to false corrections. Models, on the other hand, look at the context to judge what suggestions are needed and, based on this, can give nuanced feedback. When Writefull spots that something is off in a text, it often gives the author the probability of their phrase and compares this to alternatives. For example, when writing, “He is sitting on the sun” in the Writefull editor, Writefull shows that “He is sitting in the sun’ is a more probable alternative, with 82% likelihood of the latter versus 18% for the former. In cases like this, Writefull does not give a harsh correction, but an insight into the likelihood of the author’s wording versus alternatives. Language correctness is, after all, not always black-or-white. Messiness and ambiguity, both inherent to language, are two key challenges in the field of NLP.

The challenge of messy language

A challenge to Writefull – and to any NLP application – is noisy input. If an author writes sentences that are very different from the language that Writefull’s models know from training (i.e., from the journal articles), Writefull may fail to give accurate feedback. Think of an author messing up word order or making several serious grammar mistakes in one sentence. The challenge is therefore to identify those cases where it is best to not suggest anything, for a suggestion might turn out to be incorrect.

The possibilities are endless

At Writefull, we’re continuously exploring avenues to make our feedback even more accurate and complete. While Writefull currently gives feedback on many language features, including the use of punctuation, prepositions, subject-verb agreement, etc., there are still plenty of science-specific features to cover. Academic writing might use virtually the same grammar as other genres, but it is highly specific on other things, such as word use. We now have the technology in-house to expand – and in doing so, we’re keeping a close eye on developments in the NLP field.

The post NLP Series: Speeding up academic writing appeared first on Digital Science.

]]>
NLP Series: AI in Science; the Promise, the Challenge, and the Risk https://www.digital-science.com/blog/2020/04/nlp-series-ai-in-science-promise-challenge-risk/ Tue, 07 Apr 2020 18:46:22 +0000 https://www.digital-science.com/?p=33578 Dr Joris van Rossum focuses on AI in science and looks at the potential to make research better, but also the pitfalls.

The post NLP Series: AI in Science; the Promise, the Challenge, and the Risk appeared first on Digital Science.

]]>

Continuing our blog series on Natural Language Processing, Dr Joris van Rossum focuses on AI in science; the potential to make research better, but also the pitfalls that we must be wary of when creating and applying these new technologies. Joris has over 20 years of experience driving change in the publishing industry through new technologies and business models. His former roles include Director of Publishing Innovation at Elsevier and Director of Special Projects at Digital Science, a role in which he authored the Blockchain for Research report. He co-founded Peerwith in 2015, and currently serves as Research Data Director at STM, where he drives the adoption of sharing, linking and citing data in research publications.

Understanding the risks

According to Professor Thomas Malone, Director of the MIT Center for Collective Intelligence, AI should essentially be about connecting people and computers so that they collectively act more intelligently than any individual person, group or computer has ever done before. This connectivity is at the core of science and research. Science is a collective activity par excellence, connecting millions of minds in space as well as time. For hundreds of years, scientists have been collaborating and discussing their ideas and results in academic journals. Computers are increasingly important for researchers: in conducting experiments, collecting and analyzing data and, of course, in scholarly communication. Reflecting on this, it is perhaps surprising that AI does not play a bigger role in science today. Although computers are indispensable for modern scientists, the application of artificial intelligence lags behind other industries, such as social media and online search. Despite its huge potential, uptake of AI has been relatively slow. This is in part due to the nascent state of AI, but also to do with cultural and technological features of the scientific ecosystem. We must be aware of these in order to assess the risks associated with unreflectively applying artificial intelligence in science and research.

AI and NLP in healthcare

A logical source of data for intelligent machines is the corpus of scientific information that has been written down in millions of articles and books. This is the realm of Natural Language Processing (NLP). By processing and analyzing this information, computers could come to insights and conclusions that no human could ever reach individually. Relationships between fields of research could be identified, proposed theories collaborated on or rejected based on an analysis of a broad corpus of information, and new answers to problems given.

This is what IBM’s Watson has attempted in the field of healthcare. Initiated in 2011, it aims to build a question-and-answer machine based on data derived from a wealth of written sources, helping physicians in clinical decisions. IBM has initiated several efforts to develop AI-powered medical technology, but many have struggled, and some have even failed spectacularly. What this lack of success shows is that it is still very hard for AI to make sense of complex medical texts. This will therefore most certainly also apply to other types of scientific and academic information. So far, no NLP technology has been able to match human beings in comprehension and insight.

Barriers to information

Another reason for the slow uptake of NLP in science is that scientific literature is still hard to access. The dominant subscription and copyright models make it impossible to access the entire corpus of scientific information published in journals and books by machines. One of the positive side effects of the move towards Open Access would be the access to information by AI engines, although a large challenge still lies in the immaturity of NLP to deal with complex information.

More data give greater context

Despite the wealth of information captured in text, it is important to realize that the observational and experimental scientific data that stands at the basis of articles and books is potentially much more powerful for machines. In most branches of science the amount of information collected has increased with dazzling speed. Think about the vast amount of data collected in fields like astronomy, physics and biology. This data would allow AI engines to fundamentally do much more than what is done today. In fact, the success of born-digital companies like Amazon and Google have had in applying AI is to a large extent due to the fact that they have a vast amount of data at their disposal. AI engines could create hypotheses on the genetic origin of diseases, or the causes for global warming, test these hypotheses by means of plowing through the vast amount of data that is produced on a daily basis, and so to arrive at better and more detailed explanations of the world.

Shifting the culture around data sharing to create better AI

A challenge here is that sharing data is not yet part of the narrative-based scholarly culture. Traditionally, information is shared and credit earned in the form of published articles and books, not in the underlying observational and experimental data.

Important reasons for data not being made available is the fear of being scooped and the lack of incentives, as the latest State of Open Data report showed. Thankfully in recent years efforts have been made to stimulate or even mandate the sharing of research data. Although these offers are primarily driven by the need to make science more transparent and reproducible, enhancing the opportunity for AI engines to access this data is a promising and welcome side-effect.

Like the necessary advancement of NLP techniques, making research data structurally accessible and AI-ready will take years to come to fruition. In the meantime, AI is being applied in science and research in narrower domains, assisting scientists and publishers in specific steps in their workflows. AI can build better language editing tools, such as in the case of Writefull, who we will hear from in the next article in this series. Publishers can apply AI to perform technical checks, such as in Unsilo, scan submitted methods sections for assessing the reproducibility of research, the way Ripeta and SciScore do, and analyze citations, like Scite. Tools are being developed to scan images of submitted manuscripts to detect manipulation and duplication, and of course scientists benefit from generic AI applications such as search engines and speech and image recognition tools. Experiments have also been done with tools that help editors in making decisions to accept or reject papers. The chance of publishing a highly cited paper is predicted based on factors including the subject area, authorship and affiliation, and the use of language. This last application exposes an essential characteristic of machine learning that should make us cautious.

Breaking barriers, not reinforcing them

Roughly speaking, in machine learning, computers learn by means of identifying patterns in existing data. A program goes through vast numbers of texts to determine the predominant context in which words occur, and uses that knowledge to determine what words are likely to follow. In the case of the tools that support editors in their decision to accept or reject papers, it identifies factors that characterize successful papers, and makes predictions based on the occurrence of these factors in submitted papers. This logically implies that these patterns will be strengthened. If a word is frequently used in combination with another word, the engine subsequently suggesting this word to a user will lead to that word being used even more frequently. If an author was successful, or a particular theory or topic influential, AI will make these even more so. And if women or people from developing countries have historically published less than their male counterparts from Western countries, AI can keep them underperforming.
In other words, AI has the risk of consolidating the contemporary structures and paradigms. But as the philosopher of science Thomas Kuhn showed, real breakthroughs are characterized by replacing breaking patterns and replacing paradigms with new ones. Think of the heliocentric worldview of Kepler, Copernicus and Galileo, Darwin’s theory of natural selection, and Einstein’s theory of relativity. Real progress in science takes place by means of the novel, the unexpected, and sometimes even the unwelcome. Humans are conservative and biased enough. We have to make sure that machines don’t make us even more so.

DOI: https://doi.org/10.6084/m9.figshare.12092403.v1

SEE MORE POSTS IN THIS NLP SERIES

The post NLP Series: AI in Science; the Promise, the Challenge, and the Risk appeared first on Digital Science.

]]>
Writefull pilot by Hindawi: authors revise their English at submission https://www.digital-science.com/blog/2020/01/writefull-pilot-by-hindawi-authors-revise-their-english-at-submission/ Mon, 20 Jan 2020 16:08:13 +0000 https://www.digital-science.com/?p=32800 Hindawi was the first publisher to integrate Writefull into their submission process.

The post Writefull pilot by Hindawi: authors revise their English at submission appeared first on Digital Science.

]]>
Quick Read
  • Two-thirds of the authors accepted the invitation to use Writefull.
  • Writefull offered an average of nearly 100 suggestions per manuscript.
  • Most authors accepted more than 87% of the suggestions.

Open Access publisher Hindawi recently started a pilot with Writefull, one of Digital Science’s portfolio companies. Writefull lets authors revise their manuscript right before they submit it to one of Hindawi’s journals.

Writefull in brief

Writefull offers tools to help researchers with their academic writing. Using a technique called Deep Learning (a strand of Artificial Intelligence), Writefull gives researchers automatic feedback on their text. Writefull uses language models that have been trained on millions of published, scientific papers. Thanks to this, its language suggestions – on grammar, vocabulary, punctuation, and more – follow the norms and patterns of academic writing, tailored to researchers’ needs.

How this helps publishers 

The idea of automatic language feedback is appealing not only to researchers, but also to publishers. Editors often have to reject manuscripts because of poor English – sometimes after a time-consuming peer-review process. Letting authors revise the language of their manuscripts at pre-submission can save time and increase acceptance rates.

Pilot Hindawi – Writefull

Hindawi is the first publisher to integrate Writefull into their submission page. Authors who are about to submit a manuscript get to see a pop-up inviting them to use Writefull to revise their text: If authors accept and upload their document, Writefull shows them their text with language suggestions in-line, and an overview at the top. The report is interactive: authors can click suggestions to accept them. After this, they can download their revised document, and continue to the journal to submit.

Early pilot results

Hindawi and Writefull started the pilot early December last year – with authors submitting from six countries, and to a select number of journals. Over a month into the pilot, both Hindawi and Writefull see positive results. 

“We knew Writefull gave solid feedback when we started the pilot. But whether authors would accept the invitation to use Writefull, and then also take the time to revise their text, still had to be seen.”  Juan Castro, CEO of Writefull

A pilot is meant to trial a product, and a few technical hiccups were found early on. For example, researchers uploaded file types that weren’t supported by Writefull, and conversion from one file format to the other was error-prone. Writefull and Hindawi combined efforts in fixing these issues, and also adjusted the user interface where they saw authors got lost.

Mathias Astell (Director of Marketing at Hindawi):

“We are interested in facilitating as seamless and worthwhile an experience as possible for all our authors and so we are excited about the prospect of a tool like Writefull helping to level the playing field for researchers from non-English speaking countries by ensuring valid research doesn’t fall at the language hurdle. The initial insights from this pilot seem to show that researchers are finding Writefull useful in checking and correcting the language used in their manuscripts.”

Check out Writefull’s website for publishers or contact the team directly at publishers@writefull.com.

The post Writefull pilot by Hindawi: authors revise their English at submission appeared first on Digital Science.

]]>
Digital Science Invests in Deep Learning Language Platform Writefull https://www.digital-science.com/blog/2019/04/digital-science-invests-in-deep-learning-language-platform-writefull/ Tue, 30 Apr 2019 10:18:37 +0000 https://www.digital-science.com/?p=31638 Writing on a computer today, most of us now expect our writing support tools to offer grammar and spell checkers. These features use hard-coded rules to assess if a sentence is correct following the rules of English. The problem with this rule-based approach is that in many cases the rules are not clearly defined. For […]

The post Digital Science Invests in Deep Learning Language Platform Writefull appeared first on Digital Science.

]]>
Writing on a computer today, most of us now expect our writing support tools to offer grammar and spell checkers. These features use hard-coded rules to assess if a sentence is correct following the rules of English. The problem with this rule-based approach is that in many cases the rules are not clearly defined. For example, in the use of prepositions, we sit ‘on’ a dining chair, and yet we sit ‘in’ a rocking chair. In other cases, rules do exist, but they fail to address how language is really used.

 

Today, we are excited to announce our investment in Writefull – a deep learning language platform applied to discipline-specific scientific texts to help improve the clarity of written English. We believe Writefull will help authors to express their work more clearly before they submit articles for publication, especially those who have English as their second language (and quite a few of us native speakers too for that matter). In addition, publishers will have a service to help relieve the administrative burden on editors, maintain house style and to help with quality control.

“When we first met the founders they demonstrated Writefull highlighting a sentence that read “…the tall mountains and high trees”. Although grammatically correct, a native speaker (subconsciously) wouldn’t describe mountains as “tall” and trees as “high” but instead as “…high mountains and tall trees”… Writefull points users to change this. We were impressed.”

Writefull helps authors improve the clarity of their work. It suggests improvements to grammar and spelling and to academic language usage such as sentence structures in scientific writing, discipline-specific vocabulary and appropriate word choice. These suggestions are based on real-world, context-specific usage rather than on a fixed set of grammatical rules.

When we first met the founders of Writefull, Juan and Alberto, they demonstrated the AI highlighting a sentence that read “…the tall mountains and high trees”. Although grammatically correct – as a native speaker, something does not sound quite right about that sentence. The nuance of English is such that we (subconsciously) wouldn’t describe mountains as “tall” and trees as “high”. The recommendation picked up by Writefull’s Deep Learning and N-gram approach pointed the user to change this to “…high mountains and tall trees”. Language is full of such aspects of usage, and these don’t have fixed rules that can be hard-coded in advance.

We were impressed.

In making our investment in Writefull, we now have a solution that steps beyond rules-based approaches to cover style and usage, and applies that technology specifically to scientific writing, training the AI to understand the vocabulary and style within that domain.

The founders, with their machine learning and AI backgrounds, have a feature-packed product roadmap and we here at Digital Science look forward to working with them to help the rest of us express our ideas more clearly.

Read our full press release.

 

The post Digital Science Invests in Deep Learning Language Platform Writefull appeared first on Digital Science.

]]>
Catalyst Grant Awardee, Writefull, Launches New Chrome Extension https://www.digital-science.com/blog/2017/01/catalyst-grant-awardee-writefull-launches-new-chrome-extension/ Tue, 17 Jan 2017 13:34:34 +0000 https://www.digital-science.com/?p=23171 One of our latest Catalyst Grant awardees, Writefull, today launched a new Chrome extension. Founded by Juan Castro (PhD in Artificial Intelligence) and Alberto Villar (MsC in Machine Learning), Writefull helps users write with more confidence by allowing them to check words and phrases against databases of correct language, such as Google Scholar and Google […]

The post Catalyst Grant Awardee, Writefull, Launches New Chrome Extension appeared first on Digital Science.

]]>
Large Tile 920x680One of our latest Catalyst Grant awardees, Writefull, today launched a new Chrome extension.

Founded by Juan Castro (PhD in Artificial Intelligence) and Alberto Villar (MsC in Machine Learning), Writefull helps users write with more confidence by allowing them to check words and phrases against databases of correct language, such as Google Scholar and Google Books. Using Natural Language Processing to extract patterns in written language, they offer accurate grammar corrections and sentence replacements specific to academic writing.

Writing strong academic texts is challenging, particularly for non­-native speakers of English who make up the majority of researchers worldwide. A great deal of time can often be lost on editing and checking writing. Writefull, speeds up this process allowing researchers to work more efficiently.

Writefull can be used in a researcher’s own working environment, as it operates in any writing tool such as Overleaf, Microsoft Word and Gmail. Their new Chrome extension provides better support for countries like China, and institutions (such as, universities) that need to set a proxy to connect to the Internet.

Together with this Chrome extension they are publishing a new desktop version. In both the extension and the desktop version they have redesigned the UI with a new theme that makes it easier to digest frequency results. They have also added specialized input widgets that simplify composing queries for some commands like /compare and /translate.


Juan Castro says of the news:

“We are so pleased to launch a Chrome extension for Writefull today. Thanks to the Catalyst Grant, Writefull is now one click away from millions of devices running Google Chrome.”

The post Catalyst Grant Awardee, Writefull, Launches New Chrome Extension appeared first on Digital Science.

]]>