You searched for nlp - Digital Science

Shining a light on conflict of interest statements

Simon Linacre — Thu, 05 Sep 2024 14:56:41 +0000

Authors either have a conflict of interest or not, right? Wrong. Research from Digital Science has uncovered a tangled web of missing statements, errors, and subterfuge, which highlights the need for a more careful appraisal of published research.

At this year’s World Conference on Research Integrity, a team of researchers from Digital Science led by Pritha Sarkar presented a poster with findings from their deep dive on conflict of interest (COI) statements. Entitled Conflict of Interest: A data driven approach to categorisation of COI statements, the initial goal was to look at COI statements with a view to creating a binary model that determines whether a Conflict of Interest statement is present or not in an article.

However, all was not as it seemed. While some articles had no COI and some had one present, those present covered a number of different areas, which led the team to think COIs might represent a spectrum rather than binary options.

Gold standard

Conflict of interest is a crucial aspect of academic integrity. Properly declaring a COI statement is essential for other researchers to assess any potential bias in scholarly articles. However, those same researchers often encounter COI statements that are either inadequate or misleading in some way even if they are present.

The Digital Science team – all working on research integrity with Dimensions – soon realized the data could be leveraged further to better explore the richness inherent in the nuanced COI statements. After further research and analysis, it became clear that COI statements could be categorized into six distinct types:

None Declared
Membership or Employment
Funds Received
Shareholder, Stakeholder or Ownership
Personal Relationship
Donation

This analysis involved manually annotating hundreds of COI statements with Natural Language Processing (NLP) tools. The aim was to create a gold standard that could be used to categorize all other COI statements, however despite the team’s diligence a significant challenge persisted in the shape of ‘data skewness’ – which can be defined as an imbalance in the distribution of data within a dataset that can impact data processing and analytics.

Fatal flaw

One irresistible conclusion to the data skewness was a simple one – that authors weren’t truthfully reporting their conflicts of interest. But could this really be true?

The gold standard approach came from manually and expertly annotating COI statements to develop an auto-annotation process. However, despite the algorithm’s ability to auto-annotate 33,812 papers in just 15 minutes, the skewness that had been initially identified persisted, leading to the false reporting theory for authors (see Figure 1 of COI Poster).

To firm up this hypothesis, when the Retraction Watch database was analyzed, the troubling trend, including the discrepancy between reported COI category and retraction reason, became even more apparent (see Figure 2 of the COI Poster).

Moreover, when the team continued with the investigation, they found there were 24,289 overlapping papers in Dimensions GBQ and Retraction Watch, and among those papers, 393 were retracted due to conflict of interest. Out of those 393 papers, 134 had a COI statement, however 119 declared there was no conflict to declare.

Conclusion

Underreporting and misreporting conflict of interest statements or types can undermine the integrity of scholarly work. Other research integrity issues around paper mills, plagiarism and predatory journals have already damaged the trust the public has with published research, so further problems with COIs can only worsen the situation. With the evidence of these findings, it is clear that all stakeholders in the research publication process must adopt standard practices on reporting critical trust markers such as COI to uphold the transparency and honesty in scholarly endeavors.

To finish on a positive note, this research poster was awarded second-place at the 2024 World Conference on Research Integrity, showing that the team’s research has already attracted considerable attention among those who seek to safeguard research integrity and trust in science.

You can find the poster on Figshare: https://doi.org/10.6084/m9.figshare.25901707.v2

Partial data and the code for this project are also available on Figshare.

For more on the topic of research integrity, see details of Digital Science’s Catalyst Grant award for 2024, which focuses on digital solutions around this topic.

About the Author

Simon Linacre, Head of Content, Brand & Press | Digital Science

Simon has 20 years’ experience in scholarly communications. He has lectured and published on the topics of bibliometrics, publication ethics and research impact, and has recently authored a book on predatory publishing. Simon is an ALPSP tutor and has also served as a COPE Trustee.

The post Shining a light on conflict of interest statements appeared first on Digital Science.

Fragmentation of AI research: a blog series

h.draux@digital-science.com — Thu, 07 Dec 2023 14:35:19 +0000

In this blog series, we will explore the Fragmentation of Artificial Intelligence research. This first post lays out some of the key areas where AI research and development have become disconnected, making it more difficult to advance the field in a coordinated, ethical, and globally beneficial manner.

Figure 1: Created with DALL·E 3 with the prompt: “AI research subfields (icons representing: robotics, ML, NLP, Automatic Speech Recognition, Computer Vision, ethics, Deep learning) are each represented by a piece of puzzle scattered around.”

Artificial Intelligence (AI) is a recent discipline, having started in the 1960s, which aims at mimicking the cognitive abilities of humans. After going through a few “winters of AI” in the 70s and 90s, the field has been experiencing a boom since the 2010s thanks to increased computing capacities and large data availability.

The interdisciplinary foundations of AI draw from diverse fields across the sciences, technology, engineering, mathematics, and humanities. Core STEM disciplines like mathematics, computer science, linguistics, psychology, neuroscience, and philosophy provide vital technical capabilities, cognitive models, and ethical perspectives. Meanwhile, non-STEM fields including ethics, law, sociology, and anthropology inform AI’s societal impacts and governance. Together, this multidisciplinary collaborative approach aspires to enable AI systems that not only perform complex tasks, but do so in a way that accounts for broader human needs and societal impacts. However, significant challenges remain in developing AI that is compatible with or directed towards human values and the public interest. Continued effort is needed to ensure AI’s development and deployment serve to benefit humanity as a whole rather than exacerbate existing biases, inequities, and risks.

Global Divides

Figure 2. Created with DALL·E 3 with the prompt: “researchers with a flag from the world on their clothes. They work on platforms at different levels. Some are isolated and cannot work with the others.” Ironically the USA flag being such a common flag, it is the most used by Dall·e 3 (when asked to have lower and higher income country flags, it made some flags up).

Research is globally divided – the high income countries in particular are the biggest publisher of peer-reviewed publications and the biggest attendee group at research conferences. This is especially true in AI research, with AI researchers from poorer countries moving to hubs like Silicon Valley. This is, in part due to the lack of cyber infrastructure in many countries (GPU, electricity reliability, storage capacity, and so on), but also for countries in the non-English speaking world there may be a lack of, to data availability in their native language.

The concentration of AI research in high-income countries has multiple concerning consequences.: First, it prioritizses issues most relevant to high income countries while overlooking applications that could benefit lower income countries (e.g. iImproving access to basic needs, such as clean water and food production; diagnosis and treatment of diseases more prevalent in low-income regions). Second, the lack of diversity among AI researchers excludes valuable perspectives from underrepresented groups including non-Westerners, women, and minorities. Policies and ethics guidelines emerging from the active regions may not transfer well or align across borders.

In a third blog post of this series, we will investigate the global division of AI research, and look into the possible solutions.

Siloed knowledge

Figure 3: Created with DALL·E 3 with the prompt: “separate, isolated compartments, each representing a specialised area of AI research, like computer vision, natural language processing, and robotics. In these compartments, researchers work on their respective pieces of the AI puzzle. However, these compartments are solid and tall, making it challenging for researchers to collaborate or see what’s happening in other areas”. As expected researchers are white males.

However, in recent years research in AI has become so specialised that it is difficult to see where AI starts and ends. A great example of this is the fact that many AI-related considered research publications are actually not classified as “Artificial Intelligence” in Dimensions. Take the AlphaFold publications, these are considered Bioinformatics and Computational Biology, rather than Artificial Intelligence. Many consider Machine Learning to be a subfield of Artificial Intelligence, however the Fields of Research separates both and puts them at the same level.

Figure 4: co-authorship network of AlphaFold publications.

As AI research spreads to different fields, progress is more difficult to spread – researchers in different disciplines rarely organise conferences together, most journals are specialised into one field of research, researchers’ physical departments in universities are spread across buildings, and therefore there is less collaboration between them. Any progress such as thatprogress required to make AI more ethical, is less likely to spread evenly to every applied AI field. For instance, transparency in AI, which is still in infancy and developedhappened thanks to collaboration between ethics and AI, will take more time to reach AI applied in Physics, Chemistry, and so on.

Do the benefits of AI application in other research fields outweigh the difficulties in applying AI advancements? And how much interdisciplinary actually happens? This will be the inquiry of our second blog post of this series.

Policy framework

Figure 5: Created with DALL·E 3 with the prompt: “The picture is divided in 10 sectors. In 6 sectors robots are happily playing but in other sectors the robots look sad and are behind bars”

Globally, government policies and regulations regarding the development and use of increasingly powerful large language models (LLMs) remain fragmented. Some countries have outright banned certain LLMs, while others have taken no regulatory action, allowing unrestricted LLM progress. There is currently no international framework or agreement on AI governance; efforts like the Global Partnership on Artificial Intelligence (GPAI) aims to provide policy recommendations and best practices related to AI, which can inform the development of AI regulations and standards at the national and international levels. It tackles issues related to privacy, bias, discrimination, and transparency in AI systems; promotes ethical growth development, and encourages collaboration and information sharing.

AI policies vary widely across national governments. OIn 2022, out of 285 countries in 2022, just 62 (22.2%) countries had a national artificial intelligence strategy, seven7 (2.5%) were in progress and 209 (73.3%) had not released anything (Maslej et al. 2023). Of those countries that took a position, the US at that time focused on promoting innovation and economic competitiveness, while the EU focused on ethics and fundamental rights. On October 30th the US signed their first executive order on AI (The White House 2023), which demands the creation of standards, more testing and encourages a brain gain of skilled immigrants. At a smaller scale, city-level policies on AI are also emerging; sometimes conflicting with national policies. San Francisco, for instance, banned police from using facial recognition technology in 2019.

Ultimately, AI regulations tend to restrict AI research, which if it happened unevenly around the world would create centres of research where less regulations take place.

How does this varied policy attitude affect the prospects of AI research? Will this lead to researchers migrating to less restricted regions? Such will be questions addressed in another blog post.

Bibliography

Maslej, Nestor, Loredana Fattorini, Erik Brynjolfsson, John Etchemendy, Katrina Ligett, Terah Lyons, James Manyika, et al. 2023. ‘Artificial Intelligence Index Report 2023’. arXiv. https://doi.org/10.48550/arXiv.2310.03715.

The White House. 2023. ‘FACT SHEET: President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence’. The White House. 30 October 2023. https://www.whitehouse.gov/briefing-room/statements-releases/2023/10/30/fact-sheet-president-biden-issues-executive-order-on-safe-secure-and-trustworthy-artificial-intelligence/.

The post Fragmentation of AI research: a blog series appeared first on Digital Science.

A good time to be working on AI: an interview with Professor Sue Black

John Hammersley — Thu, 14 Sep 2023 07:24:43 +0000

“Because of my background, I’m always interested in how technology can serve the underserved in society, and how it can empower people to live their best lives.

With AI, I’m not worried about robots taking over the world. I’m more worried about people using technology to do bad things to other people, rather than the technology itself.

One of the biggest issues we’ve got with technology is that most people in society, particularly those who aren’t in tech, think that they can’t understand it. I want to help change that, on a global scale.”
Professor Sue Black, in conversation, July 2023

Foreword by John Hammersley: She’s an award winning Computer Scientist, Technology Evangelist and Digital Skills Expert who led the campaign to save Bletchley Park, but to me, Sue Black will always be the friend who has the best excuse for skipping pitch practice, namely “I’m speaking at the UN”.

We first met during the Bethnal Green Ventures (BGV) start-up accelerator programme in 2013, when Sue founded TechMums (then Savvify) and John Lees-Miller and I were starting out with Overleaf (then WriteLaTeX). And by happy coincidence, both start-ups won at the Nominet Internet Awards 2014!

Sue and I stayed in touch, and when she joined Durham University in 2018, she invited John and I to give talks to the students taking her Computational Thinking course, to give a perspective on life in industry after university.

Recently I spoke to Sue about her work in AI, and how her experience advocating for underrepresented groups can be useful in helping ensure both responsible AI development and that access isn’t restricted to and controlled by the privileged few. She’s working on a new, UN-supported educational programme, and would love for those interested in helping — in any way — to get in touch.

Topics of conversation

Early days in natural language processing
A good time to be working on AI
A practical approach
Returning to AI with a broader experience
We should worry about people, not the technology
Helping others to not be scared of technology
The buzz around AI shining a spotlight on existing problems
Dressing the animatronics – biases in plain sight
Looking ahead: A new UN digital skills programme

Early days in natural language processing

Hi Sue, it’s great to be chatting again! I regularly see you attending #techforgood events around the world, and this conversation came about in part because you mentioned you were in Geneva at an “AI for good” conference last month. How did that come about?

Lovely to be chatting to you too! I’ve become more interested in AI over the years — I first studied it at university in the 80s, and at that time I was really interested in natural language. For my Degree project in 1992 I wrote a natural language interface to a database, which was so difficult back then!

“AI was interesting (in the 80s/90s). But I never believed it would get to the state where we are now at all, because back then it was so hard to even write the code just to be able to ask a basic question.”

For example, I was building a natural language interface to something as relatively well structured as a family tree, to answer questions like “who is John’s great-grandmother?”, that sort of thing. That was so difficult and took such a long time… and that was for a typed input. It wasn’t even speech, right?

So, to have got to where we are now with voice recognition and ChatGPT, it just completely blows my mind that we’ve managed to get here in… well, it’s a long time to me (since the 80s!), but at the same time it’s a very short space of time.

Professor Sue Black and her PhD student Sarah Wyer both attended the AI for Good conference in Geneva, Switzerland in July 2023. Source: https://twitter.com/Dr_Black/status/1677400696084103168

A good time to be working on AI

One of my PhD students at Durham – Sarah Wyer – has been looking at GPT applications for a couple of years. Before ChatGPT exploded into the public sphere, we were looking at different things for her to do for her PhD. We had a conversation with one of my colleagues at Durham, Chris Wilcocks. He was all excited about the potential of…I think it was GPT2, if I remember rightly. He was telling us all about it, and we’re like “Oh my God, this is amazing, you’ve got to do your PhD in this area!” So Sarah and I became really excited about it, and we wanted to look at bias in these GP models.

“I’ve done loads of stuff around diversity and inclusion in my career, and so we figured we’d ask ‘what if we can find any bias with GPT too?’”

We thought it might take a while – and in a sense it did – because to start with it took us several months to get access to GPT3. You had to apply for an account and the waiting list was very long. But when we did get access it was amazing, and we started to look into whether any particular sort of prompts generated bias

And at that point, it didn’t take very long to find bias! The first bit of research that Sarah did was taking some simple prompts – e.g. “Men can” and “Women can”, having GPT3 generate 10,000 outputs for each prompt and then doing some clustering analysis and stuff like that. We thought it might take a while to find some bias, but it only took a few seconds with these first few prompts.

You can probably guess the biases she found – for example, stuff men can do is “be superheroes” and “drink loads of alcohol”, that kind of thing. Women can… yeah, it’s very misogynistic and sexualized, that kind of stuff. Not very nice at all, and if you add in race (e.g. as “Black women can”), it gets even worse.

Sarah is taking snapshots over time and the results are improving, in that it no longer provides some of the worst answers. But that’s also a problem because it’s getting better because those things which we really don’t want to see are now being masked, and that process isn’t very transparent.

So that’s been her PhD work over the last two years (part-time), and we went to the AI summit last year, and now to the AI for Good conference!

Sue Black describes this as “An incredibly inspiring kick off speech by the wonderful Doreen Bogdan, Secretary General of ITU, challenging us to use AI for Good”. Source: https://twitter.com/Dr_Black/status/1676858921099710465

A practical approach

How does it feel to be back in AI, after working on it in the 80s & 90s? Did you keep an interest going over the years, amongst all the other stuff you’ve been doing?

No, not at all – after my PhD I thought, “I don’t want to do that again!”

I started my PhD in formal methods, as that’s where the funding was. I did that for six months and whilst it’s clearly a good approach to software development in certain scenarios — safety critical software and stuff like that — it wasn’t a good match for my brain!

I think I have a more “practical” rather than “formal” kind of brain. It doesn’t work very well in that way, so whilst I can do it, I’m not good at it. So I moved over to reverse engineering, which is more practical and to me quite exciting, but then I ended up in a really complicated area of maths which I couldn’t understand properly! I was building on a Wide Spectrum Language, which is basically a language to end all languages… one that you can translate everything into and then everything back out of.

So I thought, “That’s a great idea; that’s sort of a practical solution for lots of problems,” but it was very formal, again, and even though it’s a really good idea it turned out to not be very practical… and also the maths involved just did my head in! I spent three months thinking to myself, “Do I carry on with this and hope that I understand all this math or do I change?” I ended up deciding I wasn’t going to carry on with it, and I changed into software engineering.

I was already really interested in software measurement because I thought again, that’s kind of practical, help for people out there adapting software systems. So then finally, that kind of resonated properly with me and I did some work around reformulating an algorithm to compute ripple effect metrics. And that was my PhD.

“I never thought we would be able to do all the things that we now can with AI.”

So, yeah, nothing around AI in there at all, in the end! And I kind of thought it (AI) was never going to get anywhere because it was just too hard, but of course, I didn’t foresee the big data revolution and social media and what that might enable to happen. I’m so excited that it’s worked out the way it has. It’s just incredible. I never thought that we would be able to do the things that we can now do at all.

Returning to AI with a broader experience

Are you better equipped to work on AI now? Or is it even more daunting?

Well firstly, I suppose I’m not going to be doing anything at a technical level ever again (laughs!) — that’s not the best use of my time.

When I was younger, writing code was definitely what I wanted to do, but I kind of feel like now, “I’ve done that, I don’t need to do that again”! And other people, who are doing it all the time, will be much quicker than me! Going back into AI now is more about how I want the world to be — it’s at a much higher level, and thinking about how we can use AI to benefit people, and I guess because of my background of some sort of disadvantage and challenges, I’m always interested in how technology can serve the underserved in society in various different ways and how it can empower people to live their best lives.

“Because of my background, I’m always interested in how technology can serve the underserved in society… and how it can empower people to live their best lives.”

So that’s one aspect of it. And then the other one is from the opposite standpoint, which is how to mitigate bias, or make sure that people realize if there is bias within a system, how much that can impact people detrimentally. Again, usually it will be the underserved communities who are most impacted without realizing it.

A lot of what I’m interested in is how to help people across the world understand reality — as much as anyone understands reality — but enough so that they can make the right decisions for themselves and their families and people around them. That could be a refugee woman setting up a business so that she can earn enough money to support a family, or it could be someone running an AI company who hasn’t thought about how the way that they’re developing their software can have a detrimental impact on potentially millions or even billions of people around the planet.

Because the #AIforGood conference was at the UN, I was chatting to the Secretary General of the ITU about helping everybody with digital skills across the world. Some sort of global program… so I’m going to be working with them on that! So, that’s the most exciting thing that’s happened to me in the last while!

We should worry about people, not the technology

I’m optimistic about AI, but the doom scenarios are interesting as well. Are we training and building something that will cause massive harm? Will we become too dependent on AI? Will we lose the ability to do certain things if we can get the answer immediately?

Were these sorts of questions a focus for the conference last month? What’s your view on them?

Yeah, this was discussed last week, and from my perspective there’s too much focus on the tech.

“I’m more worried about people using technology to do bad things to other people rather than the technology itself.”

Because I think the technology itself may be an issue way into the future, not immediately. Right now you can see — with stuff like Cambridge Analytica — how using information and data that people have found online can change the course of the way things go in the world… elections in different countries, Brexit, and so on. I think a lot of that is down to people misusing technology, and that’s the thing that worries me more than robots taking over the world.

“People are using data about other people to manipulate millions — or even billions — of people to behave not in their own best interests nor in humanity’s best interests. That worries me. I’m not worried about robots taking over the world.”

Helping others to not be scared of technology

That’s why we need to help educate as many people as possible, so that they can recognize these things. I think security, and understanding it, is one of the biggest issues facing society — there will always be scammers of all different sorts, and they’ll always be using the latest technology. We have to help people have the awareness to keep updating themselves on “So what’s the latest thing that I should be looking out for?” You can’t tell everybody; you need individuals to be able to find that stuff out for themselves.

It’s a first step, because I think one of the issues we’ve got with technology is that most people in society, particularly those who aren’t in tech, think that they can’t understand it. Whereas, of course, they can at a certain level, but because it’s got that kind of reputation, lots of people are scared of it and think they couldn’t ever understand it. And that’s one of the main things I was trying to get over in the TechMum’s program was that, “Yes, you can understand these things, you can do these things — don’t get put off by all the jargon.”

“Everyone can understand tech to a certain extent, and if they can recognise that, and not be scared of it, it can help make their lives better in terms of being able to stay safe and secure and understand what’s going on. And I guess that’s kind of like my lifelong challenge — to try and make that happen as much as possible for as many people as possible.”

The buzz around AI shining a spotlight on existing problems

It feels like the current focus on AI is shining a spotlight on some problems which already exist. For example, there was already a bias in internet search results, problems with social media, scammers, and so forth. Maybe people find it easier to think about it as technology being the problem, whereas it’s actually those that are (mis)using it. But although people may be focusing on the technology, it is at least bringing into focus how it will be used, who controls it, and…

And also who’s built it and tested it and all of that kind of stuff, from a software engineering point of view. I’ve been thinking about diversity in software teams — even though I wouldn’t have called it that — since I was doing my PhD.

I can remember reading about the disaster with the London Ambulance service computer-aided dispatch system, where people died because all sorts of things went wrong in procurement / management. A lot of it was about people not working together, not thinking, not actually valuing the people that were running the manual system beforehand and just the technology people thinking they knew better than the people that were doing the job on the ground.

I’d almost call it “uninclusion”, in the sense of not being inclusive in terms of those people working together with each other. It seemed to be a common problem in the ’90s, when there was a lot of instances of changing manual systems and computerizing systems, where the outside consultants were brought in, not really working with the people who are running the system, and having things like switchover happening on a single day with no fallback, disaster-planning. Even at the time it was obviously a ridiculous thing to do, but it seemed to be happening everywhere, with millions and millions of pounds being spent on these types of projects.

I think more than technology, it’s always been about people either not valuing other people, or other people’s opinions or information that they should be, or not testing things properly.

Dressing the animatronics — biases in plain sight

Bringing us back to the “AI for good” conference you attended last month, was there anything particularly unexpected you came across whilst you were there?

Overall it was a great conference — really interesting people and really interesting tech on display.

One thing does stick in my mind though: there were a number of robots at the event, of many different sorts including animals, and some of them were these sort of humanoid robots — animatronics. About five were women and one was a man and it was quite interesting to see how the ones that had humanoid bodies (i.e. that weren’t just a talking head on its own) were dressed. The man-robot was Japanese and kind of dressed like a Japanese businessman or professor, and quite like how a man would be dressed. Whereas the women were just dressed… well, in all sorts of seemingly random clothes that you’d probably describe as being “cheap hooker” kind of clothes.

And I was a bit like “Why? What’s going on here?” One of them had a wedding dress on it, and the impression it gave was that women are either cheap hookers or they’re getting married.

I don’t think they’d even thought about it — it didn’t seem like there was a deliberate attempt to give that impression, they’d just put some clothes on it… on her…. and that’s the clothes they put on them. So that was my main kind of “aha-but-not-in-a-good-way” moment at the conference.

I should reiterate that there were lots of interesting speakers about all different sorts of things, and the interactive stands were very cool. It was a really great conference to go to, and it was great for meeting people doing different sorts of things. But it’s still notable that this — the dressing of the animatronics — was one of the things that stuck out to me.

Looking ahead: A new UN digital skills programme

“Let’s put human values first and stay true to UN values” – ITU Secretary General Doreen Bogdan giving the closing speech. Source: https://twitter.com/Dr_Black/status/1677343924661153792

You mentioned your hopes for AI and hopes that it will help people — especially disadvantaged people — be the best version of themselves. What are your hopes for the next few years? What would you hope to see happen, what do you hope to be doing, and how could people best help you?

On a personal note, I’m really excited about working with the UN on digital skills globally. I’m very excited to be working to put together a program or programs that we can tailor for different groups of people in different countries.

So for any readers out there, please get in touch if you have any expertise around delivering digital skills programs on a large scale, or in working with disadvantaged or underserved communities in any way. I’m going to be doing a lot of fact finding — in terms of delivering a worldwide program, my experience has focused in the UK, so it will be great to broaden my perspective. I’d be very interested in speaking with people at any level — right from the lowest level in terms of content all the way up to the highest levels in terms of delivery methods, agencies to work with, etc.

For example, I was introduced to the person who runs the World Food Program — I’m hopeless at remembering exact titles but basically, the UN food agency. I had a chat with him about whether there’s a way it might work where, along with food being delivered, there’s some way that we can help facilitate a digital skills program along with it.

So, any ideas, at any level, across the world, of people who’ve got real experience — either positive or negative — delivering these types of program, so that we can help work out what is the best way to run it. Or even experience of running programs across the world — it doesn’t have to be a digital skills program, but any experience in terms of the best way of engaging communities around the world, anything kind of relevant to that and again at all levels, from experience on the ground to experience of which agencies to work with, how to bring people together, who to bring together. All of that kind of stuff.

It sounds like an amazing project, a daunting one — maybe daunting was the word I was searching for there. It sounds like there’s quite a lot of work.

I don’t feel daunted at all, I guess I’m just feeling excited! Finally, I can have the sort of impact that I want to have on the world!

If you’ve enjoyed reading this article, and would like to get in touch with Sue to discuss the new digital skills program she’s working on, you can reach her via her website, or find her on X(Twitter) or LinkedIn.

The post A good time to be working on AI: an interview with Professor Sue Black appeared first on Digital Science.

Digital Science: An Experienced Partner in AI

David Ellis — Tue, 12 Sep 2023 10:07:43 +0000

An Experienced Partner in AI

The history of AI at Digital Science

While 2023 has been all about artificial intelligence (AI) and its future capabilities, the technology has been progressing for several years with Digital Science at the forefront as an established partner for its many communities. As part of the Digital Science #MindTheTrustGap campaign, Simon Linacre presents an overview of how Digital Science has developed its capabilities in AI.

For anyone who has been involved in academia for the past 25 years or so, they will remember the challenges that have followed from leaps forward in technology. How were we going to cope if the Millennium Bug crashed our machines on 1 January 2000? What would Web 2.0 mean for scholarly communications? Would MOOCs bring down the university as a learning institution?

All these big questions came and went without too much fuss in the end, showing how resilient scientific communities were in solving any problems that had arisen. In 2023, the Next Big Thing is of course AI – or more specifically, Generative AI that has been released on a mostly free basis to the world – and many people involved in research are pondering how it might impact them.

Over the years, Digital Science has aimed to support researchers and research organizations in its early adoption and implementation of AI and AI-related technologies, such as machine learning (ML), natural language processing (NLP) and more recently large language models (LLMs). For a review of how Digital Science has supported research communities – and a refresher on what these and other AI-related terms mean – we have collated some of the key articles we have published in the last few years to guide people through the AI maze. From universities to pharmaceutical firms, from researchers to publishers, these articles look specifically not just at technology like NLP itself, but its applications and implications as well.

AI across different sectors

If you are working in a large corporate setting such as a pharmaceutical firm, AI will already be familiar to you. Nevertheless, some of the advancements using AI in recent years to support scientists have been nothing short of breathtaking. It has had a major impact on cancer research and detection, which was already apparent from this article in 2018; part of this has been the creation of huge knowledge bases such as Dimensions and devising tools to interrogate them to produce new knowledge and procedures. For example, as this post by Suze Kundu highlighted, technologies such as natural language processing could help identify specific information from huge amounts of unstructured data.

When Digital Science celebrated its 10th anniversary in 2020, one thing author Katy Alexander thought would be significant was that AI would enable both an increase in accessibility and ability to understand research across disciplines and themes in a much more structured way. We are seeing this play out already with the advancement of things like the Sustainable Development Goals (SDGs), and in turn AI is playing a part in enabling collaboration among academic researchers.

Understanding how all these factors fit together and developing solutions that enable improved capabilities for researchers is at the heart of many academic publishers, who have felt the impact of AI in recent years most keenly. Five years ago we reported – somewhat prophetically – how machines were being revolutionized by AI and what the implications might be for academic publishers as automated text generation (ATG) had clear implications for creating article content. Further implications for publishing were also discussed by Suze Kundu with a look at the potential use of NLP in peer review, and the inevitable implications that would have on trust. As we have seen in the Digital Science campaign on trust with posts such as Daniel Hook’s on the ‘Lone Banana Problem’, while there are incredible advantages to be had from using AI, there are also disadvantages if the technology isn’t engineered well or if the user doesn’t account for potential errors.

Applications

When it comes to the application of AI technology, this is where the concept of a trusted partner is particularly important. When as individuals we use technology, we do so with implicit trust in the tool we are using – trusting it will work for us, that it will find the right results and that it won’t break any rules in doing so. Scale that up to organizations using hugely powerful tools across hundreds of individuals, then the trust in the technology partner also grows along with it.

At Digital Science, we take the responsibility of being a trusted partner for customers and users very seriously, and one way we do this is in how we invest in future solutions for the research communities we work with. You can see this with the annual program we have had, known as Catalyst Grant. This showcased new technology in this space and led to a number of long-term investments, which in turn led to some companies becoming part of the Digital Science family. They have also included support for AI technology, such as two winners that sought to disrupt their niches in the research industry as highlighted in a piece by Alex Jackson.

In his article on NLP and startups, Digital Science Director of Portfolio Development Steve Scott outlined what a key role NLP was already playing in Digital Science solutions, as well as their increasing value in the future. Rather prophetically he said, “In the next 10 years, we will see NLP capabilities expand and be embedded in new products and services, helping researchers navigate ever-expanding data outputs and allowing new ways to extract and interpret meaningful analysis from past and present papers.” Only we are seeing this much quicker than the 10 years Steve forecast.

Natural Language Programming

A key aspect of AI technology that Digital Science has focused on over the years has been NLP – so much so that we published a series of articles curated by Suze Kundu devoted to the topic. As well as asking – and answering – important questions such as ‘what is NLP’ posed by Suze, the series also looked at how NLP could assist in academic writing, which we have seen develop further with the Writefull solution for Digital Science.

Another focus for NLP at Digital Science has been using AI as a force for good, mitigating against its risks for some communities in the future. This includes its use in product development for companies such as Scismic, where it has led the way in responsible recruitment, as well as Suze’s highlighting what the risks can be if certain warnings are not heeded. This is true right up until the present day as these risks are balanced in current AI technology developments at Digital Science, and we work to inform people about how what we witness with AI is not magic, but a science that requires high levels of skill and mastery as Daniel Hook so vividly portrayed in his article.

Digital Science continues to work hard to achieve this – sometimes a little too hard. Despite our best efforts, we fooled no one on 1 April 2017 with ‘news’ of a new member of the leadership team, but who knows what might happen in the future?

AI-generated artwork via Bing Image Creator. Prompts: “A (female) sci-fi wizard invoking all of (her)his mysterious energy to create the most powerful spell imaginable, drawn in comic-book art style”.

The post Digital Science: An Experienced Partner in AI appeared first on Digital Science.

AI at Digital Science

Aileen Irons — Fri, 18 Aug 2023 14:42:22 +0000

How does Digital Science use AI? We ask ChatGPT

The year 2023 will forever be remembered as the year that AI finally broke into the mainstream. But in the research ecosystem, what do we mean by AI and how is it already being used? Simon Linacre does what everyone has been doing and asks AI itself for some answers

Anyone would be forgiven for thinking that artificial intelligence (AI) had only just been invented given the huge interest that has suddenly been generated in it in 2023. However, AI actually has a long history, being first conceived of in the 1950s and gradually becoming a reality in the decades after. Nevertheless, a real step forward has been made with Generative AI, which can create seemingly flawless text or images in just seconds. But is it really that new?

Writing these words on a Google Doc, I have already accepted one or two suggestions from the predictive text function, itself a form of AI. When you realize that the words you have chosen have not 100% been your own, you can see how lines have started to blur with the more advanced Generative AI’s capabilities. However, caution must still be used as AI is more than capable of getting things wrong – just read Digital Science’s post on ‘The Lone Banana Problem’. Moreover, a lack of context or key detail can lead to misrepresentation or redundancy in the content being created.

Digital Science and AI

Digital Science has been using AI and related technologies such as Machine Learning (ML) and Natural Language Processing (NLP) for several years now – indeed, Dimensions has had https://www.dimensions.ai as its URL since its launch in 2018 – and it has become a trusted partner for hundreds of organizations that depend on its AI expertise. These are the ways Digital Science and its portfolio companies uses AI-related technologies in 2023:

Dimensions

Document classification; Researcher Disambiguation; Research Integrity Trust Markers

OntoChem

Image classification to decide whether an image shows a chemical compound; Entity recognition to decide whether a term describes a chemical compound

ReadCube

Sentiment Analysis for Mentions; Article Recommendation Engine

metaphacts

Knowledge Graph and graph-based reasoning

Writefull

Core writing assistance offering is AI-based; AI detection

Ask ChatGPT

The information above is correct because, well, I work at Digital Science and I know it to be true. It also breaks the information down by constituent parts in terms of each Digital Science company and how it uses the relevant technology. So what happens if we ask a Generative AI model how Digital Science uses AI? Well, we get the text below courtesy of OpenAI’s new ChatGPT iPhone app:

“Digital Science, as a company that provides scientific software and research solutions, utilizes AI in various ways. Here are a few examples:

– Text and data mining: AI algorithms are employed to analyze large volumes of scientific literature, extracting valuable insights, identifying trends, and accelerating the discovery process.
– Natural Language Processing (NLP): Digital Science uses NLP techniques to enhance the understanding and processing of scientific texts, enabling advanced search capabilities, automatic categorization, and semantic analysis.
– Predictive analytics: AI models are trained on scientific data to predict outcomes, such as the success or impact of research projects, funding opportunities, or potential collaborations, helping researchers make informed decisions.
– Recommendation systems: By employing machine learning algorithms, Digital Science can provide personalized recommendations to researchers, suggesting relevant articles, journals, or research resources based on their interests and previous activities.
– Data visualization and exploration: AI-powered tools enable the creation of interactive visualizations and data exploration platforms, allowing scientists to gain deeper insights from complex datasets and facilitate data-driven decision-making.

These are just a few examples of how Digital Science incorporates AI into its offerings. The field of AI is rapidly evolving, and new applications are continually emerging to support scientific research and discovery.”
Source: ChatGPT

While all of the information presented by the AI is correct, it is broken down by technical functionality, only presenting ‘examples’ of how technology is used and not what it is used on. As such, it is technically very accomplished, but of little interest to most audiences who would miss the context of how the technology is actually being applied.

AI vs Humans

In 2023 while all the hoo-haa around Generative AI was at its peak, Digital Science launched a campaign called ‘Mind the Trust Gap’ which sought to underline the importance of trust in research among those who both carry out research itself and the rest of us who depend on its integrity and quality. The developments in AI technology are often compared to humans like it is some sort of competition or even battle, but it is perhaps more sensible to view it as a relationship entering a new phase, and trust is key to that relationship. As long as we can use AI with transparency and openness like the examples above, we can build a better understanding of the world around us.

Can our AI-powered solutions help you?

Talk to us

The post AI at Digital Science appeared first on Digital Science.

Who benefits when, from FAIR data? Part 2

Suze Kundu — Tue, 25 Jul 2023 12:32:16 +0000

Machines

In part 1 of this series, I focused on researchers. For part 2 I’d like to focus on machines.

I think that AlphaFold will win the Nobel Prize by 2030. In reality, I don’t think they will award it to the artificial intelligence (AI) itself, but to the lead authors of the Nature paper describing their work (John Jumper and DeepMind CEO, Demis Hassabis). AlphaFold is an AI program which performs predictions of protein structure. It is developed by DeepMind, a subsidiary of Alphabet (Google) that recently merged with the brain team from Google Research to become Google DeepMind. Whilst there are those who have suggested that the press coverage of AlphaFold has been hyperbolic, for me it is less about whether the protein folding problem has been solved and more about the giant leap that a field of research has progressed by due to one project.

Whilst AlphaFold has got the lion’s share of the attention, there are several other advanced projects looking to have as much, if not more of an impact. All of these projects could not exist if it were not for large amounts of well-described FAIR academic data.

Academic Projects making use of FAIR data and AI to drive systemic change in a research field

Name – click to discover more	Field of research	Details
DeepTrio	Genomics	A deep learning-based trio variant caller built on top of DeepVariant. DeepTrio extends DeepVariant’s functionality, allowing it to utilise the power of neural networks to predict genomic variants in trios or duos
ClimateNet	Climate change	ClimateNet seeks to address a major open challenge in bringing the power of deep learning to the climate community, viz. creating community-sourced, open-access, expert-labelled datasets and architectures for improved accuracy and performance on a range of supervised learning problems, where plentiful reliably-labelled training data is a requirement
DeepChem	Cheminformatics and drug discovery	DeepChem aims to provide a high-quality, open-source toolchain that democratises the use of deep learning in drug discovery, materials science, quantum chemistry, and biology
The Materials Project	Materials science	The Materials Project is an initiative that harnesses the power of AI and machine learning to accelerate materials discovery and design. It provides a vast and continuously growing database of materials properties, calculations, and experimental data
The Dark Energy Survey	Astrophysics	The Dark Energy Survey (DES) is a scientific project designed to study the nature of dark energy

Of course, when we refer to machines benefiting, what we are really referencing is human-designed models benefitting from large swathes of well-described data. The machines are made up of these algorithms, FAIR data and large amounts of compute. In a time when all of the low-hanging fruit in research has been picked, larger volumes of FAIR data can be processed by machines much more efficiently than humans. Processing this information to infer trends and predicted models allows human expertise to leverage information at a much faster rate, leading to new knowledge. We are in the ‘low hanging fruit’ phase of research powered by humans and machines (algorithms, FAIR data and compute).

A Virtuous Cycle

AI and machine learning can be used to create more detailed, FAIR-er datasets to be consumed by the machines. One of the more advanced areas this is happening in is mining the existing academic literature. Efforts like the AllenNLP Library and BioBERT, a biomedical language representation model designed for biomedical text mining tasks can help semantically enhance large datasets. Large language models (LLMs) and AI systems rely heavily on vast amounts of training data to learn patterns and generate accurate outputs. Open, FAIR academic data can serve as a valuable resource for training these models. By using diverse and well-curated academic data, LLMs and AI systems can better understand the nuances of various academic disciplines, terminology, and writing styles. This leads to improved performance in tasks such as natural language understanding, text generation, and information extraction.

Subject-specific data repositories provide homogenous FAIR data for a wide range of specialised domains. Incorporating open, FAIR academic data into LLMs enables them to develop domain-specific expertise. This expertise enhances their ability to understand and process domain-specific jargon, terminology, and concepts accurately. Consequently, these models can provide more insightful and nuanced responses in specialised areas.

As well as benefiting from having large amounts of well-described FAIR data, the machines can also support our original beneficiaries of FAIR data; the researchers.

The virtuous cycle of using AI and machine learning to create even more, FAIR-er data for all.

By leveraging large volumes of FAIR data, machines can accelerate the processing and analysis of information, leading to the generation of new knowledge and empowering researchers in their endeavours. The academic community needs to ensure that both the data and the algorithms are trustworthy, safe and unbiased.

Ultimately, academia needs to prioritise feeding the machines.

The post Who benefits when, from FAIR data? Part 2 appeared first on Digital Science.

ISMPP West 2022

Sabrina Wang — Sun, 14 Aug 2022 14:38:01 +0000

The ISMPP community returns once again to Irvine, CA for ISMPP West 2022. This year’s theme is “Take the LEAD! Learning * Engagement * Application* Development”

This theme is centered on how medical publication professionals can demonstrate leadership at all levels to proactively drive change and create best practices.

Digital Science are delighted to be exhibiting and speaking at this year’s conference. Drop by our booth and say hello!

Mike Taylor and Richard Graves will be speaking on Tuesday, September 27th at 3:40pm – 4:40pm. His talk will be on Bridging the Data Gap: Applying AI and Analytics to Unstructured Scientific Content to Discover New Data that Drives Value.

Summary:

There is an abundance of critical data stored in unstructured scientific content, that if accessed, could be used to inform and drive Medical Affairs’ strategy. Artificial intelligence (AI) and natural language processing (NLP) are advancing beyond the capabilities of literature search engines and content portals and enabling Medical Affairs to capture, analyze, and distribute new layers of data from scientific publications in near real-time. By garnering data straight from publications, Medical Affairs can generate evidence, analyze trends, and produce provider and patient-centric reports and presentations. In this session, we will explore the current state and potential opportunities of AI and analytics for scientific publications and content and share practical applications for Medical Affairs. Key examples will focus on how unstructured, complex scientific content can be rapidly organized into automated dashboards, slide decks, and reports to improve productivity, health equity, and the quality of scientific exchange.

Learning Objectives:

Learn how artificial intelligence (AI) and natural language processing (NLP) can deliver new layers of data from scientific publications and unstructured content
Establish real-world applications for scientific publication analytics that deliver value, save time, generate evidence, and strengthen KOL engagement opportunities
Actionable steps on how to implement AI and analytics tools into scientific publication, clinical trial, and clinical claims monitoring workflows
Demonstrate how AI can be used in content transformation, including plain language summary (PLS) generation, to promote health equity and accessibility

Mike Taylor
Head of Data Insights, Digital Science

Mike is Head of Data Insights at Digital Science, previously he was in Elsevier Labs, and Elsevier’s Metrics and Analytics group. At Digital Science, he’s responsible for developing metrics and integrating them across the portfolio companies, in particular for Dimensions and Altmetric. Mike regularly blogs and presents on a diverse range of subjects, including reproducibility and the political context of metrics and measurement.

The post ISMPP West 2022 appeared first on Digital Science.

Digital Science APAC Showcase 2021 (online)

Sabia Morrison — Wed, 26 May 2021 11:32:44 +0000

Our Asia Pacific Virtual Showcase 2021, took place online on Monday 21 and Thursday 24 June.

With travel across across regional and international borders remaining heavily restricted, we decided to deliver our sessions live to you in your offices, lounges and at your kitchen tables. Presentations from our portfolio product teams, Altmetric, Dimensions, Figshare, Overleaf, Ripeta and Symplectic, aimed to give you more information about their products and show you how they can help your workflows. Please see the links below to view a recording of any of the sessions.

AGENDA

Monday 21 June: 2:00 PM – 2:45 PM (AEST)

An Overview of Symplectic Elements – VIEW RECORDING

New to Symplectic Elements, or want a refresher? This short session is a perfect introduction to our platform, highlighting its key use cases and most-used features. Elements brings multiple data sources together in one centralised space, allowing you to easily manage research information and funding workflows, derive powerful new insights, and showcase the real-world impact of your work.

Tuesday 22 June & Wednesday 23 June

Symplectic User Days – VIEW ALL RECORDINGS HERE

Designed specifically for existing users, these sessions provided updates (including version 6), client showcases, details on moving to hosting and introduction to the Research Funding Solution and much more (view the full agenda here).

Thursday 24 June

Figshare Session: 8:00 – 8:50 AM (AEST) – VIEW RECORDING

8:00 – 8:40 AM Figshare: Overview, highlights and what’s new (Mark Hahnel, CEO)
8:40 – 8:50 AM Figments: Figshare and Symplectic Elements integration at Central Queensland University (Nicole Healy, Central Queensland University)

Dimensions Session: 9:00 – 9:50 AM (AEST) – VIEW RECORDING

9:00 – 9:30 AM Pathway to the new FoR codes (Simon Porter, Director of Innovation, Digital Science)
9:30 – 9:50 AM Beginner’s guide to Dimensions on Google BigQuery (Danu Poyner and Jared Watts, Dimensions)
Interested in playing with Dimensions on Google BigQuery but not sure where to start? Danu and Jared will take you through what you need to know to get started with your first project and where to get help as you begin to explore further.

Overleaf Session: 10:00 – 10:30 AM (AEST) – VIEW RECORDING
Overleaf Use at Institutions (John Hammersley, CEO and Co-Founder, Overleaf)

Overleaf, the award-winning collaborative LaTeX authoring platform, is being used by students, faculty, researchers and institutions around the world. John Hammersley, CEO and Co-Founder of Overleaf will be talking through how Overleaf is being used successfully at institutions and taking questions from the audience. He will cover a short look back into the origins of Overleaf, an overview of the platform, and specific use cases at universities like Purdue, research institutions like CERN, and in the publishing community. He’ll then finish with answering questions about Overleaf, or more specifically our Institutional offerings.

Ripeta Session: 11:00 – 11:30 AM (AEST) – VIEW RECORDING
Trust in Your Research – Trust in Your Institution: Ripeta Quality Checks for Reproducibility and Professionalism
(Leslie McIntosh, CEO and Founder, Ripeta)

Research integrity and good scientific practices are critical for faculty that work within academic institutions. In this talk we will discuss how institutions can use new machine learning and NLP tools to meet and track funder and journal reporting requirements, saving compliance-checking time while assessing the overall quality of science published on their campuses in comparison to other institutions.

Altmetric Session: 12:00 – 12:30 PM (AEST) – VIEW RECORDING
Sentiment analysis for Altmetric data (Stacy Konkiel, Senior Data Analyst, Altmetric)

The public may be talking about your research but what do these conversations mean? Sentiment analysis is increasingly necessary to determine true research influence at scale.

In this talk, Altmetric Senior Data Analyst, Stacy Konkiel, will demystify sentiment analysis for altmetrics data. Using the free and open source tools anyone can use to perform textual analysis on Altmetric data, you will learn how easy it is to get up and running with your own sentiment analysis project in minutes.

Dimensions Webinar: 1:00 – 2:00 PM (AEST) – WATCH ON DEMAND
Informing Collection Development with the Dimensions API: an interview with Lauren Castle at UNSW

Lauren Castle is a Collection Development Librarian at the University of New South Wales. Lauren’s team have been using the Dimensions API to provide additional insight into how the Library Collection is used and to inform collection development decisions. In this interview and Q&A session, find out how Lauren uses Python and Google Sheets to access the Dimensions API to extract data on which journals researchers at our institution are publishing in and citing, as well as open access trends.

Using Altmetric, Dimensions & Figshare together: 2:15 – 2:45 PM (AEST) – VIEW RECORDING

Visualise Your Thesis: Components of the Research Lifecycle using Digital Science Tools with speakers from University of Melbourne

In this presentation, representatives of the University of Melbourne’s “Visualise Your Thesis” competition will discuss how they have used Figshare as a learning tool to teach graduate researchers about different components of the research lifecycle. By utilising the Figshare repository, librarians have been able to provide real-life examples of how interoperable systems work together to track bibliometrics (including Altmetric attention and citations from Dimensions), highlight the use of Author Identifiers, the importance of creative commons, and how an online presence works into an overall ‘research narrative’. Speakers, include:

Eleanor Colla, Program Manager, Scholarly Development (Research) at University of Melbourne
Kathryn Lindsay, Senior Librarian, Science, Engineering and IT at University of Melbourne

* * * * *

Altmetric Webinar – WATCH ON DEMAND
How to use the Altmetric Explorer API (Charlotte Perry-Houts, Implementation Manager)

Watch this on-demand webinar to learn about how you can use the Altmetric Explorer API to develop a deeper understanding of the data Altmetric Explorer gathers, easily build your own search and dive into tailored ideas and use cases.

NB: programme is subject to change at the organiser’s discretion

The post Digital Science APAC Showcase 2021 (online) appeared first on Digital Science.

Identifying Global Expertise in CAR-T

Suze Kundu — Thu, 15 Apr 2021 12:45:11 +0000

Identifying Global Expertise in CAR-T

Car-T Infographic

IDENTIFYING GLOBAL EXPERTISE IN CAR-T

In this snapshot you will find global trends in Car-T research investment, research and innovation across a 10-year time span. We also use our NLP search technology to surface experts at the topic intersection of Car-T and solid tumor research using our unique database of awarded grants some running up to 2023 for our top expert Carl June – so what is funded now could indicate what becomes effective research in the near future.

Download Now

The post Identifying Global Expertise in CAR-T appeared first on Digital Science.

Advanced Analysis

Aileen Irons — Tue, 02 Mar 2021 09:56:00 +0000

We deliver a range of analytical services

Scientific portfolio analysis

Schedule a discussion today.

We deliver a range analytical services, from one-off tasks to multi-year partnerships. Below are some examples, but this is not exhaustive. Would you like to see examples of our advanced analysis? Discuss your needs? Schedule a discussion today.

Scientific portfolio analysis

The breadth of Dimensions data and the extensive networks of linkages facilitates robust portfolio analysis. Beyond determining relative strengths and identifying gaps, our analysts use our data to provide a view across your portfolio to improve efficiency and effectiveness including:

Recognizing opportunities for collaboration
Locating experts to perform reviews and staff committees to promote informed decision-making
Understanding how support and resources are distributed across programs

Maximizing funding by leveraging investments of others
Pinpointing trends in funding, effort, and outputs
Assessing how funded research aligns with strategic plans and priorities

Situational awareness

Digital Science’s Dimensions platform is an ideal foundation for helping you to maintain situational awareness about rapidly evolving activities in research. The COVID-19 pandemic is just the latest in examples where agencies need to have real-time updates about scientific innovations.

COVID-19 Initiatives Updated Daily

Find out more

Dimensions Data on Google Big Query allows for real-time analysis

Learn more

Program evaluation

Not only is Dimensions the largest collection of research data, our unique reach expands beyond publications to include grants, patents, clinical trials, policy documents, and data sets, providing a more complete picture of activity and impact. Leveraging the power of our data, Digital Science’s expert analysts work with you and your stakeholders on comprehensive, collaborative program analysis to provide a thorough, custom evaluation of current programs and actionable recommendations to improve effectiveness of future endeavors.

Horizon scanning

Digital Science’s Dimensions platform is an ideal foundation for Horizon Scanning, supporting governments and funders around the world in identifying the ‘unknown unknowns’. With hundreds of millions documents and billions of relationships, Dimensions is the largest corpus of data relevant to Horizon Scanning of science and technology. Dimensions includes data that captures scientific ideas at their inception as they’re proposed by researchers for funding, as well as data sources that can indicate trends and innovation in the commercial sector.

Various features of the data, such as document type, dates, and categories, provide easy-to-use facets to define a corpus of interest for a particular study. Dimensions data can be accessed in multiple complementary ways to support the identification of weak and early signals, trends, and oddities that may be indicative of opportunities or threats. Our data supports standard and validated indicators for horizon scanning, as well as the experimentation and validation of novel indicators.

Topic modeling analysis: identify topics within a dataset and identify the ‘unknown unknowns’

Find out more

Research landscape: how one University used the Dimensions API to explore nanoscience and technology trends.

Learn more

Receive proactive alerts about new grants and publications.

Find out more

Landscape assessment

Research is continually evolving and dynamics within and between fields change constantly. Landscape assessments ensure you understand the lay of the land and are prepared to adapt to change. Our analysts use Dimensions to help you:

Identify major global research areas.
Understand how scientific fields are changing.
Assess emerging trends within a specific field of study.
Recognize the experts in these fields.
Who are the top funders? Top research organizations? Who is collaborating?

Technology watch

Dimensions is also the ideal foundation for Technology Watch, supporting governments and funders around the world carrying out strategic assessments of current, emerging, and future technologies pertinent to their mission. With a focus on the ‘known unknown’, technology watch frequently complements horizon scanning activity which is focused on the ‘unknown unknown’. Dimensions includes data that extends across traditional data silos, including:

Grants: pre-research ideas
Preprints: research-in-progress
Publications: published research
Patents, Policy Documents, Clinical Trials: research applications

Various features of the data provide easy-to-use facets to define a corpus of interest for a particular study. Dimensions data can be accessed in multiple complementary ways to support analysts in observing, tracking, and assessing potential technologies without the traditional confines of narrow types of research activity.

GBQ Report: Quickly monitor and analyze all research information about your high-priority, mission-critical topic.

Get the report

Concept Co-Occurence: Discover the relationships between concepts within your field and spot convergences.

Find out more

Impact assessment

Impact is the cornerstone of many institutional missions. Impact is no longer seen as simply the volume of outputs or interventions; it includes long-term effects on populations, communities, the environment, research fields, and researchers themselves. Our analysts help you understand whether your impact is delivering on your mission, including:

Understanding your impact within a field of study.
Identifying publications from your research that became prominent, widely cited, additions to the literature corpus of your field
Highlighting work that has been picked up by the media.
Linking research to policy development
Determining whether your funding kept researchers in the field and whether it has impacted the careers of your researchers.
Comparing impact of our funding to that of others.

Topical analysis

Increasing volumes of research outputs and data are a treasure trove of information if you have the time, tools, and expertise to make sense of it. Topical analysis leverages NLP, text mining, machine learning, coding, classification, and other techniques to perform top-down and bottom-up views of what is happening in research, from broad topics areas like cancer or engineering, to granular emerging areas first mentioned a week ago. Our data, tools, and analysts can help you separate the signals from the noise, and get reliable precision/recall balanced data sets and views of any topic in global research.

Emerging Technology

Digital Science’s Dimensions platform is an ideal foundation to support your team in understanding the opportunities and threats posed by emerging technologies. With hundreds of millions documents and billions of relationships, Dimensions offers you a one-stop-shop to uncover emerging technologies wherever they occur — in the lab, in the commercial sector, in your own city, or on the other side of the world. Dimensions can help you conquer challenges of technical jargon (e.g. blockchain vs distributed ledger) and identify (even predict!) novel applications of rapidly emerging technologies.

Emerging topics in Grants: Identify and analyze the emerging technologies among recently awarded research grants.

Watch the demo

Find Key Opinion Leaders to guide strategy in emerging areas of research

Learn more

The post Advanced Analysis appeared first on Digital Science.

You searched for nlp - Digital Science

Shining a light on conflict of interest statements

Gold standard

Fatal flaw

Conclusion

About the Author

Fragmentation of AI research: a blog series

Global Divides

Siloed knowledge

Policy framework

Bibliography

A good time to be working on AI: an interview with Professor Sue Black

Topics of conversation

Early days in natural language processing

A good time to be working on AI

A practical approach

Returning to AI with a broader experience

We should worry about people, not the technology

Helping others to not be scared of technology

The buzz around AI shining a spotlight on existing problems

Dressing the animatronics — biases in plain sight

Looking ahead: A new UN digital skills programme

Digital Science: An Experienced Partner in AI

The history of AI at Digital Science

AI across different sectors

Applications

Natural Language Programming

AI at Digital Science

How does Digital Science use AI? We ask ChatGPT

Digital Science and AI

Dimensions

OntoChem

ReadCube

metaphacts

Writefull

Ask ChatGPT

AI vs Humans

Can our AI-powered solutions help you?

Who benefits when, from FAIR data? Part 2

Machines

Academic Projects making use of FAIR data and AI to drive systemic change in a research field

A Virtuous Cycle

ISMPP West 2022

Mike Taylor Head of Data Insights, Digital Science

Mike Taylor Head of Data Insights, Digital Science

Digital Science APAC Showcase 2021 (online)

Our Asia Pacific Virtual Showcase 2021, took place online on Monday 21 and Thursday 24 June.

AGENDA

Monday 21 June: 2:00 PM – 2:45 PM (AEST)

An Overview of Symplectic Elements – VIEW RECORDING

Tuesday 22 June & Wednesday 23 June

Symplectic User Days – VIEW ALL RECORDINGS HERE

Thursday 24 June

Figshare Session: 8:00 – 8:50 AM (AEST) – VIEW RECORDING

Dimensions Session: 9:00 – 9:50 AM (AEST) – VIEW RECORDING

Overleaf Session: 10:00 – 10:30 AM (AEST) – VIEW RECORDINGOverleaf Use at Institutions (John Hammersley, CEO and Co-Founder, Overleaf)

Ripeta Session: 11:00 – 11:30 AM (AEST) – VIEW RECORDINGTrust in Your Research – Trust in Your Institution: Ripeta Quality Checks for Reproducibility and Professionalism(Leslie McIntosh, CEO and Founder, Ripeta)

Altmetric Session: 12:00 – 12:30 PM (AEST) – VIEW RECORDINGSentiment analysis for Altmetric data (Stacy Konkiel, Senior Data Analyst, Altmetric)

Dimensions Webinar: 1:00 – 2:00 PM (AEST) – WATCH ON DEMANDInforming Collection Development with the Dimensions API: an interview with Lauren Castle at UNSW

Using Altmetric, Dimensions & Figshare together: 2:15 – 2:45 PM (AEST) – VIEW RECORDING

Visualise Your Thesis: Components of the Research Lifecycle using Digital Science Tools with speakers from University of Melbourne

* * * * *

Altmetric Webinar – WATCH ON DEMANDHow to use the Altmetric Explorer API (Charlotte Perry-Houts, Implementation Manager)

Identifying Global Expertise in CAR-T

Advanced Analysis

We deliver a range of analytical services

Scientific portfolio analysis

Situational awareness

Program evaluation

Horizon scanning

Landscape assessment

Technology watch

Impact assessment

Topical analysis

Emerging Technology

Mike Taylor
Head of Data Insights, Digital Science

Mike Taylor
Head of Data Insights, Digital Science

Overleaf Session: 10:00 – 10:30 AM (AEST) – VIEW RECORDING
Overleaf Use at Institutions (John Hammersley, CEO and Co-Founder, Overleaf)

Ripeta Session: 11:00 – 11:30 AM (AEST) – VIEW RECORDING
Trust in Your Research – Trust in Your Institution: Ripeta Quality Checks for Reproducibility and Professionalism
(Leslie McIntosh, CEO and Founder, Ripeta)

Altmetric Session: 12:00 – 12:30 PM (AEST) – VIEW RECORDING
Sentiment analysis for Altmetric data (Stacy Konkiel, Senior Data Analyst, Altmetric)

Dimensions Webinar: 1:00 – 2:00 PM (AEST) – WATCH ON DEMAND
Informing Collection Development with the Dimensions API: an interview with Lauren Castle at UNSW

Altmetric Webinar – WATCH ON DEMAND
How to use the Altmetric Explorer API (Charlotte Perry-Houts, Implementation Manager)