Marco Malgarini is Senior Manager for Research Evaluation at ANVUR. His current activities include assessment of research evaluation methods, comparison of research activity patterns at the international level, assessment of information content of bibliometric indicators, and development of new indicators.
Sergio Benedetto is an Emeritus Professor at the Polytechnic University of Turin and a member of the board of directors of ANVUR.
Italy’s Research Evaluation Exercise
By Sergio Benedetto and Marco Malgarini, ANVUR
The Italian National Agency for the Evaluation of the University and Research Systems (ANVUR) is starting a new project aimed at evaluating research outcomes published by Italian professors and researchers from 2011-2014. Overall, we expect over 130,000 publications to be evaluated.
The goal of the exercise is to evaluate the quality of the research conducted in Italian universities, and to rank these institutions and their departments in each of the 16 research areas that comprise all research activities in Italy. ANVUR has designated 400 assessors as the Group of Evaluation Experts (GEV) whose evaluation will significantly inform the distribution of public funds.
All Italian professors and researchers affiliated with a research center will participate in the exercise. Each professor must submit two research outcomes, which can include books, chapters, articles or other products (such as patents, databases, works of art, or architectonic projects). Researchers from research centers will each submit three outcomes. To be considered for evaluation, professors must possess an ORCID identifier, although no evaluations concerning individual researchers will be released.
GEVs will base their evaluations on originality, methodological rigor, and scientific (actual or potential) impact. GEVs will also be responsible for defining additional, specific evaluation criteria for their assigned area. The analysis of each research outcome will result in a synthetic appraisal of its scientific merit across five possible levels in context of all publications in the field worldwide: Excellent – top 10%, High – 10-30%, Fair – 30-50%, Acceptable – 50-80%, Limited – bottom 20%. A score will be assigned to each level of merit.
Figure 1: Italy's natural sciences publications, 2010-2014
This dashboard from the SciVal Overview module provides the relative distribution of Italy’s research output in the Natural Sciences from 2010-2014. SciVal supports multiple subject classification systems, including OECD’s Fields of Science (FOS).
Sources: SciVal. Scopus data snapshot from October 5, 2015.
The GEVs will use informed peer review; that is, wherever possible, peer evaluation will be supported by bibliometric information from the two leading citation databases (Web of Science and Scopus). More precisely, evaluators will use information about the scientific impact of both the articles (as expressed by number of citations) and the journals in which they are published (as expressed by the Impact Factor and similar indicators).
Bibliometric indicators will be used in mathematical, natural, engineering and life sciences, as well as, albeit with a slightly different approach, in economics and statistics. GEVs in humanities and social sciences, on the other hand, will rely only on peer evaluation.
Because of the widely agreed assumption that citation practices differ significantly across types of publication, scientific fields and years of publication, bibliometric indicators will be normalized. But no automatic evaluations will take place, given that the GEV will always be responsible for the final assessment of a publication’s scientific merit. In any case, more than 50 percent of publications will be evaluated purely with peer review methods.
Wherever article- and journal-level metrics provide converging results (i.e., when both indicators assign an article to the same class of merit), bibliometric evaluation will be considered conclusive, pending final approval by the GEV. If the two metrics suggest different classes of merit be assigned to the article, the GEV should assign a specific weight to each indicator to decide on a final class of merit. Weights assigned to article- and journal-level metrics will usually depend on specific characteristics of the field and year of publication.
In areas characterized by slow accumulation of citations, for example, a predominant weight will be assigned to journal-level metrics, while the opposite will be true for areas characterized by more frequent citation practices. Similarly, evaluation of more recent articles, for which the citations count may still be considered inconclusive, will rely more heavily on journal-level metrics, while the opposite will be true for articles published at the beginning of the evaluation period. If article- and journal-level metrics are considered “too” far apart (usually when they differ by more than two classes of merit), the GEV will peer review the article, either internally or by assigning it to an external reviewer.
External reviewers will be selected on the basis of an external list of experts in the field. This list will be carefully crafted after consideration of ex-ante scientific accreditation and possible conflicts of interest. Peer evaluation will be based on a predetermined form containing specific questions reflecting the evaluation criteria described above. The reviewer will also provide a final brief comment explaining his or her choice concerning the class of merit assigned to the article.
The Research Evaluation Exercise will assess departments and universities within homogeneous scientific areas, based on the number of publications they submit and their assigned scores. In other words, evaluation results for each university will reflect both quantitative and qualitative factors.
Results of the evaluation of research products will account for 75 percent of the global evaluation of universities and departments. Another 20 percent of the total score will be based on the quality of publications by researchers hired or promoted from 2011-2014. In this way, the evaluation will also consider the ability Italian universities showed in selecting researchers with higher scientific impact and quality. The exercise will also evaluate so-called Third Mission activities, related to scientific transfer and building social capital. A final 5 percent of the total score will come from information concerning competitive funding granted to universities and their doctoral schools.