Network science researchers interested in analyzing large bibliographic data.
Network science approaches and the data to power them, have enabled the study of science as a complex system (Zeng et al, 2017), resulting in the emergence of a new field, the “science of science” (Fortunato et al. 2018). Studies in this nascent field rely on open and proprietary big bibliometric data sets such as Web of Science (WoS), and Microsoft Academic Graph (MAG). Yet, cost and expertise needed to host and service large open and proprietary data, are significant access barriers to many. Moreover, data use agreements often prohibit data/algorithm sharing hampering collaboration and reproducibility.
CADRE is a new, cloud-based, science gateway that overcomes these barriers. CADRE hosts large proprietary and open datasets in native and graph formats and offers a suit of analytic tools (Mabry et al., 2020). The tasks of updating and maintaining data and version control are centralized and user-generated digital objects, including code and datasets, are written in personal Jupyter notebooks, and encapsulated using cloud native containerized technologies. This makes research reproducible and research assets easy to share, cite and reuse, while increasing efficiency and reducing cost. Queries can be executed through a graphical interface for users without programming skills. By pooling resources to build a single shared instance, member institutions obtain a superior solution a fraction of the cost they would pay to develop their own. CADRE’s open datasets and basic tools are free for public use
Drawing on examples from her own research and her experience with CADRE, Staša Milojević will kick off the workshop with reflections on how network science has influenced the emergence of the science of science and discuss the role that CADRE can play in realizing its future potential. Next, a series of 4-5 talks from CADRE users will showcase research projects conducted on CADRE. Presenters will provide a rationale for why CADRE was used and the benefits it conferred to the project. They will also comment on limitations or challenges encountered and recommendations for CADRE enhancements. The portfolio of presentations will intentionally reflect a range of topics while also illustrating as many of CADRE’s capabilities as possible.
Yong Yeol Ahn will present his “science genome” project as an example of research conducted on CADRE. A hands-on tutorial will walk attendees through CADRE‘s registration process and a series of exercises: 1) create a working network embedding pipeline similar to that developed and presented by Dr. Ahn and save it to their CADRE workspace; 2) conduct their own CADRE queries and reproducing complex analyses on these queries; and 3) share complex computational workflows on CADRE. Exercises will be conducted on CADRE’s free tier using the open Microsoft Academic Graph dataset through a programming interface or an intuitive, graphical user interface for those without programming skills. Attendees will provide feedback via questionnaires. Real-time technical support for CADRE will be available during the tutorial and throughout the conference.
BREAK – 5 min 2:25 pm-2:30 pm Eastern U.S.