The 1st Workshop on AI-Ready Data for Science Discovery

Date: November 12, 2025 | Location: Washington DC, USA

About the Workshop

Scientific discovery is entering a new era—powered by artificial intelligence (AI) and machine learning (ML). These technologies are enabling breakthroughs across disciplines such as biology, physics, chemistry, and materials science. However, one major bottleneck remains: the lack of high-quality, domain-specific datasets that are truly AI-ready.

The 1st Workshop on AI-ready Data for Science Discovery (ADSD 2025) aims to address this critical challenge by building a vibrant, interdisciplinary community focused on the creation, curation, and benchmarking of scientific datasets. Hosted at ICDM 2025, ADSD will serve as a platform for researchers, practitioners, and data professionals to collaborate on shaping the future of scientific data mining.

Call for Papers

We welcome a wide array of submissions focused on AI-Ready Dataset for science discovery, encompassing topics such as theories, algorithms, applications, systems, and tools. These topics include but are not limited to:

Data Acquisition and Integration

  • Automated methods for constructing AI-ready data from experiments, simulations, and publications.
  • Methods for multimodal datasets integration, including text, images, tables, and numerical data.
  • Retrieval-augmented generation (RAG) for extracting knowledge from scientific literature.

Data Curation, Quality Control, and Enrichment

  • Methods for dataset collection and annotation.
  • Methods for metadata and synthetic data generation.
  • Methods for data consistency and completeness.
  • Automated data refinement to improve reliability.
  • Automated outlier detection and correction.
  • Systems/tools for continuous dataset integrity.

Benchmarking and Evaluation Frameworks

  • Develop standardized benchmarks applicable across domains, such as biomedicine, materials science, environmental modeling, etc.
  • Methods for developing standardized metrics e.g. accuracy, robustness, scalability, and interpretability, tailored to domain-specific data characteristics.
  • Methods to evaluate data quality and AI-readiness.
  • Methods to evaluate the data interpretability, robustness, and trustworthiness.
  • Open platforms for standardized benchmarking.

Applications in Scientific Research

  • Tools for publication summary and trends analysis.
  • AI for scientific challenges like drug discovery, climate modeling, and material prediction, etc.

Submission Details

We invite the submission of regular research papers (6-10 pages), including the bibliography and any possible appendices. Submissions must be in PDF format, and formatted according to IEEE Conference Template. Submitted papers will be assessed based on their novelty, technical quality, potential impact, insightfulness, depth, clarity, and reproducibility. All the papers are required to be submitted via the ADSD Submission. By the unique ICDM tradition, all accepted workshop papers will be published in the dedicated ICDMW proceedings published by the IEEE Computer Society Press. For more questions about the workshop and submissions, please send email to pfwang@cnic.cn.

Important Dates

* All deadlines are at 11:59 pm in the Anywhere on Earth timezone

Organizing Committee

Steering Co-Chairs

Avatar

Hui Xiong

The Hong Kong University of Science and Technology
(Guangzhou)

Avatar

Xiansheng Hua

Tongji University

Program Co-Chairs

Avatar

Pengfei Wang

Chinese Academy of Sciences

Avatar

Yanjie Fu

Arizona State University

Avatar

Pengyang Wang

University of Macau

Avatar

Kunpeng Liu

Portland State University

Avatar

Jiaxu Cui

Jilin University

Poster Co-Chairs

Avatar

Ran Zhang

University of Chinese Academy of Sciences

Avatar

Zhiyuan Ning

University of Chinese Academy of Sciences

Web Co-Chairs

Avatar

Pengjiang Li

University of Chinese Academy of Sciences

Avatar

Ping Xu

University of Chinese Academy of Sciences

Avatar

Zaitian Wang

University of Chinese Academy of Sciences

Keynote Presentations (TBD)

Program Committee (TBD)

Contact

For inquiries, please contact us at pfwang@cnic.cn.

Data Sharing Supporters