About the Workshop
Scientific discovery is entering a new era—powered by artificial intelligence (AI) and machine learning (ML). These technologies are enabling breakthroughs across disciplines such as biology, physics, chemistry, and materials science. However, one major bottleneck remains: the lack of high-quality, domain-specific datasets that are truly AI-ready.
The 1st Workshop on AI-ready Data for Science Discovery (ADSD 2025) aims to address this critical challenge by building a vibrant, interdisciplinary community focused on the creation, curation, and benchmarking of scientific datasets. Hosted at ICDM 2025, ADSD will serve as a platform for researchers, practitioners, and data professionals to collaborate on shaping the future of scientific data mining.
Call for Papers
We welcome a wide array of submissions focused on AI-Ready Dataset for science discovery, encompassing topics such as theories, algorithms, applications, systems, and tools. These topics include but are not limited to:
Data Acquisition and Integration
- Automated methods for constructing AI-ready data from experiments, simulations, and publications.
- Methods for multimodal datasets integration, including text, images, tables, and numerical data.
- Retrieval-augmented generation (RAG) for extracting knowledge from scientific literature.
Data Curation, Quality Control, and Enrichment
- Methods for dataset collection and annotation.
- Methods for metadata and synthetic data generation.
- Methods for data consistency and completeness.
- Automated data refinement to improve reliability.
- Automated outlier detection and correction.
- Systems/tools for continuous dataset integrity.
Benchmarking and Evaluation Frameworks
- Develop standardized benchmarks applicable across domains, such as biomedicine, materials science, environmental modeling, etc.
- Methods for developing standardized metrics e.g. accuracy, robustness, scalability, and interpretability, tailored to domain-specific data characteristics.
- Methods to evaluate data quality and AI-readiness.
- Methods to evaluate the data interpretability, robustness, and trustworthiness.
- Open platforms for standardized benchmarking.
Applications in Scientific Research
- Tools for publication summary and trends analysis.
- AI for scientific challenges like drug discovery, climate modeling, and material prediction, etc.
Submission Details
We invite the submission of regular research papers (6-10 pages), including the bibliography and any possible appendices. Submissions must be in PDF format, and formatted according to IEEE Conference Template. Submitted papers will be assessed based on their novelty, technical quality, potential impact, insightfulness, depth, clarity, and reproducibility. All the papers are required to be submitted via the ADSD Submission. By the unique ICDM tradition, all accepted workshop papers will be published in the dedicated ICDMW proceedings published by the IEEE Computer Society Press. For more questions about the workshop and submissions, please send email to pfwang@cnic.cn.
Important Dates
* All deadlines are at 11:59 pm in the Anywhere on Earth timezone
- Paper Submission Deadline: August 28, 2025
- Acceptance Notification: September 10, 2025
- Camera-ready Submission: September 17, 2025
- Workshop Date: November 12, 2025
Organizing Committee
Steering Co-Chairs

Hui Xiong
The Hong Kong University of Science and Technology
(Guangzhou)

Xiansheng Hua
Tongji University
Program Co-Chairs

Pengfei Wang
Chinese Academy of Sciences

Yanjie Fu
Arizona State University

Pengyang Wang
University of Macau

Kunpeng Liu
Portland State University

Jiaxu Cui
Jilin University
Poster Co-Chairs

Ran Zhang
University of Chinese Academy of Sciences

Zhiyuan Ning
University of Chinese Academy of Sciences
Web Co-Chairs

Pengjiang Li
University of Chinese Academy of Sciences

Ping Xu
University of Chinese Academy of Sciences

Zaitian Wang
University of Chinese Academy of Sciences
Keynote Presentations (TBD)
Program Committee (TBD)
Contact
For inquiries, please contact us at pfwang@cnic.cn.