The 2nd Workshop on AI-Ready Data for Science Discovery

Date: November 12-15,2026 | Location: Shenyang, China

About the Workshop

Scientific discovery is entering a new era—powered by artificial intelligence (AI) and machine learning (ML). These technologies are enabling breakthroughs across disciplines such as biology, physics, chemistry, and materials science. However, one major bottleneck remains: the lack of high-quality, domain-specific datasets that are truly AI-ready.

The 2nd Workshop on AI-ready Data for Science Discovery (ADSD 2026) aims to address this critical challenge by building a vibrant, interdisciplinary community focused on the creation, curation, and benchmarking of scientific datasets. Hosted at ICDM 2026, ADSD will serve as a platform for researchers, practitioners, and data professionals to collaborate on shaping the future of scientific data mining.

Call for Papers

We welcome a wide array of submissions focused on AI-Ready Dataset for science discovery, encompassing topics such as theories, algorithms, applications, systems, and tools. These topics include but are not limited to:

Data Acquisition and Integration

  • Automated methods for constructing AI-ready data from experiments, simulations, and publications.
  • Methods for multimodal datasets integration, including text, images, tables, and numerical data.
  • Retrieval-augmented generation (RAG) for extracting knowledge from scientific literature.

Data Curation, Quality Control, and Enrichment

  • Methods for dataset collection and annotation.
  • Methods for metadata and synthetic data generation.
  • Methods for data consistency and completeness.
  • Automated data refinement to improve reliability.
  • Automated outlier detection and correction.
  • Systems/tools for continuous dataset integrity.

Benchmarking and Evaluation Frameworks

  • Develop standardized benchmarks applicable across domains, such as biomedicine, materials science, environmental modeling, etc.
  • Methods for developing standardized metrics e.g. accuracy, robustness, scalability, and interpretability, tailored to domain-specific data characteristics.
  • Methods to evaluate data quality and AI-readiness.
  • Methods to evaluate the data interpretability, robustness, and trustworthiness.
  • Open platforms for standardized benchmarking.

Applications in Scientific Research

  • Tools for publication summary and trends analysis.
  • AI for scientific challenges like drug discovery, climate modeling, and material prediction, etc.

Submission Details

We invite the submission of regular research papers (6-10 pages), including the bibliography and any possible appendices. Submissions must be in PDF format, and formatted according to IEEE Conference Template. Submitted papers will be assessed based on their novelty, technical quality, potential impact, insightfulness, depth, clarity, and reproducibility. All the papers are required to be submitted via the ADSD Submission. By the unique ICDM tradition, all accepted workshop papers will be published in the dedicated ICDMW proceedings published by the IEEE Computer Society Press. For more questions about the workshop and submissions, please send email to pfwang@cnic.cn.

Important Dates

* All deadlines are at 11:59 pm in the Anywhere on Earth timezone

Keynote Presentations

TBD

Organizing Committee

Steering Co-Chairs

Hui Xiong

Hui Xiong

The Hong Kong University of Science and Technology
(Guangzhou)

Xiansheng Hua

Xiansheng Hua

Tongji University

Program

TBD

Program Committee

Contact

For inquiries, please contact us at pfwang@cnic.cn.

Data Sharing Supporters