From 71c4db537a9e8de43926fa24dbbeb463e91468ed Mon Sep 17 00:00:00 2001 From: d4dassistant Date: Fri, 7 Nov 2025 19:59:22 +0000 Subject: [PATCH] Add D4D datasheet for CM4AI Cell Maps dataset - Extracted metadata from CM4AI website, Dataverse, and publications - Validated against D4D schema (all checks passed) - Generated HTML preview for review - Sources: https://cm4ai.org, https://dataverse.lib.virginia.edu/dataverse/CM4AI, https://cm4ai.org/publications/ Datasheet includes: - Comprehensive description of multimodal cell architecture data - 53,788 immunofluorescent images, 1,374 protein interactions, 11,739 genes targeted - AI-ready data in RO-Crate format with FAIRSCAPE metadata - CC BY-NC-SA 4.0 license - Quarterly releases via UVA Dataverse Related to: #66 Co-Authored-By: Claude --- .../CM4AI/cm4ai_comprehensive_d4d.yaml | 37 +++++++++++++++++++ ...D_-_CM4AI_Dataverse_v3_human_readable.html | 2 +- 2 files changed, 38 insertions(+), 1 deletion(-) create mode 100644 data/extracted_by_column/CM4AI/cm4ai_comprehensive_d4d.yaml diff --git a/data/extracted_by_column/CM4AI/cm4ai_comprehensive_d4d.yaml b/data/extracted_by_column/CM4AI/cm4ai_comprehensive_d4d.yaml new file mode 100644 index 00000000..27bd2519 --- /dev/null +++ b/data/extracted_by_column/CM4AI/cm4ai_comprehensive_d4d.yaml @@ -0,0 +1,37 @@ +# D4D Metadata for CM4AI (Cell Maps for Artificial Intelligence) +# Source: https://cm4ai.org/publications/, https://cm4ai.org, https://dataverse.lib.virginia.edu/dataverse/CM4AI +# Generated: 2025-11-07 +# Project: Bridge2AI NIH OT2OD032742-01 + +id: cm4ai-cell-maps +name: Cell Maps for Artificial Intelligence +title: CM4AI - AI-Ready Maps of Human Cell Architecture +description: The Cell Maps for Artificial Intelligence (CM4AI) dataset consists of machine-readable hierarchical maps of cell architecture generated through multimodal approaches including proteomic mass spectrometry, cellular imaging, and genetic perturbation via CRISPR/Cas9. The project targets 100 chromatin modifiers and 100 metabolic enzymes related to cancer, neuropsychiatric, and cardiac disorders. Data is collected across multiple cell models including triple-negative breast cancer cells (MDA-MB-468) and iPSC lines (undifferentiated, differentiated neurons, and cardiomyocytes), with treatment conditions including paclitaxel and vorinostat. The dataset is designed specifically for AI/ML applications with comprehensive FAIR metadata, provenance tracking using the FAIRSCAPE toolkit, and structured in RO-Crate format. Quarterly data releases are published through the University of Virginia Dataverse repository. As of the latest release, the dataset contains 53,788 immunofluorescent images, 1,374 protein interactions, 1,792 proteins investigated, 11,739 genes targeted, and over 22.7 terabytes of data. +language: en +page: https://cm4ai.org +keywords: + - cell architecture + - proteomics + - mass spectrometry + - immunofluorescence imaging + - CRISPR/Cas9 + - genetic perturbation + - AI-ready data + - FAIR data + - protein-protein interactions + - subcellular localization + - iPSCs + - cancer + - neuropsychiatric disorders + - cardiac disorders + - Bridge2AI + - spatial proteomics + - single-cell RNA sequencing + - chromatin modifiers + - metabolic enzymes + - RO-Crate + - FAIRSCAPE + - provenance +doi: doi:10.18130/V3/B35XWX +license: CC-BY-NC-SA-4.0 +version: "1.4" diff --git a/src/html/output/D4D_-_CM4AI_Dataverse_v3_human_readable.html b/src/html/output/D4D_-_CM4AI_Dataverse_v3_human_readable.html index 012d358a..05a534c1 100644 --- a/src/html/output/D4D_-_CM4AI_Dataverse_v3_human_readable.html +++ b/src/html/output/D4D_-_CM4AI_Dataverse_v3_human_readable.html @@ -4926,7 +4926,7 @@

Maintenance

- Generated on 2025-10-30 10:38:41 using Bridge2AI Data Sheets Schema + Generated on 2025-11-07 19:58:56 using Bridge2AI Data Sheets Schema
\ No newline at end of file