Add more documentation #22

MichelDucartier · 2025-11-03T00:17:38Z

This pull request introduces several important improvements to the project documentation and repository structure, focusing on enhancing usability, clarity, and compliance. The most significant updates include a major overhaul of the README.md for better onboarding, the addition of a license file, and new or improved documentation for configuration and extensibility.

Documentation and usability improvements:

Revamped README.md with a clearer project introduction, feature highlights, setup instructions (including Docker and uv), an updated inference example, and simplified guidance for adding new modalities. The new format also includes project badges and improved visuals
Added a new section in the documentation (docs/source/guides/configuration.rst) providing a detailed YAML configuration reference for model training and usage.
Added an anchor for the "add modality" guide to improve navigation in the developer documentation.

Repository and compliance updates:

Added an Apache 2.0 license file to the repository, ensuring clear open-source licensing and compliance.

…re-docs

Copilot

Pull Request Overview

This PR updates documentation and adds key supporting files for the MultiMeditron project. The changes enhance user-facing documentation with comprehensive guides, improve branding with new logos, and add the Apache 2.0 license file.

Restructured documentation with enhanced branding (dual-theme logos, centered banner) and improved navigation
Added comprehensive training guide, dataset format documentation, and configuration reference
Updated README with cleaner structure, feature highlights, and complete setup/inference examples

Reviewed Changes

Copilot reviewed 11 out of 15 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
docs/source/index.rst	Enhanced documentation landing page with dual-theme logos, centered banner, and improved table of contents structure
docs/source/guides/training.rst	Added comprehensive training guide with YAML configuration examples, DeepSpeed setup, and multi-node deployment instructions
docs/source/guides/quickstart.rst	Corrected inline code formatting for placeholders using :code: directive
docs/source/guides/known_issues.rst	Simplified section title by removing redundant "when mounting volumes" text
docs/source/guides/guide.rst	Added includehidden directive and new configuration page to table of contents
docs/source/guides/dataset_format.rst	Added detailed dataset format documentation covering Arrow and JSONL formats for both pretraining and instruction-tuning
docs/source/guides/configuration.rst	Created new configuration reference with comprehensive YAML parameter documentation
docs/source/guides/add_modality.rst	Fixed plural agreement ("steps" → "step") in modality processing pipeline description
docs/source/conf.py	Added blank line for formatting consistency
assets/architecture.png	Added architecture diagram PNG for documentation
README.md	Complete rewrite with improved structure, feature highlights, installation instructions, and corrected code examples
LICENSE	Added Apache 2.0 license file

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/source/guides/training.rst

README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

docs/source/guides/configuration.rst

docs/source/guides/dataset_format.rst

fabnemEPFL · 2025-11-03T10:37:09Z

docs/source/guides/dataset_format.rst

+
+    [{"type": "modality_type", "value" : some_modality}]
+
+For instance, for image type, :code:`some_modality` must contains the bytes of the image


some_modality -> value, no?

fabnemEPFL · 2025-11-03T10:38:38Z

docs/source/guides/dataset_format.rst

+
+.. warning::
+
+   Please note that JSONL format is not recommended! We provide scripts to convert JSONL-formatted dataset into Arrow dataset. If your dataset is  in a JSONL format, you need to convert it first to Arrow before training.


"to convert a JSONL-formatted dataset into an Arrow dataset"
extra space in "if your dataset is in"
Would be nice to specify the path to the scritps

fabnemEPFL · 2025-11-03T10:38:55Z

docs/source/guides/dataset_format.rst

+
+    {
+      "text": "Let's compare the first image: <|reserved_special_token_0|>, and the second 3D image: <|reserved_special_token_0|>",
+      "modalities": [{"type" : "image", "value" : "path/to/png"}, {"type" : "image_3d", "value" : "path/to/npy"}]


path -> absolute or relative to what?

fabnemEPFL · 2025-11-03T10:45:20Z

docs/source/guides/training.rst

+Launch the training
+-------------------
+
+Once the training configuration are done, we are ready to launch a training. We support both single node and multi node training.


configurations

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

MichelDucartier and others added 25 commits November 2, 2025 16:32

Add more docs

8298c75

Add architecture to README

872b09e

Update to png

fe9fa05

maybe

294a325

Commit svg

252f825

Some day I will have color

85931ee

Commit png

ac1bb34

maybe again

e8e15b8

et la lumiere fut

f931b26

Add LICENSE and change README

d75ba41

Change redirect link

3924f5c

Fix

95cb1ba

Maybe?

561a2e4

Delete LICENSE

1568d02

Add Apache License 2.0 to the project

aef8f9c

Add logo

349e589

Merge branch 'more-docs' of github.com:EPFLiGHT/MultiMeditron into mo…

043717d

…re-docs

Center + resize

42bda59

Resize

fd598d3

I'm bad at HTML

a036b3d

Maybe

473f6b2

Crop

661d0d0

Add logo in doc

05e8c26

Remove emojis

1c154a7

Typo

2786210

MichelDucartier marked this pull request as ready for review November 3, 2025 10:22

Clean

62ddb01

fabnemEPFL requested review from Copilot and fabnemEPFL November 3, 2025 10:26

Copilot AI reviewed Nov 3, 2025

View reviewed changes

docs/source/guides/training.rst Outdated Show resolved Hide resolved

docs/source/guides/training.rst Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

MichelDucartier and others added 2 commits November 3, 2025 11:29

Update docs/source/guides/training.rst

7e70f29

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Apply suggestions from code review

6d66ffc

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

fabnemEPFL requested changes Nov 3, 2025

View reviewed changes

Apply suggestions from code review

207e6a2

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

MichelDucartier merged commit 4b43371 into master Nov 17, 2025
1 check failed

MichelDucartier deleted the more-docs branch November 17, 2025 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add more documentation #22

Add more documentation #22

Uh oh!

MichelDucartier commented Nov 3, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fabnemEPFL Nov 3, 2025

Uh oh!

fabnemEPFL Nov 3, 2025

Uh oh!

fabnemEPFL Nov 3, 2025

Uh oh!

fabnemEPFL Nov 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		[{"type": "modality_type", "value" : some_modality}]

		For instance, for image type, :code:`some_modality` must contains the bytes of the image


		.. warning::

		Please note that JSONL format is not recommended! We provide scripts to convert JSONL-formatted dataset into Arrow dataset. If your dataset is in a JSONL format, you need to convert it first to Arrow before training.

Add more documentation #22

Add more documentation #22

Uh oh!

Conversation

MichelDucartier commented Nov 3, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fabnemEPFL Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

fabnemEPFL Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

fabnemEPFL Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

fabnemEPFL Nov 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants