forked from FedML-AI/FedML
-
Notifications
You must be signed in to change notification settings - Fork 0
Marcos/benchmarks #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
mdvillagra
wants to merge
171
commits into
master
Choose a base branch
from
marcos/benchmarks
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…odelLLMAggregator
…nclude timeout for launch_fedllm_custom.py execution
…FullModelLLMTrainer and FullModelLLMAggregator
…gging in FedMLServerManager
…LLMAggregator by removing commented-out code and ensuring consistent execution of checkpointing after aggregation.
…ess and format compliance scoring. Introduce DataFormatting and Evaluation classes for data handling and numerical extraction. Update FullModelLLMTrainer to utilize the new reward function.
…nd adjust communication rounds in grpo_gsm8k_test_config.yaml from 1 to 2 for testing.
…tion class to streamline input parameters and improve clarity. Update FullModelLLMTrainer to utilize the revised combined_reward function.
…amline reward calculation, focusing solely on correctness scoring in the combined_reward method.
…mproved model output capacity. Update grpo_gsm8k_test_config.yaml to change client setup from 1 to 2 clients for enhanced testing scenarios.
…LLMTrainer to enhance experiment tracking.
…rd_fn instead of combined_reward, enhancing clarity and consistency in reward calculation.
… clarity and maintainability. Update reward_fn to utilize class attributes for reward values, improving consistency in reward calculations.
…_config.yaml for improved testing. Adjust timeout duration in run_fedml_client_custom.sh and run_fedml_server_custom.sh scripts to accommodate longer execution times.
…r faster testing iterations.
…ng and improve logging of missing/unexpected keys. Adjust grpo_gsm8k_test_config.yaml for testing parameters, reducing communication rounds and training steps for quicker iterations.
…s base model state. Enhance logging to provide clearer output of missing and unexpected keys during model loading for improved debugging.
…gFace's save_pretrained method for improved compatibility. Implement fallback mechanism for models lacking this method, ensuring robust checkpointing during training.
…30 for extended testing iterations.
…r improved troubleshooting during answer validation.
…ingle client setup for testing, adjusting client_num_in_total and client_num_per_round to 1.
… print statements to display completions and answers for better troubleshooting during answer validation.
… FullModelLLMTrainer. Refactor reward function to improve answer validation by incorporating numeric equivalence checks for better accuracy.
… for clarity in numeric equivalence checks, enhancing accuracy in answer validation.
… training parameters for improved testing. Increase client_num_in_total and client_num_per_round to 2, extend comm_round to 300, and raise grpo_max_steps to 150 for more comprehensive evaluation.
…from 256 to 512 for enhanced response handling.
…ForCausalLM for improved flexibility and evaluation. Removed deprecated torch_dtype handling and ensured dropout is disabled for the reference model.
…iner to avoid CPU↔GPU mismatch. Retain commented line for CPU off-loading during debugging.
… improved performance and compatibility.
…Q-Int8 for enhanced performance and compatibility.
…om 512 to 256 for optimized response handling.
…from 256 to 512 for enhanced response handling.
…iner.py and ensure reference model is moved to CPU for consistency in TimedGRPOTrainer.
…ine imports and improve code clarity.
… improved performance and compatibility.
…to model's device in _get_per_token_logps_and_entropies method to prevent CPU↔GPU mismatch errors.
…le tensor inputs in _get_per_token_logps_and_entropies method, ensuring compatibility with both tensor and mapping types.
…ssary tensor conversion logic, simplifying the process of moving inputs to the model's device.
…ties and entropies are moved to the appropriate device after computation in _get_per_token_logps_and_entropies method, improving device compatibility.
…or tensor types before moving log probabilities and entropies to the appropriate device, enhancing robustness and preventing potential errors.
…strategy to move policy outputs to CPU when the reference model is on CPU, ensuring efficient memory usage and preventing GPU memory spikes during rollouts.
…og probabilities and entropies to float16 and ensuring they are moved to the appropriate device, enhancing performance and memory efficiency during training.
… 256 for improved performance and resource management during training.
…mentation and clarity, and update max completion length and new tokens in FullModelLLMTrainer to 512 for enhanced training performance.
…additional parameters for enhanced performance and compatibility, and clean up commented code for better readability.
…ning performance and stability.
…h size in GRPO test config to 2 for improved testing efficiency.
…zed testing configuration.
…timized testing efficiency.
…or enhanced testing scalability.
…"Qwen/Qwen3-0.6" for improved compatibility and performance.
…Qwen/Qwen3-0.6B" for enhanced performance and compatibility.
…cstrings, and update GRPO test configuration by increasing max steps from 20 to 50 and batch size from 1 to 2 for enhanced testing efficiency.
…2 in FullModelLLMAggregator for improved checkpoint management.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.