Conversation
… to be optimized.
Feature/chat cli
Added ppl and few-shot evaluation scripts.
Feature/evaluation
…eenBitAI/green-bit-llm into feature/langchain_integration
…eenBitAI/green-bit-llm into feature/langchain_integration
…eenBitAI/green-bit-llm into feature/langchain_integration
Feature/langchain integration
I have some questions about code specifications, such as API names and some parameters, as well as which feature points need to be improved. Can you give me some suggestions? |
There was a problem hiding this comment.
We probably need to rethink some parts about the installation process as a whole, but in the meantime I think we should add some brief documentation about the installation process if VLLM is to be used. (Install the corresponding requirements.txt files after installation of green-bit-llm?) Maybe add a README to third_party/vllm or add a section to the main readme about this.
Please also always add a newline to the end of the file.
| python -m green_bit_llm.evaluation.evaluate --model GreenBitAI/Qwen-1.5-4B-layer-mix-bpw-3.0 --trust-remote-code --eval-ppl --ppl-tasks wikitext2,c4,ptb | ||
| python -m green_bit_llm.evaluation.evaluate --model GreenBitAI/Qwen-1.5-4B-layer-mix-bpw-3.0 --backend greenbit-engine --trust-remote-code --eval-ppl --ppl-tasks wikitext2,c4,ptb | ||
| ``` | ||
| or |
There was a problem hiding this comment.
I think some explanation is needed. We could add some information here on the significance of this choice. I.e. what is VLLM, why/when to use it, or some link. Or all of those.
|
|
||
| from lm_eval import evaluator | ||
|
|
||
| from vllm.model_executor.layers.logits_processor import _apply_logits_processors |
There was a problem hiding this comment.
I assume there is no "safe" alternativ instead of using this internal method, right? If so, I think we can use it.
| lm_head, hidden_states, sampling_metadata, *embedding_bias = input | ||
| embedding_bias = embedding_bias[0] if embedding_bias else None | ||
| logits = module._get_logits(hidden_states, lm_head, embedding_bias) | ||
| if logits is not None: |
There was a problem hiding this comment.
I think we should use a guard clause here.
| shift_labels = testenc[:, (i * seqlen): ((i + 1) * seqlen)][ | ||
| :, 1: | ||
| ].to(device) |
There was a problem hiding this comment.
Please adjust the styling (see https://peps.python.org/pep-0008/#indentation or follow black style).
| DEFAULT_SEQLEN = 2048 | ||
| DEFAULT_RANDOM_SEED = 0 | ||
| DTYPE = torch.half | ||
| DEFAULT_MODEL_BCKEND = ["vllm", "greenbit-engine"] |
There was a problem hiding this comment.
The name seems it should be AVAILABLE_MODEL_BACKENDS. It could be expected that the first option is the default, or we also declare the default model backend separately.
| class GBLLMInferenceBackend(BaseInferenceBackend): | ||
| def __init__(self, model_path, **kwargs): | ||
| # Building configs | ||
| tokenizer_config = {"trust_remote_code": True if kwargs.get("trust_remote_code") else None} |
There was a problem hiding this comment.
I think we should use kwargs.get("trust_remote_code", None) to set a default. Similar below for the same and other arguments.
About merge the vllm eval, I need some help.