Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -16,7 +16,7 @@ pinned: false
|
|
| 16 |
|
| 17 |
|
| 18 |
<h4> |<a href="https://arxiv.org/abs/2401.10491"> π FuseLLM Paper @ICLR2024 </a> |
|
| 19 |
-
<a href="https://arxiv.org/abs/
|
| 20 |
<a href="https://huggingface.co/FuseAI"> π€ HuggingFace Repo </a> |
|
| 21 |
<a href="https://github.com/fanqiwan/FuseLLM"> π± GitHub Repo </a> |
|
| 22 |
</h4>
|
|
@@ -39,6 +39,8 @@ Welcome to join us!
|
|
| 39 |
|
| 40 |
### FuseChat [SOTA 7B LLM on MT-Bench]
|
| 41 |
|
|
|
|
|
|
|
| 42 |
- **Mar 13, 2024:** π₯π₯π₯ We release a HuggingFace Space for [FuseChat-7B](https://huggingface.co/spaces/FuseAI/FuseChat-7B), try it now!
|
| 43 |
|
| 44 |
- **Feb 26, 2024:** π₯π₯ We release [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM), which is the fusion of three prominent chat LLMs with diverse architectures and scales, namely [NH2-Mixtral-8x7B](https://huggingface.co/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO), [NH2-Solar-10.7B](https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B), and [OpenChat-3.5-7B](https://huggingface.co/openchat/openchat_3.5). FuseChat-7B-VaRM achieves an average performance of **8.22** on MT-Bench, outperforming various powerful chat LLMs like [Starling-7B](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha), [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat), and [Tulu-2-DPO-70B](https://huggingface.co/allenai/tulu-2-dpo-70b), even surpassing [GPT-3.5 (March)](https://platform.openai.com/docs/models/gpt-3-5-turbo), [Claude-2.1](https://www.anthropic.com/news/claude-2-1), and approaching [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
|
|
@@ -46,43 +48,25 @@ Welcome to join us!
|
|
| 46 |
- **Feb 25, 2024:** π₯ We release [FuseChat-Mixture](https://huggingface.co/datasets/FuseAI/FuseChat-Mixture), which is a comprehensive training dataset covers different styles and capabilities, featuring both human-written and model-generated, and spanning general instruction-following and specific skills.
|
| 47 |
|
| 48 |
<p align="center">
|
| 49 |
-
<img src="
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
</p>
|
| 51 |
|
| 52 |
-
| Proprietary Models | #Params | MT-Bench | Open Source Models | #Params | MT-Bench |
|
| 53 |
-
|-----------------------------------------------------------------------|---------|----------|-----------------------------------------------------------------------|---------|----------|
|
| 54 |
-
| GPT-4-1106-preview | - | 9.32 | Qwen1.5-72B-Chat | 72B | 8.61 |
|
| 55 |
-
| GPT-4-0613 | - | 9.18 | Nous-Hermes-2-Mixtral-8x7B-DPO | 8x7B | 8.33 |
|
| 56 |
-
| GPT-4-0314 | - | 8.96 | Mixtral-8x7B-Instruct-v0.1 | 8x7B | 8.30 |
|
| 57 |
-
| Mistral Medium | - | 8.61 | π€ [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM) | 7B | 8.22 |
|
| 58 |
-
| GPT-3.5-Turbo-0613 | - | 8.39 | Starling-LM-7B-alpha | 7B | 8.09 |
|
| 59 |
-
| GPT-3.5-Turbo-1106 | - | 8.32 | Tulu-2-DPO-70B | 70B | 7.89 |
|
| 60 |
-
| π€ [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM) | 7B | 8.22 | OpenChat-3.5 | 7B | 7.81 |
|
| 61 |
-
| Claude-2.1 | - | 8.18 | OpenChat-3.5-0106 | 7B | 7.80 |
|
| 62 |
-
| Claude-2.0 | - | 8.06 | WizardLM-70B-v1.0 | 70B | 7.71 |
|
| 63 |
-
| GPT-3.5-Turbo-0314 | - | 7.94 | Yi-34B-Chat | 34B | 7.67 |
|
| 64 |
-
| Claude-1 | - | 7.90 | Nous-Hermes-2-SOLAR-10.7B | 10.7B | 7.66 |
|
| 65 |
|
| 66 |
### FuseLLM [Surpassing Llama-2-7B]
|
| 67 |
|
| 68 |
- **Jan 22, 2024:** π₯ We release [FuseLLM-7B](https://huggingface.co/Wanfq/FuseLLM-7B), which is the fusion of three open-source foundation LLMs with distinct architectures, including [Llama-2-7B](https://huggingface.co/meta-llama/Llama-2-7b-hf), [OpenLLaMA-7B](https://huggingface.co/openlm-research/open_llama_7b_v2), and [MPT-7B](https://huggingface.co/mosaicml/mpt-7b).
|
| 69 |
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
| MPT-7B | 33.38 | 70.12 | 42.15 | 74.74 | 76.25 | 42.40 |
|
| 74 |
-
| Llama-2-7B | 39.70 | 74.58 | 46.33 | 77.71 | 76.00 | 44.20 |
|
| 75 |
-
| Llama-2-CLM-7B | 40.44 | 74.54 | 46.50 | 76.88 | 76.57 | 44.80 |
|
| 76 |
-
| π€ [FuseLLM-7B](https://huggingface.co/Wanfq/FuseLLM-7B) | 41.75 | 75.04 | 47.44 | 78.13 | 76.78 | 45.40 |
|
| 77 |
-
|
| 78 |
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
|
| 82 |
-
| MPT-7B | 17.26 | 28.89 | 23.54 | 70.08 | 5.49 | 0.88 |
|
| 83 |
-
| Llama-2-7B | 14.63 | 52.46 | 27.25 | 73.28 | 6.48 | 0.14 |
|
| 84 |
-
| Llama-2-CLM-7B | 14.83 | 53.14 | 28.51 | 73.45 | 6.91 | 0.94 |
|
| 85 |
-
| π€ [FuseLLM-7B](https://huggingface.co/Wanfq/FuseLLM-7B) | 15.56 | 54.49 | 28.97 | 73.72 | 6.75 | 1.65 |
|
| 86 |
|
| 87 |
|
| 88 |
## Citation
|
|
@@ -102,8 +86,8 @@ Please cite the following paper if you reference our model, code, data, or paper
|
|
| 102 |
```
|
| 103 |
@article{wan2024fusechat,
|
| 104 |
title={FuseChat: Knowledge Fusion of Chat Models},
|
| 105 |
-
author={Fanqi Wan and
|
| 106 |
-
journal={arXiv preprint arXiv:
|
| 107 |
year={2024}
|
| 108 |
}
|
| 109 |
```
|
|
|
|
| 16 |
|
| 17 |
|
| 18 |
<h4> |<a href="https://arxiv.org/abs/2401.10491"> π FuseLLM Paper @ICLR2024 </a> |
|
| 19 |
+
<a href="https://arxiv.org/abs/2408.07990"> π FuseChat Tech Report </a> |
|
| 20 |
<a href="https://huggingface.co/FuseAI"> π€ HuggingFace Repo </a> |
|
| 21 |
<a href="https://github.com/fanqiwan/FuseLLM"> π± GitHub Repo </a> |
|
| 22 |
</h4>
|
|
|
|
| 39 |
|
| 40 |
### FuseChat [SOTA 7B LLM on MT-Bench]
|
| 41 |
|
| 42 |
+
- **Aug 16, 2024:** π₯π₯π₯π₯ We update the [FuseChat tech report](https://arxiv.org/abs/2408.07990) and release [FuseChat-7B-v2.0](https://huggingface.co/FuseAI/FuseChat-7B-v2.0), which is the fusion of six prominent chat LLMs with diverse architectures and scales, namely [OpenChat-3.5-7B](https://huggingface.co/openchat/openchat_3.5), [Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha), [NH2-Solar-10.7B](https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B), [InternLM2-Chat-20B](https://huggingface.co/internlm/internlm2-chat-20b), [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1), and [Qwen1.5-Chat-72B](https://huggingface.co/Qwen/Qwen1.5-72B-Chat). FuseChat-7B-v2.0 achieves an average performance of **7.38** on MT-Bench (GPT-4-0125-Preview as judge LLM), which is comparable to [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) and approaches [GPT-3.5-Turbo-1106](https://platform.openai.com/docs/models/gpt-3-5-turbo).
|
| 43 |
+
|
| 44 |
- **Mar 13, 2024:** π₯π₯π₯ We release a HuggingFace Space for [FuseChat-7B](https://huggingface.co/spaces/FuseAI/FuseChat-7B), try it now!
|
| 45 |
|
| 46 |
- **Feb 26, 2024:** π₯π₯ We release [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM), which is the fusion of three prominent chat LLMs with diverse architectures and scales, namely [NH2-Mixtral-8x7B](https://huggingface.co/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO), [NH2-Solar-10.7B](https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B), and [OpenChat-3.5-7B](https://huggingface.co/openchat/openchat_3.5). FuseChat-7B-VaRM achieves an average performance of **8.22** on MT-Bench, outperforming various powerful chat LLMs like [Starling-7B](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha), [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat), and [Tulu-2-DPO-70B](https://huggingface.co/allenai/tulu-2-dpo-70b), even surpassing [GPT-3.5 (March)](https://platform.openai.com/docs/models/gpt-3-5-turbo), [Claude-2.1](https://www.anthropic.com/news/claude-2-1), and approaching [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
|
|
|
|
| 48 |
- **Feb 25, 2024:** π₯ We release [FuseChat-Mixture](https://huggingface.co/datasets/FuseAI/FuseChat-Mixture), which is a comprehensive training dataset covers different styles and capabilities, featuring both human-written and model-generated, and spanning general instruction-following and specific skills.
|
| 49 |
|
| 50 |
<p align="center">
|
| 51 |
+
<img src="tab0.png" width="60%"> <br>
|
| 52 |
+
</p>
|
| 53 |
+
|
| 54 |
+
<p align="center">
|
| 55 |
+
<img src="tab1.png" width="95%"> <br>
|
| 56 |
</p>
|
| 57 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 58 |
|
| 59 |
### FuseLLM [Surpassing Llama-2-7B]
|
| 60 |
|
| 61 |
- **Jan 22, 2024:** π₯ We release [FuseLLM-7B](https://huggingface.co/Wanfq/FuseLLM-7B), which is the fusion of three open-source foundation LLMs with distinct architectures, including [Llama-2-7B](https://huggingface.co/meta-llama/Llama-2-7b-hf), [OpenLLaMA-7B](https://huggingface.co/openlm-research/open_llama_7b_v2), and [MPT-7B](https://huggingface.co/mosaicml/mpt-7b).
|
| 62 |
|
| 63 |
+
<p align="center">
|
| 64 |
+
<img src="fig0.png" width="95%"> <br>
|
| 65 |
+
</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 66 |
|
| 67 |
+
<p align="center">
|
| 68 |
+
<img src="fig1.png" width="95%"> <br>
|
| 69 |
+
</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
|
| 71 |
|
| 72 |
## Citation
|
|
|
|
| 86 |
```
|
| 87 |
@article{wan2024fusechat,
|
| 88 |
title={FuseChat: Knowledge Fusion of Chat Models},
|
| 89 |
+
author={Fanqi Wan and Longguang Zhong and Ziyi Yang and Ruijun Chen and Xiaojun Quan},
|
| 90 |
+
journal={arXiv preprint arXiv:2408.07990},
|
| 91 |
year={2024}
|
| 92 |
}
|
| 93 |
```
|