Spaces:

FuseAI
/

README

Running

App Files Files Community

Wanfq commited on Aug 16, 2024

Commit

9e6d70c

verified ·

1 Parent(s): d0f751c

Update README.md

Browse files

Files changed (1) hide show

README.md +16 -32

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ pinned: false
 <h4> |<a href="https://arxiv.org/abs/2401.10491"> 📑 FuseLLM Paper @ICLR2024 </a> |
-<a href="https://arxiv.org/abs/2402.16107"> 📑 FuseChat Tech Report </a> |
 <a href="https://huggingface.co/FuseAI"> 🤗 HuggingFace Repo </a> |
 <a href="https://github.com/fanqiwan/FuseLLM"> 🐱 GitHub Repo </a> |
 </h4>
@@ -39,6 +39,8 @@ Welcome to join us!
 ### FuseChat [SOTA 7B LLM on MT-Bench]
 - **Mar 13, 2024:** 🔥🔥🔥 We release a HuggingFace Space for [FuseChat-7B](https://huggingface.co/spaces/FuseAI/FuseChat-7B), try it now!
 - **Feb 26, 2024:** 🔥🔥 We release [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM), which is the fusion of three prominent chat LLMs with diverse architectures and scales, namely [NH2-Mixtral-8x7B](https://huggingface.co/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO), [NH2-Solar-10.7B](https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B), and [OpenChat-3.5-7B](https://huggingface.co/openchat/openchat_3.5). FuseChat-7B-VaRM achieves an average performance of **8.22** on MT-Bench, outperforming various powerful chat LLMs like [Starling-7B](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha), [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat), and [Tulu-2-DPO-70B](https://huggingface.co/allenai/tulu-2-dpo-70b), even surpassing [GPT-3.5 (March)](https://platform.openai.com/docs/models/gpt-3-5-turbo), [Claude-2.1](https://www.anthropic.com/news/claude-2-1), and approaching [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
@@ -46,43 +48,25 @@ Welcome to join us!
 - **Feb 25, 2024:** 🔥 We release [FuseChat-Mixture](https://huggingface.co/datasets/FuseAI/FuseChat-Mixture), which is a comprehensive training dataset covers different styles and capabilities, featuring both human-written and model-generated, and spanning general instruction-following and specific skills.
 <p align="center">
-    <img src="fig_0.png" width="50%"> <br>
 </p>
-| Proprietary Models                                                    | #Params | MT-Bench | Open Source Models                                                    | #Params | MT-Bench |
-|-----------------------------------------------------------------------|---------|----------|-----------------------------------------------------------------------|---------|----------|
-| GPT-4-1106-preview                                                    | -       | 9.32     | Qwen1.5-72B-Chat                                                      | 72B     | 8.61     |
-| GPT-4-0613                                                            | -       | 9.18     | Nous-Hermes-2-Mixtral-8x7B-DPO                                        | 8x7B    | 8.33     |
-| GPT-4-0314                                                            | -       | 8.96     | Mixtral-8x7B-Instruct-v0.1                                            | 8x7B    | 8.30     |
-| Mistral Medium                                                        | -       | 8.61     | 🤗 [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM) | 7B      | 8.22     |
-| GPT-3.5-Turbo-0613                                                    | -       | 8.39     | Starling-LM-7B-alpha                                                  | 7B      | 8.09     |
-| GPT-3.5-Turbo-1106                                                    | -       | 8.32     | Tulu-2-DPO-70B                                                        | 70B     | 7.89     |
-| 🤗 [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM) | 7B      | 8.22     | OpenChat-3.5                                                          | 7B      | 7.81     |
-| Claude-2.1                                                            | -       | 8.18     | OpenChat-3.5-0106                                                     | 7B      | 7.80     |
-| Claude-2.0                                                            | -       | 8.06     | WizardLM-70B-v1.0                                                     | 70B     | 7.71     |
-| GPT-3.5-Turbo-0314                                                    | -       | 7.94     | Yi-34B-Chat                                                           | 34B     | 7.67     |
-| Claude-1                                                              | -       | 7.90     | Nous-Hermes-2-SOLAR-10.7B                                             | 10.7B   | 7.66     |
 ### FuseLLM [Surpassing Llama-2-7B]
 - **Jan 22, 2024:** 🔥 We release [FuseLLM-7B](https://huggingface.co/Wanfq/FuseLLM-7B), which is the fusion of three open-source foundation LLMs with distinct architectures, including [Llama-2-7B](https://huggingface.co/meta-llama/Llama-2-7b-hf), [OpenLLaMA-7B](https://huggingface.co/openlm-research/open_llama_7b_v2), and [MPT-7B](https://huggingface.co/mosaicml/mpt-7b).
-| Model                                                    | BBH   | ARC-easy | ARC-challenge | BoolQ | HellaSwag | OpenBookQA |
-|----------------------------------------------------------|-------|----------|---------------|-------|-----------|------------|
-| OpenLLaMA-7B                                             | 33.87 | 69.70    | 41.38         | 72.29 | 74.53     | 41.00      |
-| MPT-7B                                                   | 33.38 | 70.12    | 42.15         | 74.74 | 76.25     | 42.40      |
-| Llama-2-7B                                               | 39.70 | 74.58    | 46.33         | 77.71 | 76.00     | 44.20      |
-| Llama-2-CLM-7B                                           | 40.44 | 74.54    | 46.50         | 76.88 | 76.57     | 44.80      |
-| 🤗 [FuseLLM-7B](https://huggingface.co/Wanfq/FuseLLM-7B) | 41.75 | 75.04    | 47.44         | 78.13 | 76.78     | 45.40      |
-| Model                                                    | MultiPL-E | TrivialQA | DROP  | LAMBADA | IWSLT2017 | SciBench |
-|----------------------------------------------------------|-----------|-----------|-------|---------|-----------|----------|
-| OpenLLaMA-7B                                             | 18.11     | 39.96     | 22.31 | 70.31   | 5.51      | 0.68     |
-| MPT-7B                                                   | 17.26     | 28.89     | 23.54 | 70.08   | 5.49      | 0.88     |
-| Llama-2-7B                                               | 14.63     | 52.46     | 27.25 | 73.28   | 6.48      | 0.14     |
-| Llama-2-CLM-7B                                           | 14.83     | 53.14     | 28.51 | 73.45   | 6.91      | 0.94     |
-| 🤗 [FuseLLM-7B](https://huggingface.co/Wanfq/FuseLLM-7B) | 15.56     | 54.49     | 28.97 | 73.72   | 6.75      | 1.65     |
 ## Citation
@@ -102,8 +86,8 @@ Please cite the following paper if you reference our model, code, data, or paper
 ```
 @article{wan2024fusechat,
   title={FuseChat: Knowledge Fusion of Chat Models},
-  author={Fanqi Wan and Ziyi Yang and Longguang Zhong and Xiaojun Quan and Xinting Huang and Wei Bi},
-  journal={arXiv preprint arXiv:2402.16107},
   year={2024}
 }
 ```

 <h4> |<a href="https://arxiv.org/abs/2401.10491"> 📑 FuseLLM Paper @ICLR2024 </a> |
+<a href="https://arxiv.org/abs/2408.07990"> 📑 FuseChat Tech Report </a> |
 <a href="https://huggingface.co/FuseAI"> 🤗 HuggingFace Repo </a> |
 <a href="https://github.com/fanqiwan/FuseLLM"> 🐱 GitHub Repo </a> |
 </h4>
 ### FuseChat [SOTA 7B LLM on MT-Bench]
+- **Aug 16, 2024:** 🔥🔥🔥🔥 We update the [FuseChat tech report](https://arxiv.org/abs/2408.07990) and release [FuseChat-7B-v2.0](https://huggingface.co/FuseAI/FuseChat-7B-v2.0), which is the fusion of six prominent chat LLMs with diverse architectures and scales, namely [OpenChat-3.5-7B](https://huggingface.co/openchat/openchat_3.5), [Starling-LM-7B-alpha](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha), [NH2-Solar-10.7B](https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B), [InternLM2-Chat-20B](https://huggingface.co/internlm/internlm2-chat-20b), [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1), and [Qwen1.5-Chat-72B](https://huggingface.co/Qwen/Qwen1.5-72B-Chat). FuseChat-7B-v2.0 achieves an average performance of **7.38** on MT-Bench (GPT-4-0125-Preview as judge LLM), which is comparable to [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) and approaches [GPT-3.5-Turbo-1106](https://platform.openai.com/docs/models/gpt-3-5-turbo).
 - **Mar 13, 2024:** 🔥🔥🔥 We release a HuggingFace Space for [FuseChat-7B](https://huggingface.co/spaces/FuseAI/FuseChat-7B), try it now!
 - **Feb 26, 2024:** 🔥🔥 We release [FuseChat-7B-VaRM](https://huggingface.co/FuseAI/FuseChat-7B-VaRM), which is the fusion of three prominent chat LLMs with diverse architectures and scales, namely [NH2-Mixtral-8x7B](https://huggingface.co/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO), [NH2-Solar-10.7B](https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B), and [OpenChat-3.5-7B](https://huggingface.co/openchat/openchat_3.5). FuseChat-7B-VaRM achieves an average performance of **8.22** on MT-Bench, outperforming various powerful chat LLMs like [Starling-7B](https://huggingface.co/berkeley-nest/Starling-LM-7B-alpha), [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat), and [Tulu-2-DPO-70B](https://huggingface.co/allenai/tulu-2-dpo-70b), even surpassing [GPT-3.5 (March)](https://platform.openai.com/docs/models/gpt-3-5-turbo), [Claude-2.1](https://www.anthropic.com/news/claude-2-1), and approaching [Mixtral-8x7B-Instruct](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
 - **Feb 25, 2024:** 🔥 We release [FuseChat-Mixture](https://huggingface.co/datasets/FuseAI/FuseChat-Mixture), which is a comprehensive training dataset covers different styles and capabilities, featuring both human-written and model-generated, and spanning general instruction-following and specific skills.
 <p align="center">
+    <img src="tab0.png" width="60%"> <br>
+</p>
+<p align="center">
+    <img src="tab1.png" width="95%"> <br>
 </p>
 ### FuseLLM [Surpassing Llama-2-7B]
 - **Jan 22, 2024:** 🔥 We release [FuseLLM-7B](https://huggingface.co/Wanfq/FuseLLM-7B), which is the fusion of three open-source foundation LLMs with distinct architectures, including [Llama-2-7B](https://huggingface.co/meta-llama/Llama-2-7b-hf), [OpenLLaMA-7B](https://huggingface.co/openlm-research/open_llama_7b_v2), and [MPT-7B](https://huggingface.co/mosaicml/mpt-7b).
+<p align="center">
+    <img src="fig0.png" width="95%"> <br>
+</p>
+<p align="center">
+    <img src="fig1.png" width="95%"> <br>
+</p>
 ## Citation
 ```
 @article{wan2024fusechat,
   title={FuseChat: Knowledge Fusion of Chat Models},
+  author={Fanqi Wan and Longguang Zhong and Ziyi Yang and Ruijun Chen and Xiaojun Quan},
+  journal={arXiv preprint arXiv:2408.07990},
   year={2024}
 }
 ```