Bitsandbytes 文件

論文、相關資源和如何引用

Hugging Face's logo
加入 Hugging Face 社群

並獲得增強的文件體驗

開始使用

論文、相關資源和如何引用

以下學術作品按時間倒序排列。

SpQR:一種用於近乎無損的 LLM 權重壓縮的稀疏量化表示(2023 年 6 月)

作者:Tim Dettmers, Ruslan Svirschevski, Vage Egiazarian, Denis Kuznedelev, Elias Frantar, Saleh Ashkboos, Alexander Borzunov, Torsten Hoefler, Dan Alistarh

@article{dettmers2023spqr,
  title={SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression},
  author={Dettmers, Tim and Svirschevski, Ruslan and Egiazarian, Vage and Kuznedelev, Denis and Frantar, Elias and Ashkboos, Saleh and Borzunov, Alexander and Hoefler, Torsten and Alistarh, Dan},
  journal={arXiv preprint arXiv:2306.03078},
  year={2023}
}

QLoRA:高效微調量化 LLM(2023 年 5 月)

作者:Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer

@article{dettmers2023qlora,
  title={Qlora: Efficient finetuning of quantized llms},
  author={Dettmers, Tim and Pagnoni, Artidoro and Holtzman, Ari and Zettlemoyer, Luke},
  journal={arXiv preprint arXiv:2305.14314},
  year={2023}
}

4 位精度的案例:k 位推理縮放定律(2022 年 12 月)

作者:Tim Dettmers, Luke Zettlemoyer

@inproceedings{dettmers2023case,
  title={The case for 4-bit precision: k-bit inference scaling laws},
  author={Dettmers, Tim and Zettlemoyer, Luke},
  booktitle={International Conference on Machine Learning},
  pages={7750--7774},
  year={2023},
  organization={PMLR}
}

LLM.int8():大規模 Transformer 的 8 位矩陣乘法(2022 年 11 月)

作者:Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer

@article{dettmers2022llm,
  title={Llm. int8 (): 8-bit matrix multiplication for transformers at scale},
  author={Dettmers, Tim and Lewis, Mike and Belkada, Younes and Zettlemoyer, Luke},
  journal={arXiv preprint arXiv:2208.07339},
  year={2022}
}

透過分塊量化實現 8 位最佳化器(2021 年 10 月)

作者:Tim Dettmers, Mike Lewis, Sam Shleifer, Luke Zettlemoyer

@article{DBLP:journals/corr/abs-2110-02861,
  author       = {Tim Dettmers and
                  Mike Lewis and
                  Sam Shleifer and
                  Luke Zettlemoyer},
  title        = {8-bit Optimizers via Block-wise Quantization},
  journal      = {CoRR},
  volume       = {abs/2110.02861},
  year         = {2021},
  url          = {https://arxiv.org/abs/2110.02861},
  eprinttype    = {arXiv},
  eprint       = {2110.02861},
  timestamp    = {Thu, 21 Oct 2021 16:20:08 +0200},
  biburl       = {https://dblp.org/rec/journals/corr/abs-2110-02861.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}
< > 在 GitHub 上更新

© . This site is unofficial and not affiliated with Hugging Face, Inc.