lyiase のブックマーク / はてなブックマーク

あとで読む

lyiase のブックマーク 2024/01/22 10:47

<blockquote class="hatena-bookmark-comment"><a class="comment-info" href="https://b.hatena.ne.jp/entry/4748154503094025391/comment/lyiase" data-user-id="lyiase" data-entry-url="https://b.hatena.ne.jp/entry/s/arxiv.org/abs/2401.10020" data-original-href="https://arxiv.org/abs/2401.10020" data-entry-favicon="https://cdn-ak2.favicon.st-hatena.com/64?url=https%3A%2F%2Farxiv.org%2Fabs%2F2401.10020" data-user-icon="/users/lyiase/profile.png">Self-Rewarding Language Models</a><ul class="comment-tag" style="list-style: none; margin: 0px;"><li style="float: left">[<a href="https://b.hatena.ne.jp/q/%E3%81%82%E3%81%A8%E3%81%A7%E8%AA%AD%E3%82%80">あとで読む</a>]</li></ul><br><p style="clear: left"></p><a class="datetime" href="https://b.hatena.ne.jp/lyiase/20240122#bookmark-4748154503094025391"><span class="datetime-body">2024/01/22 10:47</span></a></blockquote><script src="https://b.st-hatena.com/js/comment-widget.js" charset="utf-8" async></script>

このブックマークにはスターがありません。
最初のスターをつけてみよう！

Self-Rewarding Language Models

arxiv.org2024/01/21

We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal. Current approaches commonly train reward models from human pre...

17 人がブックマーク・1 件のコメント

他のコメントを読む

＼コメントがサクサク読めるアプリです／

はてなブックマーク

Self-Rewarding Language Models

はてなブックマーク

公式Twitter

はてなのサービス