Default Branch

8447d70b18 · Some gemma 3 improvements (#1000) · Updated 2026-04-06 10:05:05 +08:00

Branches

fd11713ed9 · exclude torchvission from nightly · Updated 2026-04-05 01:17:50 +08:00    git-market

2
2

9f7dbb2493 · Update docker file · Updated 2025-10-07 07:31:59 +08:00    git-market

76
1

b1f852c1ba · Update requirements.txt · Updated 2025-09-27 10:57:22 +08:00    git-market

84
2

862df48e38 · use apply_chat_template · Updated 2025-09-16 21:12:01 +08:00    git-market

91
9

8fd29ed079 · Gemma 3 270M from scratch · Updated 2025-08-17 08:49:38 +08:00    git-market

804
679

06aa6d470a · Fix eos token usage in Qwen3 tokenizer · Updated 2025-08-06 02:42:18 +08:00    git-market

128
1

4aa398c79d · Comment typo: head_dim -> head_dim // 2 · Updated 2025-07-23 21:16:30 +08:00    git-market

131
1

1552023bd4 · Fix issue 724: unused args (#726) · Updated 2025-07-11 01:58:32 +08:00    git-market

143
9

713a6e24c9 · add tests · Updated 2025-06-23 06:48:23 +08:00    git-market

162
2

4715dc3be5 · remove redundant context_length in GQA · Updated 2025-04-01 05:49:10 +08:00    git-market

206
3

ca0eee4cf9 · simplify and use pythorch 3.12 · Updated 2025-02-20 11:01:15 +08:00    git-market

238
7