mirror of
https://github.com/rasbt/LLMs-from-scratch.git
synced 2026-04-11 02:11:40 +08:00
* fix: preserve newline tokens in BPE encoder * further fixes * more fixes --------- Co-authored-by: rasbt <mail@sebastianraschka.com> |
||
|---|---|---|
| .. | ||
| 01_main-chapter-code | ||
| 02_bonus_bytepair-encoder | ||
| 03_bonus_embedding-vs-matmul | ||
| 04_bonus_dataloader-intuition | ||
| 05_bpe-from-scratch | ||
| README.md | ||
Chapter 2: Working with Text Data
Main Chapter Code
- 01_main-chapter-code contains the main chapter code and exercise solutions
Bonus Materials
-
02_bonus_bytepair-encoder contains optional code to benchmark different byte pair encoder implementations
-
03_bonus_embedding-vs-matmul contains optional (bonus) code to explain that embedding layers and fully connected layers applied to one-hot encoded vectors are equivalent.
-
04_bonus_dataloader-intuition contains optional (bonus) code to explain the data loader more intuitively with simple numbers rather than text.