rasbt-LLMs-from-scratch/ch05/01_main-chapter-code
2025-09-26 21:37:25 -05:00
..
ch05.ipynb Numerically stable generate on mps 2025-09-26 21:37:25 -05:00
exercise-solutions.ipynb updated exercise 5.3 (#615) 2025-04-13 13:06:57 -05:00
gpt_download.py Specify UTF-8 encoding in the json load command explicitely (#557) 2025-03-05 11:46:21 -06:00
gpt_generate.py Numerically stable generate on mps 2025-09-26 21:37:25 -05:00
gpt_train.py fix misplaced parenthesis and update license (#466) 2025-01-04 11:14:08 -06:00
previous_chapters.py fixed num_workers (#229) 2024-06-19 17:36:46 -05:00
README.md add main and optional sections 2024-06-19 17:48:25 -05:00
tests.py Add backup URL for gpt2 weights (#469) 2025-01-05 11:28:09 -06:00

Chapter 5: Pretraining on Unlabeled Data

Main Chapter Code

  • ch05.ipynb contains all the code as it appears in the chapter
  • previous_chapters.py is a Python module that contains the MultiHeadAttention module and GPTModel class from the previous chapters, which we import in ch05.ipynb to pretrain the GPT model
  • gpt_download.py contains the utility functions for downloading the pretrained GPT model weights
  • exercise-solutions.ipynb contains the exercise solutions for this chapter

Optional Code

  • gpt_train.py is a standalone Python script file with the code that we implemented in ch05.ipynb to train the GPT model (you can think of it as a code file summarizing this chapter)
  • gpt_generate.py is a standalone Python script file with the code that we implemented in ch05.ipynb to load and use the pretrained model weights from OpenAI