
Let's build the GPT Tokenizer
Date : 2024-02-20
Description
In this lecture, Andrej Karpathy builds from scratch the Tokenizer used in the GPT series from OpenAI. In the process, he shows that a lot of weird behaviors and problems of LLMs actually trace back to tokenization and discusses why tokenization is at fault, and why someone out there ideally finds a way to delete this stage entirely.
Watch and like on Youtube
Recently on :
Artificial Intelligence
Information Processing | Computing
PITTI - 2026-01-14
Cultural, Ideological and Political Bias in LLMs
Transcription of a talk given during the work sessions organized by Technoréalisme on December 9, 2025, in Paris. The talk pres...
WEB - 2025-11-13
Measuring political bias in Claude
Anthropic gives insights into their evaluation methods to measure political bias in models.
WEB - 2025-10-09
Defining and evaluating political bias in LLMs
OpenAI created a political bias evaluation that mirrors real-world usage to stress-test their models’ ability to remain objecti...
WEB - 2025-07-23
Preventing Woke AI In Federal Government
Citing concerns that ideological agendas like Diversity, Equity, and Inclusion (DEI) are compromising accuracy, this executive ...
WEB - 2025-07-10
America’s AI Action Plan
To win the global race for technological dominance, the US outlined a bold national strategy for unleashing innovation, buildin...