Developer breaks 'The Simpsons' down by the numbers

The writers and animators behind 'The Simpsons' have built a vast fictional world that presents plenty of opportunities for analysis.
 By 
Laura Vitto
 on 

In the last 27 years, the writers and animators behind The Simpsons have built a vast fictional world that presents plenty of opportunities for analysis.

Todd W. Schneider, a developer at Genius, took a deep dive into Simpsons data using scripts pulled from Simpsons World. From there, he wrote code that organizes dialogue by character and then ranks each character by number of words spoken.

Schneider's findings, published in a post on his personal website, cover both main and supporting character dialogue and the locations within the show where these conversations take place. His piece also takes a larger look at the show's declining ratings, and how those numbers stack up against wider trends in TV viewership.


You May Also Like

In terms of words spoken by main and supporting characters, the Simpson family itself unsurprisingly accounts for 47% of the show's dialogue. But one of Schneider's more interesting findings has to do with the number of words spoken by female characters.

Original image replaced with Mashable logo
Original image has been replaced. Credit: Mashable

According to his analysis, female characters overall account for 25% of the show's dialogue. But take Marge and Lisa out of the equation, and Schneider writes that the percentage drops to below 10%. His analysis highlights a disparity between lines spoken by men versus lines spoken by women.

"A look at the show’s list of writers reveals that 9 of the top 10 writers are male," he writes. "I did not collect data on which writers wrote which episodes, but it would make for an interesting follow-up to see if the episodes written by women have a more equal distribution of dialogue between male and female characters."

Original image replaced with Mashable logo
Original image has been replaced. Credit: Mashable

Schneider also uses a statistical measure called term frequency-inverse document frequency (tf-idf) to pull specific keywords that relate to each episode. For each script, tf-idf determines which words appear more often than usual in that specific episode. For example, the keyword assigned to Season 5 episode "Cape Feare" is "Sideshow Bob," which makes sense considering the character's heavy role in that episode's plot line.

Check out the keywords for each episode (through Season 26, at least), and read more about Schneider's fascinating analysis on his website.

Topics The Simpsons

Mashable Image
Laura Vitto

Laura Vitto was Mashable's Deputy Culture Editor.

Mashable Potato

Recommended For You
'Fortnite' developer Epic Games cuts 1,000 employees in mass layoffs
Epic Games logo is seen displayed on a phone screen. The phone is laying on the keyboard of a laptop running 'Fortnite.'

'Skate' developer Full Circle announces layoffs ahead of new game release
By Jack Dawes
skate. screenshot

Apple blocks developer from updating Mac app because its similar to a defunct feature
Apple Mac Launchpad

'A Knight of the Seven Kingdoms' star Tom Vaughan-Lawlor breaks down that phlegm scene
A man sits at a desk in a dark room, eating while he works.

Microsoft 365 Outlook down: Microsoft breaks silence on outage
Microsoft logo

Trending on Mashable
NYT Connections hints today: Clues, answers for April 4, 2026
Connections game on a smartphone

Wordle today: Answer, hints for April 4, 2026
Wordle game on a smartphone

NYT Connections hints today: Clues, answers for April 3, 2026
Connections game on a smartphone

Wordle today: Answer, hints for April 3, 2026
Wordle game on a smartphone

The biggest stories of the day delivered to your inbox.
These newsletters may contain advertising, deals, or affiliate links. By clicking Subscribe, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.
Thanks for signing up. See you at your inbox!