Tumblr users, here's what to know about Tumblr selling your data to OpenAI and MidJourney

Parent company Automattic will reportedly sell Tumblr content to OpenAI and MidJourney for training data. Here's how you can opt out.
 By 
Elizabeth de Luna
 on 
Tumblr logo seen displayed on a smartphone.
Credit: Mateusz Slodkowski/SOPA Images/LightRocket via Getty Images

OpenAI and photo generator Midjourney will soon pay to train their AI models using public Tumblr content, according to internal documents reviewed by the site 404 Media.

404 Media has reported that a deal is "imminent" between Tumblr parent company Automattic and the two AI giants but could not specify what types of data would be sold to each company. The deal also reportedly includes the sale of data from Wordpress.com, another Automattic property.

Posts detailing how user content is used for AI training were published on Feb. 27 on the staff blogs of both Tumblr and Wordpress.com. However, the posts did not tell users that Automattic was in talks to sell that data.


You May Also Like

Here's what you need to know about how the sale may affect your Tumblr content.

Which content will Automattic reportedly sell?

404 Media has reported that the documents it reviewed did not specify the types of data that would be sold to each company. It is also unclear if this deal will affect future posts to Tumblr only, or if it encompasses past content as well. AI companies have been critiqued for their rampant use of "publicly available" content to train their models, since much of what is publicly available online is still beholden to copyright.

According to a support article on OpenAI's website, "ChatGPT and our other services are developed using information that is publicly available on the internet" among other sources. Ostensibly, OpenAI has already scraped and used any and all content once publicly available on Tumblr. Given that, the current deal could serve as a sort of mea culpa on the part of OpenAI and Midjourney as they offer to pay for the use of all future Tumblr content as well.

Automattic did not respond to requests for comment from 404 Media regarding the deal but posted a statement called "Protecting User Choice" in which the company wrote, "We currently block, by default, major AI platform crawlers—including ones from the biggest tech companies—and update our lists as new ones launch." It is unclear when the site began blocking the crawlers, which is important considering that OpenAI has been training its algorithm on public content for years.

How do I opt out?

To opt out of sharing your public Tumblr content with third parties, you'll need to toggle on a new "Prevent third-party sharing" option in the settings of each individual blog you run. This needs to be done on a web browser, not through the Tumblr app. These updates have been added to Tumblr's support article about user privacy.

If you have already elected to discourage searching of your blog in the past, the new "prevent third-party sharing" option will already be toggled on by default.

But what if you decide to forgo toggling on the setting now, opting instead to do it in three months? 404 Media reported that, in a document it accessed from Feb. 23, a Tumblr staff member asked a question addressing this issue. "Do we have assurances," they wrote, "that if a user opts out of their data being shared with third parties that our existing data partners will be notified of such a change and remove their data?"

Automattic’s head of AI, Andrew Spittle, replied, "We will notify existing partners on a regular basis about anyone who's opted out... I want this to be an ongoing process where we regularly advocate for past content to be excluded based on current preferences. We will ask that content be deleted and removed from any future training runs. I believe partners will honor this based on our conversations with them to this point."

Is this normal?

It certainly seems to be, at the very least, the new normal. OpenAI is licensing news stories from the Associated Press and is reportedly in talks to do the same with CNN, Time, and Fox. Reddit is working with Google to monetize its database of content.

It was just a matter of time before Automattic started selling its own data, especially considering how much money it's losing on Tumblr. In its entire 17-year history, the site has never been profitable, and Automattic has failed to turn it around. In November, TechCrunch reported that resources had been diverted from the struggling site to support projects elsewhere within Automattic.

Mashable Image
Elizabeth de Luna
Culture Reporter

Elizabeth is a digital culture reporter covering the internet's influence on self-expression, fashion, and fandom. Her work explores how technology shapes our identities, communities, and emotions. Before joining Mashable, Elizabeth spent six years in tech. Her reporting can be found in Rolling Stone, The Guardian, TIME, and Teen Vogue. Follow her on Instagram here.

Mashable Potato

Recommended For You
ChatGPT GPT-4o users are raging at OpenAI on Reddit right now
ChatGPT GPT-4o

OpenAI explains how its AI agents avoid malicious links and prompt injection
OpenAI logo on phone screen


OpenAI must stop using ‘Cameo’ term in Sora app, judge rules
Sora and OpenAI logo


Trending on Mashable
NYT Connections hints today: Clues, answers for April 3, 2026
Connections game on a smartphone

Wordle today: Answer, hints for April 3, 2026
Wordle game on a smartphone

You can track Artemis II in real time as Orion flies to the moon
Victor Glover and Reid Wiseman piloting the Orion spacecraft


What's new to streaming this week? (April 3, 2026)
A composite of images from film and TV streaming this week.
The biggest stories of the day delivered to your inbox.
These newsletters may contain advertising, deals, or affiliate links. By clicking Subscribe, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.
Thanks for signing up. See you at your inbox!