Pull to refresh

Notion leaks email addresses of all editors of any public page (twitter.com)

by Tiberium 154 comments 401 points
Read article View on HN

154 comments

[−] Tiberium 25d ago
Apparently this is officially documented at https://www.notion.com/help/public-pages-and-web-publishing#... buried in a note:

> When you publish a Notion page to the web, the webpage’s metadata may include the names, profile photos, and email addresses associated with any Notion users that have contributed to the page.

[−] EMM_386 25d ago
That's just ... absurd.

The flaw itself is absurd but then just accepting it as "by design" makes it even worse.

[−] chinathrow 25d ago
It's also trivially easy to fix. 1 min delete and deploy.
[−] varenc 25d ago
I'm guessing it's not trivial to fix without breaking other things? The weakness seems to be that anyone can turn UUIDs into details like email. But I assume this functionality is necessary for other flows so they can't just turn off all UUID->email/profile look ups. And similarly hiding author UUIDs on posts also isn't trivial.

Conceptually, I agree it should be easy, but I suspect they're stuck with legacy code and behaviors that rely on the current system. Not breaking anything else while fixing this is likely the time consuming part.

[−] reactordev 25d ago
This is a rendering artifact, nothing more. If you can tokenize and protect PII on your platform, you can protect PII on your public pages.

    if (metadata.is_public)
Simple fix.
[−] varenc 25d ago
But a user's email isn't always forbidden. The API endpoint which turns UUIDs into a user email presumably also has use cases where you do want to expose the user email. For example, when seeing a list of people you've already invited via email to collaborate with, or listing users within your organization, etc. So a user's email isn't always forbidden PII, it depends on the context.

The trouble is the UUID->email endpoint has no idea what the context is and that endpoint alone can't decide if it should expose email or not. And then public Notion docs publicly expose author UUIDs.

Their mistake was architecting things this way. From day 1 they should have cleanly separated public identifiers from privileged ones. Or have more bespoke endpoints for looking up a UUID's email for each of the narrow contexts in which this is allowed. They didn't do this, and they certainly should have, but fixing this mess is likely a non-trivial amount of work. Though I bet it could be done immediately if they really cared and didn't mind other things breaking.

I'm absolutely not defending their choice to expose emails in this way. They should have addressed this years ago when it was first reported, and I want them shamed for failing to care. But just trying to say it's likely not a one line fix.

[−] reactordev 24d ago
A users email should always be forbidden…

It is not a public marker, it’s PII.

[−] canarias_mate 24d ago
[flagged]
[−] chinathrow 25d ago
Of course they can fix it, come on.

They can easily withold information they put out intenionally.

[−] csallen 25d ago
The whole point of that comment is that it's not that easy. There are potential side effects and consequences that are difficult to architect around.
[−] chinathrow 24d ago
The fix IS easy. The side effects need to be dealt with accordingly. Why do you defend shit like this?
[−] markdown 25d ago
Except it is.

If you can't easily architect around it, then don't do what you're trying to do.

"Oh I needed to disclose user data in order to make more money" isn't an acceptable excuse.

[−] csallen 25d ago
No one's talking about excuses.
[−] chinathrow 24d ago
Looks like everyone does talk about excuses though.
[−] sysguest 24d ago

> Oh I needed to disclose user data in order to make more money

hmm maybe they should've paywalled?

[−] UqWBcuFx6NV4r 25d ago
You literally don’t know that. Add this to the mammoth file titled “HN comments in which the author makes some completely unsubstantiated technical claim”
[−] account42 17d ago
It literally is easy to fix. For example they could shut down the servers. Which is what they should do immediately if there is no faster fix for a privacy leak like that.
[−] chinathrow 25d ago
This is, as a notion user with public pages, beyond stupid.
[−] ArchieScrivener 25d ago
Don't attribute to stupidity what can be explained by malice.
[−] mikae1 25d ago
Some CMSs do this in their RSS feeds as well. Can't recall which ones, but seen it.
[−] lioeters 25d ago
Recently I checked back on Notion after a year or so of not seeing it. I was going to recommend it to someone as an example of hypertext, but I see now it calls itself an "AI workplace that works for you" and "Your AI everything app". This company means nothing now, seriously what happened.
[−] mschoening 25d ago
Hi, this is Max from Notion.

First: This is documented and we also warn users when they publish a page. But, that’s not good enough!

Second: We don’t like this and are looking at ways to fix this either by removing the PII from the public endpoints or by replacing it with an email proxy similar to GitHub’s equivalent functionality for public commits.

P.S: Some folks here have speculated that this should be a 1 minute fix. Unfortunately that is not the case. :(

[−] RomanPushkin 25d ago
It has been an issue for at least 5 years. I remember one dude from HN deanonymized me around 5 years ago by looking at my notion page.
[−] linsomniac 25d ago
Very timely. I literally ran a Claude prompt "compare and contrast Notion vs Obsidian" and flipped over to HN while it was thinking, and this comes up. Thanks HN!
[−] DropDead 25d ago
Big companys need to start caring more security and privacy of its users and employees
[−] amazingamazing 25d ago
I've been toying around an architecture that sets things up such that the data for each user is actually stored with each user and only materialized on demand, such that many data leaks would yield little since the server doesn't actually store most of the user data. I mention this since this sorts of leaks are inevitable as long as people are fallible. I feel the correct solution is to not store user data to begin with.

some problems I've identified:

1. suppose you have x users and y groups, of which require some subset of x. joining the data on demand can become expensive, O(x*y).

2. the main usefulness of such an architecture is if the data itself is stored with the user, but as group sizes y increase, a single user's data being offline makes aggregate usecases more difficult. this would lend itself to replicating the data server side, but that would defeat the purpose

3. assuming the previous two are solved, which is very difficult to say the least, how do you secure the data for the user such that someone who knows about this architecture can't just go to the clients and trivially scrape all of the data (per user)?

4. how do you allow for these features without allowing people to modify their data in ways you don't want to allow? encryption?

a concrete example of this would be if HN had it so that each user had a sqlite database that stored all of the posts made per user. then, HN server would actually go and fetch the data for each of the posters to then show the regular page. presumably here if a data of a given user is inaccessible then their data would be omitted.

[−] jdgiese 25d ago
I love Notion and use it extremely heavily. I've also built a few integrations with Notion. I think it's a great app that uses AI very well, and they continue improving. Hopefully they fix this though! Also, their API has recently been upgraded quite a bit and now supports database views as a first class object. I have a few other small requests regarding their public API.
[−] VladVladikoff 25d ago
The tweet is only a few words, you really need an LLM to write that for you???
[−] georgespencer 25d ago
Notion’s macOS app is some of the worst software I’ve ever used. If there is a platform design idiom, they likely break it without a second thought.
[−] Akuehne 24d ago
Two things. One, surely I'm not the only one who knew this data was being stored? Two, calling it a "leak" feels like a stretch when the data was publicly accessible by design from the start.

Yes, some users probably didn't realize their edits to public pages were saved publicly, and that's a legitimate UX complaint. But some of the responsibility has to sit with the user. Otherwise we'd be running daily headlines about Meta "leaking" user data to every advertiser with a checkbook.

[−] O4epegb 25d ago
I reported this and several other issues with public pages almost six years ago. Some of them were fixed after many years - but they're very slow to handle it. I never received any bug bounty or anything.

Here's a Reddit post just as confirmation: https://www.reddit.com/r/Notion/comments/hqyxid/possible_sec.... I also reported it privately two months prior, of course.

[−] skissane 25d ago
I really dislike Notion. Its public API is full of bizarre arbitrary limitations, like a rich text database field can only contain max 100 “child blocks”, where each change in formatting consumes one child block-but its web UI doesn’t have this issue. Yes, I realise the undocumented private API that the web UI uses doesn’t have this issue either-but I shouldn’t have to, and I haven’t.

I don’t love Confluence, but at least it doesn’t do this to me.

[−] rvz 25d ago
Why people choose these services and have zero care about security is beyond me.

Tells me everything I need to know about this industry. No regard or seriousness to security at all.

[−] e-dant 25d ago
Are security vulnerabilities good marketing?
[−] hohithere 25d ago
Any self hosted solution?
[−] colesantiago 25d ago
Transparency is a good thing?
[−] staticassertion 25d ago
Isn't this very typical? Also, what is the proposal?
[−] Grappelli 25d ago
[flagged]
[−] ibrahimhossain 25d ago
[dead]
[−] SadErn 25d ago
[dead]
[−] qotgalaxy 25d ago
[dead]