Email obfuscation: What works in 2026? (spencermortensen.com)

by jaden 106 comments 374 points
Read article View on HN

106 comments

[−] ciroduran 44d ago
I stopped being concerned about email harvesting years ago, I just simply leave the email on my website. Spam handling is okay enough, I guess.

But I like this review of techniques, even the simplest ones are very effective, that surprised me.

[−] jrmg 43d ago
I’ve had my email address in a mailto: link in plaintext on my then-web-site, now-blog, since the early 2000s, and spam is no real problem. There are a few spam messages in my spam mailbox per day.

Perhaps my provider’s just great at filtering spam - but I kind of doubt it’s better than the major players (for years I’ve used Zoho for email - and it’s ‘okay’ enough that it’s not worth switching).

[−] GeoSys 43d ago
I agree that email addresses get leaked eventually.

However, LLMs are quite good at generating spam and I think soon will evade most filters.

[−] BorisMelnik 43d ago
you know what's funny is that llms are also good at detecting spam as they are generating it. I've got an automation that scores incoming emails and it's getting better and better each day (also more expensive haha)
[−] SV_BubbleTime 43d ago
I can’t explain it well, but I think there is an asymmetric issue here… that the ability for an LLM to write a plausible email, and the ability for an LLM to detect that it’s spam are mismatched.

If an LLM and make a plausible email, the best another LLM can do is to rank it as plausible. Blackbox creation and detection have to be on the same level.

Perhaps if you said the detection LLM had all your context and websearch. That it could know that a Penny Pollytree at Coco Co isn’t a real person, but… that just seems like burning a ton of coal to detect fraud where the creation LLM was able to easily come up with the fictitious spam cheaply.

The real story here is this will go beyond email verification. That every system we have is going to need to up its security. Paper birth certificates and social security cards and email addresses and all manner of identity is going to need new systems of auth. The challenge will be to prevent authoritarian centralization.

[−] Anamon 41d ago
But I think there's also an asymmetry strongly favouring the defense, namely that for a spam mail to be worthwhile, it needs some call to action, a way to lure in the victims.

A link to a shady website, an infected attachment, a weird freemail address in the body or Reply-To header that doesn't match the forged From header, etc. They're trying to get cleverer for sure -- I started getting phishing mails where the malicious link is only in a QR code in an embedded image -- but I think the need to somehow link to the trap is an inherent weakness against any defense. SpamAssassin rules give a good overview of stuff that help detection no matter how the rest of the mail is generated.

[−] Gigachad 43d ago
I doubt it. Most of the signals spam filters use these days are reputation based. You have to build up your domain and IP reputation for a long time first.
[−] embedding-shape 43d ago

> You have to build up your domain and IP reputation for a long time first.

Or buy/rent domains/IPs that have good reputations, as there are services that specializes in just bringing up the reputation for stuff so they can sell it once "good". Same exists for user accounts for various platforms like reddit and so on.

[−] Gigachad 43d ago
Sure, you'd burn that reputation extremely fast as Google detects your sending patterns change and the first few users start reporting as spam.
[−] embedding-shape 43d ago

> you'd burn that reputation extremely fast

Yes, that is indeed the point of those; "build up reputation -> sell/rent -> someone uses it to burn reputation -> rinse and repeat".

[−] Cthulhu_ 43d ago
And so the arms race continues.
[−] e40 43d ago
I’m up to more than 1,500 spam emails a month, with my email on the corp website.
[−] janderson215 41d ago
Is it mostly people trying to give you their mixtapes?
[−] unilynx 43d ago

> But I like this review of techniques, even the simplest ones are very effective, that surprised me.

because harvesters don't care until one technique gets massive use. if you come up with a unique but simple enough scheme for your sites and keep a few dozen email addresses out of their reach.. they've still gathered a million addresses. it's not really worth their effort to get the last 0.0001% of extra email addresses

so it's best to just not advertise your solution and make sure it doesn't get n any outside traction - if it gets popular the harvesters will defeat it

[−] kevincox 43d ago
I've also been like this. But if as the article suggests trivial options like HTML entities or elements with display:none will keep my email out of >90% of harvesters I'm reconsidering as they seem to have no downside other than an extra couple of bytes on the wire.
[−] Yaggo 43d ago
Same here, the address will eventually leak some way anyway.

I never got SpamAssassin working very well, but since moving my email hosting to Apple (from my own server), spam has not been a problem.

[−] kqr 43d ago
I have a hypothesis email scrapers don't parse HTML at all. I suspect they search the raw bytestring for @ characters and take whatever's on either side of it. That probably gets them as many addresses as they can realistically use at a fraction of the cost, given how expensive HTML parsing can be.

(Similarly, I'm sure most links can be found by searching the bytestring for "href" and taking what's to the right of it.)

This would explain why HTML entities are so effective.

On the other hand, surely the TLS handshake is far more expensive than HTML parsing? Maybe it's to avoid parser failure modes that consume a lot of resources?

[−] Someone 43d ago

> This would explain why HTML entities are so effective.

Could also be that they learned that sending spam to obfuscated addresses doesn’t gets much response. Such messages might get filtered out more and/or addressees might be less inclined to reply to it.

[−] BorisMelnik 43d ago
it really varies, you are correct most modern ones search the byte string for @ characters but there are probably hundreds of different methods out there in black hat marketing circles to scrape emails.
[−] nabbed 43d ago
It's odd. My email address is included un-obfuscated in ~90 commits to a popular open source repo on github. I also use this same email address for a mailing list associated with this OSS project. As far as I can tell, I've never received a single spam email in the 8 years I've had this email account.

When I view a commit on the github UI using view source, I can see the commit author's email address just as text with no special handling. It's bracketed by "<" and ">", so maybe that's enough to confuse harvesters.

I just looked at the spam folder of one my personal accounts (where I sign up for services), and it has got tons of stuff, most recently 2 or 3 with the subject "YOU PERVERT! I RECORDED YOU!".

It seems spammers are doing less harvesting and more purchasing of email lists from service vendors.

[−] ProllyInfamous 43d ago
Really surprised this [very well-written] article didn't suggest the fantastic technique of owning an entire domain (although author's own examples obviously include unique handles@ for each tested practice).

Then you can hand each recipient an absolutely unique email which isn't just ole "name.morewords@" period trick — block those which receive SPAM.

----

OR: the even "easier" lifestyle of just not using email (like me). Obviously this is difficult for modern living, but that's what temp email is best for [i.e. circumventing ubiquitous REQUIRED email address fields].

[−] bit1993 44d ago
Good stuff, but I think the title should be Email address obfuscation. Thank you for sharing I guess, but spammers can now learn from this too (:
[−] Croak 44d ago
One trick is having an tarpit email adress on your website. It is hidden using CSS so no real visitor sees it but it is visible in source. If your mail server recieves mail for that adress you can just block that IP for 24h.
[−] vlucas 43d ago
I recently noticed an uptick in cold emails and spam after publishing my new website. After a few weeks, I asked Claude/Cursor to obfuscate the email for spam protection in the mailto: link, and thy both used JavaScript with data attributes.

Something like:

`` ${children} ``

And then some light vanilla JS to stitch it together. Works in the browser, and spam has dropped off a cliff since.

[−] badsectoracula 44d ago
Some time ago i was wondering if the common "me at foobar dot com" you still see a lot of people do actually helps at all, especially now with LLMs, so i searched for some common "obfuscation" techniques and found this site (not the 2026 update, but the previous - it was a few months ago). Then i wrote a simple LLM query with a bunch of examples from the site[0] (the tool is just a frontend for a commandline program that uses llama.cpp and Mistral Small 3.1 in Q4_K_M quantization since it loads relatively fast and is fine for simple prompts). AFAICT it could reveal anything that wasn't relying on CSS tricks or JavaScript.

Like others mentioned, though, personally i haven't bothered by email harvesting for years now since spam filters seem to do a decent job. I have my email posted in plaintext here (which i bet is harvested very often) and in various other places and the occasional spam i get is eclipsed from "spam" from services i've actually signed up for (coughlinkedincough).

[0] https://i.imgur.com/ytYkyQW.png

[−] binaryturtle 44d ago
When I wrote my own brainf*ck interpreter (in C) at the start of the year I was really struggling to find a use for the language. Eventually I had the idea to obfuscate emails on my websites with the language.

Basically each email gets written as a brainf*ck program and stored in a "data-" attribute. The html only includes a more primitively obfuscated statement "Must enable Javascript to see e-mail." by default which then gets replaced by another brainf*ck interpreter (in JS) with the output of the brainf*ck code. Since we only output ASCII we can reduce the size of the brainf*ck code by always adding 32 to each value it outputs. The Javascript is loaded from what seemingly looks like a 3rd party domain. There we filter basing on heuristics and check if the "referer" matches before sending out the actual interpreter code.

Of course all this would not help if a scraper properly runs things through Javascript too.

Recently I read you soon will be able to run DOOM via CSS, so certainly it should be possible to have a brainf*ck interpreter in CSS? That would be the next step… just to get rid of the Javascript, but then I'm okay with all the downsides of using Javascript just for the e-mail obfuscation.

Anyway… I also regularly (at least once a year) rotate those public contact addresses.

[−] dandersch 44d ago
Very interesting. It seems for his own email the author has opted for a combination of the CSS display none technique and a XOR cipher:

  
[−] Bender 43d ago
They left off html cgi form. Generate the email on the web page and the server sends the email after performing some basic sanity checks and anti-spam on the form and web server itself such as solving some CSS puzzle or winning a game of DOOM.
[−] TZubiri 43d ago
What I do is I have a catch all, and based on the emails I get, I know which emails are made public, and I scout what the threat actors are doing.

For a similar reason I dislike ip2ban, my objective is not to block all attack attempts, I prefer receiving them acknowledging them and being immune to them.

The idea of ignoring attack attempts isn't very safe when you think about it, your body doesn't do that, it creates antibodies upon subclinical expositions. Complete isolation means your immune system is weak and you are more vulnerable to the lightest of exposures.

[−] newscracker 44d ago

> HTML entities are often decoded automatically by server-side libraries, which means that even the most basic harvesters can get your email addresses without any special effort. This technique should be worthless—and, yet, it still stops most harvesters.

Anecdotal, but I’ve used HTML entities on a public static website for a long time using an href tag with mailto, and yet I’ve not seen any spam.

I guess any spammer who uses some level of GenAI to process and extract email addresses would have a lot more success against all the methods listed in this article.

[−] VladVladikoff 43d ago
This is a great list on how to make an email harvester even better.
[−] xiconfjs 44d ago
WTH, a 302 into a "mailto:" (search for "HTTP redirect" in the featured article) opens up my e-mail client without clicking a mailto link!? This seems wrong.
[−] tgv 43d ago
I use a very simple encryption plus some padding (fluff in the article), but the email address gets updated by JS. This requires JS plus evaluating the resulting DOM. If you don't evaluate JS, the address will be something like "please@activate.javascript". Or you could use "potus@whitehouse.gov", in which case clueless scrapers end up spamming the US government.
[−] wackiness 43d ago
Personally, I saw email crawler crawls “iDOLM@STER” (a Japanese game franchise) as an email. Even Cloudflare’s automated email obfuscation system also triggers with this too. It was funny when I saw it. I had to manually disable the CF obfuscation when it happens.
[−] sureglymop 44d ago
What I often see is js that fetches the email from the server separately and inserts it.
[−] UltraSane 42d ago
I have a custom domain name and setup my email to forward anything@domain to go to my inbox. This lets me instantly know who leaked an address and also makes it easier to filter.
[−] momo_dev 43d ago
interesting that most scrapers are still just regex-searching for @ in raw bytes. on the receiving side i've been dealing with a different angle of the same problem, blocking disposable/temp email signups. a domain blocklist catches 90% but the clever ones use random alias domains that all point their MX records to the same disposable mail infrastructure. checking where MX records actually resolve catches those too
[−] simojo 43d ago
GitHub has a spot to display your email on your profile; is this obfuscated as well? Most of my current spam is from putting my email on there..
[−] siruwastaken 44d ago
I'm surprised that html entity supstitution performs so well. I would have assumed that scrappers could at least speak proper html.
[−] aszlig 43d ago
Alternative solution: Just use a random regex as your local part, eg. "^[0-9]+$"@example.org
[−] djha-skin 43d ago
It's simple: draw your email in a paint program and export it as a png. Totally readable by humans.
[−] jonathanstrange 43d ago
I've never obfuscated my mail and do not use server-side spam filters, yet have never had a problem with spam. Yes, I get maybe twice or three times as much spam than legitimate mail (if we include spam that was once (semi-)authorized when clicking the wrong option). However, it's all filtered reliably client-side.
[−] gfody 44d ago
I filter everything that does NOT include “+asdf” in the to:
[−] fmajid 44d ago
I use SVG where I created a text object in Affinity Designer and converted it to curves so the SVG doesn't have text any more, just vectors for the glyphs of it. Seems to work pretty well at keeping spammers at bay.
[−] _ache_ 44d ago
I'm sorry, but that is not how email address are spammed in bulk.

The data-source are the enormous data breach that are more and more frequent. There is more intensive to collect more information on someone you already know something about than spamming an email you don't even know if it's a valid one.

The spam can also be very more effective as it present itself with personal information about the spammed.

[−] sumanep 43d ago
Use a form
[−] xkbear89 42d ago
[flagged]
[−] arafeq 43d ago
[flagged]
[−] platformx 39d ago
[dead]
[−] devnotes77 43d ago
[dead]
[−] DevCrate 44d ago
[dead]
[−] jeninho 42d ago
[dead]
[−] artakulov 43d ago
[flagged]
[−] genie3io 43d ago
[dead]
[−] jwr 44d ago
This is such a waste of effort. Your E-mail address is not and can't be a secret. It will get into spammer databases eventually, no matter what you do. You will spend a lot of effort doing all these fancy tricks, and eventually you will get spam anyway.

Also, a note to those who make fancy "me+someservice@somedomain.com" addresses: make really sure you are in control and these work. Some services (including mine) will need to E-mail you one day, for example to tell you that your account will be deleted because of inactivity. If you don't receive that E-mail because of your fancy spam defenses, your account will be deleted. I've seen people hurt themselves like this and it makes me sad.

On a constructive note: what works very well is spam filtering using LLMs. We have AI to help us with this problem today. I wrote an LLM despammer tool which processes my inbox via IMAP using a local LLM (for privacy reasons). I see >97% accuracy in my benchmarks on my (very difficult) testing corpus. It's nearly perfect in real life usage. I've tested many local models in the 4-32B range and the top practical choice is gpt-oss:20b (GGUF, I run it from LM Studio, MLX quantizations are worse) — not only does it perform very well, but it's also really fast.