Ngl I’m reading this article after having used ai to build a beautiful front end that is pixel perfect.
Yes ai can’t see, it only understands numbers. So tell it to use image magick to compare the screenshot to the actual mockup, tell it to get less than 5% difference and don’t use more than 20% blur. Thank me later.
I built a whole website in like 2 days with this technique.
Everyone seems to have trouble telling ai how to check its work and that’s the real problem imho.
Truly if you took the best dev in the world and had them write 1000 lines of code without stopping to check the result they would also get it wrong. And the machine is only made in a likeness of our image.
PS. You think Christian god was also pissed at how much we lie? :)
It's hard to interpret comments like this because we all have different standards and use cases. So it would really help if you could link to it. Even in a roundabout way if you want to avoid the impression of self-promotion.
I built a few websites, most of them it wouldn’t be wise to place on here. But someone emailed me about this, so I’ll do my best to help I did build https://hartwork.life for a friend with a design from open ai (pre google stitch which is my current preferred tool)
Here is the line from my Claude code to get something like this. Keep in mind I didnt use mcp for playwright with this particular implementation but it is my preferred method currently. Tha
CRITICAL - When implementing a feature based off of an image mockup, use google chrome from the applications folder set the browser dimensions to the width and height of the mockup, capture a screenshot, and compare that screenshot directly to the mockup with imagemagick. If the image is less than 90% similar go back and try and modify the code so that way the website matches the mockup closer. If a change you make makes the similarity go down, undo it, and try something else. be mindful the fonts will never be laid out exactly like the mockup, please use blur at a max of 10% to see if the images are closer matching. If you spend more than 10 cycles screen-shotting and comparing, stop and show the user how similar they are mentioning any problems
The more text the harder it becomes and it’s why we really need the blue because fonts are almost always rendered differently.
Thanks. I would say yeah, it's not too bad, but it is also a pretty simple site.
There are some interesting issues that probably relate to your workflow, like the nav links are different sizes, the icons too. And the resolution of some of the images/icons on a MacBook is poor. But I suspect that's because a simple ImageMagick raster diff will fuzz over those kind of differences.
I wonder if you can make some tweaks or find a better representation than pure raster screenshots to fix this. Can't really deal in vector images because AI sucks at outputting those, and you can't print a web page to SVG.
There was a super niche website framework that only used SVG a while ago. Would be funny if that kind of thing makes a takes off just so AI can do better.
I feel like 2 days to build this is a bit much given the simplicity. I think the point stands.
I will grant you that this is more tasteful than most of the AI sites I see. It’s a good looking little site but nothing here screams, “AI really accelerated this.”
Thank you. Yes took a bit but still way faster than by hand. There are other store pages that are also implemented. This 1 page took me like an hour lol.
1. The main page asks for an email to be notified when the hoodie is available to buy, but I can add the goodie to my shopping cart and proceed to check out
2. The product page mentions a 6’ model but there is no model in the images
3. The check out page says “there are no payment options, please contact us”
Please share what you created! I think people have very different views for what is a good interface, or a tolerable one. I think as a front-end developer and designer I notice a lot of problems most people don't care about.
I am a backend guy, so forgive my ignorance, but for web based apps I am confused what "pixel perfect" even means. I can build a site to look one way on my computer, it will most likely not look the same way on whatever device you use to access the site.
Feeding the model images for my local computer sounds like a recipe given my experience with the tools to have it over-optimize for the wrong end device.
I've also used AI to build frontends that I'm more than satisfied with, and I think it can "see" perfectly fine. The frontier models are multi-modal and pretty good at vision. You can hook up your coding harness to your browser which will take screenshots of your rendered frontend and modify the code accordingly.
After years of writing native code for mobile apps I'm using Flutter, and finding that, if you do things step-wise, and check in intermediate results so you can easily roll back failed experiments, agent-assisted coding can accelerate your front end coding substantially, and you can deliver more polished results instead of obviously demo grade visual results that need refinement. And that makes it easier to communicate with your non-coder colleagues.
My first instinct reading an article (especially one about LLMs) is to scroll down to see the structure..
Anyway.
Do people get the impression that LLMs are worse at frontend than not? I'd think it's same with other LLM uses: you benefit from having a good understanding of what you're trying to do; and it's probably decent for making a prototype quickly.
Dunno. It’s really good with Preact + Tailwind. And I have to say that I think most problems can be solved this way and don’t require a special one-of-a-kind UI. In fact, the fewer special UIs I see, the better. I prefer standardized patterns unless they truly don’t fit a domain.
I don’t 100% agree with the “AI can’t see” because in a Ralph-loop against screenshots, it basically can (inefficiently). But more importantly I do find it generally curious how bad even frontier models are in spatial thinking. Say “Align these right to left unless it crosses the center” or “Keep this box always visible and collapse X to make space” and all hell break loose - maybe it might work but in an extremely slow, costly and tedious process.
Good design is not always logical. Color theory, if followed, results in pretty bad experiences. And interestingly, good design can't always be explained in a natural language.
Main thing is, it's very hard to get AI to have taste, because taste is not always statistically explainable.
The best I've gotten to is have it use something like ShadCN (or another well document package that's part of it's training) and make sure that it does two things, only runs the commands to create components, and does not change any stock components or introduce any Tailwind classes for colors and such. Also make it ensure that it maintains the global CSS.
This doesn't make the design look much better than what it is out of the box, but it doesn't turn it into something terrible. If left unprompted on these things, it lands up with mixing fonts that it has absolutely no idea if they look good or not, bringing serif fonts into body text, mixing and matching colors which would have looked really, really good in 2005. But just don't work any more.
Everything is nuanced and generalizations help no one. There are absolutely frontend apps where AI straight up crushes. Sure these much be less novel apps but most of what people work on is a CRUD-esque interface.
Creating a CODING.md or FRONTEND.md with rules and expectations for your LLM helps tremendously. You're right AI is not great at frontend (yet), but it does lift alot of the load. Like the top commenter says, there's not a great harness for iterative frontend building. But it can get you 80% of the way there if you give it some rules, and can do the annoying bits so you can concentrate on the 20% that is about design, effective communication and pixel-perfection.
I have found that working with llms for frontend to be better than most of the developers I have worked with. A majority of devs I came across only had enough frontend knowledge to be dangerous and to consistently introduce frontend entropy.
I'm a backend dev and I'm always hearing about how LLMs are dramatically better at frontend because of much more available training data etc. Maybe my perspective isn't as skewed as I've been led to believe and LLMs need close supervision and rework of their output there too.
AI is great at front end. Scroll based animations are the devil and these "boring" designs it defaults to are (more often than not) super intuitive. Sure, some design quirks it'll guess are annoying, but have you seen the web?
Agreed on AI limitations in originality, but the industry sucked at UIs for so long, my expectations are low. I’m just hoping for widespread use of models that take the viewpoints of newbs for UI testing.
One thing that helps with #2 ('It cannot see') -- Try playwright-cli. Your agent can use it to inspect the DOM, see what styles are applied to elements, simulate clicks, etc.
This is something that talk with some friends, How IA is doing things in front end is complelty different from Humans. Humans can select colors and themes based in their criteria, and IA only generate what they learn as a machine that they are, and It's not bad, but the thing is that people that use IA for develop front-end are adapting what IA generate, and in the other hand developer is adapting to client. Which are different approaches.
Who says it sucks at front end? Unlike Stackoverflow, AI does a great job of "center a div." I tend to like working from reference documentation which is great for Python and Java but challenging for CSS where you have to navigate roughly 50 documents that relate to each other in complex ways to find answers.
Like I don't give it 100% responsibility for front end tasks but I feel like working together with AI I feel like I am really in control of CSS in a way I haven't been before. If I am using something like MUI it also tends to do really good at answering questions and making layouts.
Thing is, I don't treat AI as an army of 20 slaves will get "shit" done while I sleep but rather as a coding buddy. I very much anthropomorphize it with lots of "thank you" and "that's great!" and "does this make sense?", "do you have any questions for me?" and "how would you go about that?" and if makes me a prototype of something I will ask pointed questions about how it works, ask it to change things, change the code manually a bit to make it my own, and frequently open up a library like MUI in another IDE window and ask Junie "how do i?" and "how does it work when I set prop B?"
It doesn't 10x my speed and I think the main dividend from using it for me is quality, not compressed schedule, because I will use the speed to do more experiments and get to the bottom of things. Another benefit is that it helps me manage my emotional energy, like in the morning it might be hard for me to get started and a few low-effort spikes are great to warm me up.
167 comments
Yes ai can’t see, it only understands numbers. So tell it to use image magick to compare the screenshot to the actual mockup, tell it to get less than 5% difference and don’t use more than 20% blur. Thank me later.
I built a whole website in like 2 days with this technique.
Everyone seems to have trouble telling ai how to check its work and that’s the real problem imho.
Truly if you took the best dev in the world and had them write 1000 lines of code without stopping to check the result they would also get it wrong. And the machine is only made in a likeness of our image.
PS. You think Christian god was also pissed at how much we lie? :)
Here is the line from my Claude code to get something like this. Keep in mind I didnt use mcp for playwright with this particular implementation but it is my preferred method currently. Tha
CRITICAL - When implementing a feature based off of an image mockup, use google chrome from the applications folder set the browser dimensions to the width and height of the mockup, capture a screenshot, and compare that screenshot directly to the mockup with imagemagick. If the image is less than 90% similar go back and try and modify the code so that way the website matches the mockup closer. If a change you make makes the similarity go down, undo it, and try something else. be mindful the fonts will never be laid out exactly like the mockup, please use blur at a max of 10% to see if the images are closer matching. If you spend more than 10 cycles screen-shotting and comparing, stop and show the user how similar they are mentioning any problems
The more text the harder it becomes and it’s why we really need the blue because fonts are almost always rendered differently.
There are some interesting issues that probably relate to your workflow, like the nav links are different sizes, the icons too. And the resolution of some of the images/icons on a MacBook is poor. But I suspect that's because a simple ImageMagick raster diff will fuzz over those kind of differences.
I wonder if you can make some tweaks or find a better representation than pure raster screenshots to fix this. Can't really deal in vector images because AI sucks at outputting those, and you can't print a web page to SVG.
There was a super niche website framework that only used SVG a while ago. Would be funny if that kind of thing makes a takes off just so AI can do better.
I will grant you that this is more tasteful than most of the AI sites I see. It’s a good looking little site but nothing here screams, “AI really accelerated this.”
1. The main page asks for an email to be notified when the hoodie is available to buy, but I can add the goodie to my shopping cart and proceed to check out 2. The product page mentions a 6’ model but there is no model in the images 3. The check out page says “there are no payment options, please contact us”
Share it. I used Claude earlier to test out its design capabilities and what I got as output was flat and tasteless.
Feeding the model images for my local computer sounds like a recipe given my experience with the tools to have it over-optimize for the wrong end device.
> Yes ai can’t see, it only understands numbers.
I've also used AI to build frontends that I'm more than satisfied with, and I think it can "see" perfectly fine. The frontier models are multi-modal and pretty good at vision. You can hook up your coding harness to your browser which will take screenshots of your rendered frontend and modify the code accordingly.
> Ngl I’m reading this article after having used ai to build a beautiful front end that is pixel perfect.
Was about to say the same thing
Anyway.
Do people get the impression that LLMs are worse at frontend than not? I'd think it's same with other LLM uses: you benefit from having a good understanding of what you're trying to do; and it's probably decent for making a prototype quickly.
To quote the article:
1. "It trained on ancient garbage" which is the by product of massive churn and this attitude leads to even more churn
2. "It doesn't know WHY we do things" because we don't either... even the paradigms used in frontend dev have needlessly churned
My fix? I switched from React/Next to Vue/Nuxt. The React ecosystem is by far the worst offender.
Good design is not always logical. Color theory, if followed, results in pretty bad experiences. And interestingly, good design can't always be explained in a natural language.
Main thing is, it's very hard to get AI to have taste, because taste is not always statistically explainable.
The best I've gotten to is have it use something like ShadCN (or another well document package that's part of it's training) and make sure that it does two things, only runs the commands to create components, and does not change any stock components or introduce any Tailwind classes for colors and such. Also make it ensure that it maintains the global CSS.
This doesn't make the design look much better than what it is out of the box, but it doesn't turn it into something terrible. If left unprompted on these things, it lands up with mixing fonts that it has absolutely no idea if they look good or not, bringing serif fonts into body text, mixing and matching colors which would have looked really, really good in 2005. But just don't work any more.
>It's notoriously bad at math,
If you are going to criticize LLMs for being out of date, at least make sure your understanding isn't out of date.
Like I don't give it 100% responsibility for front end tasks but I feel like working together with AI I feel like I am really in control of CSS in a way I haven't been before. If I am using something like MUI it also tends to do really good at answering questions and making layouts.
Thing is, I don't treat AI as an army of 20 slaves will get "shit" done while I sleep but rather as a coding buddy. I very much anthropomorphize it with lots of "thank you" and "that's great!" and "does this make sense?", "do you have any questions for me?" and "how would you go about that?" and if makes me a prototype of something I will ask pointed questions about how it works, ask it to change things, change the code manually a bit to make it my own, and frequently open up a library like MUI in another IDE window and ask Junie "how do i?" and "how does it work when I set prop B?"
It doesn't 10x my speed and I think the main dividend from using it for me is quality, not compressed schedule, because I will use the speed to do more experiments and get to the bottom of things. Another benefit is that it helps me manage my emotional energy, like in the morning it might be hard for me to get started and a few low-effort spikes are great to warm me up.
but people writing shitty node.js code might beg to differ.
>
Try asking it for some scroll-driven animations or custom micro-interactionsUnrelated, but as a long time front-end dev, FUCK THOSE.