10 interesting stories served every morning and every evening.

Moving away from Tailwind, and learning to structure my CSS

jvns.ca

Hello! 8 years ago, I wrote ex­cit­edly about dis­cov­er­ing Tailwind.

At that time I re­ally had no idea how to struc­ture my CSS code and given the choice be­tween a pile of com­plete chaos and Tailwind, I was re­ally happy to choose Tailwind. It helped me make a lot of tiny sites!

I spent the last week or so mi­grat­ing a cou­ple of sites away from Tailwind and to­wards more se­man­tic HTML + vanilla CSS, and it was SO fun and SO in­ter­est­ing, so here are some things I learned!

As usual I’m not a full-time fron­tend de­vel­oper and so all of my CSS learn­ing has hap­pened in fits and starts over many years.

it turns out Tailwind taught me a lot

When I started think­ing about struc­tur­ing CSS, I was in­tim­i­dated at first: I’m not very good at struc­tur­ing my CSS! But then I started read­ing blog posts talk­ing about how to struc­ture CSS (like A whole cas­cade of lay­ers or How I write CSS in 2024) and I re­al­ized a cou­ple of things:

Every CSS code base has a bunch of dif­fer­ent things go­ing on (layouts! fonts! colours! com­mon com­po­nents!)

It’s ex­tremely use­ful to have sys­tems or guide­lines to man­age each of those things, oth­er­wise things de­scend into chaos

Tailwind has sys­tems for some of these, and I al­ready know those sys­tems! Maybe I can im­i­tate the sys­tems I like!

For ex­am­ple, Tailwind has:

a re­set stylesheet

a colour palette

a font scale

the sys­tems I’m go­ing to talk about

I’m go­ing to talk about a few as­pects of my CSS code­base and my thoughts so far what kind of rules I want to im­pose on the code­base for each one. Some of them are copied from Tailwind and some aren’t.

re­set

com­po­nents

colours

font sizes

util­ity classes

the base

spac­ing

re­spon­sive de­sign

the build sys­tem

1. re­set

I just copied Tailwind’s preflight styles” by go­ing into tail­wind.css and copy­ing the first 200 lines or so.

I no­ticed that I’ve de­vel­oped a re­la­tion­ship with Tailwind’s CSS re­set over time, for ex­am­ple Tailwind sets box-siz­ing: bor­der-box on every el­e­ment (which means that an el­e­men­t’s width in­cludes its padding):

* { box-siz­ing: bor­der-box; }

I think it would be a real ad­just­ment for me to switch to writ­ing CSS with­out these, and I’m sure there are lots of other things in the Tailwind re­set (like html {line-height: 1.5;}) that I’m sub­con­sciously used to and don’t even re­al­ize are there.

2. com­po­nents

This next part is the bulk of the CSS!

The idea here is to or­ga­nize CSS by components”, in a way that’s spir­i­tu­ally re­lated to Vue or React com­po­nents. (though there might not ac­tu­ally be any Javascript at all in the site)

Basically the idea is that:

Each component” has a unique class

The CSS for one com­po­nent never over­rides the CSS for any other com­po­nent

Each com­po­nent has its own CSS file

So edit­ing the CSS for one com­po­nent won’t mys­te­ri­ously break some­thing in an­other com­po­nent. And prob­a­bly like 80% of the CSS that I would ac­tu­ally want to change is in var­i­ous com­po­nent files, so if I’m edit­ing a 100-line com­po­nent, I just have to think about those 100 lines. It’s way eas­ier for me to think about.

For ex­am­ple, this HTML might be the .zine component”.

<figure class=“zine hor­i­zon­tal”> <img src=“what­ever.jpg”> </figure>

And the CSS looks some­thing like this, us­ing nested se­lec­tors:

.zine { … &.horizontal { … } &.vertical { … } &:hover { … } }

I haven’t done any­thing pro­gram­matic (like web com­po­nents or @scope) that en­sures that com­po­nents won’t in­ter­fere with each other, but just hav­ing a con­ven­tion and try­ing my best al­ready feels like a big im­prove­ment.

Next: con­ven­tions to main­tain some con­sis­tency across the site and keep these com­po­nents in line with each other!

3. colours

colours.css has a bunch of vari­ables like this which I can use as nec­es­sary. Colour is re­ally hard and I did­n’t want to re­visit my use of colour in this refac­tor, so I left this alone.

The only guide­line I’m try­ing to en­force here is that all colours used in the site are listed in this file.

:root { –pink: #fea0c2; –pink-light: #F9B9B9; –red: #f91a55; –orange: rgb(222, 117, 31); … }

4. font sizes

One thing I ap­pre­ci­ated about Tailwind was that if I wanted to set a font size, I could just think hm, I want the text to be big”, write text-lg, and be done with it! And maybe if it’s not big enough I’d use xl or 2xl in­stead. No try­ing to re­mem­ber whether I’m us­ing em or px or rem.

So I de­fined a bunch of vari­ables, taken from Tailwind, like this:

–size-xs: 0.75rem; –line-height-xs: 1rem;

–size-sm: 0.875rem; –line-height-sm: 1.25rem;

Then if I want to set a font size, I can do it like this. It’s a lit­tle more ver­bose than Tailwind but I’m happy with it for now.

h3 { font-size: var(–size-lg); line-height: var(–line-height-lg); }

5. util­i­ties

There are some things like but­tons that ap­pear in many dif­fer­ent com­po­nents. I’m call­ing these utilities”.

I copied some util­ity classes from Tailwind (like .sr-only for things that should only ap­pear for screen­reader users).

This sec­tion is pretty small and I try to be care­ful about mak­ing changes here.

6. the base

base” styles are styles that ap­ply across the whole site that I chose my­self. I have to keep this sec­tion re­ally small be­cause I’m not con­fi­dent enough to en­force a lot of styles across the whole site. These are the only two I feel okay about right now, and I might change the <section> one:

/* put a 950px col­umn in the mid­dle of each <section> */ sec­tion { –inner-width: 950px; padding: 3rem max(1rem, (100% - var(–in­ner-width))/​2); }

a { color: var(–or­ange); }

I think for the base styles it’s go­ing to be eas­i­est for me to work kind of bot­tom up — first start with al­most noth­ing in the base styles, and then move some styles from the com­po­nents into base styles as I iden­tify com­mon things I want.

7. spac­ing

I haven’t com­pletely worked out an ap­proach to man­ag­ing padding and mar­gins yet. I’m def­i­nitely try­ing to be more prin­ci­pled than how I was do­ing it in Tailwind though, where I would just hap­haz­ardly put padding and mar­gins every­where un­til it looked the way I wanted.

Right now I’m work­ing to­wards mak­ing the outer lay­out com­po­nents in charge of spac­ing as much as pos­si­ble. For ex­am­ple if I have a <section> with a bunch of chil­dren that I want to have space be­tween them, I might use this to space the chil­dren evenly:

sec­tion > *+* { mar­gin-top: 1rem; }

Some in­spi­ra­tion blog posts:

the owl se­lec­tor

no outer mar­gin”

8. re­spon­sive de­sign: use more grid!

The way I was do­ing re­spon­sive de­sign in Tailwind was to use a lot of me­dia queries. Tailwind has this md:text-xl syn­tax that means apply the text-xl style at sizes md or larger”.

I’m try­ing some­thing pretty dif­fer­ent now, which is to make more flex­i­ble CSS grid lay­outs that don’t need as many break­points. This is hard but it’s re­ally in­ter­est­ing to learn about what’s pos­si­ble with grid, and it’s a good ex­am­ple of some­thing that I don’t think is pos­si­ble with Tailwind.

For ex­am­ple, I’ve been learn­ing about how to use auto-fit to au­to­mat­i­cally use 2 columns on a big screen and 1 col­umn on a small screen like this:

dis­play: grid; grid-tem­plate-columns: re­peat(auto-fit, min­max(min(100%, 400px), max-con­tent)); jus­tify-con­tent: cen­ter;

I also used grid-tem­plate-ar­eas a lot which is an amaz­ing fea­ture that I don’t think you can use with Tailwind.

Some in­spi­ra­tion:

A re­spon­sive grid lay­out with no me­dia queries from CSS Tricks

9. the build sys­tem: es­build

In de­vel­op­ment, I don’t need a build sys­tem: CSS now has both built in im­port state­ments, like this:

@import reset.css”; @import typography.css”; @import colors.css”;

and built in nested se­lec­tors, like this:

.page { h2 { …} }

If I want, I can use es­build to bun­dle the CSS file for pro­duc­tion. That looks some­thing like this.

es­build style.css –bundle –loader:.svg=dataurl –loader:.woff2=file –outfile=/tmp/out.css

Even though I usu­ally avoid us­ing CSS and JS build sys­tems, I don’t mind us­ing es­build (which I wrote about in 2021 here) be­cause it’s based on web stan­dards and be­cause it’s a sta­tic Go bi­nary.

why mi­grate away from Tailwind?

A few peo­ple asked why I was mi­grat­ing away from Tailwind. A few fac­tors that con­tributed are:

Tailwind has be­come much more re­liant on a build sys­tem since 2018, I think it’s im­pos­si­ble (?) to use newer ver­sions of Tailwind with­out us­ing a build sys­tem. So I’ve been us­ing Tailwind v2 for years. (there’s also litewind ap­par­ently)

It’s al­ways been true that you’re sup­posed to use Tailwind with a build sys­tem, but I’ve never re­ally done that, so I have 2.8MB tail­wind.min.css files (270K gzipped) in a lot of my pro­jects and it feels a lit­tle silly.

I’m a lot bet­ter at CSS than I was when I started us­ing Tailwind

Ultimately Tailwind is lim­it­ing: if you want to do Weird Stuff in your CSS, it’s not al­ways pos­si­ble with Tailwind. Those lim­its can be ex­tremely use­ful (a lot of this post is about me reim­ple­ment­ing some of Tailwind’s lim­its!) but at this point I’d like to be able to pick and choose.

I ended up with sites that mixed both vanilla CSS and Tailwind in the same pro­ject and that was not fun to main­tain

I got cu­ri­ous about what writ­ing more se­man­tic HTML would feel like.

CSS fea­tures I’m cu­ri­ous about

While do­ing this I learned about a lot of CSS fea­tures that I did­n’t use but am cu­ri­ous about learn­ing about one day:

@layer (from A Whole Cascade of Layers)

@scope)

con­tainer queries

sub­grid

Efficient Minute-Scale World Modeling

nvlabs.github.io

You don’t know HTML Lists

blog.frankmtaylor.com

Reading Time: 13 min­utes

This sec­ond in­stall­ment in the You don’t know HTML se­ries is go­ing to be all about the ways that we put col­lec­tions of things to­gether. We’re skip­ping over the MDN and W3Schools in­tro­duc­tory pages and in­stead we’re go­ing into the kind of stuff you dis­cover af­ter ac­ci­den­tally tak­ing your cous­in’s Ritalin right be­fore you open up the W3C specs. Let’s dive deep into lists.

This is­n’t an in­tro­duc­tion!

I’m as­sum­ing you’ve got real-world ex­pe­ri­ence writ­ing HTML and this is­n’t your first time search­ing How to make a list.” What I’m go­ing to cover are all of the ways you can put col­lec­tions of con­tent to­gether. So I’m talk­ing about these kinds of lists:

Ordered

Unordered

Description

Menu

Control

And if you did­n’t know there were five dif­fer­ent kinds of lists in HTML, per­fect. That must mean you don’t know HTML!

How Do we Decide Which to Use?

No need to ask AI for a sum­mary; I’ll just give you the end­ing up front. Here’s how you’ll de­cide which kind of list to use:

If the items in the list are for a sin­gle con­trol field where you’re get­ting data from a user, you ei­ther want a <select> + <option> mashup or an <input> + <datalist> combo

If chang­ing the or­der of the items would change the mean­ing of the list, then use an or­dered list (<ol>)

If the items are key-value pairs, or keys-to-value pairs, use a de­scrip­tion list (<dl>)

If the items are con­trols that will per­form ac­tions in the user in­ter­face, use a menu (<menu>)

Use an un­ordered list (<ul>)

Control Lists with <select> and<op­tion> or <input> and <datalist>

When we think of lists, we don’t usu­ally throw user con­trol fields into the mix. And that’s weird, be­cause we con­struct our nav­i­ga­tions us­ing lists, and those are lists of links that the user…uh…can con­trol. So we tend to have a bias with what we think lists are.

But I’m here to bring that to the fore­front of your mind: when we’re build­ing forms, some­times we’re build­ing lists that our users will in­ter­act with.

If it’s a fixed list, use <select> and <option>

When I say fixed”, I mean that the user can only choose the items from that list. If that’s the case, let’s use se­lect and op­tion

Suppose we want a list of lan­guages to talk in:

<select name=“lan­guages”> <option value=“”>Se­lect a Language</option> <option value=“en”>Eng­lish</​op­tion> <option value=“fr”>French</​op­tion> <option value=“es”>Span­ish</​op­tion> <option value=“pt”>Por­tuguese</​op­tion> </select>

This gives the user ex­actly one choice to make.

But if the user were also mul­ti­lin­gual, maybe they’d like to choose more than one. Easy enough with the mul­ti­ple at­tribute! The list will dis­play dif­fer­ently. Now all the op­tions will be vis­i­ble so we can shift or cmd + click the ones we want:

<select name=“lan­guages” mul­ti­ple> <option value=“”>Se­lect a Language</option> <option value=“en”>Eng­lish</​op­tion> <option value=“fr”>French</​op­tion> <option value=“es”>Span­ish</​op­tion> <option value=“pt”>Por­tuguese</​op­tion> <option value=“en”>Irish</​op­tion> <option value=“cy”>Welsh</​op­tion> </select>

So long as you’re do­ing this with an ac­tual se­lect el­e­ment and an op­tion, you don’t have to use the aria-mul­ti­s­e­lec­table at­tribute on a list el­e­ment with the role=“list­box” at­tribute. Native browser se­man­tics bakes that in for you.

Put re­lated op­tions to­gether with <optgroup>

What if we wanted to group lan­guages by lan­guage-fam­i­lies? We can do that with opt­group which lets us group a list of op­tions to­gether:

<select name=“lan­guages”> <optgroup la­bel=“Ger­manic”> <option value=“en”>Eng­lish</​op­tion> </optgroup> <optgroup la­bel=“Ro­mance”> <option value=“fr”>French</​op­tion> <option value=“es”>Span­ish</​op­tion> <option value=“pt”>Por­tuguese</​op­tion> </optgroup> <optgroup la­bel=“Celtic”> <option value=“en”>Irish</​op­tion> <option value=“cy”>Welsh</​op­tion> </optgroup> </select>

What if there’s a bunch of op­tions, but for [reasons] we don’t want a user to be able to se­lect a sub­set of them? Let’s add the dis­abled at­tribute to an opt­group:

<select name=“lan­guages”> <optgroup la­bel=“Ger­manic”> <option value=“en”>Eng­lish</​op­tion> </optgroup> <optgroup la­bel=“Ro­mance”> <option value=“fr”>French</​op­tion> <option value=“es”>Span­ish</​op­tion> <option value=“pt”>Por­tuguese</​op­tion> </optgroup> <optgroup la­bel=“Celtic” dis­abled> <option value=“en”>Irish</​op­tion> <option value=“cy”>Welsh</​op­tion> </optgroup> </select>

Use na­tive HTML op­tions first for im­prov­ing the list

Sometimes we may want a vi­sual break be­tween your groups. If we don’t want to fid­dle with CSS, we’re in luck! An <hr> is an ap­proved item in a se­lect. Not only does that make our se­lect look a lit­tle sharper, we can also use the size at­tribute to con­trol how many items will be dis­played at once — mak­ing this use­ful for es­pe­cially long lists.

We just gotta watch out with size if we’re also us­ing opt­group be­cause those group la­bels will take up some of that space we were prob­a­bly hop­ing for:

<select name=“lan­guages” size=“4″ mul­ti­ple> <optgroup la­bel=“Ger­manic”> <option value=“en”>Eng­lish</​op­tion> </optgroup> <hr /> <optgroup la­bel=“Ro­mance”> <option value=“fr”>French</​op­tion> <option value=“es”>Span­ish</​op­tion> <option value=“pt”>Por­tuguese</​op­tion> </optgroup> <hr /> <optgroup la­bel=“Celtic”> <option value=“en”>Irish</​op­tion> <option value=“cy”>Welsh</​op­tion> </optgroup> <hr /> <optgroup la­bel=“Afroasi­atic”> <option value=“he”>He­brew</​op­tion> <option value=“ar”>Ara­bic</​op­tion> </optgroup> </select>

If it’s a sug­gested list, use <datalist>

Let’s sup­pose we have a con­trol where we want to sug­gest a list op­tions to a user. This is where we get the datal­ist in­volved.

Using a datal­ist is a two-step process be­cause we have to tell the in­put to use a datal­ist.

Create a datal­ist and give it an id.

Put the value of that id in the list at­tribute of a cor­re­spond­ing in­put

<datalist id=“lan­guages”> <option>English</option> <option>French</option> <option>Spanish</option> <option>Portuguese</option> <option>Irish</option> <option>Welsh</option> <option>Hebrew</option> <option>Arabic</option> </datalist>

<input name=“lan­guage” list=“lan­guages”>

English French Spanish Portuguese Irish Welsh Hebrew Arabic

We need to watch out for us­ing a value at­tribute on the <option> of a <datalist>!

This is­n’t a datal­ist prob­lem but an op­tion prob­lem: The de­fault value for an op­tion is the text it wraps. A value at­tribute over­rides that and then the text acts like a la­bel. This is no big deal for a se­lect list be­cause the user only sees the text.

But if we put a value on an <option> in a datal­ist the user will see the label” in the list, but when they se­lect it they’ll see the value in the in­put. It’s a con­fus­ing ex­pe­ri­ence.

Start typ­ing w in this in­put and then se­lect Welsh” to see what I mean:

<datalist id=“lan­guages”> <option value=“en”>Eng­lish</​op­tion> <option value=“fr”>French</​op­tion> <option value=“es”>Span­ish</​op­tion> <option value=“pt”>Por­tuguese</​op­tion> <option value=“en”>Irish</​op­tion> <option value=“cy”>Welsh</​op­tion> <option value=“he”>He­brew</​op­tion> <option value=“ar”>Ara­bic</​op­tion> </datalist>

<input name=“lan­guage” list=“lan­guages”>

English French Spanish Portuguese Irish Welsh Hebrew Arabic

So if we’re go­ing to use a datal­ist, we need to work with the un­der­stand­ing that the value is what gets in­serted — not the la­bel.

We can use a datal­ist for any kind of in­put

We tend to think of the datal­ist as be­ing use­ful for text op­tions. But that ain’t how it has to work.

Suppose we had a cal­en­dar wid­get and we wanted to gen­tly sug­gest a par­tic­u­lar range of weeks in the year. We could do that with a datal­ist:

<label for=“camp-week”>Choose a week</​la­bel>

<input type=“week” name=“week” id=“camp-week” min=“2026-W2″ max=“2026-W51” list=“pre­ferred-weeks” />

<datalist id=“pre­ferred-weeks”> <option>2026-W22</option> <option>2026-W23</option> <option>2026-W24</option> <option>2026-W25</option> </datalist>

Choose a week:

2026-W22 2026-W23 2026-W24 2026-W25

<datalist> and <input type=“range”> can work to­gether

<datalist> is­n’t lim­ited to stringy val­ues; it works with num­bers. Which means we could pair it with a range in­put and cre­ate la­beled stops along the range.

The only thing we have to watch out for in this ap­proach is that not all browsers are guar­an­teed to work the same way. In Chrome and friends, we could dis­play these stops with very pro­gram­matic and sim­ple CSS. In Firefox…shenanigans are in­volved. But it starts with the big idea that you can dis­play a datal­ist:

<div class=“range­Field”> <label for=“tips”>Tip Percentage</label>

<input type=“range” name=“tips” id=“tips” min=“0″ max=“50” step=“1″ list=“rec­om­mended-tips” />

<datalist id=“rec­om­mended-tips”> <option value=“10″ la­bel=“10%“></​op­tion> <option value=“18” la­bel=“18%“></​op­tion> <option value=“30” la­bel=“30%“></​op­tion> <option value=“45” la­bel=“45%“></​op­tion> </datalist>

<style> .rangeField {

/* con­tainer for the two things ch is the width of the 0 in com­puted font. Very pre­cise for num­bers */ width: 50ch; }

/*same width for in­put and datal­ist*/ #recommended-tips, #tips { width: 100%; mar­gin: 0; padding: 0; }

#recommended-tips { po­si­tion: rel­a­tive; dis­play: block; writ­ing-mode: ver­ti­cal-lr; }

</style> </div>

Tip Percentage

Our pro­gram­matic styles which will work in Chrome and friends will in­volve us­ing the attr() func­tion, cast­ing it to a per­cent, and some math.

@supports (x: attr(x type(per­cent­age))) { /* For browsers that let you set a type on an attr() 1. get value from the la­bel with attr() 2. use the type() func­tion to de­clare the value as per­cent 3. make it ab­solute 4. max of in­put is 50, not 100. Set left to be clo­seish to left x 2, and sub­tract based on the char­ac­ter width */ /* set datal­ist to dis­play, and be a po­si­tion­ing root add a ver­ti­cal writ­ing mode */ #recommended-tips op­tion{ –percent: attr(la­bel type(<per­cent­age>)); po­si­tion: ab­solute; left: calc((var(–per­cent) * 1.9) - .1ch); } }

For this to work in Firefox, we have to go in a dif­fer­ent and more an­noy­ing di­rec­tion. We will need to man­u­ally set these as sep­a­rate rule­sets. And we will tar­get a pseudo-el­e­ment in­stead. And our math gets weirder. This is not guar­an­teed to dis­play well on your screen:

@supports not (x: attr(x type(per­cent­age))) { /* In fire­fox, the val­ues dis­play as a ::before Also, ex­plic­itly set the height of the op­tion, oth­er­wise it will be too big so set the ::before to po­si­tion ab­solute Also, don’t set length with per­cent as it’s wildly off in­stead, use the same unit set on the con­tainer (ch) */

#recommended-tips op­tion { height: 1ch; mar­gin:0; padding:0; } #recommended-tips op­tion::be­fore { po­si­tion: ab­solute; top: .5ex; }

#recommended-tips op­tion[value=“10”]::be­fore { left: calc(5ch + 2ex) }

#recommended-tips op­tion[value=“18”]::be­fore { left: calc(9ch + 2.5ex ); }

#recommended-tips op­tion[value=“30”]::be­fore { left: calc(15ch + 4ex); }

#recommended-tips op­tion[value=“45”]::be­fore { left: calc(22.5ch + 6.5ex); } }

Ordered Lists with <ol>

Any time we have a col­lec­tion of items that must be read in a par­tic­u­lar or­der, we should use an or­dered list. We should not let vi­sual pre­sen­ta­tion dic­tate this choice. It’s not about whether the items should have num­bers next to them. It’s about whether their se­quence mat­ters.

These are the kinds of col­lec­tions that should be an or­dered list:

An al­go­rithm

Series of events

Items that have an in­cre­men­tal con­tin­uum

A recipe (which is a se­ries of events and also an al­go­rithm)

An al­pha­betic list (which is let­ters arranged along their con­tin­uum)

And the rea­son these should be an or­dered list is be­cause chang­ing the or­der of the items would change the mean­ing of the list.

Our bread will bake dif­fer­ently if it’s not in an or­dered list!

<ol> <li>Pre-heat oven to 350 de­grees and grease a 9x5 pan.</​li> <li>Combine flour, bak­ing soda, and salt in large bowl with beaten brown sugar, but­ter, eggs, and mashed ba­nanas</​li> <li>If oven is pre-heated, pour bat­ter into pan</​li> <li>Bake for 60 min­utes or un­til a tooth­pick in­serted into the cen­ter comes out clean.</​li> <li>Let cool on a wire rack</​li> </ol>

Pre-heat oven to 350 de­grees and grease a 9×5 pan.

Combine flour, bak­ing soda, and salt in large bowl with beaten brown sugar, but­ter, eggs, and mashed ba­nanas

If oven is pre-heated, pour bat­ter into pan

Bake for 60 min­utes or un­til a tooth­pick in­serted into the cen­ter comes out clean.

Let cool on a wire rack

And if we say some­thing is al­pha­bet­i­cal, it’d be weird to sug­gest it could be or­dered dif­fer­ently!

<h3>Ingredients (alphabetical)</h3> <ol> <li>baking soda (1 tea­spoon)</​li> <li>bananas (2) (mashed)</li> <li>brown sugar (¾ cup)</​li> <li>butter (½ cup)</​li> <li>eggs (2)</li> <li>flour (2 cups)</​li> <li>salt (¼ tea­spoon)</​li> </ol>

Mozilla to UK regulators: VPNs are essential privacy and security tools and should not be undermined  – Open Policy & Advocacy

blog.mozilla.org

In the con­text of con­cerns around young peo­ple’s in­ter­ac­tions with dig­i­tal tech­nolo­gies, the UKs Department for Science, Innovation and Technology is con­sult­ing on ad­di­tional mea­sures to pre­pare young peo­ple for grow­ing up in a dig­i­tal world. Before the back­drop of users cir­cum­vent­ing age as­sur­ance sys­tems man­dated un­der the UKs Online Safety Act, the con­sul­ta­tion con­sid­ers age-gat­ing vir­tual pri­vate net­works (VPNs).

Mozilla’s mis­sion is grounded in the be­lief that the in­ter­net must re­main open and ac­ces­si­ble to all, and that pri­vacy and se­cu­rity on­line are fun­da­men­tal hu­man rights. We rec­og­nize that the pro­tec­tion of young peo­ple on­line is one of the most press­ing and chal­leng­ing ques­tions of our time, and we are com­mit­ted to sup­port­ing pol­icy pro­pos­als that ad­dress the root causes of on­line harms. We are con­cerned, how­ever, that blunt in­ter­ven­tions like manda­tory age as­sur­ance and re­strict­ing ac­cess to tools like VPNs are not ef­fec­tive in im­prov­ing the pro­tec­tion af­forded to young peo­ple on­line, while un­der­min­ing the fun­da­men­tal rights of all users.

VPNs serve as crit­i­cal pri­vacy and se­cu­rity tools for users across all ages. By hid­ing users’ IP ad­dresses, VPNs help pro­tect users’ lo­ca­tion, re­duce track­ing and avoid IP-based pro­fil­ing. People use VPNs for lots of dif­fer­ent rea­sons: to con­nect to their school’s or em­ploy­er’s net­work re­motely, to avoid cen­sor­ship or to sim­ply pro­tect their pri­vacy and se­cu­rity on­line. While be­ing able to ac­cess VPNs is es­pe­cially im­por­tant for vul­ner­a­ble groups like ac­tivists, dis­si­dents or jour­nal­ists, VPNs im­prove every­one’s base­line pro­tec­tion on­line.

Young peo­ple are par­tic­u­larly vul­ner­a­ble to on­line track­ing, tar­geted ad­ver­tis­ing, and the risks that flow from per­sonal data be­ing col­lected and processed for com­mer­cial pur­poses with­out ad­e­quate con­sent or trans­parency. In a world in which young peo­ple are in­ter­act­ing with dig­i­tal tech­nolo­gies as part of their re­al­i­ties from young ages on­ward, re­strict­ing young peo­ple’s ac­cess to pri­vacy-pro­tect­ing tech­nolo­gies is in ten­sion with the goal of equip­ping them to nav­i­gate the in­ter­net safely and com­pe­tently. In or­der to be able to de­velop agency and re­spon­si­ble habits in en­gag­ing with dig­i­tal tech­nolo­gies, it is cru­cial for young peo­ple to be in­tro­duced to best prac­tices and key safety and pri­vacy tools as they en­gage with the on­line world.

Rather than age-gat­ing tech­nolo­gies like VPNs, we be­lieve that reg­u­la­tors should ad­dress the root causes of on­line harm by hold­ing plat­forms to ac­count, en­cour­ag­ing the re­spon­si­ble use of parental con­trols and in­vest­ing in dig­i­tal skills and a whole of so­ci­ety ap­proach to dig­i­tal well­be­ing.

Read our full sub­mis­sion to the Department for Science, Innovation and Technology.

DeepSeek-V4-Flash means LLM steering is interesting again

www.seangoedecke.com

Ever since Golden Gate Claude I’ve been fas­ci­nated with steering”: the idea that you can guide LLM out­puts by di­rectly ma­nip­u­lat­ing the ac­ti­va­tions of the model mid-flight.

DeepSeek V4 Flash

I was in­spired to write this post by an­ti­rez’s re­cent pro­ject DwarfStar 4, which is a ver­sion of llama.cpp that’s been stripped down to run only DeepSeek-V4-Flash. What’s so spe­cial about this model? It might be what many en­gi­neers have been wait­ing for: a lo­cal model good enough to com­pete with at least the low end of fron­tier model agen­tic cod­ing.

Since steer­ing re­quires a lo­cal model, it’s now prac­ti­cal for many en­gi­neers to try it out for the first time. And in­deed, an­ti­rez has baked steer­ing into DwarfStar 4 as a first-class cit­i­zen. Right now it’s very rudi­men­tary (basically just the toy verbosity” ex­am­ple you can repli­cate via prompt­ing), but the ini­tial re­lease was only eight days ago. I plan to fol­low this pro­ject closely.

How steer­ing works

The ba­sic idea be­hind steer­ing is ex­tract­ing a con­cept (like respond tersely”) from the mod­el’s in­ter­nal brain state, then reach­ing in dur­ing in­fer­ence and boost­ing the nu­mer­i­cal ac­ti­va­tions that form that con­cept.

One way you might do this is to feed your model the same set of a hun­dred prompts twice, once with the nor­mal prompts and once with the words respond tersely” ap­pended. Then mea­sure the dif­fer­ence in the mod­el’s ac­ti­va­tions1 for each prompt pair (by sub­tract­ing one ac­ti­va­tion ma­trix from the other). That’s your steering vec­tor”. In the­ory, you can go and add that to the same ac­ti­va­tion layer for any prompt and get the same ef­fect (of the model re­spond­ing tersely).

Another, more so­phis­ti­cated way you might do this is to train a sec­ond model to ex­tract features” from your mod­el’s ac­ti­va­tions: pat­terns of be­hav­ior that seem to show up to­gether. Then you can try to map those fea­tures back to in­di­vid­ual con­cepts, and boost them in the same way. This is more or less what Anthropic is do­ing with sparse au­toen­coders2. It’s the same prin­ci­ple as the naive ap­proach, but it lets you cap­ture deeper pat­terns (at the cost of be­ing much more ex­pen­sive in time, com­pute and ex­per­tise).

Why steer­ing is in­ter­est­ing

Steering sounds like a cheat code. Instead of painstak­ingly as­sem­bling a train­ing set that tries to push the model to­wards the smart” end of the dis­tri­b­u­tion in its train­ing data, why not sim­ply go un­cover the smart” dial in the mod­el’s brain and turn it all the way to the right?

It also seems like a more el­e­gant way to ad­just the way mod­els talk. Instead of fid­dling with the prompt (adding or re­mov­ing qual­i­fiers like you MUST), could­n’t we just have a con­trol panel of slid­ers like succinctness/verbosity” or conscientiousness/speed” and move them around di­rectly?

Finally, it’s just cool. Watching Golden Gate Claude un­will­ingly drag every sen­tence back to the Golden Gate Bridge is as fas­ci­nat­ing and un­set­tling as Oliver Sacks’ neu­ro­log­i­cal anec­dotes. What if your own mind was tweaked in a sim­i­lar way? Would it still be you?

Why steer­ing has­n’t been used

Why don’t we steer more, then? Why don’t ChatGPT and Claude Code al­ready have a steer­ing panel where you can ad­just the mod­el’s brain in real time? One rea­son is that steer­ing is kind of an un­for­tu­nately middle class” idea in AI re­search.

It’s be­neath the big AI labs, who can ma­nip­u­late their mod­els di­rectly with­out hav­ing to do awk­ward brain surgery mid-in­fer­ence. Anthropic is work­ing on this stuff, but largely from an in­ter­pretabil­ity and safety per­spec­tive (as far as I know). When they want a model to be­have in a cer­tain way, they don’t mess around with steer­ing, they just train the model.

Steering is also out of reach for reg­u­lar AI users like you and me3, who use LLMs via an API and thus don’t have ac­cess to the model weights or ac­ti­va­tions needed to steer the model. Only OpenAI can iden­tify or ex­pose steer­ing vec­tors for GPT-5.5, for in­stance. We could do this for open-weights mod­els, but un­til very re­cently (more on that later) there haven’t been any open mod­els strong enough to be worth do­ing this for.

On top of that, most ba­sic ap­pli­ca­tions of steer­ing are out­com­peted by just prompt­ing the model. It sounds pretty im­pres­sive to be able to ma­nip­u­late the mod­el’s brain di­rectly. But you know what else ma­nip­u­lates the mod­el’s brain di­rectly? Prompt to­kens. You can ex­er­cise fairly fine-grained con­trol over ac­ti­va­tions with steer­ing, but you can al­ready ex­er­cise ex­tremely fine-grained con­trol by tweak­ing the lan­guage of your prompt. In other words, there’s not much point go­ing to the trou­ble to steer a model to be more ver­bose when you could sim­ply ask.

Steering the un­prompt­able

One way for steer­ing to be re­ally use­ful is if we could iden­tify a con­cept that can’t be prompted for. What about intelligence”? You used to be able to prompt for in­tel­li­gence - this is why 4o-era prompt­ing al­ways be­gan with you are an ex­pert” - but cur­rent-gen­er­a­tion mod­els have that baked into their per­son­al­i­ties, so prompt­ing for it does noth­ing. Maybe steer­ing for it would still work?

Ultimately this is an em­pir­i­cal ques­tion, but I’m skep­ti­cal that we’ll be able to find an intelligence” steer­ing vec­tor. Put an­other way, the steer­ing vec­tor that makes up a con­cept as dif­fi­cult as intelligence” might be al­most co­ex­ten­sive with the en­tire set of weights of the model, and thus iden­ti­fy­ing it re­duces to the prob­lem of training a smart model”.

A suf­fi­ciently so­phis­ti­cated steer­ing ap­proach ends up just re­plac­ing the ac­tual model. If I take GPT-2, and at each layer I swap out the ac­ti­va­tions with the ac­ti­va­tions from a much stronger model with the same ar­chi­tec­ture, I will get a much bet­ter re­sult. But at that point you’re not mak­ing GPT-2 more in­tel­li­gent, you’re just talk­ing to the stronger model in­stead. The in­tel­li­gence is in the steer­ing, not in the model. For much more on this, see my post AI in­ter­pretabil­ity has the same prob­lems as phi­los­o­phy of mind.

Steering as data com­pres­sion

Another way for steer­ing to be use­ful is if we could some­how steer for a con­cept that re­quires a ton of to­kens to ex­press. Steering would thus save us a big chunk of the mod­el’s con­text win­dow. Intuitively, we might think of this as a way to shift a con­cept from the mod­el’s work­ing mem­ory into its im­plicit mem­ory.

For in­stance, what if we could iden­tify a knowledge of my par­tic­u­lar code­base” con­cept? When GPT-5.5 speed-reads my code­base, some of that knowl­edge it gains has to be buried in the ac­ti­va­tions, right? Maybe we could drag that out into a very large steer­ing vec­tor.

I would be sur­prised if this could work. I think we’ll run into the same prob­lem as with ex­tract­ing intelligence”: the knows my code­base” con­cept is prob­a­bly so­phis­ti­cated enough to re­quire a full fine-tune of the mod­el4. But it at least seems pos­si­ble.

Conclusion

I’m fas­ci­nated with steer­ing, but I’m not par­tic­u­larly op­ti­mistic about it. I think most of the gains can be more ef­fi­ciently re­pro­duced with prompts, and that the truly am­bi­tious steer­ing goals can be more ef­fi­ciently re­pro­duced by train­ing or fine-tun­ing the model.

However, the open-source com­mu­nity has­n’t done a lot of work on steer­ing yet, and that might be just start­ing to change now. If I’m wrong and it does have prac­ti­cal ap­pli­ca­tions, we should find that out in the next six months.

It’ll be in­ter­est­ing to see if be­spoke per-model tools like DwarfStar 4 end up in­clud­ing a library” of boost­able fea­tures. When a pop­u­lar open-weights model is re­leased, the com­mu­nity al­ways rushes to re­lease a suite of wrap­pers and quan­tized ver­sions. Could we also see a rush to ex­tract boost­able fea­tures from the model?

edit: this post got some com­ments on Hacker News. Several com­menters (including an­ti­rez him­self) pointed out that steer­ing can change some trained in” be­hav­ior in ways that prompt­ing can’t: most no­tably to re­move re­fusal from the model. Another com­menter says that this is how un­cen­sor­ing/​ablit­er­a­tion is al­ready done for open mod­els. I did­n’t know that - I thought the un­cen­sored mod­els were typ­i­cally LoRA fine-tunes. On this point, an­ti­rez noted that mod­i­fy­ing the weights can dam­age model ca­pa­bil­i­ties more than the more light­weight run­time-steer­ing ap­proach (which can only be ap­plied when needed). Makes sense to me.

Models have lots of dif­fer­ent ac­ti­va­tions you might mea­sure (after at­ten­tion, be­tween each layer, etc). You can ba­si­cally pick any one you want, or try mul­ti­ple and see what works best. ↩

Models have lots of dif­fer­ent ac­ti­va­tions you might mea­sure (after at­ten­tion, be­tween each layer, etc). You can ba­si­cally pick any one you want, or try mul­ti­ple and see what works best.

I re­cently read a re­ally good deep dive into do­ing this with an open LLaMA model (and I tried it my­self a few months ago, with mixed re­sults.) ↩

I re­cently read a re­ally good deep dive into do­ing this with an open LLaMA model (and I tried it my­self a few months ago, with mixed re­sults.)

Apologies to my read­ers from the big AI labs. Please email me if you have tried steer­ing in­ter­nally to boost ca­pa­bil­i­ties and it has­n’t worked. I promise I won’t tell any­one. ↩

Apologies to my read­ers from the big AI labs. Please email me if you have tried steer­ing in­ter­nally to boost ca­pa­bil­i­ties and it has­n’t worked. I promise I won’t tell any­one.

And even then, the re­sults of fine tune a model on your code­base” in the in­dus­try have largely been un­suc­cess­ful. ↩

And even then, the re­sults of fine tune a model on your code­base” in the in­dus­try have largely been un­suc­cess­ful.

Here’s a pre­view of a re­lated post that shares tags with this one.

wsl9x

codeberg.org

Windows 9x Subsystem for Linux.

WSL9x runs a mod­ern Linux ker­nel (6.19 at time of writ­ing) co­op­er­a­tively in­side the Windows 9x ker­nel, en­abling users to take ad­van­tage of the full suite of ca­pa­bil­i­ties of both op­er­at­ing sys­tems at the same time, in­clud­ing pag­ing, mem­ory pro­tec­tion, and pre-emp­tive sched­ul­ing. Run all your favourite ap­pli­ca­tions side by side - no re­boot­ing re­quired!

Proudly writ­ten with­out AI.

Technical de­tails

WSL9x is made up of three com­po­nents: a patched Linux ker­nel (see the win9x-um-6.19 branch), a VxD dri­ver, and a wsl.com client pro­gram.

The dri­ver is re­spon­si­ble for the ini­tial­i­sa­tion of WSL9x (see vxd/​wsl9x.asm for the dri­ver en­try point). It sets up the ini­tial map­pings for the ker­nel code and loads vm­linux.elf off disk us­ing DOS in­ter­rupts (see vxd/​loader.c and vxd/​fs.asm). The ker­nel is com­piled with a fixed base ad­dress of 0xd0000000.

The dri­ver then starts a new thread in the System VM, al­lo­cates a 16 KiB stack for en­ter­ing Linux on, and drops into an event loop which han­dles en­ter­ing the ker­nel, dis­patch­ing IRQs, re­turn­ing to user­space, and idling. See vxd/​en­try.c for this code.

The dri­ver is also re­spon­si­ble for han­dling user­space events which must be dis­patched to the ker­nel, cur­rently page faults and syscalls. Syscalls are han­dled via the gen­eral pro­tec­tion fault han­dler, as Win9x does not have an in­ter­rupt de­scrip­tor table long enough to in­stall a proper han­dler for int 0x80 - the Linux i386 syscall in­ter­rupt. Instead, the GPF han­dler in­spects the fault­ing in­struc­tion. If it’s int 0x80, the GPF han­dler ad­vances the in­struc­tion pointer as if the in­ter­rupt suc­ceeded and dis­patches as a syscall to Linux. See vxd/​fault.c for this code.

The Linux ker­nel is based on user-mode Linux, but hacked to call Windows 9x ker­nel APIs in­stead of posix APIs, and run­ning in ring 0 (supervisor/kernel mode) rather than ring 3 (user mode). Much of the ac­tual Win9x ker­nel in­te­gra­tion in­clud­ing con­text switch­ing lives in the Linux ker­nel. See linux/​arch/​um/​os-Win95 for the bulk of the Linux-side code. The en­try point called by vxd/​en­try.c is _start in main.c. process.c and mmu.c are also im­por­tant.

The last piece is the wsl.com client. This is a small 16 bit DOS pro­gram im­ple­mented in wsl/​wsl.asm which ex­ists to al­low WSL9x to use MS-DOS prompts as TTY win­dows rather than need­ing to im­ple­ment some­thing cus­tom.

When wsl.com starts, it makes an ini­tial call into wsl9x_v86_api in vxd/​con­sole.c to claim an un­used con­sole and no­tify WSL9x that out­put for that con­sole should be dis­patched to it. Then it drops into an event loop wait­ing for an IRQ and at­tempt­ing to read from the key­board when in­ter­rupted. The top of this event loop also serves as a syn­chro­ni­sa­tion point for the con­sole dri­ver - when out­put from Linux is ready, it sched­ules an event and ex­e­cutes int 0x29 in the con­text of the MS-DOS VM to out­put chars to the DOS win­dow. This in­ter­rupt is also where an ANSI dri­ver for DOS such as NNANSI is able to in­ter­cept the ter­mi­nal out­put to im­ple­ment ANSI es­cape codes.

Building and run­ning

You will need a cross tool­chain tar­get­ing i386-linux-musl on PATH. Use musl-cross-make to build one

You will need a cross tool­chain tar­get­ing i386-linux-musl on PATH. Use musl-cross-make to build one

You will need the Open Watcom v2 tool­chain for build­ing the Windows com­po­nents. Set the WATCOM env var to the pre­fix where you in­stalled it. On my ma­chine, that’s /opt/watcom.

You will need the Open Watcom v2 tool­chain for build­ing the Windows com­po­nents. Set the WATCOM env var to the pre­fix where you in­stalled it. On my ma­chine, that’s /opt/watcom.

Build the patched Linux ker­nel. This is a man­ual step be­cause build­ing the ker­nel takes quite a long time. $ git sub­mod­ule up­date –init # make sure linux sub­mod­ule is up to date $ make build-linux -j $(nproc)

Build the patched Linux ker­nel. This is a man­ual step be­cause build­ing the ker­nel takes quite a long time.

$ git sub­mod­ule up­date –init # make sure linux sub­mod­ule is up to date $ make build-linux -j $(nproc)

You will need a hard drive im­age hdd.base.img with Windows 9x pre-in­stalled

You will need a hard drive im­age hdd.base.img with Windows 9x pre-in­stalled

Run make - this will pro­duce a new hdd.img with WSL9x ready to go.

Run make - this will pro­duce a new hdd.img with WSL9x ready to go.

Run wsl at the MS-DOS prompt to open a pty. If you’d like to use ANSI colours, make sure you have an ap­pro­pri­ate dri­ver loaded be­fore run­ning wsl. nnansi.com is a good op­tion.

Run wsl at the MS-DOS prompt to open a pty. If you’d like to use ANSI colours, make sure you have an ap­pro­pri­ate dri­ver loaded be­fore run­ning wsl. nnansi.com is a good op­tion.

License

GPL-3

openai.com

A nicer voltmeter clock

lcamtuf.substack.com

Back in 2019, I built a sim­ple volt­meter clock:

As the name im­plies, these clocks use ana­log panel volt­meters in­stead of tra­di­tional clock faces to dis­play time. I did­n’t come up with the idea, so I never re­ally blogged about the de­sign; I just built one and kept it on my of­fice desk.

The idea en­dures, but most of the de­signs I see on the in­ter­net are need­lessly com­pli­cated and not all that pretty, so when I de­cided to build a re­vised de­sign, I fig­ured it might be good to doc­u­ment it bet­ter. The process started with a rough mockup in a 3D de­sign pro­gram:

For this ver­sion of the me­ter clock, I opted to use three generic, 90° panel volt­meters from Amazon (link, about $9). I dis­as­sem­bled them, took care­ful mea­sure­ments of the faces, and then printed re­place­ment de­cals on ad­he­sive pa­per. Printable PDF tem­plates can be found here.

Note that the new hour gauge has 13 di­vi­sions, from 0 to 12, while the minute and sec­ond tem­plates have 61 di­vi­sions, from 00 to 60. This is be­cause I wanted to im­ple­ment con­tin­u­ous mo­tion for each hand; this meant that at 11:30, the hour dial could­n’t be just stuck at 11; it needed to be mov­ing to­ward the twelfth di­vi­sion, even if it was never to reach it.

In ad­di­tion to a host of other prob­lems, the cheap Baomain 65C5 me­ters I’m us­ing have a rather hideous plas­tic flange. I de­cided to hide this flange from view and use a re­cessed dec­o­ra­tive pat­tern to keep the front panel in­ter­est­ing. This made it more ex­pe­di­ent to cut the front and back on a CNC mill in­stead of build­ing the en­clo­sure by hand (as I did for ver­sion 1). The stock ma­te­r­ial is maple lum­ber re­sawn, squared, and planed in my work­shop:

The rounded side wall posed a dif­fer­ent chal­lenge. For a seam­less ap­pear­ance, I needed to do bend a flat piece of wood us­ing a shaped tem­plate. To pull this off with­out a steam bend­ing jig, I had to cut a se­ries of in­ter­nal notches on the side wall. This al­lowed the wood to flex more eas­ily:

The wood had to be moist­ened, clamped, and then al­lowed to dry. After a cou­ple of days, I glued the curved side wall to the front and back faces, re­ly­ing on an­other tem­plate cut out of scrap ply­wood to get a pre­cise fit with­out any more gym­nas­tics with clamps and ratchet straps:

Anyway — here’s the as­sem­bled piece af­ter sand­ing and a coat of ni­tro­cel­lu­lose lac­quer:

Not bad, right?

The cir­cuit is far less in­ter­est­ing and took just an hour or so: I grabbed the ven­er­a­ble AVR128DB28 MCU, pow­ered it off a wall wart, in­ter­faced it to an 8 MHz crys­tal (ECS-80 – 18-4X-CKM). A 32.768 kHz crys­tal would also do. The pan­els are con­nected to three dig­i­tal out­put pins (PC0, PC1, PC2). Finally, two in­put pins (PD6, PD7) are in­ter­faced to two small push­but­tons mounted on the back and used to set time.

Note that the cir­cuit does­n’t re­quired dig­i­tal-to-ana­log con­vert­ers or any other ad­di­tional com­po­nents to drive the me­ters; in­stead, I’m just us­ing a rel­a­tively high-fre­quency, 1-bit dig­i­tal pulse train. The in­er­tia of the me­ter (and the in­duc­tance of the coil in­side the me­ter) does the rest, set­tling in an in­ter­me­di­ate po­si­tion de­pend­ing on the soft­ware-con­trolled sig­nal duty cy­cle.

The code can be viewed here; it’s short and well-com­mented. The ba­sic idea is to ad­vance a 10 Hz counter us­ing a timer in­ter­rupt syn­chro­nized with the crys­tal. With this out of the way, the main event loop com­putes the ap­pro­pri­ate duty cy­cle and then man­u­ally tog­gles the out­put pins. Although the chip has a hard­ware PWM mod­ule, the ap­pli­ca­tion is sim­ple enough that us­ing the PWM cir­cuitry would­n’t re­ally buy us any­thing.

Here’s the oblig­a­tory rollover” video cap­tured around 11:59:59:

Peace out.

If you liked the ar­ti­cle, you’ll en­joy The Secret Life of Circuits. It’s a richly il­lus­trated, lu­cid in­tro­duc­tion to elec­tron­ics — from the physics of con­duc­tion to em­bed­ded sys­tem pro­gram­ming. It fea­tures 290+ color di­a­grams, 420+ pages of orig­i­nal con­tent, and zero AI.

And if you’re new here, you might en­joy some of my other ar­ti­cles:

No posts

Are you a robot?

www.bloomberg.com

Please make sure your browser sup­ports JavaScript and cook­ies and that you are not block­ing them from load­ing. For more in­for­ma­tion you can re­view our Terms of Service and Cookie Policy.

Hosting a website on an 8-bit microcontroller. (Maurycy's blog)

maurycyz.com

In to­day’s episode of dumb things to do with an AVR mi­cro­con­troller”:

MCU web­site demo (may go down if this gets posted to HN)

My vic­tim is the AVR64DD32 which is quite sim­i­lar to the Atmega328 of Arduino fame. Compared to the older Atmega, these AVR DD lines are cheaper for the same mem­ory, use a sin­gle pro­gram­ming pin and have nicer pe­riph­er­als:

So that’s the com­puter (a rather spa­cious one at that) but it’ll need an in­ter­net con­nec­tion to host a web­site.

The ob­vi­ous choice is Ethernet, but even the slow­est ver­sion (10BASE-T) still runs at 10 megabits/​sec­ond. Worse, it uses Manchester en­cod­ing: a zero is sent as 10” and a one as 01″, so 10 megabits of data is ac­tu­ally 20 megabits at the wire.

This is sim­ply too fast for the AVR to gen­er­ate. While it’s proces­sor can run at 24 MHz, but all the pe­riph­er­als and IO pins max out at a 12 MHz clock. (although some other 8-bit chips should be able to do it)

The proper so­lu­tion is to buy a ded­i­cated eth­er­net chip from DigiKey, but then I’d be wait­ing weeks to fin­ish this pro­ject.

… and eth­er­net is far from the only op­tion:

Serial Line Internet Protocol (RFC 1055) is a very old and very sim­ple stan­dard for run­ning net­works over se­r­ial:

Before send­ing a packet, wrap it in 0xC0 bytes. If the packet con­tains any 0xC0 bytes, re­place them with 0xDB 0xDC. To avoid am­bi­gu­ity, any pre-ex­ist­ing 0xDB bytes are re­placed with 0xDB 0xDD.

This scheme was widely used for con­nect­ing to the in­ter­net in the olden days: A dial up mo­dem cre­ates a se­r­ial link over a phone line, and it’s up the the com­puter to do any­thing with it. (This also means that they are not lim­ited to net­work­ing: those same modems could be con­nected to a ter­mi­nal for re­mote ac­cess)

… which is why SLIP is still sup­ported by mod­ern Linux:

# Just a nor­mal USB to Serial adapter stty -F /dev/ttyUSB0 115200 raw cs8 slat­tach -m -F -L -p slip /dev/ttyUSB0 # … and now it’s a net­work in­ter­face

The hard­ware on the mi­cro­con­troller’s end is triv­ial:

It does work with no ex­ter­nal com­po­nents, but I wanted some blinken­lights, and an id­iot-proof­ing diode for when I in­evitably con­nect the power back­wards.

Because it only draws a few mil­li­watts, it can run the server of the se­r­ial adapter’s 5 volt rail: it’s re­ally nice to only have one ca­ble to deal with.

Now it has an in­ter­net con­nec­tion, but that’s hardly a server.

In or­der for my web page to get to your com­puter, it needs to pass through dozens of dif­fer­ent net­works. To do this, each packet has an IP header: 40 bytes that con­tain the ad­dress of the source and des­ti­na­tion com­put­ers, and some other stuff I don’t re­ally care about.

The pro­to­col used to be a lot more com­plex, with fea­tures like packet frag­men­ta­tion that re­quire a lot of mem­ory to han­dle cor­rectly, but I don’t have to: every mod­ern op­er­at­ing sys­tem dis­ables frag­men­ta­tion and IPv6 re­moved it en­tirely.

This makes im­ple­ment­ing it very easy: Just swap around the source and des­ti­na­tion of a re­cieved packet to gen­er­ate the header for the re­sponse. (and re­set the TTL counter)

The other pro­to­col, TCP is a lot harder: Implementing it re­quires the mi­cro­con­troller to track con­nec­tion states, pe­ri­od­i­cally re­trans­mit lost pack­ets and han­dle a huge num­ber edge cases.

It took sev­eral days to get my cus­tom im­ple­men­ta­tion work­ing well enough, and it’s still got a few bugs.

As for im­ple­ment­ing HTTP, I did­n’t: The server al­ways sends a hard­coded response” back to the client. This works fine as long as there’s only a sin­gle URL on the site.

[Video of the page load­ing. See web or files di­rec­tory: load­ing.mp4]

Ok great, but what if I want to share it with friends? Unfortunately, for their re­quests to reach it, it needs a pub­li­cally routable IPv4 ad­dress. Not only are these ex­pen­sive but it’s im­pos­si­ble to get a good in­ter­net con­nec­tion at my place.

(no, Starlink is not good)

I do have a ma­chine with a pub­li­cally routable ad­dress, but it’s at a dat­a­cen­ter near Helsinki: I’d need a very long se­r­ial ca­ble…

an­other cool thing Linux sup­ports is wire­guard, which cre­ates a vir­tual net­work link over the in­ter­net. This works even if one of the ma­chines is be­hind (CG)NAT or other an­noy­ances.

Problem solved:

have the Linux router box con­nect to the VPS to get a proper in­ter­net con­nec­tion?

… ex­cept the MCU still does­n’t have it’s own IP ad­dress: I could for­ward every­thing from my VPSs ad­dress to it, but that would break my nor­mal web­site.

Instead, I setup the server to proxy any re­quests un­der /mcu to the server us­ing a lo­cal ad­dress block. This means that vis­i­tors aren’t di­rectly con­nect­ing to the MCUs TCP/IP stack… but hey, it’s the same setup that the Vape Server uses and no one com­plained.

(It also makes it slightly harder to break by send­ing SYN pack­ets, but it’s not ex­actly hard to DDoS a server con­nected over what’s ef­fec­tively dial-up)

Related:

/mcu: The page hosted from the mi­cro­con­troller.

http://​ewaste.fka.wtf/: The Vape Server, a web­site hosted off a 32-bit MCU pulled from the trash.

https://​lcamtuf.sub­stack.com/​p/​psa-if-youre-a-fan-of-at­mega-try: lcamtuf on the AVR Dx line.

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

Visit pancik.com for more.