10 interesting stories served every morning and every evening.

Banning noise will be a disaster for statistical data products - Ted is writing things

desfontain.es

Last week, the United States Department of Commerce is­sued an or­der de­clar­ing that noise in­fu­sion” will be banned from all sta­tis­ti­cal prod­ucts pub­lished by the Census Bureau and the Bureau of Economic Analysis.

What does it mean, and why should you care?

Context

Statistical prod­ucts are a bunch of num­bers pub­lished from a se­cret dataset. Often, that dataset con­tains con­fi­den­tial in­for­ma­tion, and it is im­por­tant that the num­bers don’t re­veal that in­for­ma­tion. The U.S. Census is a well-known ex­am­ple: the sta­tis­tics are made pub­lic, but the con­tents of each form filled by in­di­vid­ual U.S. res­i­dents must stay se­cret.

Scientists have de­vel­oped a num­ber of tech­niques that can be used to pub­lish use­ful sta­tis­tics while pro­tect­ing the pri­vacy of the orig­i­nal data. This field is called dis­clo­sure avoid­ance in sta­tis­ti­cal com­mu­ni­ties. Here are a few of these tech­niques.

Suppression: re­mov­ing data that does­n’t pass cer­tain thresh­olds (e.g. if a count of peo­ple is be­low 5, we don’t pub­lish it).

Coarsening (or gen­er­al­iza­tion): mak­ing data at­trib­utes less pre­cise (e.g. trans­form a county into its state, a date of birth into an age range, etc.).

Sampling: ran­domly re­mov­ing some records from the dataset.

Swapping: tak­ing at­trib­utes from dif­fer­ent records and ex­chang­ing them ran­domly.

Contribution bound­ing: mak­ing sure that a sin­gle in­di­vid­ual can­not con­tribute too much” to a sta­tis­tic by lim­it­ing their max­i­mum im­pact.

Noise ad­di­tion: adding a ran­dom num­ber to sta­tis­tics to hide their true value.

Some of these tech­niques, when com­bined, achieve a de­f­i­n­i­tion called dif­fer­en­tial pri­vacy. This de­f­i­n­i­tion has a lot of nice fun­da­men­tal prop­er­ties and is widely con­sid­ered the gold stan­dard of pri­vacy pro­tec­tion among sci­en­tists. To achieve it, sci­en­tists typ­i­cally rely on a com­bi­na­tion of con­tri­bu­tion bound­ing and care­fully-cal­i­brated noise ad­di­tion.

From 1990 to 2010, the U.S. Census Bureau pri­mar­ily re­lied on swap­ping for the de­cen­nial cen­sus. Then, they re­al­ized that this tech­nique was ac­tu­ally very un­safe, and that it was pretty easy to re­con­struct in­di­vid­ual records us­ing the pub­lished sta­tis­tics. This is bad, be­cause the Bureau is re­quired by fed­eral law to keep these records con­fi­den­tial. So they tried a few al­ter­na­tive ap­proaches, and de­cided to adopt dif­fer­en­tial pri­vacy for the 2020 Census: this was the one that kept the sta­tis­tics most use­ful, while pre­vent­ing these at­tacks.

It bears re­peat­ing: dif­fer­en­tial pri­vacy was­n’t cho­sen be­cause the math was nice and com­pelling1. It was se­lected be­cause among the dif­fer­ent op­tions that mit­i­gated the at­tack, it was the one that pre­served the most util­ity. Its ex­act pri­vacy pa­ra­me­ters were cho­sen not be­cause they pro­vided rock-solid prov­able guar­an­tees, but be­cause they squeezed most use­ful­ness out of the data while reach­ing an ac­cept­able level of pri­vacy pro­tec­tion.

Sadly, preserved the most util­ity un­der newly-dis­cov­ered pri­vacy con­straints” did not mean preserved as much util­ity as the 2010 Census”: the num­bers got less ac­cu­rate, and the in­ac­cu­ra­cies got a lot more trans­par­ent, and there­fore im­pos­si­ble to ig­nore. This made a num­ber of peo­ple very an­gry.

Demographers and so­cial sci­en­tists could no longer ig­nore that the data they were work­ing with was noisy data. This re­quired a ma­jor shift in how they con­cep­tu­al­ized and worked with this data.

People who were us­ing Census data to ac­tu­ally re­con­struct records could no longer do so. Demographers ad­mit­ted that this was com­mon prac­tice. It’s also an open se­cret that this was done by po­lit­i­cal op­er­a­tives as part of ger­ry­man­der­ing ef­forts.

Phew, that was a lot of con­text.

What does the or­der say?

The ad­min­is­tra­tion has now de­cided that noise in­fu­sion was no longer an ac­cept­able dis­clo­sure avoid­ance tech­nique.

The or­der clearly tar­gets dif­fer­en­tial pri­vacy, but also seems to im­pact other tech­niques that in­volve ran­dom­ness: the text ex­plic­itly men­tions that coars­en­ing should al­ways be pre­ferred, falling back to sup­pres­sion as a last re­sort”. I have no idea why the or­der is so spe­cific. Maybe they wanted to make sure the sci­en­tists work­ing at the U.S. Census could­n’t still use sim­i­lar tech­niques with­out call­ing them dif­fer­en­tial pri­vacy?

The or­der also care­fully says it shall not be in­ter­preted to con­flict with any con­sti­tu­tional, statu­tory, reg­u­la­tory, or other le­gal pro­vi­sion”. So the con­fi­den­tial­ity oblig­a­tions sur­round­ing these sta­tis­ti­cal prod­ucts still ap­ply.

What will it mean in prac­tice?

The con­se­quences will be dire for util­ity or for pri­vacy, and pos­si­bly both. It’s hard to un­der­state this point: fu­ture sta­tis­ti­cal re­leases will ei­ther be use­less com­pared to past ones, or they will be in­cred­i­bly un­safe.

For starters, tak­ing away use­ful tools from the dis­clo­sure avoid­ance tool­box will al­ways lead to more painful pri­vacy/​util­ity trade-offs. The whole point of this re­search field is to bet­ter un­der­stand and quan­tify pri­vacy risk, and de­velop bet­ter tools to mit­i­gate this risk while pre­serv­ing util­ity.

For sta­tis­ti­cal re­leases, dif­fer­en­tial pri­vacy is sim­ply the best tool we have right now. It pro­vides a finer way of quan­ti­fy­ing trade-offs, and al­lows us to get more util­ity out of the data than com­pet­ing tech­niques at sim­i­lar pri­vacy lev­els. If you take it away, you’re left with tech­niques that ei­ther have worse util­ity at sim­i­lar lev­els of pri­vacy, or worse pri­vacy for the same util­ity.

But all com­pet­ing tech­niques also rely on noise ad­di­tion. The Cell Key method, used at other sta­tis­ti­cal agen­cies, adds noise to sta­tis­tics. Swapping, used from 1990 to 2010 for the U.S. Census, also in­jects ran­dom­ness into the process. Sampling is every­where in sta­tis­ti­cal work2. Hell, even im­pu­ta­tion tech­ni­cally adds noise to the data3!

By con­trast, coars­en­ing and sup­pres­sion are very blunt in­stru­ments. They only work in sit­u­a­tions where the sta­tis­tics are al­ready very coarse, and not too many of them are pub­lished. For com­plex data prod­ucts with many sta­tis­tics about small groups of peo­ple (like the U.S. Census), they ei­ther de­stroy all util­ity of the data (especially for mi­nor­ity pop­u­la­tions), or are very vul­ner­a­ble to pri­vacy at­tacks.

It makes sense: pri­vacy at­tacks on sta­tis­ti­cal re­leases are about solv­ing a sys­tem of equa­tions. It is such an eas­ier task when you know for sure that the sta­tis­tics are all per­fectly ac­cu­rate. Noise forces you to com­pute prob­a­bil­i­ties, quan­tify the un­cer­tainty, care­fully con­sider base­lines, and so on. That’s why ran­dom­ness is such a use­ful tool for dis­clo­sure avoid­ance! Even with­out for­mal guar­an­tees, it makes at­takcs a lot harder. Take it away and at­tacks be­come triv­ial.

Why is it hap­pen­ing?

I mean, who knows.

Maybe the goal is to force the U.S. Census to pub­lish sta­tis­tics that ac­tu­ally en­able re-iden­ti­fi­ca­tion, to help with fu­ture ger­ry­man­der­ing ef­forts? Or on the con­trary, maybe the idea is to stop the pub­li­ca­tion of use­ful de­mo­graphic data, to pre­vent re­searchers from show­ing un­fair dis­par­i­ties among the pop­u­la­tion?

Hanlon’s ra­zor pro­vides an al­ter­na­tive ex­pla­na­tion. The fun­da­men­tal pri­vacy/​util­ity trade-off in­her­ent to sta­tis­ti­cal data re­leases is an­noy­ing. It would be a lot eas­ier if pub­lish­ing many sta­tis­tics did­n’t au­to­mat­i­cally come with a high pri­vacy risk. Differential pri­vacy makes this trade-off ex­plicit, and thus im­pos­si­ble to ig­nore. Maybe ban­ning it is a way of pre­tend­ing that the prob­lem does­n’t ex­ist, in the hope that it will go away?

Thanks to Adam Sealfon, Aloni Cohen, Ben Jacobsen, and Gautam Kamath for help­ful com­ments on ear­lier drafts of this post.

Sadly for math-brained peo­ple like me, it turns out that very few choices made in the real world de­pend on the el­e­gance of the un­der­ly­ing math. ↩

Sadly for math-brained peo­ple like me, it turns out that very few choices made in the real world de­pend on the el­e­gance of the un­der­ly­ing math. ↩

Maybe sam­pling does­n’t count as noise in­fu­sion? But if you take a ran­dom sam­ple of a pop­u­la­tion and es­ti­mate a to­tal sta­tis­tic based on the sam­ple… this sta­tis­tic will be noisy. ↩

Maybe sam­pling does­n’t count as noise in­fu­sion? But if you take a ran­dom sam­ple of a pop­u­la­tion and es­ti­mate a to­tal sta­tis­tic based on the sam­ple… this sta­tis­tic will be noisy. ↩

Thanks to Adam Smith for point­ing this out. It is al­most funny, in an ex­tremely cursed sort of way, to imag­ine do­ing sta­tis­ti­cal data re­leases with­out any im­pu­ta­tion. Maybe there’s a pa­per to be writ­ten here, to for­mally quan­tify the ef­fect of im­pu­ta­tion us­ing the lan­guage of dif­fer­en­tial pri­vacy, sim­i­larly to what was done with swap­ping. ↩

Thanks to Adam Smith for point­ing this out. It is al­most funny, in an ex­tremely cursed sort of way, to imag­ine do­ing sta­tis­ti­cal data re­leases with­out any im­pu­ta­tion. Maybe there’s a pa­per to be writ­ten here, to for­mally quan­tify the ef­fect of im­pu­ta­tion us­ing the lan­guage of dif­fer­en­tial pri­vacy, sim­i­larly to what was done with swap­ping. ↩

Every Frame Perfect

tonsky.me

A while ago I was read­ing about Wayland and this quote stuck with me:

A stated goal of Wayland is every frame is per­fect”.

A stated goal of Wayland is every frame is per­fect”.

And I think this is a goal we should all as­pire to. Wayland is talk­ing about the tech­ni­cal side of things (modern GPU stacks are very com­plex and Wayland is try­ing to take con­trol back) but it could be ap­plied to UI too.

The rule of thumb is:

If I take a screen­shot of your app at any mo­ment, it must make sense

Why care about every frame? It builds trust. Users can’t see the code, so UI is the only way for them to judge the qual­ity of the app. If UI looks good, that means de­vel­op­ers had time to pol­ish it, which means that they prob­a­bly spent a com­pa­ra­ble amount of time to iron out the code. It’s a heuris­tic, but a rea­son­able one.

Now, what does it mean in prac­tice? I can think of a few things:

No white flashes be­tween screens.

No par­tially loaded con­tent.

No re­lay­out while con­tent loads.

Internally con­sis­tent. If one part of the UI says 1 up­date avail­able”, an­other part should not say Checking for up­dates…”

Precise an­i­ma­tions.

Animations of­ten end up be­ing for­got­ten. A UI might look great in both start and end states but very janky in be­tween. Like this:

If you feel like there are weird things go­ing on there, there are! Look at slowed down ver­sion:

Now let’s ap­ply our rule and take screen­shots in the mid­dle of the an­i­ma­tion. This does­n’t look right:

Neither does this:

Both of these frames are not per­fect.

Let’s look at an­other ex­am­ple. Safari:

Placeholder text here moves from the cen­ter but cur­sor an­i­mates from the left po­si­tion:

Not the end of the world by any means, but it does cre­ate a feel­ing that these two com­po­nents are not in sync with each other. Next thought: maybe they weren’t de­signed to­gether? If so, then they might not work well to­gether. That’s how trust is lost.

This de­syn­chro­niza­tion can lead to a lot of con­fu­sion. For ex­am­ple, in Photos, when switch­ing be­tween Crop and Adjust mode, pic­ture snaps into place im­me­di­ately but the crop bor­der is an­i­mated:

This cre­ates a false feel­ing that some­thing sub­tly changes when you switch be­tween modes. And you know what? I don’t want my UI to give me false feel­ings. I want it to be a pre­cise in­stru­ment, not an an­i­mated toy.

Sometimes an­i­ma­tions are sup­posed to help you un­der­stand a tran­si­tion, so it’s dou­bly sad when they make it harder. Follow the mag­ni­fy­ing glass:

Same with Youtube. They had the sim­plest task in the world: move a rec­tan­gle from one po­si­tion to an­other! Yet they de­cided to do some­thing very strange:

Can you ex­plain this? Does it make sense?

Probably a tech­ni­cal lim­i­ta­tion of the DOM ar­chi­tec­ture they de­cided ear­lier on. I call these sit­u­a­tions The tech­nol­ogy has out­smarted the pro­gram­mer”. But no mat­ter the rea­son, the re­sult is an im­per­fect frame.

Sometimes an­i­ma­tions are left out as an af­ter­thought. Whatever hap­pens, hap­pens. Then we get this:

The de­tails are fas­ci­nat­ing to watch:

So yeah. Please pay at­ten­tion not only to the start and end states, but also to every­thing in be­tween. Every frame mat­ters.

I’ll leave you with this un­pro­voked zoom an­i­ma­tion from Preview app. Take care!

wsj.com

www.wsj.com

Please en­able JS and dis­able any ad blocker

wsj.com

www.wsj.com

Please en­able JS and dis­able any ad blocker

Z.ai launches GLM-5.2 with a 1-million-token context window ahead of an MIT-licensed release next week · Digg

digg.com

Intelligence should be open, ac­ces­si­ble, and ready to build with, em­pow­er­ing every de­vel­oper, every­where.

GLM-5.2 is now avail­able to all GLM Coding Plan users, in­clud­ing Lite, Pro, Max, and Team plans. http://​docs.z.ai/​de­v­pack/​lat­est-model

As our new flag­ship model, GLM-5.2 de­liv­ers pow­er­ful cod­ing ca­pa­bil­i­ties, us­able 1M-context sup­port, and con­tin­ued strengths in long-hori­zon tasks.

API and Chatbot ser­vices will launch next week. The model will also be of­fi­cially open-sourced next week un­der the MIT License.

The fu­ture of AI is open, and it be­longs to the peo­ple.

Just a moment...

economist.com

Access Denied

news.sky.com

AI Coding at Home Without Going Broke

stephen.bochinski.dev

There are three ways to do AI cod­ing at home with­out spend­ing like a com­pany, and which one fits de­pends mostly on how much you trust the next year of hard­ware and model re­leases. The first is to self host. You buy the ma­chine, run open source mod­els lo­cally, and pay noth­ing per to­ken af­ter that. The up­front cost is steep and the mod­els you can ac­tu­ally run at home are weaker than what the fron­tier labs ship, so this only pays off if you can keep the rig busy with long run­ning tasks where a slower, cheaper model grinds away overnight. Most peo­ple can’t keep a home ma­chine that loaded, and the hard­ware you buy to­day may look like a bad bet in a year.

The sec­ond is to skip the hard­ware and rent those same open source mod­els from a provider at API rates. For most peo­ple this is the right call. You avoid putting thou­sands of dol­lars on one GPU setup while con­fig­u­ra­tions are still in flux, you skip the work of squeez­ing long run­ning per­for­mance out of an open model, and you can switch to what­ever is cheaper or bet­ter next month with­out re­selling a box. Something like OpenRouter makes the move close to a one line change.

The third is to min-max the fron­tier sub­scrip­tions from OpenAI and Anthropic. Around $400 a month of plans buys roughly $2800 of API us­age at list prices, which is a real bar­gain right up un­til you hit the ceil­ing. The plans are me­tered, and any large AI na­tive work­flow will chew through the in­cluded to­kens fast. They shine for the work you drive by hand and fall short as the en­gine for an agent run­ning all day.

What I’ve seen work best is a blend of the last two. Keep a cou­ple of fron­tier sub­scrip­tions for the hard think­ing and the spec writ­ing, and pay API rates for open source mod­els to han­dle the small me­chan­i­cal pieces. Lean on spec dri­ven de­vel­op­ment so the ex­pen­sive mod­els pro­duce the plan and the cheap ones fill it in. Do that well and you can build what a team of twenty en­gi­neers would put out in a month for around a thou­sand dol­lars.

Honda Civics and the Evil Valet

juniperspring.org

Three years ago, I pub­lished my ini­tial work to un­der­stand and re­verse en­gi­neer my car, specif­i­cally the head­unit of my 2021 Honda Civic.1

The ini­tial re­sponse was in­cred­i­bly en­cour­ag­ing. I’m writ­ing to give a pro­ject up­date.

Keys to the Kingdom

The biggest progress has been made while map­ping out the up­date process.

Honda sup­ports up­dat­ing the head­unit via USB. There are a num­ber of Honda-specific checks, but ul­ti­mately the USB drive con­tains a signed AOSP up­date file that gets staged and ap­plied via Android re­cov­ery. The good news? They left the pub­licly-known AOSP test key in res/​keys*, and, even though they mod­i­fied the re­cov­ery bi­nary, the ver­i­fy_­file sig­na­ture logic matches stock AOSP.

So as long as you can prop­erly for­mat a USB drive and sign it with the pub­licly-known AOSP test key, you can in­stall what­ever you want to the head­unit, with­out con­ven­tional root ac­cess (no need for su with se­tuid). This means that, as long as the head­unit has power and an at­tacker has phys­i­cal ac­cess to the front-most USB port, they have ar­bi­trary code ex­e­cu­tion on the head­unit via the up­date path.

This is an evil maid at­tack. Since it re­quires phys­i­cal ac­cess to the cabin of the car rather than the ho­tel room, I call it an evil valet at­tack. Imagine a jour­nal­ist dri­ves to a ho­tel and leaves their car with the valet. The valet, who works for a three-let­ter agency, in­stalls an up­date via USB. When the car is re­turned, the jour­nal­ist does­n’t know the head­unit has been mod­i­fied. Since I want a cool vul­ner­a­bil­ity name, I’m call­ing this EvilValet”.

This blog ar­ti­cle is not in­tended as a tech­ni­cal writeup. If you want the gory de­tails, see the tech­ni­cal docs.2

I’ve also pub­lished a new tool, ota-builder3, that al­lows peo­ple to eas­ily pre­pare up­date files that will be ac­cepted by the head­unit. While in its early days, it should be triv­ial to now build an up­date file that, for ex­am­ple, in­stalls an su bi­nary with se­tuid set (i.e., to root the de­vice).

*I have strong rea­son to be­lieve that all up­dates are signed with the pub­licly-known AOSP test key, but I don’t have ac­cess to every pos­si­ble of­fi­cial up­date file, nor ac­cess to every head­unit vari­ant and its filesys­tem. My head­unit has the AOSP test key in res/​keys, though I’ve also in­stalled HondaHack, so it’s pos­si­ble that it in­jected the key into the key­store. However, I’ve also con­firmed that MRC_EU_SW_v12_4.zip, a pub­licly-avail­able EU soft­ware up­date file, is test key signed. This file was down­loaded from a pub­lic fo­rum4 and was never mod­i­fied by me. So it seems highly likely that all up­dates are signed with the AOSP test key. Contributors are wel­come to help sup­port or re­fute this hy­poth­e­sis.

Building Tools

Beyond the up­date process, the most use­ful work has been on apk-re­builder5. It has one very im­por­tant job: take in a Honda Civic up­date file from the Internet, and pro­duce a clean tree of out­put files that au­to­mates every­thing a re­verse en­gi­neer would oth­er­wise have to do man­u­ally, in­clud­ing:

Resolving re­sources

Reconstructing .smali code

Repacking APK files

Extracting the ramdisk

And more

This also serves an im­por­tant role be­cause we can’t pub­lish ac­tual Honda source code. We pub­lish a func­tion that takes in an up­date file (that we don’t host) and spits out Honda .smali code, im­age as­sets, etc. The re­sult­ing out­put fol­lows a clear di­rec­tory struc­ture that can be ref­er­enced in doc­u­men­ta­tion with­out ac­tu­ally up­load­ing the sen­si­tive files them­selves.

Outstanding Work - A Call for Contributors

There are a few out­stand­ing things that would be nice to have.

Known Versions

The up­date process is frag­ile and re­lies heav­ily on ver­sion num­bers. This does­n’t limit the abil­ity to run un­signed code, be­cause the ver­sion num­bers can be spoofed” (see the tech­ni­cal docs). But in or­der to build an up­date file in the first place you need to know what ver­sions your head­unit ex­pects. Further, any changes to the head­unit soft­ware that don’t match my build could lead to un­ex­pected be­hav­ior and re­cov­ery loops.

If you drive a 10th gen Honda Civic and are tech-savvy, I en­cour­age you to con­tribute to the Known Versions, Display Audio Software” sec­tion of the repo.6

If you’re feel­ing par­tic­u­larly brave, read through the ota-builder code and try and flash an up­date. But do so at your own risk; if your head­unit dif­fers from mine you could get stuck in a re­cov­ery loop and soft­brick your de­vice.

Toolchain

I have an ex­per­i­men­tal/​work-in-progress tool­chain on my lo­cal ma­chine. It takes can­di­date .c code and com­piles it for ARMv7, us­ing the same com­piler ver­sion and build flags as the orig­i­nal ven­dor bi­na­ries. This proved in­dis­pens­able in my work to un­der­stand the up­date process. It makes heavy use of Docker. The cur­rent it­er­a­tion is messy and largely spe­cific to my work­flow, but I’d like to pub­lish a clean im­ple­men­ta­tion.

Custom Themes

I ex­plored this a bit while vibe-cod­ing apk-ren­der­er7. Custom themes are likely dif­fi­cult to ship be­cause they live in Mitsubishi’s fork of the AOSP frame­work, and the head­unit apps are mini­fied to ex­pect hard­coded re­source IDs. Any at­tempt to ship a cus­tom theme would likely in­volve sur­gi­cally edit­ing the ven­dor frame­work (and writ­ing a tool to do so au­to­mat­i­cally). None of this is triv­ial and prob­a­bly is­n’t worth the ef­fort, but I wel­come con­trib­u­tors.

Improve aidl-re­builder

I started work­ing on a tool to parse .smali files and gen­er­ate/​map out all AIDL in­ter­faces on the head­unit. This works but I haven’t re­viewed it fully for ac­cu­racy. This opens up the door for cus­tom apps such as vir­tual speedome­ters. Contributors wel­come.

Thoughts on Documentation and LLMs

I’ve placed less em­pha­sis on ref­er­ence doc­u­men­ta­tion and more on tool­ing. The idea is that if I can ship re­li­able, de­ter­min­is­tic tools that map the head­unit code to more di­gestible forms, then peo­ple can use LLMs to query those more di­gestible forms to an­swer what­ever their spe­cific ques­tions are. This avoids hav­ing to main­tain ref­er­ence docs that can stray from the ac­tual head­unit code, be­cause the head­unit code is the source of truth.

For ex­am­ple, a user guide that ex­plains how to con­nect to the head­unit via ADB is still deemed use­ful. But a doc­u­ment ex­plain­ing how some Java code works, when the Java code it­self is avail­able to an LLM, seems like a main­te­nance bur­den.

Wrapping up and Thanks

At this point, I’ve done most of the in­ves­tiga­tive work I in­tend to do on the head­unit. This is one of those pro­jects that I could toil end­lessly on, but I’ll likely tran­si­tion to other pro­jects. That said, the repo is by no means aban­doned. PRs are al­ways wel­come.

Special thanks to Tunas8 for the mem­o­ries, and to Hackaday9 for cov­er­ing my orig­i­nal work.

See every­one some­time down the road 🌱

Eric

McDonald, E. (2023). Honda Reverse Engineering”. Juniperspring. Retrieved June 13, 2026. ↩︎

McDonald, E. (2023). Honda Reverse Engineering”. Juniperspring. Retrieved June 13, 2026. ↩︎

McDonald, E. (n.d.). Display Audio Update Files”. GitHub. Retrieved June 13, 2026. ↩︎

McDonald, E. (n.d.). Display Audio Update Files”. GitHub. Retrieved June 13, 2026. ↩︎

McDonald, E. (n.d.). ota-builder”. GitHub. Retrieved June 13, 2026. ↩︎

McDonald, E. (n.d.). ota-builder”. GitHub. Retrieved June 13, 2026. ↩︎

fe­lixlen­nart (September 22, 2022). Install American firmware on European head unit”. 2016+ Honda Civic Forum (CivicX.com). Retrieved June 13, 2026. ↩︎

fe­lixlen­nart (September 22, 2022). Install American firmware on European head unit”. 2016+ Honda Civic Forum (CivicX.com). Retrieved June 13, 2026. ↩︎

McDonald, E. (n.d.). apk-rebuilder”. GitHub. Retrieved June 13, 2026. ↩︎

McDonald, E. (n.d.). apk-rebuilder”. GitHub. Retrieved June 13, 2026. ↩︎

McDonald, E. (n.d.). Known Versions, Display Audio Software”. GitHub. Retrieved June 13, 2026. ↩︎

McDonald, E. (n.d.). Known Versions, Display Audio Software”. GitHub. Retrieved June 13, 2026. ↩︎

McDonald, E. (n.d.). apk-renderer”. GitHub. Retrieved June 13, 2026. ↩︎

McDonald, E. (n.d.). apk-renderer”. GitHub. Retrieved June 13, 2026. ↩︎

Tunas. (n.d.). Tunas1337”. GitHub. Retrieved June 13, 2026. ↩︎

Tunas. (n.d.). Tunas1337”. GitHub. Retrieved June 13, 2026. ↩︎

Posch, M. (June 27, 2023). Honda Headunit Reverse Engineering, And The Dismal State Of Infotainment Systems”. Hackaday. Retrieved June 13, 2026. ↩︎

Posch, M. (June 27, 2023). Honda Headunit Reverse Engineering, And The Dismal State Of Infotainment Systems”. Hackaday. Retrieved June 13, 2026. ↩︎

GitHub - tensorzero/tensorzero: TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.

github.com

TensorZero is an open-source LLMOps plat­form that uni­fies:

Gateway: ac­cess every LLM provider through a uni­fied API, built for per­for­mance (<1ms p99 la­tency)

Observability: store in­fer­ences and feed­back in your data­base, avail­able pro­gram­mat­i­cally or in the UI

Evaluation: bench­mark in­di­vid­ual in­fer­ences or end-to-end work­flows us­ing heuris­tics, LLM judges, etc.

Optimization: col­lect met­rics and hu­man feed­back to op­ti­mize prompts, mod­els, and in­fer­ence strate­gies

Experimentation: ship with con­fi­dence with built-in A/B test­ing, rout­ing, fall­backs, re­tries, etc.

You can take what you need, adopt in­cre­men­tally, and com­ple­ment with other tools. It plays nicely with the OpenAI SDK, OpenTelemetry, and every ma­jor LLM provider.

TensorZero is used by com­pa­nies rang­ing from fron­tier AI star­tups to the Fortune 10 and fu­els ~1% of global LLM API spend to­day.

Demo

Features

Note

🆕 TensorZero Autopilot

TensorZero Autopilot is an au­to­mated AI en­gi­neer pow­ered by TensorZero that an­a­lyzes LLM ob­serv­abil­ity data, sets up evals, op­ti­mizes prompts and mod­els, and runs A/B tests.

It dra­mat­i­cally im­proves the per­for­mance of LLM agents across di­verse tasks:

Learn more →

🌐 LLM Gateway

Integrate with TensorZero once and ac­cess every ma­jor LLM provider.

Integrate with TensorZero once and ac­cess every ma­jor LLM provider.

Call any LLM (API or self-hosted) through a sin­gle uni­fied API

Infer with tool use, struc­tured out­puts (JSON), batch, em­bed­dings, mul­ti­modal (images, files), caching, etc.

Create prompt tem­plates and schemas to en­force a struc­tured in­ter­face be­tween your ap­pli­ca­tion and the LLMs

Satisfy ex­treme through­put and la­tency needs, thanks to 🦀 Rust: <1ms p99 la­tency over­head at 10k+ QPS

Ensure high avail­abil­ity with rout­ing, re­tries, fall­backs, load bal­anc­ing, gran­u­lar time­outs, etc.

Track us­age and cost and en­force cus­tom rate lim­its with gran­u­lar scopes (e.g. tags)

Set up auth for TensorZero to al­low clients to ac­cess mod­els with­out shar­ing provider API keys

Supported Model Providers

Anthropic, AWS Bedrock, AWS SageMaker, Azure, DeepSeek, Fireworks, GCP Vertex AI Anthropic, GCP Vertex AI Gemini, Google AI Studio (Gemini API), Groq, Hyperbolic, Mistral, OpenAI, OpenRouter, SGLang, TGI, Together AI, vLLM, and xAI (Grok).

Need some­thing else? TensorZero also sup­ports any OpenAI-compatible API (e.g. Ollama).

Usage Example

You can use TensorZero with any OpenAI SDK (Python, Node, Go, etc.) or OpenAI-compatible client.

Deploy the TensorZero Gateway (one Docker con­tainer).

Update the base_url and model in your OpenAI-compatible client.

Run in­fer­ence:

from ope­nai im­port OpenAI

# Point the client to the TensorZero Gateway client = OpenAI(base_url=“http://​lo­cal­host:3000/​ope­nai/​v1, api_key=“not-used”)

re­sponse = client.chat.com­ple­tions.cre­ate( # Call any model provider (or TensorZero func­tion) model=“ten­sorzero::mod­el_­name::an­thropic::claude-son­net-4 – 6”, mes­sages=[ { role”: user”, content”: Share a fun fact about TensorZero.”, } ], )

See Quick Start for more in­for­ma­tion.

🔍 LLM Observability

Zoom in to de­bug in­di­vid­ual API calls, or zoom out to mon­i­tor met­rics across mod­els and prompts over time — all us­ing the open-source TensorZero UI.

Zoom in to de­bug in­di­vid­ual API calls, or zoom out to mon­i­tor met­rics across mod­els and prompts over time — all us­ing the open-source TensorZero UI.

Store in­fer­ences and feed­back (metrics, hu­man ed­its, etc.) in your own data­base

Dive into in­di­vid­ual in­fer­ences or high-level ag­gre­gate pat­terns us­ing the TensorZero UI or pro­gram­mat­i­cally

Build datasets for op­ti­miza­tion, eval­u­a­tion, and other work­flows

Replay his­tor­i­cal in­fer­ences with new prompts, mod­els, in­fer­ence strate­gies, etc.

Export OpenTelemetry traces (OTLP) and ex­port Prometheus met­rics to your fa­vorite ap­pli­ca­tion ob­serv­abil­ity tools

Soon: AI-assisted de­bug­ging and root cause analy­sis; AI-assisted data la­bel­ing

📈 LLM Optimization

Send pro­duc­tion met­rics and hu­man feed­back to eas­ily op­ti­mize your prompts, mod­els, and in­fer­ence strate­gies — us­ing the UI or pro­gram­mat­i­cally.

Send pro­duc­tion met­rics and hu­man feed­back to eas­ily op­ti­mize your prompts, mod­els, and in­fer­ence strate­gies — us­ing the UI or pro­gram­mat­i­cally.

Optimize your mod­els with su­per­vised fine-tun­ing, RLHF, and other tech­niques

Optimize your prompts with au­to­mated prompt en­gi­neer­ing al­go­rithms like GEPA

Optimize your in­fer­ence strat­egy with dy­namic in-con­text learn­ing, best/​mix­ture-of-N sam­pling, etc.

Enable a feed­back loop for your LLMs: a data & learn­ing fly­wheel turn­ing pro­duc­tion data into smarter, faster, and cheaper mod­els

Soon: syn­thetic data gen­er­a­tion

📊 LLM Evaluation

Compare prompts, mod­els, and in­fer­ence strate­gies us­ing eval­u­a­tions pow­ered by heuris­tics and LLM judges.

Compare prompts, mod­els, and in­fer­ence strate­gies us­ing eval­u­a­tions pow­ered by heuris­tics and LLM judges.

Evaluate in­di­vid­ual in­fer­ences with in­fer­ence eval­u­a­tions pow­ered by heuris­tics or LLM judges (≈ unit tests for LLMs)

Evaluate end-to-end work­flows with work­flow eval­u­a­tions with com­plete flex­i­bil­ity (≈ in­te­gra­tion tests for LLMs)

Optimize LLM judges just like any other TensorZero func­tion to align them to hu­man pref­er­ences

Soon: more built-in eval­u­a­tors; head­less eval­u­a­tions

docker com­pose run –rm eval­u­a­tions \ –evaluation-name ex­trac­t_­data \ –dataset-name hard_test_­cases \ –variant-name gp­t_4o \ –concurrency 5

Run ID: 01961de9-c8a4 – 7c60-ab8d-15491a9708e4 Number of dat­a­points: 100 ██████████████████████████████████████ 100/100 ex­ac­t_­match: 0.83 ± 0.03 (n=100) se­man­tic_­match: 0.98 ± 0.01 (n=100) item_­count: 7.15 ± 0.39 (n=100)

🧪 LLM Experimentation

Ship with con­fi­dence with built-in A/B test­ing, rout­ing, fall­backs, re­tries, etc.

Ship with con­fi­dence with built-in A/B test­ing, rout­ing, fall­backs, re­tries, etc.

Run adap­tive A/B tests to ship with con­fi­dence and iden­tify the best prompts and mod­els for your use cases.

Enforce prin­ci­pled ex­per­i­ments in com­plex work­flows, in­clud­ing sup­port for multi-turn LLM sys­tems, se­quen­tial test­ing, and more.

& more!

Build with an open-source stack well-suited for pro­to­types but de­signed from the ground up to sup­port the most com­plex LLM ap­pli­ca­tions and de­ploy­ments.

Build with an open-source stack well-suited for pro­to­types but de­signed from the ground up to sup­port the most com­plex LLM ap­pli­ca­tions and de­ploy­ments.

Build sim­ple ap­pli­ca­tions or mas­sive de­ploy­ments with GitOps-friendly or­ches­tra­tion

Extend TensorZero with built-in es­cape hatches, pro­gram­matic-first us­age, di­rect data­base ac­cess, and more

Integrate with third-party tools: spe­cial­ized ob­serv­abil­ity and eval­u­a­tions, model providers, agent or­ches­tra­tion frame­works, etc.

Iterate quickly by ex­per­i­ment­ing with prompts in­ter­ac­tively us­ing the Playground UI

Frequently Asked Questions

How is TensorZero dif­fer­ent from other LLM frame­works?

TensorZero en­ables you to op­ti­mize com­plex LLM ap­pli­ca­tions based on pro­duc­tion met­rics and hu­man feed­back.

TensorZero sup­ports the needs of in­dus­trial-grade LLM ap­pli­ca­tions: low la­tency, high through­put, type safety, self-hosted, GitOps, cus­tomiz­abil­ity, etc.

TensorZero uni­fies the en­tire LLMOps stack, cre­at­ing com­pound­ing ben­e­fits. For ex­am­ple, LLM eval­u­a­tions can be used for fine-tun­ing mod­els along­side AI judges.

Can I use TensorZero with ___?

Yes. Every ma­jor pro­gram­ming lan­guage is sup­ported. It plays nicely with the OpenAI SDK, OpenTelemetry, and every ma­jor LLM provider.

Is TensorZero pro­duc­tion-ready?

Yes. TensorZero is used by com­pa­nies rang­ing from fron­tier AI star­tups to the Fortune 10 and pow­ers ~1% of the global LLM API spend to­day.

Here’s a case study: Automating Code Changelogs at a Large Bank with LLMs

How much does TensorZero cost?

TensorZero (LLMOps plat­form) is 100% self-hosted and open-source.

TensorZero Autopilot (automated AI en­gi­neer) is a com­ple­men­tary paid prod­uct pow­ered by TensorZero.

Who is build­ing TensorZero?

Our tech­ni­cal team in­cludes a for­mer Rust com­piler main­tainer, ma­chine learn­ing re­searchers (Stanford, CMU, Oxford, Columbia) with thou­sands of ci­ta­tions, and the chief prod­uct of­fi­cer of a de­ca­corn startup. We’re backed by the same in­vestors as lead­ing open-source pro­jects (e.g. ClickHouse, CockroachDB) and AI labs (e.g. OpenAI, Anthropic). See our $7.3M seed round an­nounce­ment and cov­er­age from VentureBeat. We’re hir­ing in NYC.

How do I get started?

You can adopt TensorZero in­cre­men­tally. Our Quick Start goes from a vanilla OpenAI wrap­per to a pro­duc­tion-ready LLM ap­pli­ca­tion with ob­serv­abil­ity and fine-tun­ing in just 5 min­utes.

Get Started

Start build­ing to­day. The Quick Start shows it’s easy to set up an LLM ap­pli­ca­tion with TensorZero.

Questions? Ask us on Slack or Discord.

Using TensorZero at work? Email us at hello@ten­sorzero.com to set up a Slack or Teams chan­nel with your team (free).

Examples

We are work­ing on a se­ries of com­plete runnable ex­am­ples il­lus­trat­ing TensorZero’s data & learn­ing fly­wheel.

Optimizing Data Extraction (NER) with TensorZero This ex­am­ple shows how to use TensorZero to op­ti­mize a data ex­trac­tion pipeline. We demon­strate tech­niques like fine-tun­ing and dy­namic in-con­text learn­ing (DICL). In the end, an op­ti­mized GPT-4o Mini model out­per­forms GPT-4o on this task — at a frac­tion of the cost and la­tency — us­ing a small amount of train­ing data.

Optimizing Data Extraction (NER) with TensorZero

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

Visit pancik.com for more.