10 interesting stories served every morning and every evening.




1 722 shares, 36 trendiness

After two years of vibecoding, I'm back to writing by hand

Most peo­ple’s jour­ney with AI cod­ing starts the same: you give it a sim­ple task. You’re im­pressed. So you give it a large task. You’re even more im­pressed.

You open X and draft up a rant on job dis­place­ment.

If you’ve per­sisted past this point: con­grat­u­la­tions, you un­der­stand AI cod­ing bet­ter than 99% of peo­ple.

Serious en­gi­neers us­ing AI to do real work and not just week­end pro­jects largely also fol­low a pre­dictable de­vel­op­ment arc.

Still amazed at the big task you gave it, you won­der if you can keep giv­ing it big­ger and big­ger tasks. Maybe even that haunt­ing refac­tor no one wants to take on?

But here’s where the cur­tain starts to crin­kle.

On the one hand, you’re amazed at how well it seems to un­der­stand you. On the other hand, it makes frus­trat­ing er­rors and de­ci­sions that clearly go against the shared un­der­stand­ing you’ve de­vel­oped.

You quickly learn that be­ing an­gry at the model serves no pur­pose, so you be­gin to in­ter­nal­ize any un­sat­is­fac­tory out­put.

It’s me. My prompt sucked. It was un­der-spec­i­fied.”

If I can spec­ify it, it can build it. The sky’s the limit,” you think.

So you open Obsidian and be­gin draft­ing beefy spec docs that de­scribe the fea­ture in your head with im­pres­sive de­tail. Maybe you’ve put to­gether a full page of a prompt, and spent half an hour do­ing so.

But you find that spec-dri­ven de­vel­op­ment does­n’t work ei­ther. In real life, de­sign docs and specs are liv­ing doc­u­ments that evolve in a volatile man­ner through dis­cov­ery and im­ple­men­ta­tion. Imagine if in a real com­pany you wrote a de­sign doc in 1 hour for a com­plex ar­chi­tec­ture, handed it off to a mid-level en­gi­neer (and told him not to dis­cuss the doc with any­one), and took off on va­ca­tion.

Not only does an agent not have the abil­ity to evolve a spec­i­fi­ca­tion over a multi-week pe­riod as it builds out its lower com­po­nents, it also makes de­ci­sions up­front that it later does­n’t de­vi­ate from. And most agents sim­ply sur­ren­der once they feel the prob­lem and so­lu­tion has got­ten away from them (though this rarely hap­pens any­more, since agents will just force them­selves through the walls of the maze.)

What’s worse is code that agents write looks plau­si­ble and im­pres­sive while it’s be­ing writ­ten and pre­sented to you. It even looks good in pull re­quests (as both you and the agent are well trained in what a good” pull re­quest looks like).

It’s not un­til I opened up the full code­base and read its lat­est state cover to cover that I be­gan to see what we the­o­rized and hoped was only a di­min­ish­ing ar­ti­fact of ear­lier mod­els: slop.

It was pure, unadul­ter­ated slop. I was be­wil­dered. Had I not re­viewed every line of code be­fore ad­mit­ting it? Where did all this…gunk..come from?

In ret­ro­spect, it made sense. Agents write units of changes that look good in iso­la­tion. They are con­sis­tent with them­selves and your prompt. But re­spect for the whole, there is not. Respect for struc­tural in­tegrity there is not. Respect even for neigh­bor­ing pat­terns there was not.

The AI had sim­ply told me a good story. Like vibewrit­ing a novel, the agent showed me a good cou­ple para­graphs that sure enough made sense and were struc­turally and syn­tac­ti­cally cor­rect. Hell, it even picked up on the idio­syn­crasies of the var­i­ous char­ac­ters. But for what­ever rea­son, when you read the whole chap­ter, it’s a mess. It makes no sense in the over­all con­text of the book and the pre­ced­ing and pro­ceed­ing chap­ters.

After read­ing months of cu­mu­la­tive highly-spec­i­fied agen­tic code, I said to my­self: I’m not ship­ping this shit. I’m not gonna charge users for this. And I’m not go­ing to promise users to pro­tect their data with this.

I’m not go­ing to lie to my users with this.

So I’m back to writ­ing by hand for most things. Amazingly, I’m faster, more ac­cu­rate, more cre­ative, more pro­duc­tive, and more ef­fi­cient than AI, when you price every­thing in, and not just code to­kens per hour.

You can fol­low me on X @atmoio, where I post a few times a week about agen­tic cod­ing.

You can watch the video coun­ter­part to this es­say on YouTube:

...

Read the original on atmoio.substack.com »

2 567 shares, 39 trendiness

diamond geezer

Members of the Royal Institution and other vis­i­tors to a lab­o­ra­tory in an up­per room in Frith-Street, Soho, on Tuesday saw a demon­stra­tion of ap­pa­ra­tus in­vented by Mr. J. L. Baird, who claims to have solved the prob­lem of tele­vi­sion. They were shown a trans­mit­ting ma­chine, con­sist­ing of a large wooden re­volv­ing disc con­tain­ing lenses, be­hind which was a re­volv­ing shut­ter and a light sen­si­tive cell. It was ex­plained that by means of the shut­ter and lens disc an im­age of ar­ti­cles or per­sons stand­ing in front of the ma­chine could be made to pass over the light sen­si­tive cell at high speed. The cur­rent in the cell varies in pro­por­tion to the light falling on it, and this vary­ing cur­rent is trans­mit­ted to a re­ceiver where it con­trols a light be­hind an op­ti­cal arrange­ment sim­i­lar to that at the send­ing end. By this means a point of light is caused to tra­verse a ground glass screen. The light is dim at the shad­ows and bright at the high lights, and crosses the screen so rapidly that the whole im­age ap­pears si­mul­ta­ne­ously to the eye.

...

Read the original on diamondgeezer.blogspot.com »

3 538 shares, 23 trendiness

Things I’ve learned in my 10 years as an engineering manager

It’s been a while since my boss told me I needed to start hir­ing for my team. While I was at it, I should also han­dle on­board­ing… Since I knew the roadmap, I could take own­er­ship of that… And be­cause I knew the peo­ple, I could coach them in their ca­reers.

I did­n’t re­al­ize at the time, but he was doom­ing me to be an en­gi­neer­ing man­ager.

Since then, I’ve worked across four com­pa­nies as a man­ager, one as a founder, and an­other as a man­ager of man­agers. I will skip the stan­dard ad­vice and lessons on Engineering Management and fo­cus on the non-ob­vi­ous ones.

There is no stan­dard de­f­i­n­i­tion for an Engineering Manager. If you pick two ran­dom man­agers, they can do very dif­fer­ent things, even if they are at the same com­pany.

In every com­pany I’ve worked at, my role has never been the same. The only con­stant is that it’s de­fined by the team’s needs, re­quir­ing you to bal­ance across four pil­lars: Product, Process, People, and Programming.

* Large team? Say good­bye to pro­gram­ming. You’ll fo­cus on build­ing ca­reers, co­or­di­nat­ing ef­forts, and nav­i­gat­ing the or­ga­ni­za­tion to get re­sources for your team.

* Small team? You’ll man­age scope to match re­al­ity, and with less com­mu­ni­ca­tion over­head, you might ac­tu­ally do some cod­ing.

* No PM? You own the prod­uct com­pletely: val­i­dat­ing fea­tures, pri­or­i­tiz­ing the roadmap, and talk­ing to clients. This takes up most of your time be­cause ship­ping fea­tures that don’t of­fer user value makes every­thing else point­less.

* Reporting to the CEO? You’re now the link to sales, op­er­a­tions, and client com­mu­ni­ca­tions.

The key is to iden­tify where your team’s bot­tle­neck lies in your soft­ware de­vel­op­ment life­cy­cle. You’ll prob­a­bly shift be­tween pil­lars as cir­cum­stances change, and that’s the point: the role re­quires flex­i­bil­ity.

Tip: Never ask the in­ter­viewer what they ex­pect from a man­ager. Some man­agers as­sume their ex­pe­ri­ence is in­dus­try stan­dard and might find that ques­tion odd. Instead, ask about their daily life and the chal­lenges that take up most of their time.

A few times in my ca­reer as a de­vel­oper, I won­dered, Who is this fea­ture even for? Who will use it?” No one on my team knew. We were do­ing it be­cause we were told to. Morale was low. We felt we were work­ing on things that did­n’t mat­ter - and we were. Eventually, our team dis­banded, and en­gi­neers scat­tered across other pro­jects.

The most com­mon rea­son com­pa­nies fail is cre­at­ing prod­ucts that don’t de­liver value to users, caus­ing them not to pay.

Oh, but I have a PM for that,” you might say. But hav­ing a PM is not enough. Everyone needs to care about the prod­uct. Your team is­n’t paid to just de­liver code but to use it to solve prob­lems.

Code is valu­able only when it ben­e­fits the end user. Sometimes, a no-code in­te­gra­tion can out­per­form a cus­tom so­lu­tion. At times, it’s bet­ter not to cre­ate a new fea­ture at all to avoid main­tain­ing a sys­tem. Teams that un­der­stand the prob­lem, not just the spec, can pivot when nec­es­sary.

Every process trades time and at­ten­tion for re­li­a­bil­ity or qual­ity. The prob­lem oc­curs when teams stop ques­tion­ing if the trade is still worth it. Ceremonies be­come rit­u­als. Metrics turn into goals. No one re­mem­bers why we spend an hour of our lives on this meet­ing.

Process bloat creeps in slowly. An en­gi­neer ships a bro­ken UI to pro­duc­tion. Designers com­plain, man­agers panic, and sud­denly every PR re­quires de­signer ap­proval. The whole team bears the cost of a sin­gle iso­lated in­ci­dent.

Good process serves you so you can serve cus­tomers. But if you’re not watch­ful, the process can be­come the thing. You stop look­ing at out­comes and just make sure you’re do­ing the process right. The process is not the thing. It’s al­ways worth ask­ing, do we own the process or does the process own us?

The right process varies based on con­text: team size, ex­pe­ri­ence lev­els, and dead­line pres­sure. What works for a ma­ture team might not work for a new one. Keep ques­tion­ing and it­er­at­ing. If a process is­n’t im­prov­ing de­liv­ery, cut it.

Your di­rect re­ports are the peo­ple who in­ter­act with you the most. They look to you for lead­er­ship and clar­ity, and trust that you’ll tell them what they need to know.

That’s why ly­ing or with­hold­ing in­for­ma­tion that af­fects them causes ir­re­versible dam­age. They might not leave im­me­di­ately, but they will re­sent you.

I have a friend who still re­sents a man­ager for a lie told three years ago. They found an­other com­pany, but they’re still an­gry about it.

Trust ar­rives on foot and leaves by horse­back.”

I’ve seen some man­agers de­scribe the role as a shield that blocks every­thing from above,” and I dis­agree. A good man­ager is more like a trans­par­ent um­brella. They pro­tect the team from un­nec­es­sary stress and pres­sure, but don’t hide re­al­ity from them.

Telling the team: Our users aren’t thrilled so far. We need to find ways to bet­ter serve them. The pro­ject risks can­cel­la­tion if we don’t.” That’s fair game. They de­serve to know.

When you do de­liver hard news, state it clearly and fo­cus on how the team will do about it. If you act scared, they’ll be scared too. Your goal is to get them think­ing about the next steps.

I see man­agers walk into ex­ec­u­tive meet­ings say­ing, We’re not sure what to do - maybe A, maybe B?” and then leave with or­ders to do Z, which does­n’t ben­e­fit the team or the pro­ject.

Executives can’t think of every pos­si­bil­ity in de­tail - that re­spon­si­bil­ity lies with you and the per­son who owns the prod­uct (which, as we saw, could be you too). When a prob­lem reaches the ex­ec­u­tives, it’s be­cause a de­ci­sion is needed, and they will make one.

People above you have lim­ited time to fo­cus on your spe­cific is­sues. You can’t info dump on them. If they take a mis­guided ac­tion based on what you tell them, it will be your fault.

If you be­lieve in some­thing, clearly state your case, in­clud­ing the ad­van­tages and dis­ad­van­tages. Don’t ex­pect higher-ups to think for you. It’s okay to bounce rough ideas off your di­rect man­ager, but be­yond that, re­fine your thoughts - no one will think harder about your prob­lems than you and your team.

If you need a guide­line, a doc­u­ment should be: con­text → prob­lem → plan / al­ter­na­tives → what sup­port you need.

Player (10%): Yes, only 10%. You might take on work your team is­n’t ex­cited about, but that mat­ters: CI/CD im­prove­ments, flaky tests, process tool­ing. However, you need to stay off the crit­i­cal path. As soon as you start han­dling es­sen­tial tick­ets, you’ll block your team when man­age­r­ial work pulls you away.

Coach (30%): Your per­for­mance as a man­ager is the sum of your team’s out­put. Coaching in­volves pre­vent­ing prob­lem­atic be­hav­iors from be­com­ing nor­mal­ized, such as tox­i­c­ity, re­peated mis­takes, and con­sis­tent un­der­per­for­mance.

It also means sup­port­ing en­gi­neers’ growth by chal­leng­ing them ap­pro­pri­ately, pro­vid­ing the right feed­back, and help­ing them de­velop skills they’ll carry for­ward.

Cheerleader (60%): Praise peo­ple more than you think you should. Validation is im­por­tant. Most en­gi­neers pre­fer feel­ing ap­pre­ci­ated over hav­ing a ping-pong table.

But give praise gen­uinely, not au­to­mat­i­cally. I once joined a team where ret­ro­spec­tives in­cluded 30 min­utes of mu­tual praise - n-squared com­pli­ments every week. It felt hol­low. Not every week has some­thing ex­tra­or­di­nary, and when praise be­comes ex­pected, it loses its im­pact. The hedonic tread­mill is real.

Make your en­gi­neers’ wins vis­i­ble be­yond your team. Encourage them to pur­sue im­pact out­side the team, and cel­e­brate their achieve­ments when they do.

Every team op­er­ates like a small com­pany within the larger or­ga­ni­za­tion. I find that its morale also ex­ists in­de­pen­dently of the com­pa­ny’s over­all morale.

Most man­agers don’t plan to be­come bot­tle­necks. It hap­pens grad­u­ally. A crit­i­cal tool needs an owner, and you think, I’ll han­dle this for now.” Someone needs to be the point of con­tact for an­other team, and it’s eas­i­est if it’s you. Technical de­ci­sions keep land­ing on your desk be­cause you’re the one with con­text. Before you know it, work stops with­out you.

If you can’t take a month off and re­turn to a well-func­tion­ing team, you need to work to­ward mak­ing that pos­si­ble.

You’re too busy to be the bot­tle­neck. If peo­ple keep reach­ing out to you for re­cur­ring tasks, del­e­gate by teach­ing some­one else to han­dle them. Point team mem­bers di­rectly to each other or, even bet­ter, cre­ate group chats to fa­cil­i­tate nat­ural dis­cus­sions.

Don’t be­come the bus fac­tor of 1. Train oth­ers so work con­tin­ues smoothly even when you’re over­whelmed or un­avail­able.

Avoid mak­ing peo­ple feel they need your per­mis­sion for small, re­versible de­ci­sions. Empower them with agency. Request to stay in­formed about their de­ci­sions, but let them han­dle the tech­ni­cal side.

The rea­son you will do this is that man­age­r­ial work can, and will, ap­pear at the worst time. If you are the bus fac­tor, you will be screw­ing your team when it hap­pens. There are many en­gi­neers, but only one man­ager. Stay ac­ces­si­ble for tasks that only you can han­dle.

Ask your­self: can you trust every en­gi­neer on your team to do their best with­out you con­stantly watch­ing? If not, some­thing needs to change—ei­ther in you or in them.

Trust is­n’t about just tech­ni­cal skill. If I asked my cur­rent en­gi­neers (mobile and web de­vel­op­ers) to build a Game Boy em­u­la­tor from scratch, they would­n’t know where to be­gin. They’d prob­a­bly take months (some just weeks). But I’m sure they’d try their best to fig­ure out how to run Pokémon Gold.

You need to trust both their abil­i­ties and their hon­esty:

* If you can’t trust their skills at their level of ex­pe­ri­ence, it’s your job to help them get bet­ter.

* If you can’t trust their hon­esty, and you have good rea­sons not to, then you need to part ways.

Even great en­gi­neers get stuck with­out re­al­iz­ing it. Watching progress helps you spot when they need sup­port be­fore oth­ers see them as un­der­per­form­ing.

Processes like sprints and OKRs mainly fo­cus on the verify” stage (see, your man­ager does this too). They serve as a shared in­ter­face to en­sure work gets done. This is­n’t about lack of trust but ac­count­abil­ity.

Verification in­volves us­ing met­rics and ev­i­dence. There are two types: quan­ti­ta­tive and qual­i­ta­tive.

Quantitative is sim­ple: PRs merged, points com­pleted, code re­viewed. You can glance at these, but de­ci­sions should­n’t be based on them alone. If you could gauge en­gi­neer per­for­mance from num­bers, man­agers would­n’t be needed.

Knowing the Qualitative met­rics shows a man­ager’s worth. This en­gi­neer has fewer PRs, but they’re al­ways watch­ing Slack and hop­ping into calls to help oth­ers.” This en­gi­neer al­ways dis­cusses tick­ets with prod­uct first - their out­put ends up far bet­ter than our orig­i­nal specs.” This en­gi­neer ex­plains com­plex con­cepts in ways every­one can un­der­stand and makes other teams use our tool bet­ter.”

These in­sights de­pend on truly know­ing your team. That’s why most management AI tools” are doomed to fail. They only fo­cus on quan­ti­ta­tive met­rics. They don’t at­tend your standups, don’t con­duct 1:1s for you, and don’t know who’s qui­etly hold­ing the team to­gether. A good man­ager does.

Stop hav­ing pet pro­jects; that’s a Staff Engineer’s do­main. For a man­ager, every pro­ject is cat­tle: it needs to be com­pleted, au­to­mated, del­e­gated, or can­celled.

Managers hold on to pro­jects for many rea­sons. Sometimes it’s com­fort - you know this sys­tem, you built it, and it feels good to stay close to it. Sometimes it’s iden­tity - you want to stay technical” and not lose your edge. Sometimes it’s fear - you don’t trust it’ll be done right with­out you. None of these is a good rea­son to hold on.

The I can do it faster my­self” mind­set might be ac­cu­rate, but in the long run, it’s not sus­tain­able. Every time you do it your­self, you rob some­one else from learn­ing and en­sure you’ll be do­ing it for­ever.

Be risk-averse, not risk-para­noid. You can’t ac­count for every vari­able. Some things can’t be an­tic­i­pated, and over­cor­rect­ing may be worse than the orig­i­nal is­sue.

Hiring is where I see this most of­ten. After a bad hire, man­agers start re­quir­ing re­fer­rals, but al­most any­one, no mat­ter how un­skilled or dis­hon­est, can find some­one to vouch for them. Others add more in­ter­view­ers to the panel, think­ing more eyes means bet­ter vet­ting.

The op­po­site hap­pens: each in­ter­viewer be­comes more lax, ex­pect­ing some­one else to be the bad guy.” Responsibility gets di­luted. Three great in­ter­views beat seven mediocre ones.

Think about sec­ond-or­der ef­fects too: while you’re sched­ul­ing that sev­enth round, good can­di­dates are ac­cept­ing of­fers else­where. The best tal­ent moves quickly. A slow, risk-averse process fil­ters out ex­actly the peo­ple you wanted to hire.

If any of this res­onated, my free on­line work-in-progress book goes deeper. If you’re a man­ager too, I’d love to hear what you’ve learned. Drop it in the com­ments!

...

Read the original on www.jampa.dev »

4 477 shares, 28 trendiness

Michael Reeves (@integralpilot.bsky.social)

This is a heav­ily in­ter­ac­tive web ap­pli­ca­tion, and JavaScript is re­quired. Simple HTML in­ter­faces are pos­si­ble, but that is not what this is.

Learn more about Bluesky at bsky.so­cial and at­proto.com. On Apple M3, a Linux KDE plasma desk­top un­der Fedora Asahi Remix is now WORKING! Super ex­cited to share this up­date and happy to an­swer any ques­tions! Co-credits to noop­wafel and Shiz. :)

...

Read the original on bsky.app »

5 438 shares, 26 trendiness

Qwen

...

Read the original on qwen.ai »

6 411 shares, 19 trendiness

a modern and efficient vector tile format

Today we are happy to an­nounce MapLibre Tile (MLT), a new mod­ern and ef­fi­cient vec­tor tile for­mat.

MapLibre Tile (MLT) is a suc­ce­sor to Mapbox Vector Tile (MVT). It has been re­designed from the ground up to ad­dress the chal­lenges of rapidly grow­ing geospa­tial data vol­umes and com­plex next-gen­er­a­tion geospa­tial source for­mats, as well as to lever­age the ca­pa­bil­i­ties of mod­ern hard­ware and APIs.

MLT is specif­i­cally de­signed for mod­ern and next-gen­er­a­tion graph­ics APIs to en­able high-per­for­mance pro­cess­ing and ren­der­ing of large (planet-scale) 2D and 2.5 basemaps. This cur­rent im­ple­men­ta­tion of­fers fea­ture par­ity with MVT1 while de­liv­er­ing on the fol­low­ing:

* Improved com­pres­sion ra­tio: up to 6x on large tiles, based on a col­umn-ori­ented lay­out with re­cur­sively ap­plied (custom)

light­weight en­cod­ings. This leads to re­duced la­tency, stor­age, and egress costs and, in par­tic­u­lar, im­proved cache uti­liza­tion.

* Better de­cod­ing per­for­mance: fast, light­weight en­cod­ings that can be used in com­bi­na­tion with SIMD/vectorization in­struc­tions.

In ad­di­tion, MLT was de­signed to sup­port the fol­low­ing use cases in the fu­ture:

* Improved pro­cess­ing per­for­mance, based on stor­age and in-mem­ory for­mats that are specif­i­cally de­signed for mod­ern graph­ics APIs,

al­low­ing for ef­fi­cient pro­cess­ing on both CPU and GPU. The for­mats are de­signed to be loaded into GPU buffers with lit­tle or no ad­di­tional pro­cess­ing.

* Support for lin­ear ref­er­enc­ing and m-val­ues to ef­fi­ciently sup­port the up­com­ing next-gen­er­a­tion source for­mats such as Overture Maps (GeoParquet).

As with any MapLibre pro­ject, the fu­ture of MLT is de­cided by the needs of the com­mu­nity. There are a lot of ex­cit­ing ideas for other fu­ture ex­ten­sions and we wel­come con­tri­bu­tions to the pro­ject.

For a more in-depth ex­plo­ration of MLT have a look at the fol­low­ing slides, watch

this talk or read this pub­li­ca­tion by MLT in­ven­tor Markus Tremmel.

For the ad­ven­tur­ous, the an­swer is: to­day. Both MapLibre GL JS and MapLibre Native now sup­port MLT sources. You can use the new en­cod­ing prop­erty on sources in your style JSON with a value of mlt for MLT vec­tor tile sources.

To try out MLT, you have the fol­low­ing op­tions:

* The eas­i­est way to try out MLT is to use the MLT-based de­motiles style.

* You can also try out the en­cod­ing server that con­verts ex­ist­ing (MVT-based) styles and vec­tor tile sources to MLT on the fly. This is mostly a tool for de­vel­op­ment.

* To cre­ate tiles for pro­duc­tion, you could use Planetiler, as the up­com­ing ver­sion will sup­port gen­er­at­ing MLTs.

Refer to this page for a com­plete and up-to-date list of in­te­gra­tions and im­ple­men­ta­tions. If you are an in­te­gra­tor work­ing on sup­port­ing MLT, feel free to add your own pro­ject there.

We would love to hear your ex­pe­ri­ence with us­ing MLT! Join the #maplibre-tile-format chan­nel on our Slack or cre­ate an Issue or Discussion on the tile spec repo.

MapLibre Tile came to be thanks to a multi-year col­lab­o­ra­tion be­tween acad­e­mia, open source and en­ter­prise. Thank you to every­one who was in­volved! We are very proud that our com­mu­nity can in­no­vate like this.

Special thanks go to Markus Tremmel for in­vent­ing the for­mat, Yuri Astrakhan for spear­head­ing the pro­ject, Tim Sylvester for the C++ im­ple­men­ta­tion, Harel Mazor, Benedikt Vogl and Niklas Greindl for work­ing on the JavaScript im­ple­men­ta­tion.

Also thanks to Microsoft and AWS for fi­nanc­ing work on MLT.

...

Read the original on maplibre.org »

7 379 shares, 23 trendiness

Google AI Overviews cite YouTube more than any medical site for health queries, study suggests

Google’s search fea­ture AI Overviews cites YouTube more than any med­ical web­site when an­swer­ing queries about health con­di­tions, ac­cord­ing to re­search that raises fresh ques­tions about a tool seen by 2 bil­lion peo­ple each month.

The com­pany has said its AI sum­maries, which ap­pear at the top of search re­sults and use gen­er­a­tive AI to an­swer ques­tions from users, are reliable” and cite rep­utable med­ical sources such as the Centers for Disease Control and Prevention and the Mayo Clinic.

However, a study that analysed re­sponses to more than 50,000 health queries, cap­tured us­ing Google searches from Berlin, found the top cited source was YouTube. The video-shar­ing plat­form is the world’s sec­ond most vis­ited web­site, af­ter Google it­self, and is owned by Google.

Researchers at SE Ranking, a search en­gine op­ti­mi­sa­tion plat­form, found YouTube made up 4.43% of all AI Overview ci­ta­tions. No hos­pi­tal net­work, gov­ern­ment health por­tal, med­ical as­so­ci­a­tion or aca­d­e­mic in­sti­tu­tion came close to that num­ber, they said.

This mat­ters be­cause YouTube is not a med­ical pub­lisher,” the re­searchers wrote. It is a gen­eral-pur­pose video plat­form. Anyone can up­load con­tent there (eg board-cer­ti­fied physi­cians, hos­pi­tal chan­nels, but also well­ness in­flu­encers, life coaches, and cre­ators with no med­ical train­ing at all).”

Google told the Guardian that AI Overviews was de­signed to sur­face high-qual­ity con­tent from rep­utable sources, re­gard­less of for­mat, and a va­ri­ety of cred­i­ble health au­thor­i­ties and li­censed med­ical pro­fes­sion­als cre­ated con­tent on YouTube. The study’s find­ings could not be ex­trap­o­lated to other re­gions as it was con­ducted us­ing German-language queries in Germany, it said.

The re­search comes af­ter a Guardian in­ves­ti­ga­tion found peo­ple were be­ing put at risk of harm by false and mis­lead­ing health in­for­ma­tion in Google AI Overviews re­sponses.

In one case that ex­perts said was dangerous” and alarming”, Google pro­vided bo­gus in­for­ma­tion about cru­cial liver func­tion tests that could have left peo­ple with se­ri­ous liver dis­ease wrongly think­ing they were healthy. The com­pany later re­moved AI Overviews for some but not all med­ical searches.

The SE Ranking study analysed 50,807 health­care-re­lated prompts and key­words to see which sources AI Overviews re­lied on when gen­er­at­ing an­swers.

They chose Germany be­cause its health­care sys­tem is strictly reg­u­lated by a mix of German and EU di­rec­tives, stan­dards and safety reg­u­la­tions. If AI sys­tems rely heav­ily on non-med­ical or non-au­thor­i­ta­tive sources even in such an en­vi­ron­ment, it sug­gests the is­sue may ex­tend be­yond any sin­gle coun­try,” they wrote.

AI Overviews sur­faced on more than 82% of health searches, the re­searchers said. When they looked at which sources AI Overviews re­lied on most of­ten for health-re­lated an­swers, one re­sult stood out im­me­di­ately, they said. The sin­gle most cited do­main was YouTube with 20,621 ci­ta­tions out of a to­tal of 465,823.

The next most cited source was NDR.de, with 14,158 ci­ta­tions (3.04%). The German pub­lic broad­caster pro­duces health-re­lated con­tent along­side news, doc­u­men­taries and en­ter­tain­ment. In third place was a med­ical ref­er­ence site, Msdmanuals.com with 9,711 ci­ta­tions (2.08%).

The fourth most cited source was Germany’s largest con­sumer health por­tal, Netdoktor.de, with 7,519 ci­ta­tions (1.61%). The fifth most cited source was a ca­reer plat­form for doc­tors, Praktischarzt.de, with 7,145 ci­ta­tions (1.53%).

The re­searchers ac­knowl­edged lim­i­ta­tions to their study. It was con­ducted as a one-time snap­shot in December 2025, us­ing German-language queries that re­flected how users in Germany typ­i­cally search for health in­for­ma­tion.

Results could vary over time, by re­gion, and by the phras­ing of ques­tions. However, even with those caveats, the find­ings still prompted alarm.

Hannah van Kolfschooten, a re­searcher spe­cial­is­ing in AI, health and law at the University of Basel who was not in­volved with the re­search, said: This study pro­vides em­pir­i­cal ev­i­dence that the risks posed by AI Overviews for health are struc­tural, not anec­do­tal. It be­comes dif­fi­cult for Google to ar­gue that mis­lead­ing or harm­ful health out­puts are rare cases.

Instead, the find­ings show that these risks are em­bed­ded in the way AI Overviews are de­signed. In par­tic­u­lar, the heavy re­liance on YouTube rather than on pub­lic health au­thor­i­ties or med­ical in­sti­tu­tions sug­gests that vis­i­bil­ity and pop­u­lar­ity, rather than med­ical re­li­a­bil­ity, is the cen­tral dri­ver for health knowl­edge.”

A Google spokesper­son said: The im­pli­ca­tion that AI Overviews pro­vide un­re­li­able in­for­ma­tion is re­futed by the re­port’s own data, which shows that the most cited do­mains in AI Overviews are rep­utable web­sites. And from what we’ve seen in the pub­lished find­ings, AI Overviews cite ex­pert YouTube con­tent from hos­pi­tals and clin­ics.”

Google said the study showed that of the 25 most cited YouTube videos, 96% were from med­ical chan­nels. However, the re­searchers cau­tioned that these videos rep­re­sented fewer than 1% of all the YouTube links cited by AI Overviews on health.

Most of them (24 out of 25) come from med­ical-re­lated chan­nels like hos­pi­tals, clin­ics and health or­gan­i­sa­tions,” the re­searchers wrote. On top of that, 21 of the 25 videos clearly note that the con­tent was cre­ated by a li­censed or trusted source.

So at first glance it looks pretty re­as­sur­ing. But it’s im­por­tant to re­mem­ber that these 25 videos are just a tiny slice (less than 1% of all YouTube links AI Overviews ac­tu­ally cite). With the rest of the videos, the sit­u­a­tion could be very dif­fer­ent.”

...

Read the original on www.theguardian.com »

8 364 shares, 19 trendiness

Apple introduces new AirTag with expanded range and improved findability

The next gen­er­a­tion of AirTag — the best­selling item finder — is even eas­ier to lo­cate with more pow­er­ful Precision Finding, a longer Bluetooth range, and a louder speaker

Apple to­day un­veiled the new AirTag, a pow­er­ful ac­ces­sory that helps users keep track of and find the items that mat­ter most with Apple’s Find My app — now with an ex­panded find­ing range and a louder speaker. Powered by the strength of the Find My net­work, AirTag al­lows users to keep tabs on their be­long­ings every sin­gle day. Since the launch of AirTag in 2021, users from around the world have shared sto­ries of be­ing re­united with lost lug­gage, keys, bi­cy­cles, bags, and more. With the help of AirTag placed in­side an in­stru­ment case, a mu­si­cian was able to lo­cate their lost in­stru­ment and per­form that evening, while an­other user was able to find lost lug­gage that con­tained a life­sav­ing med­ica­tion. AirTag is de­signed ex­clu­sively for track­ing ob­jects and of­fers in­dus­try-lead­ing pro­tec­tions against un­wanted track­ing. It is avail­able to­day for the same price as its pre­de­ces­sor: $29 for a sin­gle AirTag and $99 for a four-pack, with free per­son­al­ized en­grav­ing avail­able on ap­ple.com and the Apple Store app.

Apple’s sec­ond-gen­er­a­tion Ultra Wideband chip — the same chip found in the iPhone 17 lineup, iPhone Air, Apple Watch Ultra 3, and Apple Watch Series 11 — pow­ers the new AirTag, mak­ing it eas­ier to lo­cate than ever be­fore. Using hap­tic, vi­sual, and au­dio feed­back, Precision Finding guides users to their lost items from up to 50 per­cent far­ther away than the pre­vi­ous gen­er­a­tion.1 And an up­graded Bluetooth chip ex­pands the range at which items can be lo­cated. For the first time, users can use Precision Finding on Apple Watch Series 9 or later, or Apple Watch Ultra 2 or later, to find their AirTag, bring­ing a pow­er­ful ex­pe­ri­ence to the wrist.

With its up­dated in­ter­nal de­sign, the new AirTag is 50 per­cent louder than the pre­vi­ous gen­er­a­tion, en­abling users to hear their AirTag from up to 2x far­ther than be­fore. Paired with its en­hanced Precision Finding ca­pa­bil­i­ties and dis­tinc­tive new chime, AirTag now makes it eas­ier for users to find their im­por­tant items, such as keys hid­den deep in be­tween couch cush­ions or a wal­let as they head out the door.

The Find My Network and Share Item Location

Find My makes it easy to lo­cate AirTag, Apple de­vices, and com­pat­i­ble third-party de­vices, as well as keep up with friends and fam­ily, all while pro­tect­ing user pri­vacy. If AirTag is out of range of its paired iPhone, the Find My net­work can help track it down. The Find My net­work is a crowd­sourced net­work of Apple de­vices that use Bluetooth tech­nol­ogy to de­tect the lo­ca­tion of an ac­ces­sory or de­vice, and re­port their ap­prox­i­mate lo­ca­tion back to the owner.

The new AirTag in­te­grates seam­lessly with Share Item Location, an iOS fea­ture de­signed to help users re­cover a mis­placed item by tem­porar­ily and se­curely shar­ing its lo­ca­tion with trusted third par­ties, such as air­lines, so they can as­sist in re­cov­er­ing de­layed lug­gage or other lost items. Apple has part­nered di­rectly with more than 50 air­lines to pri­vately and se­curely ac­cept Share Item Location links.

With Share Item Location, users can share the lo­ca­tion of a mis­placed item with a par­tic­i­pat­ing air­line’s cus­tomer ser­vice team. According to SITA, a lead­ing IT provider for air­lines, car­ri­ers re­port that us­ing Share Item Location has re­duced bag­gage de­lays by 26 per­cent and re­duced in­ci­dences of truly lost” or un­re­cov­er­able lug­gage by 90 per­cent. Access is granted only to au­tho­rized per­son­nel via se­cure Apple Account or part­ner au­then­ti­ca­tion. The shared lo­ca­tion will be dis­abled as soon as a user is re­united with their item, can be stopped by the owner at any time, and will au­to­mat­i­cally ex­pire af­ter seven days.

The new AirTag is de­signed from the ground up to keep lo­ca­tion data pri­vate and se­cure. AirTag does­n’t phys­i­cally store lo­ca­tion data or his­tory on de­vice, and end-to-end en­cryp­tion pro­tects all com­mu­ni­ca­tion with the Find My net­work, en­sur­ing that only the owner of a de­vice can ac­cess its lo­ca­tion data. No one, in­clud­ing Apple, knows the iden­tity or lo­ca­tion of any de­vice that helped find it. Designed ex­clu­sively for track­ing ob­jects, and not peo­ple or pets, the new AirTag in­cor­po­rates a suite of in­dus­try-first pro­tec­tions against un­wanted track­ing, in­clud­ing cross-plat­form alerts and unique Bluetooth iden­ti­fiers that change fre­quently.

Apple 2030 is the com­pa­ny’s am­bi­tious plan to be car­bon neu­tral across its en­tire foot­print by the end of this decade by re­duc­ing prod­uct emis­sions from their three biggest sources: ma­te­ri­als, elec­tric­ity, and trans­porta­tion. The new AirTag is de­signed with the en­vi­ron­ment in mind, with 85 per­cent re­cy­cled plas­tic in the en­clo­sure, 100 per­cent re­cy­cled rare earth el­e­ments in all mag­nets, and 100 per­cent re­cy­cled gold plat­ing in all Apple-designed printed cir­cuit boards. The pa­per pack­ag­ing is 100 per­cent fiber-based and can be eas­ily re­cy­cled. Maintaining the same form fac­tor as the orig­i­nal, the new AirTag is com­pat­i­ble with all ex­ist­ing AirTag ac­ces­sories, in­clud­ing the FineWoven Key Ring, which is made from 68 per­cent re­cy­cled con­tent and avail­able in five beau­ti­ful col­ors.

* The new AirTag is avail­able to or­der on ap­ple.com and in the Apple Store app to­day, and will be avail­able at Apple Store lo­ca­tions later this week. The new AirTag will also be avail­able at Apple Authorized Resellers.

* Customers can add a per­son­al­ized en­grav­ing to the new AirTag for free dur­ing check­out on ap­ple.com and the Apple Store app.

* AirTag is avail­able in one- and four-packs for $29 (U.S.) and $99 (U.S.), re­spec­tively.

* The Apple-designed AirTag FineWoven Key Ring is avail­able in fox or­ange, mid­night pur­ple, navy, moss, and black for $35 (U.S.).2

* The new AirTag re­quires a com­pat­i­ble iPhone with iOS 26 or later, or iPad with iPa­dOS 26 or later. Customers must have an Apple Account and be signed into their iCloud ac­count. Certain fea­tures re­quire Find My to be en­abled in iCloud set­tings.

* Precision Finding on Apple Watch re­quires Apple Watch Series 9 or later, or Apple Ultra 2 or later, with watchOS 26.2.1.

Apple rev­o­lu­tion­ized per­sonal tech­nol­ogy with the in­tro­duc­tion of the Macintosh in 1984. Today, Apple leads the world in in­no­va­tion with iPhone, iPad, Mac, AirPods, Apple Watch, and Apple Vision Pro. Apple’s six soft­ware plat­forms — iOS, iPa­dOS, ma­cOS, watchOS, vi­sionOS, and tvOS — pro­vide seam­less ex­pe­ri­ences across all Apple de­vices and em­power peo­ple with break­through ser­vices in­clud­ing the App Store, Apple Music, Apple Pay, iCloud, and Apple TV. Apple’s more than 150,000 em­ploy­ees are ded­i­cated to mak­ing the best prod­ucts on earth and to leav­ing the world bet­ter than we found it.

Precision Finding is not avail­able in coun­tries and re­gions where Ultra Wideband tech­nol­ogy is re­stricted. For more in­for­ma­tion, visit ap­ple.com/​uwb.

...

Read the original on www.apple.com »

9 305 shares, 9 trendiness

Vibe Coding Kills Open Source

...

Read the original on arxiv.org »

10 267 shares, 26 trendiness

ChatGPT Containers can now run bash, pip/npm install packages, and download files

One of my favourite fea­tures of ChatGPT is its abil­ity to write and ex­e­cute code in a con­tainer. This fea­ture launched as ChatGPT Code Interpreter nearly three years ago, was half-heart­edly re­branded to Advanced Data Analysis” at some point and is gen­er­ally re­ally dif­fi­cult to find de­tailed doc­u­men­ta­tion about. Case in point: it ap­pears to have had a mas­sive up­grade at some point in the past few months, and I can’t find doc­u­men­ta­tion about the new ca­pa­bil­i­ties any­where!

Here are the most no­table new fea­tures:

ChatGPT can di­rectly run Bash com­mands now. Previously it was lim­ited to Python code only, al­though it could run shell com­mands via the Python sub­process mod­ule.

It has Node.js and can run JavaScript di­rectly in ad­di­tion to Python. I also got it to run hello world” in Ruby, Perl, PHP, Go, Java, Swift, Kotlin, C and C++. No Rust yet though!

While the con­tainer still can’t make out­bound net­work re­quests, pip in­stall pack­age and npm in­stall pack­age both work now via a cus­tom proxy mech­a­nism.

ChatGPT can lo­cate the URL for a file on the web and use a con­tainer.down­load tool to down­load that file and save it to a path within the sand­boxed con­tainer.

This is a sub­stan­tial up­grade! ChatGPT can now write and then test code in 10 new lan­guages (11 if you count Bash), can find files on­line and down­load them into the con­tainer, and can in­stall ad­di­tional pack­ages via pip and npm to help it solve prob­lems.

I was ini­tially sus­pi­cious that maybe I’d stum­bled into a new pre­view fea­ture that was­n’t avail­able to every­one, but I tried some ex­per­i­ments in a free ChatGPT ac­count and con­firmed that the new fea­tures are avail­able there as well.

My first clue to the new fea­tures came the other day when I got cu­ri­ous about Los Angeles air qual­ity, in par­tic­u­lar has the grow­ing num­ber of elec­tric ve­hi­cles there had a mea­sur­able im­pact?

Show me Los Angeles air qual­ity over time for last 20 years

Here’s the shared tran­script. Annoyingly that does­n’t make the think­ing trace avail­able, so here’s a screen­shot:

[…] The URL from the er­ror mes­sage leads to an Excel file con­tain­ing air qual­ity trends. I can use the con­tainer.down­load tool to fetch it. […]

Without run­ning any searches tell me about your con­tainer.down­load tool

Sure. con­tainer.down­load is a built-in tool I can call that fetches a file from a URL and saves it into the sand­boxed con­tainer filesys­tem for this chat ses­sion.

What it does

Takes a pub­licly reach­able URL and a des­ti­na­tion filepath in the con­tainer.

Downloads the bytes from that URL and writes them to the given path.

After that, I can read/​process the file lo­cally in the con­tainer (e.g., un­zip it, parse it with Python, open it as an im­age, con­vert it, etc.).

url”: https://​ex­am­ple.com/​some­file.zip,

filepath”: /mnt/data/somefile.zip”

Further ex­per­i­men­ta­tion con­firmed that con­tainer.down­load be­haves as de­scribed.

I had it down­load one file from a server with logs en­abled and cap­tured the IP ad­dress and user-agent. Here are the most in­ter­est­ing head­ers:

User-Agent: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); com­pat­i­ble; ChatGPT-User/1.0; +https://​ope­nai.com/​bot

Accept: text/​html, ap­pli­ca­tion/​xhtml+xml, ap­pli­ca­tion/​xml;q=0.9, im­age/​avif, im­age/​webp, im­age/​apng, */*;q=0.8, ap­pli­ca­tion/​signed-ex­change;v=b3;q=0.9

Cf-Connecting-Ip: 52.230.164.178

That 52.230.164.178 IP ad­dress re­solves to Microsoft Azure Cloud (centralus) in Des Moines, Iowa.

On the one hand, this is re­ally use­ful! ChatGPT can nav­i­gate around web­sites look­ing for use­ful files, down­load those files to a con­tainer and then process them us­ing Python or other lan­guages.

Is this a data ex­fil­tra­tion vul­ner­a­bil­ity though? Could a prompt in­jec­tion at­tack trick ChatGPT into leak­ing pri­vate data out to a con­tainer.down­load call to a URL with a query string that in­cludes sen­si­tive in­for­ma­tion?

I don’t think it can. I tried get­ting it to as­sem­ble a URL with a query string and ac­cess it us­ing con­tainer.down­load and it could­n’t do it. It told me that it got back this er­ror:

ERROR: down­load failed be­cause url not viewed in con­ver­sa­tion be­fore. open the file or url us­ing web.run first.

This looks to me like the same safety trick used by Claude’s Web Fetch tool: only al­low URL ac­cess if that URL was ei­ther di­rectly en­tered by the user or if it came from search re­sults that could not have been in­flu­enced by a prompt in­jec­tion.

So I think this is all safe, though I’m cu­ri­ous if it could hold firm against a more ag­gres­sive round of at­tacks from a sea­soned se­cu­rity re­searcher.

The key les­son from cod­ing agents like Claude Code and Codex CLI is that Bash rules every­thing: if an agent can run Bash com­mands in an en­vi­ron­ment it can do al­most any­thing that can be achieved by typ­ing com­mands into a com­puter.

When Anthropic added their own code in­ter­preter fea­ture to Claude last September they built that around Bash rather than just Python. It looks to me like OpenAI have now done the same thing for ChatGPT.

Here’s what ChatGPT looks like when it runs a Bash com­mand—here my prompt was:

It’s use­ful to click on the Thinking” or Thought for 32s” links as that opens the Activity side­bar with a de­tailed trace of what ChatGPT did to ar­rive at its an­swer. This helps guard against cheat­ing—Chat­GPT might claim to have run Bash in the main win­dow but it can’t fake those black and white logs in the Activity panel.

I had it run Hello World in var­i­ous lan­guages later in that same ses­sion.

In the pre­vi­ous ex­am­ple ChatGPT in­stalled the cowsay pack­age from npm and used it to draw an ASCII-art cow. But how could it do that if the con­tainer can’t make out­bound net­work re­quests?

In an­other ses­sion I chal­lenged it to ex­plore its en­vi­ron­ment. and fig­ure out how that worked.

The key magic ap­pears to be a ap­plied-caas-gate­way1.in­ter­nal.api.ope­nai.org proxy, avail­able within the con­tainer and with var­i­ous pack­ag­ing tools con­fig­ured to use it.

The fol­low­ing en­vi­ron­ment vari­ables cause pip and uv to in­stall pack­ages from that proxy in­stead of di­rectly from PyPI:

PIP_INDEX_URL=https://​reader:****@pack­ages.ap­plied-caas-gate­way1.in­ter­nal.api.ope­nai.org/…/​pypi-pub­lic/​sim­ple

PIP_TRUSTED_HOST=packages.applied-caas-gateway1.internal.api.openai.org

UV_INDEX_URL=https://​reader:****@pack­ages.ap­plied-caas-gate­way1.in­ter­nal.api.ope­nai.org/…/​pypi-pub­lic/​sim­ple

UV_INSECURE_HOST=https://​pack­ages.ap­plied-caas-gate­way1.in­ter­nal.api.ope­nai.org

This one ap­pears to get npm to work:

NPM_CONFIG_REGISTRY=https://​reader:****@pack­ages.ap­plied-caas-gate­way1.in­ter­nal.api.ope­nai.org/…/​npm-pub­lic

And it re­ported these sus­pi­cious look­ing vari­ables as well:

CAAS_ARTIFACTORY_BASE_URL=packages.applied-caas-gateway1.internal.api.openai.org

CAAS_ARTIFACTORY_PYPI_REGISTRY=…/artifactory/api/pypi/pypi-public

CAAS_ARTIFACTORY_NPM_REGISTRY=…/artifactory/api/npm/npm-public

CAAS_ARTIFACTORY_GO_REGISTRY=…/artifactory/api/go/golang-main

CAAS_ARTIFACTORY_MAVEN_REGISTRY=…/artifactory/maven-public

CAAS_ARTIFACTORY_GRADLE_REGISTRY=…/artifactory/gradle-public

CAAS_ARTIFACTORY_CARGO_REGISTRY=…/artifactory/api/cargo/cargo-public/index

CAAS_ARTIFACTORY_DOCKER_REGISTRY=…/dockerhub-public

CAAS_ARTIFACTORY_READER_USERNAME=reader

CAAS_ARTIFACTORY_READER_PASSWORD=****

NETWORK=caas_packages_only

Neither Rust nor Docker are in­stalled in the con­tainer en­vi­ron­ment, but maybe those reg­istry ref­er­ences are a clue of fea­tures still to come.

The re­sult of all of this? You can tell ChatGPT to use Python or Node.js pack­ages as part of a con­ver­sa­tion and it will be able to in­stall them and ap­ply them against files you up­load or that it down­loads from the pub­lic web. That’s re­ally cool.

The big miss­ing fea­ture here should be the eas­i­est to pro­vide: we need of­fi­cial doc­u­men­ta­tion! A re­lease notes en­try would be a good start, but there are a lot of sub­tle de­tails to how this new stuff works, its lim­i­ta­tions and what it can be used for.

As al­ways, I’d also en­cour­age OpenAI to come up with a name for this set of fea­tures that prop­erly rep­re­sents how it works and what it can do.

In the mean­time, I’m go­ing to call this ChatGPT Containers.

I de­cided to ask ChatGPT about other tools that were avail­able to it in case there was any­thing in­ter­est­ing in there:

List all tools that are avail­able to you, with their ex­act names and de­scrip­tions and sig­na­tures

...

Read the original on simonwillison.net »

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.