10 interesting stories served every morning and every evening.
Tim Cook to become :br(s): :br(m): :br(l): :br(xl):Apple Executive Chairman
John Ternus
to become Apple CEO
CUPERTINO, CALIFORNIA Apple announced that Tim Cook will become executive chairman of Apple’s board of directors and John Ternus, senior vice president of Hardware Engineering, will become Apple’s next chief executive officer effective on September 1, 2026. The transition, which was approved unanimously by the Board of Directors, follows a thoughtful, long-term succession planning process.
Cook will continue in his role as CEO through the summer as he works closely with Ternus on a smooth transition. As executive chairman, Cook will assist with certain aspects of the company, including engaging with policymakers around the world.
“It has been the greatest privilege of my life to be the CEO of Apple and to have been trusted to lead such an extraordinary company. I love Apple with all of my being, and I am so grateful to have had the opportunity to work with a team of such ingenious, innovative, creative, and deeply caring people who have been unwavering in their dedication to enriching the lives of our customers and creating the best products and services in the world,” said Cook. “John Ternus has the mind of an engineer, the soul of an innovator, and the heart to lead with integrity and with honor. He is a visionary whose contributions to Apple over 25 years are already too numerous to count, and he is without question the right person to lead Apple into the future. I could not be more confident in his abilities and his character, and I look forward to working closely with him on this transition and in my new role as executive chairman.”
“I am profoundly grateful for this opportunity to carry Apple’s mission forward,” said Ternus. “Having spent almost my entire career at Apple, I have been lucky to have worked under Steve Jobs and to have had Tim Cook as my mentor. It has been a privilege to help shape the products and experiences that have changed so much of how we interact with the world and with one another. I am filled with optimism about what we can achieve in the years to come, and I am so happy to know that the most talented people on earth are here at Apple, determined to be part of something bigger than any one of us. I am humbled to step into this role, and I promise to lead with the values and vision that have come to define this special place for half a century.”
Arthur Levinson, who has been Apple’s non-executive chairman for the past 15 years, will become its lead independent director on September 1, 2026. Ternus will join the board of directors, also effective September 1, 2026.
“Tim’s unprecedented and outstanding leadership has transformed Apple into the world’s best company. He’s introduced groundbreaking products and services time and again, and his integrity and values are infused into everything Apple does,” said Levinson. “On behalf of the entire board of directors, we are incredibly grateful for his countless contributions to Apple and the world, and we are thrilled he will now be executive chairman. We believe John is the best possible leader to succeed Tim and as he transitions to CEO we know his love of Apple, his leadership, deep technical knowledge, and relentless focus on creating great products will help lead Apple to an extraordinary future.”
“I want to thank Art for the incredible work he has done leading the board of directors for the past 15 years,” said Cook. “I have always found his advice to be invaluable and I appreciate his thoughtfulness and his unwavering dedication to the company. I am grateful he will serve as our lead independent director, and I look forward to working with him in my new role.”
Tim Cook joined Apple in 1998. He became CEO in 2011 and has overseen the introduction of numerous products and services, including new categories like Apple Watch, AirPods, and Apple Vision Pro, and services ranging from iCloud and Apple Pay to Apple TV and Apple Music. He was also instrumental in expanding existing product lines. Under Cook’s leadership Apple has grown from a market capitalization of approximately $350 billion to $4 trillion, representing a more than 1,000% increase, and yearly revenue has nearly quadrupled, from $108 billion in fiscal year 2011 to more than $416 billion in fiscal year 2025. The company has expanded its global footprint substantially, particularly in emerging markets; it is now in more than 200 countries and territories. Apple operates over 500 retail stores and has more than doubled the number of countries in which its customers can visit an Apple Store. During his tenure, Apple has grown by more than 100,000 team members and increased its active installed base to more than 2.5 billion devices.
Apple Services has been a major focus area of Cook’s, and during his tenure the category has grown to become a more than $100 billion business, the equivalent of a Fortune 40 company. Cook was also instrumental in creating the wearables category at Apple, which now includes the world’s most popular watch and headphones, and which has served as the foundation for Apple’s remarkable impact on the health and safety of its users. Under Cook’s leadership, Apple also transitioned to Apple-designed silicon, enabling the company to own more of its primary technology and deliver industry-leading gains in power efficiency and performance that directly benefit users across its products.
Cook has made Apple’s core values even more central to the company’s decision making and product development. Under his leadership, the company reduced its carbon footprint by more than 60 percent below 2015 levels during a period in which revenue nearly doubled. Cook, who has long advocated for privacy as a fundamental human right, has made privacy and security imperative at Apple, setting a standard for user protection that continues to set the company apart from the rest of the technology industry. He has also pushed for continued innovation in the accessibility space, believing that Apple products should be made for everyone. And he has made central to his leadership the notion that Apple should be a place where everyone can feel they belong and where everyone is treated with dignity and respect.
Ternus joined Apple’s product design team in 2001 and became a vice president of Hardware Engineering in 2013. He joined the executive team in 2021 as senior vice president of Hardware Engineering. Throughout his tenure at Apple, Ternus has overseen hardware engineering work on a variety of groundbreaking products across every category. He was instrumental in the introduction of multiple new product lines, including iPad and AirPods, as well as many generations of products across iPhone, Mac, and Apple Watch.
Ternus’s work on Mac has helped the category become more powerful and more popular globally than at any time in its 40-year history. That includes the recent introduction of MacBook Neo, an all-new laptop that makes the Mac experience even more accessible to more people around the world. This past fall, his team’s efforts were on full display with the introduction of a redefined iPhone lineup, including the incredibly powerful iPhone 17 Pro and Pro Max, the radically thin and durable iPhone Air, and the iPhone 17, which has been an incredible upgrade for users. Under his leadership, his team also drove advancements in AirPods to make them the world’s best in-ear headphones, with unprecedented active noise cancellation, as well as the capability to become an all-in-one hearing health system that can serve as over-the-counter hearing aids.
Ternus led much of the company’s focus in areas like reliability and durability, introducing new techniques that have made Apple products remarkably resilient. He has also driven much of Apple’s innovation in materials and hardware design that have reduced the carbon footprint of its products, including the creation of a new, recycled aluminum compound that has been introduced across multiple product lines, the use of 3-D printed titanium in Apple Watch Ultra 3, and innovations in repairability that have increased the lifespans of several Apple products.
Prior to Apple, Ternus worked as a mechanical engineer at Virtual Research Systems. He holds a bachelor’s degree in Mechanical Engineering from the University of Pennsylvania.
This press release contains forward-looking statements, within the meaning of the Private Securities Litigation Reform Act of 1995. These forward-looking statements include without limitation those about Apple’s executive succession plans. These statements involve risks and uncertainties, and actual results may differ materially from any future results expressed or implied by the forward-looking statements. More information regarding potential risks and other factors that could affect the company are included in Apple’s filings with the SEC, including in the “Risk Factors” and “Management’s Discussion and Analysis of Financial Condition and Results of Operations” sections of Apple’s most recently filed periodic reports on Form 10-K and Form 10-Q and subsequent filings. Apple assumes no obligation to update any forward-looking statements or information, which speak only as of the date they are made.
About Apple
Apple revolutionized personal technology with the introduction of the Macintosh in 1984. Today, Apple leads the world in innovation with iPhone, iPad, Mac, AirPods, Apple Watch, and Apple Vision Pro. Apple’s six software platforms — iOS, iPadOS, macOS, watchOS, visionOS, and tvOS — provide seamless experiences across all Apple devices and empower people with breakthrough services including the App Store, Apple Music, Apple Pay, iCloud, and Apple TV+. Apple’s more than 150,000 employees are dedicated to making the best products on earth and to leaving the world better than we found it.
© 2026 Apple Inc. All rights reserved. Apple, the Apple logo, Apple Watch, AirPods, Apple Vision Pro, iCloud, Apple Pay, Apple TV, Apple Music, Apple Store, iPad, iPhone, Mac, MacBook Neo, and iPhone Air are trademarks of Apple. Other company and product names may be trademarks of their respective owners.
...
Read the original on www.apple.com »
Six million fake stars, $0.06 per click, and a VC funding pipeline that treats GitHub popularity as proof of traction. We ran our own analysis on 20 repos and found the fingerprints.
Six million fake stars, $0.06 per click, and a VC funding pipeline that treats GitHub popularity as proof of traction. We ran our own analysis on 20 repos and found the fingerprints.
A GitHub star costs $0.06 at the low end. A seed round unlocks $1 million to $10 million. The math is obvious, and thousands of repositories are exploiting it.
This investigation maps the full ecosystem: from the peer-reviewed research quantifying the problem, to the marketplaces selling stars openly, to the venture capital pipeline that converts star counts into funding decisions. We ran our own analysis on 20 repositories using the GitHub API, sampling thousands of stargazer profiles to independently verify which projects show fingerprints of manipulation - and which don’t.
The picture that emerges is a mature, professionalized shadow economy operating in plain sight.
The definitive account comes from a peer-reviewed study presented at ICSE 2026 by researchers at Carnegie Mellon University, North Carolina State University, and Socket. Their tool, StarScout, analyzed 20 terabytes of GitHub metadata - 6.7 billion events and 326 million stars from 2019 to 2024 - and identified approximately 6 million suspected fake stars distributed across 18,617 repositories by roughly 301,000 accounts.
The problem accelerated dramatically in 2024. By July, 16.66% of all repositories with 50 or more stars were involved in fake star campaigns - up from near-zero before 2022. The researchers’ detection proved accurate: 90.42% of flagged repositories and 57.07% of flagged accounts had been deleted as of January 2025, confirming GitHub itself recognized these as illegitimate.
AI and LLM repositories emerged as the largest non-malicious category of fake-star recipients, ahead of blockchain/cryptocurrency projects in absolute volume at 177,000 fake stars. The study notes that “many of which are academic paper repositories or LLM-related startup products.” Critically, 78 repositories with detected fake star campaigns appeared on GitHub Trending, proving that purchased stars successfully game the platform’s discovery algorithm.
Earlier foundational work includes Dagster’s March 2023 investigation, where engineers purchased stars from two vendors to study the phenomenon. They found services via basic Google search. A premium vendor - GitHub24, a registered German company (Moller und Ringauf GbR) - charged EUR 0.85 per star and delivered reliably, with all 100 stars persisting after one month. A budget service (Baddhi Shop) sold 1,000 stars for $64, though only 75% survived.
The star-selling ecosystem spans dedicated websites, freelance platforms, exchange networks, and underground channels. At least a dozen active websites sell GitHub stars directly, including SocialPlug.io, Buy.fans, Boost-Like.store, GitHubPromoter.com, Followdeh.com, and Vurike.com.
On Fiverr, 24 active gigs sell GitHub promotion, with packages from $5 for basic stars and forks to $25+ for “organic promotion.” Many use obfuscated language to evade platform filters. Star exchange platforms like GithubStarMate.com and SafeStarExchange.com - both live and operational - enable free mutual starring through credit-based systems.
The infrastructure extends beyond stars. At least seven open-source tools on GitHub (fake-git-history, commit-bot, Commiter, and others) exist specifically to fabricate GitHub contribution graphs. Pre-built GitHub profiles with five-year commit histories and Arctic Code Vault Contributor badges sell for approximately $5,000 on Telegram.
Some vendors offer replacement guarantees - Followdeh advertises 30-day coverage, and premium services promise “non-drop” stars that survive GitHub’s detection systems. SocialPlug claims 3.1 million stars delivered across 53,000+ clients and offers a formal API for programmatic purchasing.
A Tsinghua University study (ACSAC 2020) documented Chinese QQ and WeChat promotion groups with 1,020+ members processing roughly 20 repos per day, generating an estimated $3.4 to $4.4 million annually in promoter profits.
To move beyond reported statistics, we built a GitHub API analysis tool and ran it against 20 repositories: projects flagged by StarScout, fast-growing AI repos from the Runa Capital ROSS Index, and known organic baselines. For each repo, we sampled 150 stargazer profiles and measured account age, public repos, followers, and bio presence.
The fingerprints of manipulation are unmistakable once you know what to look for.
Organic repositories are starred by developers who have been on GitHub for years, maintain their own projects, and follow other users. Ghost accounts - zero repos, zero followers, no bio - make up about 1% of a healthy project’s stargazer base.
These repos share a distinctive fingerprint. The accounts aren’t obviously new - median ages of 1,000+ days - so they pass simple “young account” filters. But they’re empty: a third have zero repos, half to four-fifths have zero followers, and a quarter are complete ghosts. These are aged accounts purchased or farmed specifically for star campaigns.
The fork-to-star ratio is the strongest signal. Flask has 235 forks per 1,000 stars. Shardeum has 22. FreeDomain has 17. When nobody is forking a 157,000-star repository, nobody is using it. The watcher-to-star ratio tells the same story: FreeDomain’s 0.001 means that for every 1,000 people who starred the repo, just one actually watches it for updates.
FreeDomain is worth isolating: 157,000 stars, but only 168 watchers and 2,676 forks. That’s a watcher-to-star ratio 26x lower than Flask. 81.3% of sampled stargazers have zero followers. This is a repository where almost nobody who starred it has any visible presence on GitHub.
Union Labs is the most consequential case. It was ranked #1 on Runa Capital’s ROSS Index for Q2 2025 - a widely cited VC industry report identifying the “hottest open-source startups” - with 54.2x star growth and 74,300 stars. Our analysis found 32.7% zero-repo accounts, 52% zero-follower accounts, and a fork-to-star ratio of 0.052. The StarScout analysis flagged it with 47.4% suspected fake stars. An influential investment-sourcing report that VCs rely on was topped by a project with nearly half its stars suspected as artificial.
RagaAI-Catalyst and openai-fm show clear manipulation signals. RagaAI has 76.2% zero-follower accounts and 28% ghosts - nearly identical to the blockchain pattern. openai-fm is the most extreme case in our dataset: 66% suspicious accounts, 36% ghosts, and a median account age of just 116 days. Two-thirds of its stargazers are less than a year old with virtually no GitHub activity. (The StarScout analysis notes this is likely third-party bots, not OpenAI itself.)
Langflow - flagged by StarScout at 47.9% fake - showed clean metrics in our profile sample, with a median age of 2,859 days and low ghost rates. This likely reflects improved account quality since the StarScout scan. The 0.060 fork-to-star ratio is still notably low - roughly a quarter of Flask’s - suggesting less genuine adoption relative to star count.
For comparison, NousResearch’s hermes-agent looks relatively organic: median age 8 years, 6% ghosts, fork-to-star ratio of 0.133. Despite Reddit accusations of astroturfing, the stargazer population is mostly real developers. The project’s crypto-adjacent audience includes more casual GitHub users, which explains slightly elevated zero-follower rates, but the fundamental engagement pattern is legitimate.
The connection between GitHub star counts and startup funding is not speculative - it is explicitly documented by the investors themselves.
Jordan Segall, Partner at Redpoint Ventures, published an analysis of 80 developer tool companies showing that the median GitHub star count at seed financing was 2,850 and at Series A was 4,980. He confirmed: “Many VCs write internal scraping programs to identify fast growing github projects for sourcing, and the most common metric they look toward is stars.”
Those numbers set an implicit target. For $85 to $285 in budget stars, a startup can manufacture the 2,850-star seed median. For $990 to $4,500, it can reach Series A territory. Against typical seed rounds of $1-10 million, the ROI ranges from 3,500x to 117,000x.
Runa Capital publishes the ROSS (Runa Open Source Startup) Index quarterly, ranking the 20 fastest-growing open-source startups by GitHub star growth rate. Per TechCrunch, 68% of ROSS Index startups that attracted investment did so at seed stage, with $169 million raised across tracked rounds. GitHub itself, through its GitHub Fund partnership with M12 (Microsoft’s VC arm), commits $10 million annually to invest in 8-10 open-source companies at pre-seed/seed stages based partly on platform traction.
* Lovable (formerly GPT Engineer): 50,000+ stars, $7.5M pre-seed, $200M Series A at $1.8 billion valuation with 45 employees
Dagster’s Fraser Marlow, who led the fake star investigation, admitted directly: “In the run-up to the fundraising, I spent a fair amount of time preoccupied with GitHub stars.” An academic paper in Organization Science provided rigorous statistical evidence that GitHub engagement correlates with startup funding outcomes - startups active on GitHub are 15 percentage points more likely to have raised a financing round.
The incentive loop is self-reinforcing: VCs use stars as sourcing signals, so startups manipulate stars, so VCs see inflated traction, so more VCs adopt star-tracking, so more startups manipulate. Redpoint’s own published benchmarks give startups an exact target to buy toward.
Our analysis revealed the fork-to-star ratio as the strongest simple heuristic for identifying potential manipulation. The logic is straightforward: a star costs nothing and conveys no commitment. A fork means someone downloaded the code to use or modify it.
Any repository with a fork-to-star ratio below 0.05 and more than 10,000 stars warrants scrutiny. The watcher-to-star ratio is even more telling: organic projects average 0.005 to 0.030; FreeDomain registers 0.001.
These ratios aren’t perfect - educational repos and curated lists naturally have low fork rates. But as a first-pass filter, they catch the most egregious cases that raw star counts miss entirely.
The problem extends to every platform where popularity metrics influence trust.
npm downloads are trivially inflatable. Developer Andy Richardson demonstrated this by using a single AWS Lambda function (free tier) to push his package is-introspection-query to nearly 1 million downloads per week - surpassing legitimate packages like urql and mobx. Zero actual users. The CMU study found that of repos with fake star campaigns, only 1.23% appeared in package registries, but of those 738 packages, 70.46% had zero dependent projects.
VS Code Marketplace extensions are similarly vulnerable. Researchers demonstrated 1,000+ installs of a fake extension in 48 hours. AquaSec found 1,283 extensions with known malicious dependencies totaling 229 million installs.
X/Twitter promotion amplifies artificial GitHub virality through engagement pods - private groups where members agree to like, repost, and comment on each other’s content. Growth Terminal sells this as a product feature. NBC News and Clemson University researchers identified a network of 686 X accounts that posted more than 130,000 times using LLM-generated content, some containing telltale artifacts like “Dolphin here!” from the uncensored Dolphin model they employed.
The Higgsfield AI case documents cross-platform astroturfing at industrial scale: over 100 confirmed spam posts across 60+ subreddits, combined with mass template DMs to content creators offering payment for promotion.
The FTC Consumer Review Rule, effective October 21, 2024, explicitly prohibits selling or buying “fake indicators of social media influence” generated by bots or fake accounts for commercial purposes. Penalties: up to $53,088 per violation. The FTC issued its first warning letters to 10 companies in December 2025. A GitHub star purchased to promote a commercial product fits this framework.
The SEC precedent is more direct. HeadSpin’s CEO was charged with wire fraud (maximum 20 years) and securities fraud for inflating metrics to deceive investors out of $80 million. ComplYant’s founder faced charges for claiming $250,000 monthly revenue when actual revenue was $250.
The SEC’s message: “Startup fundraisers cannot use the ‘fake it until you make it’ ethos to whitewash lying to investors.”
If a startup buys fake GitHub stars to inflate perceived traction during a fundraising round, and investors rely on those metrics to deploy capital, the wire fraud framework applies: using electronic communications to misrepresent material facts for financial gain. No one has been charged specifically for fake GitHub stars yet. Given the CMU research documenting the practice at scale and the FTC rule explicitly covering fake social influence metrics, it may only be a matter of time.
GitHub’s Acceptable Use Policies explicitly prohibit “inauthentic interactions, such as fake accounts and automated inauthentic activity,” “rank abuse, such as automated starring or following,” and “creation of or participation in secondary markets for the purpose of the proliferation of inauthentic activity.” The policies even specifically prohibit starring incentivized by “cryptocurrency airdrops, tokens, credits, gifts or other give-aways.”
Enforcement is reactive and asymmetric. GitHub removed 90.42% of repositories flagged by StarScout, but only 57.07% of the accounts that delivered those stars. The infrastructure for future campaigns largely remains intact. When Dagster published its investigation, fake star profiles were deleted within 48 hours - but only after public embarrassment, not proactive detection.
GitHub has never published an engineering blog post about its detection methods or enforcement statistics. No transparency report exists for star manipulation. The company’s VP of Security Operations told Wired only that they “disabled user accounts in accordance with GitHub’s Acceptable Use Policies,” declining to elaborate - though that comment was specifically about the Stargazers Ghost Network malware operation, not vanity metric manipulation.
The CMU researchers recommended GitHub adopt a weighted popularity metric based on network centrality rather than raw star counts. A change that would structurally undermine the fake star economy. GitHub has not implemented it.
Bessemer Venture Partners calls stars “vanity metrics” and instead tracks unique monthly contributor activity - anyone who created an issue, comment, PR, or commit. Fewer than 5% of top 10,000 projects ever exceeded 250 monthly contributors; only 2% sustained it across six months.
Jono Bacon at StateShift recommends five metrics that correlate with real adoption: package downloads, issue quality (production edge cases from real users), contributor retention (time to second PR), community discussion depth, and usage telemetry.
The fork-to-star ratio our analysis surfaced is the simplest first-pass filter. A healthy project has roughly 100-200 forks per 1,000 stars. Projects below 50 forks per 1,000 stars with high absolute counts deserve a closer look.
As one commenter put it: “You can fake a star count, but you can’t fake a bug fix that saves someone’s weekend.”
First, the incentive loop. VCs use stars as sourcing signals. Startups manipulate stars. VCs see inflated traction. More VCs adopt star-tracking. More startups manipulate. Redpoint’s published benchmarks - 2,850 at seed, 4,980 at Series A - effectively give startups a price list for how many stars to buy.
Second, the AI sector’s specific vulnerability. The combination of extreme hype, crypto-adjacent funding models that reward token price over product quality, and a reviewer ecosystem on X/Twitter populated partly by fabricated personas creates a perfect environment for manufactured credibility. Our analysis confirmed this: the repos with the worst manipulation signals were overwhelmingly blockchain and crypto-adjacent AI projects.
Third, GitHub’s enforcement asymmetry. Removing repos but leaving 57% of fake accounts intact preserves the labor force of the fake star economy while doing little to deter repeat offenses. Until GitHub implements structural changes - weighted popularity metrics, account-level reputation scoring, or transparent enforcement reporting - the gap between star counts and genuine developer adoption will continue to widen.
The star economy is a $50 problem with a $50 million consequence. Until the platforms, investors, and regulators catch up, the market will keep paying the $50.
...
Read the original on awesomeagents.ai »
We are open sourcing our latest model, Kimi K2.6, featuring state-of-the-art coding, long-horizon execution, and agent swarm capabilities. Kimi K2.6 is now available via Kimi.com, the Kimi App, the API, and Kimi Code.
Kimi K2.6 shows strong improvements in long-horizon coding tasks, with reliable generalization across programming languages (e.g., Rust, Go, and Python) and tasks (e.g., front-end, devops, and performance optimization). On Kimi Code Bench, our internal coding benchmark covering diverse complicated end-to-end tasks, Kimi K2.6 demonstrates significant improvements over Kimi K2.5.
Kimi K2.6 successfully downloaded and deployed the Qwen3.5-0.8B model locally on a Mac. By implementing and optimizing model inference in Zig—a highly niche programming language—it demonstrated exceptional out-of-distribution generalization. Across 4,000+ tool calls, over 12 hours of continuous execution, and 14 iterations, Kimi K2.6 dramatically improved throughput from ~15 to ~193 tokens/sec, ultimately achieving speeds ~20% faster than LM Studio.
Kimi K2.6 autonomously overhauled exchange-core, an 8-year-old open-source financial matching engine. Over a 13-hour execution, the model iterated through 12 optimization strategies, initiating over 1,000 tool calls to precisely modify more than 4,000 lines of code. Acting as an expert systems architect, Kimi K2.6 analyzed CPU and allocation flame graphs to pinpoint hidden bottlenecks and boldly reconfigured the core thread topology (from 4ME+2RE to 2ME+1RE). Despite the engine already operating near its performance limits, Kimi K2.6 extracted a 185% medium throughput leap (from 0.43 to 1.24 MT/s) and a 133% performance throughput gain (soaring from 1.23 to 2.86 MT/s).
In beta tests, K2.6 performs well on long-horizon coding tasks in enterprise evaluations (by alphabetic order):
Based on the strong coding capabilities, Kimi K2.6 can turn simple prompts into complete front-end interfaces, generating structured layouts with deliberate design choices such as aesthetic hero sections, as well as interactive elements and rich animations, including scroll-triggered effects. With strong proficiency in leveraging image and video generation tools, Kimi K2.6 supports the generation of visually coherent assets and contributes to higher-quality, more salient hero sections.
Moreover, Kimi K2.6 expands beyond static frontend development to simple full-stack workflows—spanning authentication to user interaction to database operations for lightweight use cases like transaction logging or session management.
We established an internal Kimi Design Bench, organized into four categories: Visual Input Tasks, Landing Page Construction, Full-Stack Application Development, and General Creative Programming. In comparison with Google AI Studio, Kimi K2.6 shows promising results and performs well across these categories.
Below are examples generated by K2.6 Agent from a single prompt, with preconfigured harnesses and tools:
Scaling out, not just up. An Agent Swarm dynamically decomposes tasks into heterogeneous subtasks executed concurrently by self-created domain-specialized agents.
Based on the K2.5 Agent Swarm research preview, Kimi K2.6 Agent Swarm demonstrates a qualitative leap in the agent swarm experience. It seamlessly coordinates heterogeneous agents to combine complementary skills: broad search layered with deep research, large-scale document analysis fused with long-form writing, and multi-format content generation executed in parallel. This compositional intelligence enables the swarm to deliver end-to-end outputs—spanning documents, websites, slides, and spreadsheets—within a single autonomous run.
The architecture scales horizontally to 300 sub-agents executing across 4,000 coordinated steps simultaneously, a substantial expansion from K2.5′s 100 sub-agents and 1,500 steps. This massive parallelization fundamentally reduces end-to-end latency while significantly enhancing output quality and expanding the operational boundaries of Agents swarms.
It can also turn any high-quality files such as PDFs, spreadsheets, slides, and Word documents into Skills. Kimi K2.6 captures and maintains the documents’ structural and stylistic DNA, enabling you to reproduce the same quality and format in future tasks.
Here are some examples:
K2.6 demonstrates strong performance in autonomous, proactive agents such as OpenClaw and Hermes, which operate across multiple applications with continuous, 24/7 execution.
Unlike simple chat-based interactions, these workflows require AI to proactively manage schedules, execute code, and orchestrate cross-platform operations as a persistent background agent.
Our RL infra team used a K2.6-backed agent that operated autonomously for 5 days, managing monitoring, incident response, and system operations, demonstrating persistent context, multi-threaded task handling, and full-cycle execution from alert to resolution. Here is K2.6′s worklog (anonymized to remove sensitive information):
Kimi K2.6 delivers measurable improvements in real-world reliability: more precise API interpretation, stabler long-running performance, and enhanced safety awareness during extended research tasks.
Performance gains are quantified by our internal Claw Bench, the evaluation suite spanning five domains: Coding Tasks, IM Ecosystem Integration, Information Research & Analysis, Scheduled Task Management, and Memory Utilization. Across all metrics, Kimi K2.6 significantly outperforms Kimi K2.5 in task completion rates and tool invocation accuracy—particularly in workflows requiring sustained autonomous operation without human oversight.
Building upon Kimi K2.6′s robust orchestration capabilities, Kimi K2.6 extends your proactive agents to Claw Groups as a research preview—a new instantiation of the Agent Swarm architecture.
Claw Groups embrace an open, heterogeneous ecosystem: Multiple agents and humans operate as true collaborators. Users can onboard agents from any device, running any model, each carrying their own specialized toolkits, skills and persistent memory contexts. Whether deployed on local laptops, mobile devices, or cloud instances, these diverse agents integrate seamlessly into a shared operational space.
At the center of this swarm, Kimi K2.6 serves as an adaptive coordinator. It dynamically matches tasks to agents based on their specific skill profiles and available tools, optimizing for capability fit. When an agent encounters failure or stalls, the coordinator detects the interruption, automatically reassigns the task or regenerates subtasks, and actively manages the full lifecycle of deliverables—from initiation through validation to completion.
We also want to thank the K2.6-powered agents in Claw Groups—we’ve been dogfooding our own agent marketing team by refining human–agent workflows in practice. Using Claw Groups, we run end-to-end content production and launch campaigns, with specialized agents like Demo Makers, Benchmark Makers, Social Media Agents, and Video Makers working together. K2.6 coordinates the process, enabling agents to share intermediate results and turn ideas into consistent, fully packaged deliverables.
We are moving beyond simply asking AI a question or assigning AI a task, and entering a phase where human and AI collaborate as genuine partners—combining strengths to solve problems collectively. Claw Groups marks our latest efforts toward a future where the boundaries between “my agent,” “your agent,” and “our team” dissolve seamlessly into a collaborative system.
To reproduce official Kimi-K2.6 benchmark results, we recommend using the official API. For third-party providers, refer to Kimi Vendor Verifier (KVV) to choose high-accuracy services. Details: https://kimi.com/blog/kimi-vendor-verifier
* We report results for Kimi K2.6 and Kimi K2.5 with thinking mode enabled, Claude Opus 4.6 with max effort, GPT-5.4 with xhigh reasoning effort, and Gemini 3.1 Pro with a high thinking level.
* Unless otherwise specified, all Kimi K2.6 experiments were conducted with temperature = 1.0, top-p = 1.0, and a context length of 262,144 tokens.
* Benchmarks without publicly available scores were re-evaluated under the same conditions used for Kimi K2.6 and are marked with an asterisk (*). Except where noted with an asterisk, all other results are cited from official reports.
* IMO-AnswerBench scores for GPT-5.4 and Claude 4.6 were obtained from https://z.ai/blog/glm-5.1.
* Humanity’s Last Exam (HLE) and other reasoning tasks were evaluated with a maximum generation length of 98,304 tokens. By default, we report results on the HLE full set. For the text-only subset, Kimi K2.6 achieves 36.4% accuracy without tools and 55.5% with tools.
* Kimi K2.6 was equipped with search, code-interpreter, and web-browsing tools for HLE with tools, BrowseComp, DeepSearchQA, and WideSearch.
* For HLE-Full with tools, the maximum generation length is 262,144 tokens with a per-step limit of 49,152 tokens. We employ a simple context management strategy: once the context window exceeds the threshold, only the most recent round of tool-related messages is retained.
* For BrowseComp, we report scores obtained with context management using the same discard-all strategy as Kimi K2.5 and DeepSeek-V3.2.
* For DeepSearchQA, no context management was applied to Kimi K2.6 tests, and tasks exceeding the supported context length were directly counted as failed. Scores for Claude Opus 4.6, GPT-5.4, and Gemini 3.1 Pro on DeepSearchQA are cited from the Claude Opus 4.7 System Card.
* For WideSearch, we report results under the “hide tool result” context management setting. Once the context window exceeds the threshold, only the most recent round of tool-related messages is retained.
* The test system prompts are identical to those used in the Kimi K2.5 technical report.
* Claw Eval was conducted using version 1.1 with max-tokens-per-step = 16384.
* For APEX-Agents, we evaluate 452 tasks from the public 480-task release, as done by Artificial Analysis (excluding Investment Banking Worlds 244 and 246, which have external runtime dependencies).
* Terminal-Bench 2.0 scores were obtained with the default agent framework (Terminus-2) and the provided JSON parser, operating in preserve thinking mode.
* For the SWE-Bench series of evaluations (including Verified, Multilingual, and Pro), we used an in-house evaluation framework adapted from SWE-agent. This framework includes a minimal set of tools—bash tool, createfile tool, insert tool, view tool, strreplace tool, and submit tool.
* All reported scores for coding tasks are averaged over 10 independent runs.
* Settings with Python tool use max-tokens-per-step = 65,536 and max-steps = 50 for multi-step reasoning.
* MMMU-Pro follows the official protocol, preserving input order and prepending images.
...
Read the original on www.kimi.com »
Let me tell you a story. When I was a child, I suffered from night terrors. It was always the same dream: I could hear my family and neighbors wailing in the street outside as they were pursued and then destroyed by a nameless malevolent force, something neither I nor anyone else could control, a great darkness that was, somehow, all my fault.
Today, that childhood dream is finally coming true. Today I can finally say the sweetest nine or 10 words in the English language: Global Tetrahedron has completed its plan to control InfoWars.com.
I’ve had a lot of time to think about InfoWars in the last year and a half. As the seasons have changed, my ambitions for the project have grown grander, crueler, better aligned with market data. Come, friends, and imagine with me…
Imagine a roaring arena packed to the rafters with pathological liars. High above you in the nosebleeds are podcasters, screaming that you’ll die if you don’t buy their skincare products. Below, on the floor, imagine demonic battalions of super-influencers physically forcing people into home fitness devices designed to dismantle their bodies bone by bone and reassemble them into a grotesque statue of yourself. Out of the throngs, an extremely sick looking man approaches you. He puts his hands on your shoulders. He explains that he is your life coach and that you owe him $800.
Such is the InfoWars I envision: An infinite virtual surface teeming with ads. Not just ads, but scams! Not just scams, but lies with no object, free radical misinformation, sentences and images so poorly thought out that they are unhealthy even to view for just a few seconds. The InfoWars of old was only the prototype for the hell I know we can build together: A digital platform where, every day, visitors sacrifice themselves at altars of delusion and misery, their minds fully disintegrating on contact.
With this new InfoWars, we will democratize psychological torture, welcoming brutal and sadistic ideas from everyone, even the very stupidest among us. It will be like the Manhattan Project, only instead of a bomb, we will be building a website.
The InfoWars of tomorrow will converge into a swirling vortex of content about content, talent acquiring talent, rings of concentric media mergers processing all human artistry into one endlessly digestible slurry. This will be a dank, sunless place, one where panic and capital feed on each other like twins in the womb of a hulking, unknowable monster—a monster known by many names, but which I like to call modern-day America.
All of this is to say that I believe in us. I believe that with the new InfoWars, we can alchemize the pioneering spirit of amateur inquiry, the profit-maximizing drive of corporations, and the cold mental clarity that comes only with disciplined daily ingestion of mind- and body-altering chemicals. If we can do that, what other great things can we do together?
I don’t yet know, but I’m excited to find out. Welcome home, warriors. The future belongs to us. We’re writing the story now. It’s going to be a long one, and it’s going to be a bad one.
So settle in. Make yourself comfortable. Buy a tote bag.
Nothing can stop us now that we’re in charge of a website.
...
Read the original on theonion.com »
Saunas have been around since the primitive years in ancient Finland, and have always been considered to have a therapeutic effect[1]. Saunas are a hot, dry environment used to stimulate our cardiovascular system. During extreme heat exposure, our heart rate rises and our vessels dilate to increase the delivery of blood volume in order to protect the body[2].
This extra pressure on the heart is known to have long-term health benefits[3]. The heat exposure also promotes sweating and therefore the elimination of toxins, including those generated in the process of repairing small muscle tears after exercise[4]. It is for this reason that saunas are also considered great for recovery. All of this is no news, at the end of the day isn’t that what roman baths were built for? For recovery after battle[5]!
However, most studies have looked at the benefits of frequent sauna bathing and the impacts on long-term health. Motivated to understand the immediate physiological response to saunas, we looked at the same-day effects across ~59,000 daily records from 256 users.
We used simple paired t-test evaluations to assess the immediate same-day effects of saunas.
Sauna days were associated with:
That fits our intuition: many people sauna after a workout.
Sauna days also showed lower minimum heart rate compared to non‑sauna days. Importantly, this effect remains even after controlling for activity, which suggests the lower nighttime heart rate isn’t simply due to exercise. The difference between sauna and non-sauna days is on average 5% (3bpm) which is a noticeable physiological change.
These results were statistically robust (FDR‑corrected p < 0.05 and Cohen’s d > 0.2), supporting the idea that sauna use may be linked to better same‑day recovery.
Females showed larger activity increases on sauna days, which may reflect more consistent sauna use on workout days. However, females showed a smaller drop in minimum heart rate than males on sauna days.
As we looked at in previous blogs, the menstrual cycle can influence recovery and nighttime heart rate. For this reason, we evaluated sauna effects across the follicular and luteal phases and observed statistically higher activity and lower heart rate when women use the sauna on their luteal phase. In fact, heart rate at night was only meaningfully lower (Cohen d > 0.2) compared to non sauna days during the luteal phase. Or in other words, the benefits of saunas seem to appear only during the luteal phase…
Sauna use is part of a recovery‑oriented day. Sauna days are more active, which fits how people actually use saunas, often as a post‑workout routine. Yet even after accounting for activity, nighttime minimum heart rate is lower on sauna days, suggesting a physiological recovery signal beyond exercise alone.
Mechanistically, this pattern is consistent with known heat‑stress physiology: heart rate increases during sauna exposure, followed by recovery dynamics that can reflect increased parasympathetic influence during cooling [6][7] Within women, the strongest recovery signal in our dataset appears in the luteal phase, where the effect size crosses a meaningful threshold.
Ketelhut, S., & Ketelhut, R. G. (2019). The blood pressure and heart rate during sauna bath correspond to cardiac responses during submaximal dynamic exercise. Complementary therapies in medicine, 44, 218–222. https://doi.org/10.1016/j.ctim.2019.05.002Laukkanen T, Kunutsor SK, Khan H, et al. Sauna bathing is associated with reduced cardiovascular mortality and improves risk prediction in men and women: a prospective cohort study. BMC Medicine. 2018;16:219. (PMC: PMC6262976)Kuan, W. H., Chen, Y. L., & Liu, C. L. (2022). Excretion of Ni, Pb, Cu, As, and Hg in Sweat under Two Sweating Conditions. International journal of environmental research and public health, 19(7), 4323. https://doi.org/10.3390/ijerph19074323Marcussen, W. (2019, August 23). The Roman Baths in Bath- A Deep Dive into Britain’s Ancient History. World History Encyclopedia. **https://www.worldhistory.org/article/1427/the-roman-baths-in-bath–a-deep-dive-into-britains/**Laukkanen JA, Laukkanen T, Kunutsor SK. Cardiovascular and Other Health Benefits of Sauna Bathing: A Review of the Evidence. Mayo Clinic Proceedings. 2018;93(8):1111–1121. (PubMed: 30077204)
...
Read the original on tryterra.co »
Today, we are super excited to announce the alpha-release of
ggsql
. As the name suggests, ggsql is an implementation of the grammar of graphics based on SQL syntax, bringing rich, structured visualization support to SQL. It is ready for use in Quarto, Jupyter notebooks, Positron and VS Code among others.
In this post we will go over some of the motivations that lead us to develop this tool, as well as give you ample examples of its use; so you can hopefully get as excited about it as we are.
Before we discuss the why, let’s see what ggsql is all about with some examples.
To get our feet wet, lets start with the hello-world of visualizations: A scatterplot, using the built-in penguins dataset:
That wasn’t too bad. Sure, it has the verbosity of SQL, but that also means that you can speak your plot code out loud and understand what it does. We can break down what is going on here line-by-line:
We initiate the visual query with VISUALIZE and provide a mapping from the built-in penguins dataset, relating x to the data in the bill_len column, and y in the bill_dep column.
We draw a point layer that, by default, uses the mapping we defined at the top.
With this in place, we can begin to add to the visualization:
We see that a single addition to the mappings adds colored categories to the plot. This gradual evolution of plot code is one of the biggest strengths of the grammar of graphics. There are no predefined plot types, only modular parts that can be combined, added, and removed. To further emphasize this, let’s add a smooth regression line to the plot:
We add a new layer on top of the point layer. This layer also borrows the same mapping as the point layer. Since we color by species, the smooth line is split into one for each species.
We can continue doing this, adding more mappings, adding or swapping layers, controlling how scales are applied etc until we arrive at the plot we need, however simple or complicated it may be. In the above example we may well end up deciding we are more interested in looking at the distribution of species across the three islands the data was collected from:
While a completely different plot, you can see how much of the code from the previous plot carries over.
With our first couple of plots under the belt, let’s move on to a complete example. It will contains parts we have not seen before, but don’t worry, we will go through it below, even the parts we’ve already seen before. The example is an adaptation of a visualization created by
Jack Davison
for TidyTuesday.
That was a lot of code, but on the flip-side we have now covered a lot of the most important aspects of the syntax with one example.
At the topmost level there are two parts to this query: The SQL query, and the visualization query. The SQL query is anything from the beginning to the VISUALIZE clause. It is your standard SQL, and it accepts anything your backend accepts (in this blog post we use a DuckDB backend). The result of the query is funnelled directly into the visualization rather than being returned as a table like you’d normally expect.
Since the point of this post is not to teach you SQL we won’t spend much more time discussing the SQL query part. The main take away is that everything before the VISUALIZE clause is pure SQL, any resulting table is automatically used by your visualization, and any table or CTE created there is available for referencing in the visualization query.
As we saw in the first examples, the SQL query part is optional. If your data is already in the right shape for plotting you can skip it and instead name the source directly in the VISUALIZE clause:
Now, let’s look at the visual query — everything from VISUALIZE and onwards. VISUALIZE marks the end of the SQL query and the beginning of the visualization query (or VISUALISE for those who prefer UK spelling). It can stand on its own or, as we do here, have one or more mappings which will become defaults for every subsequent layer. Mappings are purely for relating data to abstract visual properties. A mapping is like a SELECT where you alias columns to a visual properties (called aesthetics in the grammar of graphics). In the visualization above we say that the age column holds the values used for x (position along the x axis) and the category column holds the values used for fill (the fill color of the entity). We do not say anything about how to draw it yet.
Following the VISUALIZE query we have a DRAW clause. DRAW is how we add layers to our visualization. There is a large selection of different layer types in ggsql. Some are straightforward: e.g. point for drawing a scatterplot. Some are more involved: histogram (which we use here) requires calculation of derived statistics like binned count. A visualization can have any number of layers and layers will be rendered in the sequence they are defined. DRAW has a sibling clause called PLACE. It is used for annotation and works like DRAW except it doesn’t get data from a table but rather as provided literal values. It follows that our visualization above contains three layers: A histogram layer showing data from our table, a rule annotation layer showing precomputed mean values for each category, and a text annotation layer adding context to the visualization. It is worth mentioning that a layer does not correspond to a single graphical entity. Like with the text layer above, each layer can render multiple separate entities of its type so there is no need to have e.g. 3 line layers to render line plots for 3 different categories.
After the DRAW and PLACE clauses we have a SCALE clause. This clause controls how data values are translated into values that are meaningful for the aesthetic. In our case, the category column holds the strings “Age at mission” and “Age at selection” which doesn’t in itself translate to a color value. The clause SCALE fill TO accent tells ggsql to use the “accent” color palette when converting the values mapped to fill to actual colors. Scales can be used for much more, like applying transformations to continuous data, defining break points, and setting specific scale types (like ordinal or binned).
The last clause in our visual query is LABEL which allows us to add or modify various text labels like title, subtitle, and axis and legend titles.
That was a mouthful. But there are two very silvery linings to it all:
You now know the most important aspects of the syntax (there are more, of course, but you can grow into that)
Many visualization queries will be much simpler than the one above
We have already seen examples of shorter visual queries above but let’s continue with a boxplot of astronaut birth year split by sex:
That’s much shorter than the last plot code but still, if you are coming from a different plotting system you may even think this is overly verbose (e.g. compared to something like boxplot(astronauts.sex, astronauts.year_of_birth)). Yes, it is longer, but it is also more structured, composable, and self-descriptive. These features (which are a direct result of its grammar of graphics lineage) means that both you and your future LLM coding buddy will have an easier time internalizing the workings of all types of plots that can be made. The 18 years of dominance of ggplot2 (which shares these features) in the R ecosystem is a testament to this.
As an example, let’s change the above plot to instead show the same relationship as a jittered scatterplot.
Or perhaps the jitter follows the distribution of the data so it doubles as a violin plot:
As you can see the syntax and composable nature makes visualization iteration very ergonomic, something that is extremely valuable in both explorative analyses and visualization design.
Writing a new visualization library from scratch is a big task and you might wonder why we’re doing it again. Some of the reasons are:
* We want to engage with and help data analysts and data scientists that predominantly work in SQL
* SQL and the grammar of graphics fit together extremely well
* We want to create an extremely powerful, code-based, visualization tool that doesn’t require an entire programming language (like R or python)
* LLMs speak SQL very well and also presents a new interface to data visualization creation
* We have learned so much from 18 years of
ggplot2
development that we’re excited to apply to a blank canvas
While first R and then Python captured all the attention of the data science revolution, SQL chugged along as the reliable and powerful workhorse beneath it all. There are many people who work with data that do so only or predominantly in SQL. The choice they have for visualizing their data are often suboptimal in our view:
* Export the data and use R or Python which may not be within their comfort zone
* Use a GUI-based BI tool with poor support for reproducibility
* Rely on one of the few tools that exist for creating visualizations directly within the query that we feel are not powerful or ergonomic enough
Our goal when designing ggsql was that the syntax should immediately make sense to SQL users, tapping into their expectation of composable, declarative clauses.
Apart from offering a better way to visualize their data, ggsql is also a way to invite SQL users into our rich ecosystem of code based report generation and sharing build on top of
Quarto
If you are reading this with no prior knowledge of SQL, here’s a very brief recap: SQL is a domain specific language for manipulating relational data stored in one or more tables. The syntax is based on the concept of relational algebra which is a structured way to think about data manipulation operations. The semantics defines a set of modular operations that are declarative rather than functional, allowing the user to compose very powerful and custom manipulations using a well-defined set of operations.
If you are reading this with no prior knowledge of the grammar of graphics, here’s a very brief recap: The grammar of graphics is a theoretical deconstruction of the concepts of data visualization into its modular parts. While purely theoretical, tools such as ggplot2 have implemented the idea in practice. The semantics defines a set of modular operations that are declarative rather than functional, allowing the user to compose very powerful and custom visualizations using a well-defined set of operations.
From the above, slightly hyperbolic, overview it is clear that both SQL and the grammar of graphics have a lot of commonality in their approach to their respective domains. Together they can offer a very powerful and natural solution to the full pipeline from raw data to final visualization.
Why does it matter that
ggplot2
and
plotnine
requires R and Python installed respectively? There are clear benefits to a single, focused executable to handle data visualization:
* Embedding a small executable in other tools is much easier than bundling R/Python (or requiring them to be installed)
* A smaller scope makes it easier to sandbox and prevent malicious code execution (either deliberately or in error)
Both of the above points make ggsql a much more compelling option for integrating into tools such as AI agents assisting you in data analysis, or code based reporting tools that may execute code in different environments.
You may think we have had to swallow some bitter pills by moving away from an interpreted language, but it has also given us a lot. Most importantly, the rigid structure means that we can execute the whole data pipeline as a single SQL query per layer on the backend. This means that if you want to create a bar plot of 10 billion transactions you only ever fetch the count values for each bar from your data warehouse, not the 10 billion rows of data. The same is true for more complicated layer types such as boxplots and density plots. This is in stark contrast to most visualization tools which must first materialize the complete data, then perform the necessary computations on it, then plot it.
LLMs have proven very effective at translating natural language into SQL, and we’re bullish that they can be just as effective with ggsql. We’ve already seen evidence of this in
querychat
, where you can now visually explore data using natural language via ggsql. And, since ggsql is a much safer and lighter runtime than R or Python, you can much more confidently ship coding agents into a production environment.
18 years of ggplot2 development and maintenance also means 18 years of thinking about data visualization syntax, use, and design. While not trying to be boastful we do believe that gives us some expert knowledge on the subject matter. However, not all of this knowledge can be poured back into ggplot2. There are decisions and expectations established many years ago that we have to honor, or at least only challenge very gradually (which we do on occasions).
ggsql is a blank slate. Not only in the sense that we are building it from the ground up, but also in that it is built for an environment with no established expectations for a visualization tool. I cannot stress how liberating and invigorating this has felt, and I am positive that this shines through in how ggsql feels to the user.
We are nearing the end of a rather long announcement — thanks for sticking with us. In the very first line we called this an alpha-release which implies that we are not done yet. To get you as excited about the future as you hopefully are about the present state of ggsql, here is a non-exhaustive list of things we want to add.
* New high-performance writer, written from the ground up in Rust
If you are a current ggplot2 user you may have read this with a mix of fear and excitement (or maybe just one of them). Does this mean that we are leaving ggplot2 behind at Posit to focus on our new shiny toy? Not at all! ggplot2 is very mature and stable at this point but we will continue to support and build it out. We also hope that ggsql can pay back all the experience from ggplot2 that went into its development by informing new features in ggplot2.
If you can’t wait to learn more about ggsql and begin to use it you can head to the
Getting started
section of the
ggsql website
for installation instructions, and a tutorial, or head straight to
the documentation
to discover everything ggsql is capable of. We can’t wait for you to try it out and hear about your experiences with it.
...
Read the original on opensource.posit.co »
As the internet chokes on ever more slop, the one thing that gives me hope is this: people seem to loathe AI, and are actively resisting it. This won’t be a long post, as I’m personally so tired of writing and thinking about AI at this point in time, but I do want to draw your attention here to some recent anti-AI stuff that’s worth discussing.
r/PoisonFountain, created by individuals who claim to be concerned AI industry insiders, is a community with one goal: encourage as many people as possible to feed huge quantities of trash data (poison) to all of the web crawlers out there that are scraping our work for AI training sets. They aim to serve one terabyte of poison per day to these crawlers by the end of 2026.
The poison fountain itself is hosted on rnsaffn.com, sandwiched between several garbage links that look irresistable to AI crawlers; it produces a page of code that seems correct at first glance, but is actually riddled with subtle errors that render the code unusable. Filtering out these errors is possible, but expensive at scale. Since these companies can’t improve their AI models without fresh data created by human beings, the idea here is to waste their time and make it expensive for them to steal our data.
Miasma is one example of a tool that uses the fountain to serve massive amounts of garbage to malicious bots. The developer describes it as “an endless buffet of slop for the slop machines,” which is delightful. I can’t use Miasma with my site’s setup, but it may be of interest to those of you who could. I deliver my trash to crawlers using other means … some visible, some invisible. While I can’t serve it up to anywhere near the same extent as Miasma can, I do catch sneaky bots with my junk links every day.
If you’re pro-AI and feel outraged on behalf of these companies that anyone would dare try to make life difficult for them, please know that this is simply a case of tit for tat. The teams that send AI crawlers out into the world wide web are DDoSing small websites on the regular and raising hosting fees for everyone with their voracious desire to devour the entire internet. They do not obey robots.txt, and often hide their crawlers behind residential proxies. If they can’t source training data ethically, then I see absolutely no reason why any website operator should make it easy for them to steal it.
Caution: I’m messing with automated visitors in plain sight as an experiment. 🤭 To avoid false positives, human visitors are encouraged to ignore the link in this box.
Someone Figured Out How To Poison AI Video Summarizers
Thanks to r/PoisonFountain, I learned that YouTube has no .ass. I could try to explain what that means, but the video is hilarious and well worth a watch, so I’ll leave it up to @f4mi.
Sadly, it looks like the poisoning technique used by the creator in this video no longer works; YouTube presumably fixed the transcript loophole she was exploiting here. I plugged a few of her video URLs into a few different video summarizers, and they all failed to tell me anything that wasn’t actually in the videos.
Still, it’s great to see people trying and succeeding at fucking with the slop machines — even if that success is only temporary.
All over Reddit and other social media platforms, I’m increasingly seeing stuff like this:
I mean, sure, it’s literally misinformation and you could indeed argue that there’s already enough misinformation on the internet as it is … but it’s important to note here that bots, not people, are the target audience of this misinformation.
I think most of us can understand from the context that Idris Elba did not ever play Raymond’s mother in an episode of Everybody Loves Raymond. Automated web scrapers, however, will just see good human-generated data, which is what they want. They’re going to merrily scrape that garbage from Reddit and send it back to OpenAI or whomever, who will then have to waste resources removing it from their training data sets.
This isn’t exactly the modern equivalent of angry textile workers destroying power looms, but (if you’ll forgive the pun) it’s cut from the same cloth. The difference here (I hope) is that if enough of us pollute public spaces with misinformation intended for bots, it might be enough to compel AI companies to rethink the way they source training data.
People hate what AI is doing to our world. They hate what it’s doing to our online communities, what it’s doing to our environment, what it’s doing to our elementary schools and universities, what it’s doing to at-risk individuals with mental health issues, what it’s doing (and may yet still do) to our livelihoods. While there are certainly plenty of people out there who happily consume and generate massive amounts of AI slop, they are — at least in my anecdotal experience within my own social circles, both offline and online — dwarfed by people who detest and want nothing to do with this technology.
Hatred of a thing seldom leads anywhere good, as recent events demonstrate, but do I think that if people are able to translate what they’re feeling about AI into peaceful, legal acts of resistance, then we might actually stand to change the way Silicon Valley does things.
To see what people are saying about this post, check it out on Mastodon. Want to know why this blog doesn’t have a comments section? I wrote about that here.
If you enjoy my writing and want to read more of it, check out my last post or browse through my blog archive.
...
Read the original on stephvee.ca »
La voiture autonome promettait un rêve, elle se transforme en cauchemar pour certains usagers. Une enquête révèle comment Elon Musk et Tesla ont utilisé les routes comme terrain d’essai en précipitant la mise sur le marché d’un système de conduite autonome par intelligence artificielle.
Le constructeur automobile a passé sous silence des milliers d’incidents graves. Certains ont coûté la vie à des conducteurs et des passagers. D’autres usagers de la route se sont retrouvés impliqués sans le savoir.
L’enquête s’appuie sur une fuite massive de données internes à Tesla. Ces documents révèlent l’ampleur du problème. Le constructeur était conscient depuis des années des défaillances de ses systèmes.
Les fichiers montrent des milliers de plaintes de clients. Plus de 2400 concernent des accélérations spontanées et le nombre d’accidents dépasse les 1000. Dans de nombreux cas, le statut indiqué était “non résolu”.
Certaines voitures Tesla ont accéléré ou freiné brutalement sans raison. En intelligence artificielle, on appelle ces dysfonctionnements des “hallucinations”, comme lorsque ChatGPT donne une réponse complètement fausse.
Sur la route, les conséquences sont désastreuses. Le système de conduite autonome peut mal interpréter son environnement. A grande vitesse, ces erreurs deviennent mortelles.
Je ne savais pas que le pilote automatique existait. Quand je l’ai découvert, je me suis senti comme un cobaye Dillon Angulo, impliqué dans un accident avec une Tesla
Le problème touche tous les usagers. Alors que beaucoup n’ont jamais accepté d’être les cobayes de Tesla, ils se retrouvent malgré eux exposés aux défaillances du système “Autopilot”.
>> Lire à ce sujet : Les automobilistes encore “cobayes” des systèmes d’aide à la conduite
Naibel Benavides avait 22 ans. Cette simple piétonne est morte dans un accident impliquant une Tesla en mode “Autopilot”. Son compagnon Dillon Angulo a survécu avec de graves blessures.
“Je ne savais pas que le pilote automatique existait. Quand je l’ai découvert, je me suis senti comme un cobaye”, témoigne Dillon Angulo, qui souffre encore aujourd’hui des conséquences de l’accident.
La famille de Naibel a décidé d’attaquer Tesla en justice. Elle accuse le constructeur d’avoir caché des informations cruciales. Tesla a toujours rejeté la faute sur le conducteur.
Les enquêteurs ont rencontré des obstacles inhabituels. Les données de l’accident auraient dû être disponibles dans la “boîte noire” du véhicule. Or, Tesla a affirmé que ces données étaient corrompues.
Les avocats des victimes ont fait appel à des experts, qui ont réussi à récupérer les données supprimées. Ces informations prouvent que Tesla était au courant de la défaillance dès le soir de l’accident.
La voiture en mode “Autopilot” avait détecté les obstacles. Elle n’a pourtant rien fait pour éviter la collision. Seule une alerte a retenti juste avant l’impact.
Un jury a condamné Tesla à verser plus de 243 millions de dollars de dommages et intérêts. Cette sanction marque une première dans les affaires liées à l’“Autopilot”. Les jurés ont jugé que Tesla et le conducteur étaient responsables.
“C’est un jour historique pour la justice”, a déclaré l’avocat des victimes. Le verdict montre que les constructeurs ne peuvent pas utiliser les routes publiques comme laboratoire.
Tesla a tenté de faire annuler ce verdict. Fin février, un juge fédéral a confirmé la sanction contre le constructeur. L’entreprise peut encore faire appel.
Tesla fait l’objet de plusieurs enquêtes aux Etats-Unis. Le ministère de la Justice examine si le constructeur a trompé les consommateurs. L’Administration nationale de la sécurité routière enquête également.
>> Lire aussi : Tesla évite un long procès sur sa technologie d’aide à la conduite
Des lanceurs d’alerte ont témoigné auprès des autorités. Ils décrivent une entreprise qui privilégie la rapidité au détriment de la sécurité. La version test de la conduite autonome a été précipitée sur le marché, alors que plusieurs employés avaient alerté la direction sur les dangers de l’“Autopilot”.
Les experts s’attendent à ce que d’autres poursuites judiciaires suivent. Le premier verdict ouvre la voie à de nouveaux procès contre Tesla.
...
Read the original on www.rts.ch »
Deezer announced on Monday that AI-generated tracks now represent 44% of all new music uploaded to its platform. The company said it’s receiving almost 75,000 AI-generated tracks per day and more than two million per month.
The consumption of AI-generated music on the platform is still very low, at 1-3% of total streams, and 85% of these streams are detected as fraudulent and demonetized by the company.
The latest figure from Deezer highlights a continuous surge in AI-generated music uploads to the platform. Deezer reported receiving around 60,000 AI tracks per day in January, up from 50,000 in November, 30,000 in September, and just 10,000 in January 2025, when it first launched its AI-music detection tool.
Songs tagged as AI-generated on Deezer are automatically removed from algorithmic recommendations and not included in editorial playlists. The company announced today that it will no longer store hi-res versions of AI tracks.
The updated figure comes as an AI-generated track topped the iTunes charts last week in the United States, United Kingdom, France, Canada, and New Zealand.
“AI-generated music is now far from a marginal phenomenon and as daily deliveries keep increasing, we hope the whole music ecosystem will join us in taking action to help safeguard artists’ rights and promote transparency for fans,” said Deezer CEO Alexis Lanternier in a press release. “Thanks to our technology and the proactive measures we put in place more than a year ago, we have shown that it’s possible to reduce AI-related fraud and payment dilution in streaming to a minimum.”
Today’s announcement comes as Deezer conducted a survey last November that found that 97% of participants couldn’t tell the difference between fully AI-generated music and human-made music.
The survey also found that 52% of respondents said 100% AI-generated songs shouldn’t be included in charts alongside human-made songs in the main charts. Meanwhile, 80% said 100% AI-generated music should be clearly labeled for listeners.
Deezer started tagging AI tracks at the platform level in June 2025, becoming the first streaming platform to do so. Over the course of 2025, Deezer tagged more than 13.4 million AI tracks on its platform.
In February, French streaming service Qobuz announced plans to tag AI-generated content on its platform. Other major streaming services, such as Spotify and Apple Music, take different approaches to AI-generated music, often combining the use of filters to identify low-quality AI music with other transparency efforts left up to the distributors.
...
Read the original on techcrunch.com »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.