10 interesting stories served every morning and every evening.
Add AP News as your preferred source to see more of our stories on Google.
Add AP News as your preferred source to see more of our stories on Google.
The U. S. Border Patrol is monitoring millions of American drivers nationwide in a secretive program to identify and detain people whose travel patterns it deems suspicious, The Associated Press has found.
The predictive intelligence program has resulted in people being stopped, searched and in some cases arrested. A network of cameras scans and records vehicle license plate information, and an algorithm flags vehicles deemed suspicious based on where they came from, where they were going and which route they took. Federal agents in turn may then flag local law enforcement.
Suddenly, drivers find themselves pulled over — often for reasons cited such as speeding, failure to signal, the wrong window tint or even a dangling air freshener blocking the view. They are then aggressively questioned and searched, with no inkling that the roads they drove put them on law enforcement’s radar.
Once limited to policing the nation’s boundaries, the Border Patrol has built a surveillance system stretching into the country’s interior that can monitor ordinary Americans’ daily actions and connections for anomalies instead of simply targeting wanted suspects. Started about a decade ago to fight illegal border-related activities and the trafficking of both drugs and people, it has expanded over the past five years.
The Border Patrol has recently grown even more powerful through collaborations with other agencies, drawing information from license plate readers nationwide run by the Drug Enforcement Administration, private companies and, increasingly, local law enforcement programs funded through federal grants. Texas law enforcement agencies have asked Border Patrol to use facial recognition to identify drivers, documents show.
This active role beyond the borders is part of the quiet transformation of its parent agency, U. S. Customs and Border Protection, into something more akin to a domestic intelligence operation. Under the Trump administration’s heightened immigration enforcement efforts, CBP is now poised to get more than $2.7 billion to build out border surveillance systems such as the license plate reader program by layering in artificial intelligence and other emerging technologies.
The result is a mass surveillance network with a particularly American focus: cars.
This investigation, the first to reveal details of how the program works on America’s roads, is based on interviews with eight former government officials with direct knowledge of the program who spoke on the condition of anonymity because they weren’t authorized to speak to the media, as well as dozens of federal, state and local officials, attorneys and privacy experts. The AP also reviewed thousands of pages of court and government documents, state grant and law enforcement data, and arrest reports.
The Border Patrol has for years hidden details of its license plate reader program, trying to keep any mention of the program out of court documents and police reports, former officials say, even going so far as to propose dropping charges rather than risk revealing any details about the placement and use of their covert license plate readers. Readers are often disguised along highways in traffic safety equipment like drums and barrels.
The Border Patrol has defined its own criteria for which drivers’ behavior should be deemed suspicious or tied to drug or human trafficking, stopping people for anything from driving on backcountry roads, being in a rental car or making short trips to the border region. The agency’s network of cameras now extends along the southern border in Texas, Arizona and California, and also monitors drivers traveling near the U. S.-Canada border.
And it reaches far into the interior, impacting residents of big metropolitan areas and people driving to and from large cities such as Chicago and Detroit, as well as from Los Angeles, San Antonio, and Houston to and from the Mexican border region. In one example, AP found the agency has placed at least four cameras in the greater Phoenix area over the years, one of which was more than 120 miles (193 kilometers) from the Mexican frontier, beyond the agency’s usual jurisdiction of 100 miles (161 kilometers) from a land or sea border. The AP also identified several camera locations in metropolitan Detroit, as well as one placed near the Michigan-Indiana border to capture traffic headed towards Chicago or Gary, Indiana, or other nearby destinations.
Border Patrol’s parent agency, U. S. Customs and Border Protection, said they use license plate readers to help identify threats and disrupt criminal networks and are “governed by a stringent, multi-layered policy framework, as well as federal law and constitutional protections, to ensure the technology is applied responsibly and for clearly defined security purposes.”
“For national security reasons, we do not detail the specific operational applications,” the agency said. While the U. S. Border Patrol primarily operates within 100 miles of the border, it is legally allowed “to operate anywhere in the United States,” the agency added.
While collecting license plates from cars on public roads has generally been upheld by courts, some legal scholars see the growth of large digital surveillance networks such as Border Patrol’s as raising constitutional questions. Courts have started to recognize that “large-scale surveillance technology that’s capturing everyone and everywhere at every time” might be unconstitutional under the Fourth Amendment, which protects people from unreasonable searches, said Andrew Ferguson, a law professor at George Washington University.
Today, predictive surveillance is embedded into America’s roadways. Mass surveillance techniques are also used in a range of other countries, from authoritarian governments such as China to, increasingly, democracies in the U. K. and Europe in the name of national security and public safety.
“They are collecting mass amounts of information about who people are, where they go, what they do, and who they know … engaging in dragnet surveillance of Americans on the streets, on the highways, in their cities, in their communities,” Nicole Ozer, the executive director of the Center for Constitutional Democracy at UC Law San Francisco, said in response to the AP’s findings. “These surveillance systems do not make communities safer.”
In February, Lorenzo Gutierrez Lugo, a driver for a small trucking company that specializes in transporting furniture, clothing and other belongings to families in Mexico, was driving south to the border city of Brownsville, Texas, carrying packages from immigrant communities in South Carolina’s low country.
Gutierrez Lugo was pulled over by a local police officer in Kingsville, a small Texas city near Corpus Christi that lies about 100 miles from the Mexican border. The officer, Richard Beltran, cited the truck’s speed of 50 mph (80 kph) in a 45 mph (72 kph) zone as the reason for the stop.
But speeding was a pretext: Border Patrol had requested the stop and said the black Dodge pickup with a white trailer could contain contraband, according to police and court records. U. S. Route 77 passes through Kingsville, a route that state and federal authorities scrutinize for trafficking of drugs, money and people.
Gutierrez Lugo, who through a lawyer declined to comment, was interrogated about the route he drove, based on license plate reader data, per the police report and court records. He consented to a search of his car by Beltran and Border Patrol agents, who eventually arrived to assist.
They unearthed no contraband. But Beltran arrested Gutierrez Lugo on suspicion of money laundering and engaging in organized criminal activity because he was carrying thousands of dollars in cash — money his supervisor said came directly from customers in local Latino communities, who are accustomed to paying in cash. No criminal charges were ultimately brought against Gutierrez Lugo and an effort by prosecutors to seize the cash, vehicle and trailer as contraband was eventually dropped.
Luis Barrios owns the trucking company, Paquetería El Guero, that employed the driver. He told AP he hires people with work authorization in the United States and was taken aback by the treatment of his employee and his trailer.
“We did everything right and had nothing to hide, and that was ultimately what they found,” said Barrios, who estimates he spent $20,000 in legal fees to clear his driver’s name and get the trailer out of impound.
Border Patrol agents and local police have many names for these kinds of stops: “whisper,” “intel” or “wall” stops. Those stops are meant to conceal — or wall off — that the true reason for the stop is a tip from federal agents sitting miles away, watching data feeds showing who’s traveling on America’s roads and predicting who is “suspicious,” according to documents and people interviewed by the AP.
In 2022, a man from Houston had his car searched from top to bottom by Texas sheriff’s deputies outside San Antonio after they got a similar tipoff from Border Patrol agents about the driver, Alek Schott.
Federal agents observed that Schott had made an overnight trip from Houston to Carrizo Springs, Texas, and back, court records show. They knew he stayed overnight in a hotel about 80 miles (129 kilometers) from the U. S.-Mexico border. They knew that in the morning Schott met a female colleague there before they drove together to a business meeting.
At Border Patrol’s request, Schott was pulled over by Bexar County sheriff’s deputies. The deputies held Schott by the side of the road for more than an hour, searched his car and found nothing.
“The beautiful thing about the Texas Traffic Code is there’s thousands of things you can stop a vehicle for,” said Joel Babb, the sheriff’s deputy who stopped Schott’s car, in a deposition in a lawsuit Schott filed alleging violations of his constitutional rights.
According to testimony and documents released as part of Schott’s lawsuit, Babb was on a group chat with federal agents called Northwest Highway. Babb deleted the WhatsApp chat off his phone but Schott’s lawyers were able to recover some of the text messages.
Through a public records act request, the AP also obtained more than 70 pages of the Northwest Highway group chats from June and July of this year from a Texas county that had at least one sheriff’s deputy active in the chat. The AP was able to associate numerous phone numbers in both sets of documents with Border Patrol agents and Texas law enforcement officials.
The chat logs show Border Patrol agents and Texas sheriffs deputies trading tips about vehicles’ travel patterns — based on suspicions about little more than someone taking a quick trip to the border region and back. The chats show how thoroughly Texas highways are surveilled by this federal-local partnership and how much detailed information is informally shared.
In one exchange a law enforcement official included a photo of someone’s driver’s license and told the group the person, who they identified using an abbreviation for someone in the country illegally, was headed westbound. “Need BP?,” responded a group member whose number was labeled “bp Intel.” “Yes sir,” the official answered, and a Border Patrol agent was en route.
Border Patrol agents and local law enforcement shared information about U. S. citizens’ social media profiles and home addresses with each other after stopping them on the road. Chats show Border Patrol was also able to determine whether vehicles were rentals and whether drivers worked for rideshare services.
In Schott’s case, Babb testified that federal agents “actually watch travel patterns on the highway” through license plate scans and other surveillance technologies. He added: “I just know that they have a lot of toys over there on the federal side.”
After finding nothing in Schott’s car, Babb said “nine times out of 10, this is what happens,” a phrase Schott’s lawyers claimed in court filings shows the sheriff’s department finds nothing suspicious in most of its searches. Babb did not respond to multiple requests for comment from AP.
The Bexar County sheriff’s office declined to comment due to pending litigation and referred all questions about the Schott case to the county’s district attorney. The district attorney did not respond to a request for comment.
The case is pending in federal court in Texas. Schott said in an interview with the AP: “I didn’t know it was illegal to drive in Texas.”
Today, the deserts, forests and mountains of the nation’s land borders are dotted with checkpoints and increasingly, surveillance towers, Predator drones, thermal cameras and license plate readers, both covert and overt.
Border Patrol’s parent agency got authorization to run a domestic license plate reader program in 2017, according to a Department of Homeland Security policy document. At the time, the agency said that it might use hidden license plate readers ”for a set period of time while CBP is conducting an investigation of an area of interest or smuggling route. Once the investigation is complete, or the illicit activity has stopped in that area, the covert cameras are removed,” the document states.
But that’s not how the program has operated in practice, according to interviews, police reports and court documents. License plate readers have become a major — and in some places permanent — fixture of the border region.
In a budget request to Congress in fiscal year 2024, CBP said that its Conveyance Monitoring and Predictive Recognition System, or CMPRS, “collects license plate images and matches the processed images against established hot lists to assist … in identifying travel patterns indicative of illegal border related activities.” Several new developer jobs have been posted seeking applicants to help modernize its license plate surveillance system in recent months. Numerous Border Patrol sectors now have special intelligence units that can analyze license plate reader data, and tie commercial license plate readers to its national network, according to documents and interviews.
Border Patrol worked with other law enforcement agencies in Southern California about a decade ago to develop pattern recognition, said a former CBP official who spoke on the condition of anonymity for fear of reprisal. Over time, the agency learned to develop what it calls “patterns of life” of vehicle movements by sifting through the license plate data and determining “abnormal” routes, evaluating if drivers were purposely avoiding official checkpoints. Some cameras can take photos of a vehicle’s plates as well as its driver’s face, the official said.
Another former Border Patrol official compared it to a more technologically sophisticated version of what agents used to do in the field — develop hunches based on experience about which vehicles or routes smugglers might use, find a legal basis for the stop like speeding and pull drivers over for questioning.
The cameras take pictures of vehicle license plates. Then, the photos are “read” by the system, which automatically detects and distills the images into numbers and letters, tied to a geographic location, former CBP officials said. The AP could not determine how precisely the system’s algorithm defines a quick turnaround or an odd route. Over time, the agency has amassed databases replete with images of license plates, and the system’s algorithm can flag an unusual “pattern of life” for human inspection.
The Border Patrol also has access to a nationwide network of plate readers run by the Drug Enforcement Administration, documents show, and was authorized in 2020 to access license plate reader systems sold by private companies. In documents obtained by the AP, a Border Patrol official boasted about being able to see that a vehicle that had traveled to “Dallas, Little Rock, Arkansas and Atlanta” before ending up south of San Antonio.
Documents show that Border Patrol or CBP has in the past had access to data from at least three private sector vendors: Rekor, Vigilant Solutions and Flock Safety.
Through Flock alone, Border Patrol for a time had access to at least 1,600 license plate readers across 22 states, and some counties have reported looking up license plates on behalf of CBP even in states like California and Illinois that ban sharing data with federal immigration authorities, according to an AP analysis of police disclosures. A Flock spokesperson told AP the company “for now” had paused its pilot programs with CBP and a separate DHS agency, Homeland Security Investigations, and declined to discuss the type or volume of data shared with either federal agency, other than to say agencies could search for vehicles wanted in conjunction with a crime. No agencies currently list Border Patrol as receiving Flock data. Vigilant and Rekor did not respond to requests for comment.
Where Border Patrol places its cameras is a closely guarded secret. However, through public records requests, the AP obtained dozens of permits the agency filed with Arizona and Michigan for permission to place cameras on state-owned land. The permits show the agency frequently disguises its cameras by concealing them in traffic equipment like the yellow and orange barrels that dot American roadways, or by labeling them as jobsite equipment. An AP photographer in October visited the locations identified in more than two dozen permit applications in Arizona, finding that most of the Border Patrol’s hidden equipment remains in place today. Spokespeople for the Arizona and Michigan departments of transportation said they approve permits based on whether they follow state and federal rules and are not privy to details on how license plate readers are used.
Texas, California, and other border states did not provide documents in response to the AP’s public records requests.
CBP’s attorneys and personnel instructed local cities and counties in both Arizona and Texas to withhold records from the AP that might have revealed details about the program’s operations, even though they were requested under state open records laws, according to emails and legal briefs filed with state governments. For example, CBP claimed records requested by the AP in Texas “would permit private citizens to anticipate weaknesses in a police department, avoid detection, jeopardize officer safety, and generally undermine police efforts.” Michigan redacted the exact locations of Border Patrol equipment, but the AP was able to determine general locations from the name of the county.
One page of the group chats obtained by the AP shows that a participant enabled WhatsApp’s disappearing messages feature to ensure communications were deleted automatically.
The Border Patrol’s license plate reader program is just one part of a steady transformation of its parent agency, CBP, in the years since 9/11 into an intelligence operation whose reach extends far beyond borders, according to interviews with former officials.
CBP has quietly amassed access to far more information from ports of entry, airports and intelligence centers than other local, state and federal law enforcement agencies. And like a domestic spy agency, CBP has mostly hidden its role in the dissemination of intelligence on purely domestic travel through its use of whisper stops.
Border Patrol has also extended the reach of its license plate surveillance program by paying for local law enforcement to run plate readers on their behalf.
A federal grant program called Operation Stonegarden, which has existed in some form for nearly two decades, has handed out hundreds of millions of dollars to buy automated license plate readers, camera-equipped drones and other surveillance gear for local police and sheriffs agencies. Stonegarden grant funds also pay for local law enforcement overtime, which deputizes local officers to work on Border Patrol enforcement priorities. Under President Donald Trump, the Republican-led Congress this year allocated $450 million for Stonegarden to be handed out over the next four fiscal years. In the previous four fiscal years, the program gave out $342 million.
In Cochise County, Arizona, Sheriff Mark Dannels said Stonegarden grants, which have been used to buy plate readers and pay for overtime, have let his deputies merge their mission with Border Patrol’s to prioritize border security.
“If we’re sharing our authorities, we can put some consequences behind, or deterrence behind, ‘Don’t come here,’” he said.
In 2021, the Ward County, Texas, sheriff sought grant funding from DHS to buy a “covert, mobile, License Plate Reader” to pipe data to Border Patrol’s Big Bend Sector Intelligence Unit. The sheriff’s department did not respond to a request for comment.
Other documents AP obtained show that Border Patrol connects locally owned and operated license plate readers bought through Stonegarden grants to its computer systems, vastly increasing the federal agency’s surveillance network.
How many people have been caught up in the Border Patrol’s dragnet is unknown. One former Border Patrol agent who worked on the license plate reader pattern detection program in California said the program had an 85% success rate of discovering contraband once he learned to identify patterns that looked suspicious. But another former official in a different Border Patrol sector said he was unaware of successful interdictions based solely on license plate patterns.
In Trump’s second term, Border Patrol has extended its reach and power as border crossings have slowed to historic lows and freed up agents for operations in the heartland. Border Patrol Sector Chief Gregory Bovino, for example, was tapped to direct hundreds of agents from multiple DHS agencies in the administration’s immigration sweeps across Los Angeles, more than 150 miles (241 kilometers) from his office in El Centro, California. Bovino later was elevated to lead the aggressive immigration crackdown in Chicago. Numerous Border Patrol officials have also been tapped to replace ICE leadership.
The result has been more encounters between the agency and the general public than ever before.
“We took Alek’s case because it was a clear-cut example of an unconstitutional traffic stop,” said Christie Hebert, who works at the nonprofit public interest law firm Institute for Justice and represents Schott. ”What we found was something much larger — a system of mass surveillance that threatens people’s freedom of movement.”
AP found numerous other examples similar to what Schott and the delivery driver experienced in reviewing court records in border communities and along known smuggling routes in Texas and California. Several police reports and court records the AP examined cite “suspicious” travel patterns or vague tipoffs from the Border Patrol or other unnamed law enforcement agencies. In another federal court document filed in California, a Border Patrol agent acknowledged “conducting targeted analysis on vehicles exhibiting suspicious travel patterns” as the reason he singled out a Nissan Altima traveling near San Diego.
In cases reviewed by the AP, local law enforcement sometimes tried to conceal the role the Border Patrol plays in passing along intelligence. Babb, the deputy who stopped Schott, testified he typically uses the phrase “subsequent to prior knowledge” when describing whisper stops in his police reports to acknowledge that the tip came from another law enforcement agency without revealing too much in written documents he writes memorializing motorist encounters.
Once they pull over a vehicle deemed suspicious, officers often aggressively question drivers about their travels, their belongings, their jobs, how they know the passengers in the car, and much more, police records and bodyworn camera footage obtained by the AP show. One Texas officer demanded details from a man about where he met his current sexual partner. Often drivers, such as the one working for the South Carolina moving company, were arrested on suspicion of money laundering merely for carrying a few thousand dollars worth of cash, with no apparent connection to illegal activity. Prosecutors filed lawsuits to try to seize money or vehicles on the suspicion they were linked to trafficking.
Schott warns that for every success story touted by Border Patrol, there are far more innocent people who don’t realize they’ve become ensnared in a technology-driven enforcement operation.
“I assume for every one person like me, who’s actually standing up, there’s a thousand people who just don’t have the means or the time or, you know, they just leave frustrated and angry. They don’t have the ability to move forward and hold anyone accountable,” Schott said. “I think there’s thousands of people getting treated this way.”
...
Read the original on apnews.com »
When it comes to sharing moments between family and friends, what device you have shouldn’t matter — sharing should just work. But we’ve heard from many people that they want a simpler way to share files between devices.
Today, we’re introducing a way for Quick Share to work with AirDrop. This makes file transfer easier between iPhones and Android devices, and starts rolling out today to the Pixel 10 family.
We built this with security at its core, protecting your data with strong safeguards that were tested by independent security experts. It’s just one more way we’re bringing better compatibility that people are asking for between operating systems, following our work on RCS and unknown tracker alerts.
We’re looking forward to improving the experience and expanding it to more Android devices. See it in action on the Pixel 10 Pro in this video, and try it out for yourself!
...
Read the original on blog.google »
Preserving code that shaped generations: Zork I, II, and III go Open Source
A game that changed how we think about play
Today, we’re preserving a cornerstone of gaming history that is near and dear to our hearts. Together, Microsoft’s Open Source Programs Office (OSPO), Team Xbox, and Activision are making Zork I, Zork II, and Zork III available under the MIT License. Our goal is simple: to place historically important code in the hands of students, teachers, and developers so they can study it, learn from it, and, perhaps most importantly, play it.
A game that changed how we think about play
When Zork arrived, it didn’t just ask players to win; it asked them to imagine. There were no graphics, no joystick, and no soundtrack, only words on a screen and the player’s curiosity. Yet those words built worlds more vivid than most games of their time. What made that possible wasn’t just clever writing, it was clever engineering.
Beneath that world of words was something quietly revolutionary: the Z-Machine, a custom-built engine. Z-Machine is a specification of a virtual machine, and now there are many Z-Machine interpreters that we used today that are software implementations of that VM. The original mainframe version of Zork was too large for early home computers to handle, so the team at Infocom made a practical choice. They split it into three games titled Zork I, Zork II, and Zork III, all powered by the same underlying system. This also meant that instead of rebuilding the game for each platform, they could use the Z-Machine to interpret the same story files on any computer. That design made Zork one of the first games to be truly cross-platform, appearing on Apple IIs, IBM PCs, and more.
Game preservation takes many forms, and it’s important to consider research as well as play. The Zork source code deserves to be preserved and studied. Rather than creating new repositories, we’re contributing directly to history. In collaboration with Jason Scott, the well-known digital archivist of Internet Archive fame, we have officially submitted upstream pull requests to the historical source repositories of Zork I, Zork II, and Zork III. Those pull requests add a clear MIT LICENSE and formally document the open-source grant.
Accompanying documentation where available, such as build notes, comments, and historically relevant files.
Clear licensing and attribution, via MIT LICENSE.txt and repository-level metadata.
This release focuses purely on the code itself. It does not include commercial packaging or marketing materials, and it does not grant rights to any trademarks or brands, which remain with their respective owners. All assets outside the scope of these titles’ source code are intentionally excluded to preserve historical accuracy.
More than forty years later, Zork is still alive and easier than ever to play. The games remain commercially available via The Zork Anthology on Good Old Games. For those who enjoy a more hands on approach, the games can be compiled and run locally using ZILF, the modern Z-Machine interpreter created by Tara McGrew. ZILF compiles ZIL files into Z3s that can be run with Tara’s own ZLR which is a sentence I never thought I’d write, much less say out loud! There are a huge number of wonderful Z-machine runners across all platforms for you to explore.
Here’s how to get started running Zork locally with ZILF. From the command line, compile and assembly the zork1.zil into a runnable z3 file.
Then run your Z3 file in a Zmachine runner. I’m using Windows Frotz from David Kinder based on Stefan Jokisch’s Frotz core:
Or, if you’re of a certain age as I am, you can apply a CRT filter to your Terminal and use a CLI implementation of a Zmachine like Matthew Darby’s “Fic” written in Python:
We will use the existing historical repositories as the canonical home for Zork’s source. Once the initial pull requests land under the MIT License, contributions are welcome. We chose MIT for its simplicity and openness because it makes the code easy to study, teach, and build upon. File issues, share insights, or submit small, well-documented improvements that help others learn from the original design. The goal is not to modernize Zork but to preserve it as a space for exploration and education.
Zork has always been more than a game. It is a reminder that imagination and engineering can outlast generations of hardware and players. Bringing this code into the open is both a celebration and a thank you to the original Infocom creators for inventing a universe we are still exploring, to Jason Scott and the Internet Archive for decades of stewardship and partnership, and to colleagues across Microsoft OSPO, Xbox, and Activision who helped make open source possible.
...
Read the original on www.microsoft.com »
In October, I reported two security issues to Okta’s auth0/nextjs-auth0 project, here and here. The latter bug, an oauth parameter injection, allows for a range of types of abuse, like scoping tokens for unintended services, setting redirect_uri and scope to arbitrary values to leak tokens, and so on.
The patch was simple enough, so I opened a PR:
All’s well that ends well, right? Obviously, no.
The PR, 3 weeks later, was closed by the maintainer, an auth0 (an Okta company) employee, with the following comment:
This change is superseded by #2413. This was done to ensure that commits are signed. Orignal contribution history has been preserved. Hence closing this PR now.
Hmm, let’s take a look at that PR:
Hmm. That patch looks familiar. And who is Simen Olsen?
no it hasn’t. I don’t know who “Simen A. W. Olsen my@simen.io” is but it isn’t me and my commit here doesn’t reference that name or email address at all. Was it ai generated or something?
Of course, the answer was: yes. It was AI slop. Just like my previous post about gixy-ng (a fun read for anybody dealing with nginx), the developer had used CoPilot to somebow generate their patches:
Hi @MegaManSec I sincerely apologize for this attribution error.
Can confirm that an AI workflow was used to created the rebased commit, which got confused with OP details. I’ve added a correction to #2413, and will ensure the changelog is updated.
Thank you for calling this out, we’ll make sure this doesn’t happen again.
Not only did the maintainer state the above, they also used AI to generate the response! In a now-deleted comment, they clearly used some AI to respond to my complaint:
With the classic ChatGPT “you are absolutely correct”, it’s pretty frustrating that this developer used AI to:
Take my report/PR and commit it themselves.
Used AI to commit it, removing my attribution.
Used AI to “apologise” for using AI, then stated that “it won’t happen again” — (yeah right; please provide a detailed explanation how you’re going to ensure that, when clearly a 1-line code change is too much for your AI to handle without breaking).
Refused to fix the commit to remove the invalid / AI-generated-slop details, and add back mine.
I would appreciate force-pushing a fix for the commit to properly include my information in the commit.
I was told that they cannot change it. That seems like a copyright infringement to me: taking somebody else’s code, then changing the author’s name?
What I really find the most interesting is really how this AI slop even came to be. I cannot find any reference to the email address “my@simen.io” anywhere online. On GitHub, the only reference to this email address is from the nextjs-auth0 PR. Simen Olsen has never contributed to any of the nextjs-auth0 repositories as far as I can tell (searching org:auth0 author:simenandre on GitHub), and that doesn’t even seem to be their real email address. so was this some type of ai hallucination? And why? The code change was tiny. I just totally don’t get it: I have literally never had any AI tooling fail like this and come up with some other person’s (fake) contact details. It’s simply absurd; are auth0’s engineers using some extremely (extremely) low quality local model or something? If ChatGPT failed like this for me even once every thousand times, I would simply never use it again.
In the end, at the time of writing this, the auth0/nextjs-auth0 maintainer, Tushar Pandey, who made all of these mistakes, has not fixed attribution mistake in the commit history. In addition to this, that first bug, which allows for arbitrary account hijacking in this software, has been fixed after 3 weeks, with new versions of the nextjs-auth0 software released, but Okta’s security people stating that “unless you create a video abusing this vulnerability, we aren’t going to accept this as a security issue” — LMAO; “yeah, it’s a vulnerability, we fixed in the code, it can be used to takeover accounts, but you need to create a video”. Hilarious. That’s just another case to add to my list of hilarious problems related to reporting security issue, that my next post will document.
...
Read the original on joshua.hu »
The f32 is an ultra-compact ESP32 development board designed to mount directly behind a USB-C receptacle. The PCB measures just 9.85 mm x 8.45 mm. It’s powered by the ESP32-C3FH4 microcontroller and was created primarily for research and as a bit of a stress test for the ESP32, since it intentionally ignores many standard design guidelines. There’s only one exposed GPIO and it is connected to an onboard LED, so most of the development on here would be more catered for wifi/web.
To test the f32 an example application was created that users can interact with. The application turns the f32 into a captive portal so when it’s powered on it will show up as an open access point that the user can select from available WiFi networks. The user is then automatically sent to the f32′s control page where they can interact with some of its basic functionality such as turning on an LED or scanning for surrounding WiFi networks. There’s also an “About” page that provides a small overview of the device. Below are some screenshots and a gif of interacting with the device.
Initially the f32 didn’t seem to want to work. I couldn’t get it to connect to any networks or broadcast it’s own network. Im 100% sure this is due to the poor antenna circuitry or lack of, but I did manage to get it functional after adding an additional tiny antenna onto the chip antenna as seen in the picture below. This was just a piece of bent wire soldered to the end lead and floating above the first lead.
Since I don’t have fancy signal testing equipment I relied on some manual testing such as seeing if I can still connect to the device and control the LED. In a clear line of sight test with the f32 placed about 3ft off the ground I was able to connect and perform scans/control the LED at roughly 120ft! This can be seen in my highly necessary depiction below.
The PCB was designed using DipTrace and manufactured by PCBWay with a board thickness of 0.6mm, min hole size of 0.2mm, and min track/spacing of 4/4mil. At the time of making this it only cost $10.75 for 5 boards shipped! That still blows my mind. PCBWay does also offer assembly services, but I chose to assemble this at home and suffer a bit. This took a bit of trial and error with such small parts, but I decided the best way for me was to ditch the stencil and make flux my best friend.
* Send the gerber file f32_gerber.zip found in the hardware folder to PCBWay with the specs mentioned above.
* Order the components noted in f32_bom.pdf. These parts can be found on both DigiKey and Mouser except the antenna. I don’t remember where I had originally ordered them, but I believe they are CrossAir CA-C03.
** Tip: Always order more than you need, especially with components as small as these.
* ** Tip: Always order more than you need, especially with components as small as these.
* Clean the pcb really well with 99% Alcohol.
* Starting with the top side (Antenna side) apply a thin layer of soldering flux across the entire board using a tooth pick.
* Using a soldering iron with a fine tip apply some solder to the tip and then go across all the exposed pads.
* Clean the board again with 99% alcohol and verify all the pads on this side have some solder on them.
* Apply another thin layer of flux to the same side.
* Using tweezers and a microscope/loupe start placing the top components following the reference guide f32_reference.pdf.
* Gently move the board onto the soldering hotplate or use the rework station to heat the solder back up and watch the components wiggle into place.
* Repeat with Bottom side.
Bottom side must be done using a rework hot air gun, not possible with hotplate.
* Bottom side must be done using a rework hot air gun, not possible with hotplate.
After assembly you can use ESP-IDF VSCode extension or Arduino and upload whatever you’d like to the board or you can upload my example application using the steps below.
* Make sure you are in the base directory of this repo and have access to esptool.py.
* Make sure your esptool version is at v4+
* Run the following command replacing with whichever port the device is connected to i.e. on Windows typically something like COM5 or on Linux /dev/ttyACM0
Well that’s up to you to decide. I started this project for some personal research and also a fun learning experience. I had always wanted a project that used 01005 components ever since I had accidentally ordered some years ago. Whatever you choose to use it for, please note that this design intentionally neglects several fundamental components such as proper decoupling capacitors, an antenna matching circuit, USB termination resistors, and likely more. It does function, but it’s intentionally bare.
* Expose more GPIOs on the sides of the PCB to make it a mountable PCB.
Lastly, fun coincidence, the ESP32 chip, the antenna, and the LDO all are “C3” models!
...
Read the original on github.com »
ravynOS is a new open source OS project that aims to provide a similar experience and some compatibility with macOS on x86-64 (and eventually ARM) systems. It builds on the solid foundations of FreeBSD, existing open source packages in the same space, and new code to fill the gaps.
* Source compatibility with macOS applications (i.e. you could compile a Mac application on ravynOS and run it)
* Similar GUI metaphors and familiar UX (file manager, application launcher, top menu bar that reflects the open application, etc)
* Compatible with macOS folder layouts (/Library, /System, /Users, /Volumes, etc) and perhaps filesystems (HFS+, APFS) as well as fully supporting ZFS
* Self-contained applications in App Bundles, AppDirs, and AppImage files - an installer-less experience for /Applications
* Mostly maintain compatibility with the FreeBSD base system and X11 - a standard Unix environment under the hood
* Pleasant to use, secure, stable, and performant
Please visit ravynos.com for more info: Release Notes | Screenshots | FAQ
* Can you help build the dream? See the current projects/needs in CONTRIBUTING.md!
This is the top level of the FreeBSD source directory.
FreeBSD is an operating system used to power modern servers, desktops, and embedded platforms. A large community has continually developed it for more than thirty years. Its advanced networking, security, and storage features have made FreeBSD the platform of choice for many of the busiest web sites and most pervasive embedded networking and storage devices.
For copyright information, please see the file COPYRIGHT in this directory. Additional copyright information also exists for some sources in this tree - please see the specific source directories for more information.
The Makefile in this directory supports a number of targets for building components (or all) of the FreeBSD source tree. See build(7), config(8), FreeBSD handbook on building userland, and Handbook for kernels for more information, including setting make(1) variables.
For information on the CPU architectures and platforms supported by FreeBSD, see the FreeBSD
website’s Platforms page.
For official FreeBSD bootable images, see the release page.
For information on synchronizing your source tree with one or more of the FreeBSD Project’s development branches, please see FreeBSD Handbook.
...
Read the original on github.com »
After building a software company to a multi-billion dollar exit, I made the jump to hardware. Now I’m working on carbon removal + steel at Charm Industrial, and electric long-haul trucking with Revoy. It’s epically fun to be building in the real world, but little did I expect that more than half the cost of building a hardware company would come from regulatory bottlenecks. Despite a huge push for climate fixes and the bipartisan geopolitical desire to bring industry back to the USA, I’ve been shocked to find that the single biggest barrier—by far—is over-regulation from the massive depth of bureaucracy.
Hardtech companies of all flavors are being forced to burn through limited capital while they wait for regulatory clarity and/or permits. This creates a constant cycle of cost increases that ultimately flows to consumers, it lowers investment in the US manufacturing and industrial base, it delays innovative new hardware getting into the hands of consumers and businesses, and at the end of the day, it leaves us all worse off, stuck with a quality of life pegged to technology developed decades ago.
Regulatory delays and bottlenecks have added millions of pounds of pollutants like PM2.5, NOₓ and CO₂ to our air from the continuation of business as usual, instead of the deployment of clean technologies from my two hardtech efforts alone. While CO₂ is a long-term climate issue, PM2.5 and NOₓ are immediate major drivers of asthma and excess morbidity. Both operations have high bipartisan appeal—and we’ve never been denied a permit—because we’re fundamentally cleaning up things that matter to everyone: dirty air, wildfires, orphaned oil wells. Revoy is also helping deflate the cost of long-haul freight. But none of that has made getting freedom to operate easy. For creative new technologies the default answer is “no” because there isn’t a clear path to permitting at all, and figuring out that path itself takes years — time that startups can’t afford to wait.
Regulation obviously has a critical role in protecting people and the environment, but the sheer volume, over-specificity and sometimes ambiguity of those same regulations is now actively working against those goals! We’re unintentionally blocking the very things that would improve our environment. We’ve become a society that blocks all things, and we need to be a society that builds great things every day. The rest of this article gets very specific about the astronomical costs regulations are imposing on us as a society, and the massive positive impact that could be unleashed by cutting back regulation that is working against new, cost-saving, creative technology that could also be making people and the environment healthy again.
To make it concrete: both Charm and Revoy are capital-efficient hardtech companies, but Charm will spend low hundreds of millions to get to breakeven, and Revoy will spend tens of millions. In both cases, more than half of the total cost of building each company has gone to counterproductive regulatory burden. I’m hellbent on pushing through these barriers, but the unspoken reality is that our regulatory morass is the deathbed of thousands of hardtech companies that could be drastically improving our lives. We must unleash them.
$300M in Societal Cost & $125M in Burden for Charm
Charm produces and delivers verified carbon removal to companies like Google, Microsoft and JPMorgan. Charm’s breakthrough was realizing that you could take CO₂ captured in farm & forestry plant residues, convert it into a carbon-rich, BBQ sauce-like liquid (it’s literally the smoke flavor in BBQ sauce), and inject it into old oil wells to permanently remove carbon from the atmosphere. This has all kinds of co-benefits like reducing the massive overburden of wildfire fuels, cleaning up & plugging nasty orphaned oil wells, and improving PM2.5 and NOₓ air quality by avoiding that biomass being burned instead.
And yet… there was a hangup: what kind of injection well is this? Should it be permitted as a Class I disposal, Class II oilfield disposal, or Class V experimental? This question on permitting path took four years to answer. Four years to decide which path to use, not even the actual permit! It took this long because regulators are structurally faced with no upside, only downside legal risk in taking a formal position on something new. Even when we’d done an enormous amount of lab and field work with bio-oil to understand its safety and behavior at surface and subsurface conditions. A regulator faces little cost to moving incredibly cautiously, but a major cost if they approve something that triggers activist pushback.
In the end, we’re grateful that—eventually—a state regulator took the reins and reviewed, managed, and issued the first-ever Class V bio-oil sequestration permit, through what was still an incredibly complex and detailed 14-month review process.
Now imagine that, instead of the 5.5 years from first contact to issued permit, it had only taken the 6 months it actually required to get everyone across the regulatory establishment to agree on a Class V pathway, we would have had 5 additional years operating the well. That’s the equivalent, from our real supply chain, of sinking at least 30,000 tonnes of carbon per year at $600/tonne. Looking only at this one aspect, this delay came with a $90M price tag for Charm. We’ve also spent untold millions on regulatory affairs at all levels of government, not to mention the missed acceleration in sales, and other direct hard costs spent in R&D and processing bio-oil for inefficient and expensive injection into salt caverns instead.
But the public health burden created by this regulatory slowness is where it gets really crazy. This one regulatory delay meant we all got subjected to decreased air quality from an additional 30,000 tonnes per year of pile burning. The resulting particulate emissions alone are estimated to have caused a mindblowing $40m/year in healthcare costs. This is $200M in additional healthcare burden over those five years, mostly borne by Medicare and Medicaid. There are additional costs to NOₓ emissions and more that take it to $300M.
In total, the total cost to society of this single regulatory delay will be about $400M: $120-150M of unnecessary cost to Charm, and the bulk of it—$300M or so—borne by the public in healthcare costs. I’m not sharing these numbers to complain or make excuses; Charm is still on the path to having a huge impact and we’re among the lucky few that can survive these delays. What pains me most is the 5 years of lost carbon removal and pollutant reduction, and the compounding effect that has on all our health and healthcare costs. Over-regulation is now working against the very things it’s intended to protect.
Regulators do their absolute best with the system they have, but the combined effects of: (1) extremely detailed and complex regulation, (2) chaotic budgets and understaffing that disrupt an efficient process, and (3) endless lawsuits against regulators since 1970s-era Naderism have created an atmosphere of fear. If we want to solve the climate crisis, build abundance, lower costs, and generate wealth for all, this has to change. We need to delete and simplify reams of regulations. We need to pay regulators well, and we need to trust our regulators to operate quickly and decisively by putting reasonable limits on endless activist legal challenges.
Revoy’s breakthrough was realizing that you could lower long-haul freight costs and electrify long-haul semi trucks by leaving the diesel tractor in place and dropping an electric powertrain onto the back of the semi. Today, we boost semis from 7 mpg to 120 mpg, driving a 94% reduction in fuel consumption. This slashes emissions that negatively impact both air quality and climate.
And yet again… a hangup: what exactly is this electric doohickey? Is it a truck? A trailer? Something else? It was clear from the regulations that it was a “converter dolly”. But getting complete alignment on that simple fact across an alphabet soup of government agencies spanning both federal and state—NHTSA, FMCSA, FHWA, state transit authorities, air quality management districts, state DMVs, highway patrols and more—took years.
A “powered converter dolly” isn’t even a new thing! Here’s one from the sixties that ran on diesel to help trucks get over mountain passes:
There were some bright spots. The Federal Motor Carrier Safety Administration (FMCSA) and the National Highway Transportation Safety Administration (NHTSA) quickly converged on informal definitional clarity, and then eventually a Highway Patrol Captain who was eager to get innovative electric vehicles on the road pushed it through with a state DMV to register the first four Revoys. But bringing along the rest of the agencies, and the rest of the states, was not fast. It delayed deployments, soaked up hundreds of thousands of dollars of legal and lobbyist time (not to mention all the corresponding time on the government side that all of us taxpayers have to bear), and maybe most importantly… even with a formal memo from the Federal DOT, it is still not 100% resolved in some states.
As one example, one state agency has asked Revoy to do certified engine testing to prove that the Revoy doesn’t increase emissions of semi trucks. And that Revoy must do this certification across every single truck engine family. It costs $100,000 per certification and there are more than 270 engine families for the 9 engines that our initial partners use. That’s $27,000,000 for this one regulatory item. And keep in mind that this is to certify that a device—whose sole reason for existence is to cut pollution by >90%, and which has demonstrably done so across nearly 100,000 miles of testing and operations—is not increasing the emissions of the truck. It’s a complete waste of money for everyone.
And that $27M dollar cost doesn’t include the cost to society. This over-regulation will delay deployment of EV trucks by years, increasing NOₓ and PM 2.5 air pollution exposure for many of society’s least well-off who live near freeways. The delayed deployment will also increase CO₂ emissions that threaten the climate and environment. Revoy’s Founder (Ian Rust) and I actually disagree on what exactly it is about the regulatory environment that needs to change, but we agree it’s completely broken and hurting both people and the planet.
In every interaction I have with regulators, I’m reminded that they’re good people doing god’s work operating in a fundamentally broken system. A regulatory system that structurally insists on legalistic, ultra-extreme caution is bound to generate a massive negative return for society.
If we had a regulatory system that could move fast to experiment with creative new technologies, we’d live in a world where our environment gets cleaned up faster, where awesome new hardware was constantly improving our lives by making things better and cheaper, and where large-scale hardtech innovation happened here at home in the USA, not in China.
As we collectively work to build more manufacturing capacity at home and build the next wave of technologies to power the economy, we need to grapple with the real bottlenecks holding us back. I hope other hardtech founders will publicly share more of their stories as well (the stories I’ve heard in private would shock you). Props to Blake Scholl for doing so.
We need a come-to-jesus about regulatory limits, timelines, and scope. Yes, we need basic and strong protections for clear harms, but we need to unleash every hardworking American, not just a few companies with massive funding, to invent and build hardware again. We need to combine many approaches to get there: expedited reviews for new technology, freedom to operate by default, permits by right-not-process, deleting as many regulatory steps as possible, and more. CA YIMBY’s successful push to pass a deluge of housing acceleration laws in the past two years could serve as a model. America building things again is the foundation of a prosperous, powerful, and clean America.
...
Read the original on rein.pk »
Choose your preferences above and click Generate to view boards
...
Read the original on mobomaps.com »
FEX allows you to run x86 applications on ARM64 Linux devices, similar to qemu-user and box64. It offers broad compatibility with both 32-bit and 64-bit binaries, and it can be used alongside Wine/Proton to play Windows games.
It supports forwarding API calls to host system libraries like OpenGL or Vulkan to reduce emulation overhead. An experimental code cache helps minimize in-game stuttering as much as possible. Furthermore, a per-app configuration system allows tweaking performance per game, e.g. by skipping costly memory model emulation. We also provide a user-friendly FEXConfig GUI to explore and change these settings.
On the technical side, FEX features an advanced binary recompiler that supports all modern extensions of the x86(-64) instruction set, including AVX/AVX2. The heart of this recompiler is a custom IR that allows us to generate more optimized code than a traditional splatter JIT. A comprehensive system call translation layer takes care of differences between the emulated and host operating systems and implements even niche features like seccomp. A modular core enables FEX to be used as a WoW64/ARM64EC backend in Wine.
Try it out
You would think doing this month after month we would eventually run out of things to work on, but in true emulator fashion the work never ends. Let’s jump in to what has changed for the release this month!
Read More
We’re just gonna kick out this little release and be on our way. There might be some interesting things this month, read and find out!
Read More
After last month’s enormous improvements, this release will look quite tame in comparison. Although we still did a bunch of work, so let’s dive in.
Read More
Older Posts
...
Read the original on fex-emu.com »
Language models are often treated as snapshots—brief captures of a long and carefully curated development process. But sharing only the end result obscures the rich context needed to modify, adapt, and extend a model’s capabilities. Many meaningful adjustments require integrating domain-specific knowledge deep within the development pipeline, not merely at the final stage. To truly advance open AI development and research, the entire model flow — not just its endpoint — should be accessible and customizable. The model flow is the full lifecycle of an LM: every stage, checkpoint, dataset, and dependency required to create and modify it. By exposing this complete process, the goal is to engender greater trust and enable more effective adaptation, collaboration, and innovation.
With today’s release of Olmo 3, we’re empowering the open source community with not only state-of-the-art open models, but the entire model flow and full traceability back to training data.
At its center is Olmo 3-Think (32B), the best fully open 32B-scale thinking model that for the first time lets you inspect intermediate reasoning traces and trace those behaviors back to the data and training decisions that produced them. Olmo 3 is a family of compact, dense models at 7 billion and 32 billion parameters that can run on everything from laptops to research clusters.
Olmo 3-Base (7B, 32B) is our most powerful base model yet. When evaluated on our expanded, diverse evaluation suite, Olmo 3-Base delivers the strongest performance among fully open base models — where training data, code, and weights are all publicly available, like Stanford’s Marin and Swiss AI’s Apertus — and achieves competitive performance with some of the best open-weights base models of comparable size and architecture, including Qwen 2.5 and Gemma 3. Achieving strong results in programming, reading comprehension, and math problem solving, Olmo 3-Base maintains performance at extended context lengths (~up to 65K tokens)—providing a versatile foundation for continued pretraining, targeted fine-tuning, and reinforcement learning and making it easy to build in specialized capabilities like reasoning, tool use (function calling), and instruction following through post-training. Olmo 3-Think (7B, 32B) is our flagship post-trained reasoning set built on Olmo 3-Base. At a time when few organizations are releasing truly open models at this scale, Olmo 3-Think (32B) serves as a workhorse for RL research, long-horizon reasoning, and other advanced experiments that require substantial compute. On our suite of reasoning benchmarks (discussed below), it’s the strongest fully open thinking model we’re aware of, narrowing the gap to the best open-weight models of similar scale — such as Qwen 3 32B — while training on roughly 6x fewer tokens. Olmo 3-Think (7B) brings the same design and training approach to an even more efficient form factor, surfacing intermediate thinking steps for complex prompts while making open, inspectable reasoning accessible on more modest hardware.Olmo 3-Instruct (7B) is a chat and quick-response focused post-train of Olmo 3-Base that handles multi-turn, instruction-following, tool use, and more. In our evaluations, it matches or outperforms open-weight models including Qwen 2.5, Gemma 3, and Llama 3.1, and narrows the gap with Qwen 3 model families at a similar scale—delivering a strong, fully open alternative for high-quality conversational and tool-using agents.Olmo 3-RL Zero (7B), is a fully open reinforcement learning pathway built on Olmo 3-Base, designed to bootstrap complex reasoning behaviors and enable clear benchmarking of RL algorithms. We release four series of checkpoints from domain-focused training on math, code, instruction following, and general chat, enabling careful study of reinforcement learning with verifiable rewards (RLVR).
Instead of a single set of frozen weights, Olmo 3 offers multiple, fully documented paths through development: the Instruct path for everyday chat and tool use, the RL Zero path for RL experimentation from base models, and the Think/reasoning path for models that leverage inference-time scaling to unlock complex reasoning and agentic behaviors. Each path is a concrete example of how to shape behavior from the same base model, and you’re free to fork or remix them—start with Olmo 3-Base, explore your own supervised fine-tuning (SFT) or direct preference optimization (DPO) recipe for instruct-style use cases, or plug in a new RL objective to probe different tradeoffs. The flow itself becomes a rich, reusable object—not just a record of how we built Olmo 3, but a scaffold for how you can build your own systems.
Click on any stage to learn more about it and download artifacts.
The Olmo 3 checkpoints we’re releasing represent our initial paths targeting our goals around reasoning, tool use, and general capabilities — we have exciting plans for other ways to leverage Olmo 3-Base 32B. But because we’re releasing the entire flow, you can intervene at any point: swap in domain-specific data during mid-training, adjust post-training for your use case, or build on an earlier checkpoint that better suits your needs.
As with Olmo and Olmo 2, we’re releasing all components of the Olmo 3 flow — data, code, model weights, and checkpoints — under permissive open source licenses.
Try Olmo 3 | Download the models & data | Read the report
We run the Olmo 3 checkpoints through a broad, updated benchmark suite, grouping dozens of industry-standard tasks (plus a few new ones we introduce) into several capability clusters. Together, the clustered suite and these held-out tasks give us a capability profile of Olmo 3—a clear picture of how well it solves math problems, codes, uses tools, answers general-knowledge questions, and more.
At a high level, the Olmo 3 family delivers the strongest fully open base and thinking models we’re aware of. Olmo 3-Base 32B outperforms other fully open base models, and Olmo 3-Think 32B emerges as the strongest fully open thinking model.
Our results were made possible by rigorous data curation at every stage of training, a carefully designed training recipe for each model, and a set of new algorithmic and infrastructure advances across data processing, training, and reinforcement learning. We also introduce an enhanced reinforcement learning framework that guides the development of our models and is particularly essential for our thinking models. To design the training recipe and coordinate targeted improvements across a wide range of capabilities at each stage of the model training pipeline, our development framework balances distributed innovation with centralized evaluation.
Olmo 3-Base, with a training pipeline that first focuses on broad coverage over diverse text, code, and math, then concentrates on harder distributions to sharpen programming, quantitative reasoning, and reading comprehension, is clearly the strongest set of fully open base models in our evaluations. It’s also arguably the best 32B model in the entire ecosystem of models with open weights, performing impressively in programming, reading comprehension, math problem solving, and long-context benchmarks like RULER, which tests information retrieval from lengthy texts. Olmo 3-Base (7B) and Olmo 3-Base (32) maintain quality at extended context lengths and integrate cleanly with RL workflows, providing a robust foundation for continued pretraining and post-training.
Olmo 3-Think, which turns the Base into a reasoning model by training on multi-step problems spanning math, code, and general problem solving, then running the thinking SFT → thinking DPO → RLVR model flow to elicit high-quality reasoning traces, competes with or exceeds several open-weight reasoning models of similar sizes. On math benchmarks, Olmo 3-Think (7B) matches Qwen 3 8B on MATH and comes within a few points on AIME 2024 and 2025, and also leads all comparison models on HumanEvalPlus for coding—performing strongly on MBPP and LiveCodeBench to demonstrate particular strength in code-intensive reasoning. On broader reasoning tasks like BigBench Hard and AGI Eval English, Olmo 3-Think (7B) remains competitive with Qwen 3 8B reasoning and Qwen 3 VL 8B Thinker while staying fully open and slightly smaller.
For the 32B model, Olmo 3-Think scales these trends up and becomes one of the strongest fully open reasoning models in its class. Olmo 3-Think (32B) either wins or sits within roughly two points of the best open-weight model on MATH, OMEGA, BigBenchHard, HumanEvalPlus, PopQA, and IFEval. It ties Qwen 3 VL 32B Thinking for the top score on the OMEGA suite while staying clearly ahead of Gemma 3 27B Instruct and competitive with DeepSeek R1 Distill 32B on math and reasoning. On broader knowledge and QA, Olmo 3-Think (32B) is effectively neck-and-neck with the Qwen 3 models on PopQA. And in instruction following, Olmo 3-Think (32B) tops this subset on IFEval and remains solid on IFBench and AlpacaEval 2 LC—offering a strong default for reasoning workloads at the 32B scale.
Olmo 3-Instruct, which produces shorter sequences than the corresponding Olmo 3-Think models to improve inference efficiency and is designed to focus on general chat, tool use, and synthetic data generation, outperforms comparably-sized open-weight models. Olmo 3-Instruct ties or surpasses Qwen 2.5, Gemma 3, and Llama 3.1 in our evaluations, and competes with the Qwen 3 family at similar scale, delivering strong function calling performance and instruction-following capabilities in a fully open 7B model.
Olmo 3 uses a decoder-only transformer architecture and multi-stage training pipeline. Pretraining runs in three stages—an initial large-scale training run that builds broad capabilities; a mid-training phase that focuses on harder material like math, code, and reading comprehension; and a final long-context extension stage that trains the model on very long documents. Together with architectural enhancements, this yields a more capable, efficient base for the Olmo 3 family.
Post-training then specializes the pretrained model for different use cases. Building on Olmo 2, each pathway follows a three-stage recipe — SFT, preference tuning with DPO, and RLVR — but in Olmo 3, we expose this as a fully documented model flow with complete customization over each training stage and dataset mix.
Instead of releasing only the final weights, we provide checkpoints from each major training milestone: the base pretrained model, the mid-trained model after targeted skill enhancement, the long-context-extended version, plus post-training checkpoints for the Olmo 3-Think, Olmo 3-Instruct, and Olmo 3-RL Zero flows. You can study how capabilities emerge over time, run ablations on specific stages, and fork the model at whatever point best fits your data, compute, and goals.
Compared to Olmo 2, we scaled data collection and significantly strengthened our dataset curation methods. Continuing our commitment to full transparency, we’re releasing several new, higher-quality datasets that cover every stage of base model training and post-training—from initial learning to specialized skills like complex reasoning and long-context understanding. This means anyone can see exactly what data shaped the model’s capabilities, reproduce our results, and reuse these datasets to train their own AI systems.
Olmo 3 is pretrained on Dolma 3, a new ~9.3-trillion-token corpus drawn from web pages, science PDFs processed with olmOCR, codebases, math problems and solutions, and encyclopedic text. From this pool, we construct Dolma 3 Mix, a 5.9-trillion-token (~6T) pretraining mix with a higher proportion of coding and mathematical data than earlier Dolma releases, plus much stronger decontamination via extensive deduplication, quality filtering, and careful control over data mixing. We follow established web standards in collecting training data and don’t collect from sites that explicitly disallow it, including paywalled content.
On top of this, we introduce two Dolma 3-based mixes for later stages of base model training. Dolma 3 Dolmino is our mid-training mix: 100B training tokens sampled from a ~2.2T-token pool of high-quality math, science, code, instruction-following, and reading-comprehension data, including reasoning traces that also enable RL directly on the base model. Dolma 3 Longmino is our long-context mix: ~50B training tokens drawn from a 639B-token pool of long documents combined with mid-training data to teach Olmo 3 to track information over very long inputs (like reports, logs, and multi-chapter documents).
We also introduce Dolci, a new post-training data suite tailored specifically for reasoning, tool use, and instruction following. Dolci provides separate mixes for each stage of post-training: SFT, DPO, and RLVR. For SFT, Dolci aggregates state-of-the-art datasets that advance step-by-step reasoning, tool use, and high-quality conversational behavior; for DPO, it supplies high-quality contrastive preference data; and for RL, it includes hard, diverse prompts across math, coding, instruction following, and general chat.
Together, Dolma 3 and Dolci give Olmo 3 a fully open data curriculum from first token to final post-trained checkpoint.
We pretrained Olmo 3 on a cluster of up to 1,024 H100 GPUs; we achieved training throughput of 7.7K tokens per device per second for Olmo 3-Base (7B). We mid-trained on 128 H100 GPUs, and post-trained on a set of 256 H100s.
For Olmo 3, building on the work we did for Olmo 2, we were able to significantly improve the efficiency of our post-training code. By moving SFT from Open Instruct (our post-training codebase, prioritizing flexibility) to Olmo Core (our pretraining codebase, designed to maximize efficiency), we increased throughput (tokens/second) by 8x. Similarly, by incorporating in-flight weight updates, continuous batching, and a lot of threading improvements, we made our RL training 4x more efficient—resulting in training runs that are significantly cheaper and faster.
A note on our 32B models: We believe 32B sits in a sweet spot for research and tinkering. 32B models are big enough to support strong, competitive performance, but still small enough that a wide audience can fine-tune and deploy them on accessible hardware.
For more details, including ablations, please read our technical report.
A core goal of Olmo 3 is not just to open the model flow, but to make it actionable for people who want to understand and improve model behavior. Olmo 3 integrates with OlmoTrace, our tool for tracing model outputs back to training data in real time.
For example, in the Ai2 Playground, you can ask Olmo 3-Think (32B) to answer a general-knowledge question, then use OlmoTrace to inspect where and how the model may have learned to generate parts of its response. This closes the gap between training data and model behavior: you can see not only what the model is doing, but why—and adjust data or training decisions accordingly.
To further promote transparency and explainability, we’re making every training and fine-tuning dataset available for download, all under a permissive license that allows for custom deployment and reuse. The datasets come in a range of mixes to accommodate different storage and hardware constraints, from several billion tokens all the way up to 6 trillion.
Our new tooling for data processing allows you to de-contaminate, tokenize, and de-duplicate data in the same way we did for Olmo 3’s corpora. All the tooling is open source, enabling you to replicate our training curves or run controlled ablations across data mixes and objectives.
Our Olmo utilities and software cover the whole development cycle:
is a toolkit for reproducible evals. It includes our brand-new eval collection OlmoBaseEval, which we used for Olmo 3 base model development.
Importantly, our tooling allows you to instrument complex tasks and analyze intermediate traces to understand where the models succeed—or struggle. Because the Olmo 3 data recipes, training pipeline, and checkpoints are open, independent teams can connect model behavior back to measurable properties.
Ready to deploy and use
Together, the Olmo 3 family makes it easier to build trustworthy features quickly, whether for research, education, or applications. By making every development step available and inspectable, we’re enabling entirely new categories of research. You can run experiments on any training phase, understand exactly how different techniques contribute to model capabilities, and build on our work at whatever stage makes sense for your project.
For scientists, the fully open flow exposes the model’s inner workings, so you can instrument experiments across coding, reasoning, RL, and tool use.
If you care about AI you can study, audit, and improve, Olmo 3 is for you. Try the demos in the Ai2 Playground, explore the documentation, and build on the released weights and checkpoints. Then tell us what you discover—we invite the community to validate, critique, and extend our findings.
True openness in AI isn’t just about access—it’s about trust, accountability, and shared progress. We believe the models shaping our future should be fully inspectable, not black boxes. Olmo 3 represents a different path: one where anyone can understand, verify, and build upon the AI systems that increasingly influence our world. This is what open-first means—not just releasing weights, but sharing the complete knowledge needed to advance AI responsibly: the flow.
Try Olmo 3 | Download the models & data | Read the report
...
Read the original on allenai.org »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.