10 interesting stories served every morning and every evening.
10 interesting stories served every morning and every evening.
🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length.
🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world’s top closed-source models.
🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice.
Try it now at chat.deepseek.com via Expert Mode / Instant Mode. API is updated & available today!
📄 Tech Report: https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf
🤗 Open Weights: https://huggingface.co/collections/deepseek-ai/deepseek-v4
DeepSeek-V4-Pro
🔹 Enhanced Agentic Capabilities: Open-source SOTA in Agentic Coding benchmarks.
🔹 Rich World Knowledge: Leads all current open models, trailing only Gemini-3.1-Pro.
🔹 World-Class Reasoning: Beats all current open models in Math/STEM/Coding, rivaling top closed-source models.
DeepSeek-V4-Flash
🔹 Reasoning capabilities closely approach V4-Pro.
🔹 Performs on par with V4-Pro on simple Agent tasks.
🔹 Smaller parameter size, faster response times, and highly cost-effective API pricing.
Structural Innovation & Ultra-High Context Efficiency
🔹 Novel Attention: Token-wise compression + DSA (DeepSeek Sparse Attention).
🔹 Peak Efficiency: World-leading long context with drastically reduced compute & memory costs.
🔹 1M Standard: 1M context is now the default across all official DeepSeek services.
Dedicated Optimizations for Agent Capabilities
🔹 DeepSeek-V4 is seamlessly integrated with leading AI agents like Claude Code, OpenClaw & OpenCode.
🔹 Already driving our in-house agentic coding at DeepSeek.
The figure below showcases a sample PDF generated by DeepSeek-V4-Pro.
API is Available Today!
🔹 Keep base_url, just update model to deepseek-v4-pro or deepseek-v4-flash.
🔹 Supports OpenAI ChatCompletions & Anthropic APIs.
🔹 Both models support 1M context & dual modes (Thinking / Non-Thinking): https://api-docs.deepseek.com/guides/thinking_mode
⚠️ Note: deepseek-chat & deepseek-reasoner will be fully retired and inaccessible after Jul 24th, 2026, 15:59 (UTC Time). (Currently routing to deepseek-v4-flash non-thinking/thinking).
🔹 Amid recent attention, a quick reminder: please rely only on our official accounts for DeepSeek news. Statements from other channels do not reflect our views.
🔹 Thank you for your continued trust. We remain committed to longtermism, advancing steadily toward our ultimate goal of AGI.
First enthusiasm
A couple of weeks ago I subscribed to Claude Code, and during the first few weeks I had a really nice experience. It was fast, the token allowance was fair, and the quality was good.
I learned they had
raised the token allowance for non-rush hours
, and since they opposed some governmental rules, it felt good to support the right cause.
(づ  ̄ ³ ̄)づ
However… for about three weeks now my initial enthusiasm has been rapidly waning.
It began with an issue three weeks ago. I started working in the morning after about a ten-hour break; enough time for my tokens to refresh.
I sent two small questions to Claude Haiku. They were simple questions, not even related to the repository.
Suddenly, token usage spiked to 100%.
Have a nice break…
I contacted their “AI support bot”, which returned some default support nonsense and didn’t really understand the problem. So I asked for human support. A couple of days later a - what appeared to be - human support person sent a reply. It began like this:
“Our systems are detecting your inquiry is regarding usage limits on your Pro or Max plan.”
Yeah, well — it’s the Pro plan. Seems like your systems weren’t actually queried; it was just a default intro and probably a default answer, because:
This was followed by an extensive what seems to be copy-and-paste answer from their docs explaining how daily and weekly limits work.
And it closed with the typically frustrating line, that no customer likes to read at the end of an e-mail and which is just the classical middle-finger of customer support - we don’t care if your problem is solved or not, we declared it closed.
“Note that further replies to this ticket may not be monitored. If your request is not regarding usage limits on your Pro or Max plan, or you need additional support, please visit our help page at”
Great! Sending an automated e-mail that does not refer to the actual problem and then closing the channel. Thanks for nothing, I guess? Or was I wrong. I asked Claude Haiku:
@Haiku:
See the customer’s request here and the response from the AI and later W***** - did they answer the concern/question of the customer?
See the customer’s request here and the response from the AI and later W***** - did they answer the concern/question of the customer?
(╯°_°)╯︵ ┻━┻
Declining quality
In the following days and weeks, the quality was far from satisfying my needs or matching my initial experience. While I used to be able to work on up to three projects at once, now the token limit was exhausted after two hours on a single project.
And the quality was degrading. I am fully aware this is quite subjective and that the quality of the agent is always heavily impacted by the operator. The failure usually appears in front of the screen. But hey, I also develop using Github’s Copilot, OpenAI’s Codex and I am running my own inference with OMLX and Continue using Qwen3.5 – 9B. I’m not the expert, I’m lazy sometimes but I probably know a thing or two.
Let me give you this wonderful example: yesterday I asked Claude Opus to refactor a project.
While I was browsing the model’s thinking log - which I strongly suggest doing not only occasionally - I found this:
Rather than editing every slider in JSX, I’ll add a generic initializer in ui-events.js that auto-injects value displays for all range inputs that lack one.
Rather than editing every slider in JSX, I’ll add a generic initializer in ui-events.js that auto-injects value displays for all range inputs that lack one.
This is clearly bad practice. It’s a cheap workaround you wouldn’t expect even from a junior dev; it reads like someone who just doesn’t want to deliver a good result. My response:
“you can’t be serious — is this how you fix things? just WORKAROUNDS????”
At least Opus admitted:
“You’re right, that was lazy. Let me do it properly — add the labels directly in the JSX and wire them explicitly.”
Needless to say, this shortcut cost me around 50% of my five-hour token allowance.
(ง •̀_•́)ง
And even more…
Now this cache topic comes up
-
among others
. at least they are talking about it openly. The problem was: when you get back to work after some time, your conversation cache is gone and the model starts reading your codebase again. Cost-wise this is smart. But experience-wise? It means you paid tokens for the initial load and, after a forced break because the five-hour token window hit its limit, you pay again for the same load.
Think that’s all? Wait, I also got this funny anecdote: all of a sudden the weekly window changed from today to Monday. OK, I was thankful because it came with a reset to zero. But still: what is going on, Anthropic? Not only that — while I was working on my project, watching token usage with Argus-eyed vigilance, this little warning popped up:
Wait, what? I’m neither part of an organization nor do I see any hint why I suddenly have to worry about a “monthly usage limit” — also the hourly and weekly limits were still not exceeded. What is happening right now?
Turns out — two hours later - it allowed me to continue working. The warning was gone.
At least
this documentation
does not mention a monthly usage limit. And the settings page only lists the limits for the current session and week.
So… what is this monthly limit all about, Anthropic?
Sorry to let you down, Anthropic
I am a huge fan of the product. Theoretically everything just works like a charm; it offers so many opportunities. I built my
Please make sure your browser supports JavaScript and cookies and that you are not blocking them from loading. For more information you can review our Terms of Service and Cookie Policy.
Hi friends,
I’ll be attending Babashka Conf on May 8 and Dutch Clojure Days on May 9.
If you’re attending either (or just visiting Amsterdam), drop me a line!
When I have an idea for a project, it tends to go in one of these two directions:
I just do it. Maybe I make a few minor revisions, but often it turns out exactly how I’d imagined and I’m happy.
I just do it. Maybe I make a few minor revisions, but often it turns out exactly how I’d imagined and I’m happy.
I think, “I should look for prior art”. There’s a lot of prior art, dealing with a much broader scope than I’d originally imagined. I start to wonder if I should incorporate that scope. Or perhaps try to build my thing on top of the existing sorta-nearby-solutions. Or maybe I should just use the popular thing. Although I could do a better job than that thing, if I put a bunch of time into it. But actually, I don’t want to maintain a big popular project, nor do I want to put that much time into this project. Uh oh, now I’ve spent a bunch of time, having neither addressed the original issue nor experienced the joy of creating something.
I think, “I should look for prior art”. There’s a lot of prior art, dealing with a much broader scope than I’d originally imagined. I start to wonder if I should incorporate that scope. Or perhaps try to build my thing on top of the existing sorta-nearby-solutions. Or maybe I should just use the popular thing. Although I could do a better job than that thing, if I put a bunch of time into it. But actually, I don’t want to maintain a big popular project, nor do I want to put that much time into this project. Uh oh, now I’ve spent a bunch of time, having neither addressed the original issue nor experienced the joy of creating something.
I prefer the first outcome, and I think the pivotal factor is how well I’ve internalized my own success criteria.
For example, last weekend I hosted my friend Marcin and we decided it’d be fun to do some woodworking, so we threw together this shelf and 3d-printed hangers for my kitchen:
Absolute banger of a project:
brainstormed the design over coffee
did a few 3d-print iterations for the Ikea bin hangers (OnShape CAD, if you want to print your own)
used material leftover from my workbench
rounded the corner by eye with a palm sander
sealed the raw plywood edge with some leftover paint from a friend
done in a weekend
The main success criteria was to jam on woodworking with a friend, and that helped me not overthink the object-level success criteria: Just make a shelf for my exact kitchen!
In contrast, this past Friday I noticed difftastic did a poor job, so I decided to shop around for structural/semantic diff tools and related workflows (a topic I’ve never studied, that I’m increasingly interested in as I’m reviewing more and more LLM-generated code).
I spent 4 hours over the weekend researching existing tools (see my notes below), going through dark periods of both “semantic tree diffing is a PhD-level complex problem” and “why do all of these have MCP servers? I don’t want an MCP server”, before I came to my senses and remembered my original success criteria: I just want a nicer diffing workflow for myself in Emacs, I should just build it myself — should take about 4 hours.
I’m cautiously optimistic that, having had this realization and committing myself to a minimal scope, I’ll be able to knock out a prototype before running out of motivation.
However, other long-running interests of mine:
interfaces for prototyping hardware (discussed September 2023)
a programming language that fuses what I like about Clojure and Rust (November 2023)
a programming language for CAD (constraints, bidirectional editing, other dubious ideas)
seem to be deep in the well of outcome #2.
That is, I’ve spent hundreds of hours on background research and little prototypes, but haven’t yet synthesized anything that addresses the original motivating issue.
It’s not quite that I regret that time — I do love learning by reading — but I have a nagging sense of unease that my inner critic (fear of failure?) is silencing my generative tendencies, keeping me from the much more enjoyable (and productive!) learning by doing.
I think in these cases the success criteria has been much fuzzier: Am I trying to replace my own usage of Rust/Clojure?
Only for some subset of problems?
Or is it that I actually just need a playground to learn about language design/implementation, and it’s fine if I don’t end up using it?
Ditto for CAD: Am I trying to replace my commercial CAD tool in favor of my own?
Only for some subset of simple or particularly parametric parts?
Do I care if it’s useful for others?
Does my tool need to be legibly different from existing open-source tools?
It’s worth considering these questions, sure.
But at the end of the day, I’d much rather have done a lot than have only considered a lot.
So I’m trying to embrace my inner clueless 20-year-old and just do things — even if some turn out to be “obviously bad” in hindsight, I’ll still be coming out ahead on net =D
Conservation of scope creep
Of course, there’s only so much time to “just do things”, and there’s a balance to be had. I’m not sure how many times I’ll re-learn YAGNI (“you ain’t gonna need it”) in my career, but I was reminded of it again after writing a bunch of code with an LLM agent, then eventually coming to my senses and throwing it all out.
I wanted a Finda-style filesystem-wide fuzzy path search for Emacs.
Since I’ve built (by hand, typing the code myself!) this exact functionality before (walk filesystem to collect paths, index them by trigram, do fast fuzzy queries via bitmap intersections), I figured it’d only take a few hours to supervise an LLM to write all the code.
I started with a “plan mode” chat, and the LLM suggested a library, Nucleo, which turned up since I wrote Finda (10 years ago, eek!).
I read through it, found it quite well-designed and documented, and decided to use it so I’d get its smart case and Unicode normalization functionality.
(E.g., query foo matches Foo and foo, whereas query Foo won’t match foo; similarly for cafe and café.)
Finding a great library wasn’t the problem, the problem was that Nucleo also supported some extra functionality: anchors (^foo only matches at the beginning of a line).
This got me thinking about what that might mean in a corpus that consists entirely of file paths.
Anchoring to the beginning of a line isn’t useful (everything starts with /), so I decided to try and interpret the anchors with respect to the path segments.
E.g., ^foo would match /root/foobar/ but not /root/barfoo/.
But to do this efficiently, the index needs to keep track of segment boundaries so that the query can be checked against each segment quickly.
But then we also need to handle a slash occurring in an anchored query (e.g., ^foo/bar) since that wouldn’t get matched when only looking at segments individually (root, foo, bar, and baz of a matching path /root/foo/bar/baz/).
Working through this took several hours: first throwing around design ideas with an LLM, having it write code to wrap Nucleo’s types, then realizing its code was bloated and didn’t spark joy, so finally writing my own (smaller) wrapper.
Then, after a break, I realized:
I can’t think of a situation where I’d ever wished Finda had anchor functionality
In a corpus of paths, I can anchor by just adding / to the start or end of a query (this works for everything except anchoring to the end of a filename).
So I tossed all of the anchoring code.
I’m pretty sure I still came out ahead compared to if I’d tried to write everything myself sans LLM or discussion with others, but I’m not certain.
Perhaps there’s some kind of conservation law here: Any increases in programming speed will be offset by a corresponding increase in unnecessary features, rabbit holes, and diversions.
Structural diffing
Speaking of unnecessary diversions, let me tell you everything I’ve learned about structural diffing recently — if you have thoughts/feelings/references in this space, I’d love to hear about ’em!
When we’re talking about code, a “diff” usually means a summary of the line-by-line changes between two versions of a file.
This might be rendered as a “unified” view, where changed lines are prefixed with + or - to indicate whether they’re additions or deletions.
For example:
We’ve removed coffee and added apple.
The same diff might also be rendered in a side-by-side view, which can be easier to read when there are more complex changes:
The problem with these line-by-line diffs is that they’re not aware of higher-level structure like functions, types, etc. — if some braces match up somehow between versions, they might not be shown at all, even if the braces “belong” to different functions.
There’s a wonderful tool, difftastic, which tries to address this by calculating diffs using treesitter-provided concrete syntax trees.
It’s a huge improvement over line-based diffs, but unfortunately it doesn’t always do a great job matching entities between versions.
Here’s the diff that motivated this entire foray:
Note that it doesn’t match up struct PendingClick, it shows it deleted on the left and added on the right.
I haven’t dug into why difftastic fails to match here, but I do feel like it’s wrong — even if the overall diff would be longer, I’d still rather see PendingClickRequest and PendingClick matched up between both sides.
Here’s a summary of tools / references in the space:
The most “baked” and thoughtful semantic diff tool I found is, perhaps unsurprisingly, semanticdiff.com, a small German company with a free VSCode plugin and web app that shows diffs for github PRs. Unfortunately they don’t have any code libraries I can use as a foundation for the workflow I want.
this semanticdiff vs. difftastic blog post covers a lot of great details (including that difftastic doesn’t even show semantically meaningful indentation changes in python !!!)
one of the authors has great HN comments with hard-won background knowledge. E.g., they moved away from treesitter because it’s unreliable for semantics:
Context-sensitive keywords in particular were a constant source of annoyance. The grammar looks correct, but it will fail to parse because of the way the lexer works. You don’t want your tool to abort just because someone named their parameter “async”.
The most “baked” and thoughtful semantic diff tool I found is, perhaps unsurprisingly, semanticdiff.com, a small German company with a free VSCode plugin and web app that shows diffs for github PRs. Unfortunately they don’t have any code libraries I can use as a foundation for the workflow I want.
this semanticdiff vs. difftastic blog post covers a lot of great details (including that difftastic doesn’t even show semantically meaningful indentation changes in python !!!)
one of the authors has great HN comments with hard-won background knowledge. E.g., they moved away from treesitter because it’s unreliable for semantics:
Context-sensitive keywords in particular were a constant source of annoyance. The grammar looks correct, but it will fail to parse because of the way the lexer works. You don’t want your tool to abort just because someone named their parameter “async”.
Context-sensitive keywords in particular were a constant source of annoyance. The grammar looks correct, but it will fail to parse because of the way the lexer works. You don’t want your tool to abort just because someone named their parameter “async”.
diffsitter
built on treesitter, has MCP server. README includes list of similar projects.
lots of github stars, but doesn’t seem particularly well-documented; I couldn’t find an explanation of how it works, but the difftastic wiki says it “runs longest-common-subsequence on the leaves of the tree”
diffsitter
built on treesitter, has MCP server. README includes list of similar projects.
lots of github stars, but doesn’t seem particularly well-documented; I couldn’t find an explanation of how it works, but the difftastic wiki says it “runs longest-common-subsequence on the leaves of the tree”
gumtree
research / academic origin in 2014
requires Java, so no-go for my use case of a quick tool I can use via Emacs
gumtree
research / academic origin in 2014
requires Java, so no-go for my use case of a quick tool I can use via Emacs
mergiraf: treesitter-based merge-driver written in rust
very nice architecture overview; tool uses Gumtree algorithm
docs and adorable illustrations indicate this project was clearly written by a thoughtful human
semanticdiff.com author in HN comments:
> GumTree is good at returning a result quickly, but there are quite a few cases where it always returned bad matches for us, no matter how many follow-up papers with improvements we tried to implement. In the end we switched over to a dijkstra based approach that tries to minimize the cost of the mapping
mergiraf: treesitter-based merge-driver written in rust
very nice architecture overview; tool uses Gumtree algorithm
Please make sure your browser supports JavaScript and cookies and that you are not blocking them from loading. For more information you can review our Terms of Service and Cookie Policy.
Spinel — Ruby AOT Compiler
Spinel compiles Ruby source code into standalone native executables.
It performs whole-program type inference and generates optimized C code,
achieving significant speedups over CRuby.
Spinel is self-hosting: the compiler backend is written in Ruby and
compiles itself into a native binary.
How It Works
Ruby (.rb)
|
v
spinel_parse Parse with Prism (libprism), serialize AST
| (C binary, or CRuby + Prism gem as fallback)
v
AST text file
|
v
spinel_codegen Type inference + C code generation
| (self-hosted native binary)
v
C source (.c)
|
v
cc -O2 -Ilib -lm Standard C compiler + runtime header
|
v
Native binary Standalone, no runtime dependencies
Quick Start
# Fetch libprism sources (from the prism gem on rubygems.org):
make deps
# Build everything:
make
# Write a Ruby program:
cat > hello.rb <<‘RUBY’
def fib(n)
if n < 2
n
else
fib(n - 1) + fib(n - 2)
end
end
puts fib(34)
RUBY
# Compile and run:
./spinel hello.rb
./hello # prints 5702887 (instantly)
Options
./spinel app.rb # compiles to ./app
./spinel app.rb -o myapp # compiles to ./myapp
./spinel app.rb -c # generates app.c only
./spinel app.rb -S # prints C to stdout
Self-Hosting
Spinel compiles its own backend. The bootstrap chain:
CRuby + spinel_parse.rb → AST
CRuby + spinel_codegen.rb → gen1.c → bin1
bin1 + AST → gen2.c → bin2
bin2 + AST → gen3.c
gen2.c == gen3.c (bootstrap loop closed)
Benchmarks
74 tests pass. 55 benchmarks pass.
Geometric mean: ~11.6x faster than miniruby (Ruby 4.1.0dev) across
the 28 benchmarks below. Baseline is the latest CRuby miniruby build
(without bundled gems), which is considerably faster than the system
ruby (3.2.3); Spinel’s advantage is correspondingly smaller but still
substantial on computation-heavy workloads.
Computation
Data Structures & GC
Real-World Programs
Supported Ruby Features
Core: Classes, inheritance, super, include (mixin), attr_accessor,
Struct.new, alias, module constants, open classes for built-in types.
Control Flow: if/elsif/else, unless, case/when,
case/in (pattern matching), while, until, loop, for..in
(range and array), break, next, return, catch/throw,
&. (safe navigation).
Blocks: yield, block_given?, &block, proc {}, Proc.new,
lambda -> x { }, method(:name). Block methods: each,
each_with_index, map, select, reject, reduce, sort_by,
any?, all?, none?, times, upto, downto.
Exceptions: begin/rescue/ensure/retry, raise,
custom exception classes.
Types: Integer, Float, String (immutable + mutable), Array, Hash,
Range, Time, StringIO, File, Regexp, Bigint (auto-promoted), Fiber.
Polymorphic values via tagged unions. Nullable object types (T?)
for self-referential data structures (linked lists, trees).
Global Variables: $name compiled to static C variables with
type-mismatch detection at compile time.
Strings: << automatically promotes to mutable strings (sp_String)
for O(n) in-place append. +, interpolation, tr, ljust/rjust/center,
and all standard methods work on both. Character comparisons like
s[i] == “c” are optimized to direct char array access (zero allocation).
Chained concatenation (a + b + c + d) collapses to a single malloc
via sp_str_concat4 / sp_str_concat_arr — N-1 fewer allocations.
Loop-local str.split(sep) reuses the same sp_StrArray across
iterations (csv_process: 4 M allocations eliminated).
Regexp: Built-in NFA regexp engine (no external dependency).
=~, $1-$9, match?, gsub(/re/, str), sub(/re/, str),
scan(/re/), split(/re/).
Bigint: Arbitrary precision integers via mruby-bigint. Auto-promoted
from loop multiplication patterns (e.g. q = q * k). Linked as static
library — only included when used.
and others
added 30 commits
Seeking breaks otherwise. We might be able to just fflush() before or seeking
instead?
Turns out DosBox-X was having trouble with the Sound Blaster or something;
standard DosBox works correctly directly from the interrupt handler, and
without doubling the buffer size.
This is MUCH faster than just leaving buffering disabled, and also works
around getting bogus reads after an fseek. SDL_LoadWAV on test/sample.wav
no longer takes several seconds to finish, and comes up with the correct
data.
I wonder if we’re triggering this in LoadWAV because we’re malloc’ing data
between seeks/reads, and it’s causing the djgpp transfer buffer to change. Or
maybe the Fat DS trick is confusing it? I don’t know, I haven’t had time to
debug it, it might just be a legit libc bug in djgpp too, for all I know.
This uses an old trick we used in SDL 1.2 for MacOS Classic, which did its
audio callback in a hardware interrupt. If the audio is locked when the
interrupt fires, make a note of it and return immediately. When the lock is
released, if the interrupt has been fired, run the audio device iteration
right then.
Since there isn’t a big device lock in SDL3 (available to the app, at least),
this keeps a counter of when any SDL_AudioStream is locked, which is probably
good enough.
This uses VESA interfaces to manage the display and works with the software
renderer.
Events aren’t hooked up yet, so prepare to close DosBox on each run. :)
…upport.
This gets most of the rendering examples, which use SDL_GetBasePath() to
find textures to load, working.
Of course Quake 1 solved this better, haha. It’s smart: less memory, dirt
simple, and you don’t even have to worry about synchronizing with the
interrupt handler, because it’s safe for both sides no matter when an
interrupt fires.
[sdl-ci-filter djgpp]
[sdl-ci-artifacts]
- SDL_runapp.c: Add SDL_PLATFORM_DOS to the exclusion list so the
generic
SDL_RunApp() is disabled when the DOS-specific one is compiled.
- SDL.c: Exclude SDL_Gtk_Quit() on DOS. DJGPP defines __unix__ which
sets
SDL_PLATFORM_UNIX, but DOS has no GTK/display server. The GTK source
is not compiled (CMake UNIX is false for DOS) so this was a link
error.
- sdlplatform.cmake: Add DOS case to SDL_DetectCMakePlatform so the
platform is properly detected from CMAKE_SYSTEM_NAME=DOS.
- i586-pc-msdosdjgpp.cmake: Add i386-pc-msdosdjgpp-gcc as a fallback
compiler name, since some DJGPP toolchain builds use the i386 prefix.
- Implement double-buffered page-flipping for VBE modes with >1 image
page
- Save and restore full VBE state on video init/quit for clean mode
switching
- Improve DOS keyboard handling: support extended scancodes and Pause
key
- Lock ISR code/data to prevent page faults during interrupts
- Always vsync when blitting in single-buffered modes to reduce tearing
Move audio mixing out of IRQ handler to main loop for improved
stability and to avoid reentrancy issues. Add SDL_DOS_PumpAudio
function, update DMA buffer handling, and adjust sample rate to 22050
Hz.
Silence stale DMA buffer halves to prevent stutter during load.
Detect SB version and select 8-bit mono or 16-bit stereo mode.
Handle DMA and DSP setup for both SB16 and pre-SB16 hardware.
Add FORCE_SB_8BIT option for testing in DOSBox.
- Poll Sound Blaster DSP status instead of fixed delay after speaker-on
- Clarify DPMI conventional memory is always locked; update comments
- Document and justify DMA memory allocation strategy
- Free IRET wrapper after restoring interrupt vector to avoid leaks
- Throttle joystick axis polling to ~60 Hz to reduce BIOS timing loop
cost
- Always poll joystick buttons directly for responsiveness
Implement banked framebuffer access for VBE 1.2+ modes without LFB.
Detect and initialize banked modes, copy framebuffer data using bank
switching, and blank the framebuffer on mode set. Page-flipping is
disabled in banked mode.
Open
April, 2026
Apr 24
Feature
gpt-5.5
gpt-5.5-pro
v1/responses
v1/chat/completions
v1/batch
Released GPT-5.5, a new frontier model for complex professional work, to the Chat Completions and Responses API, and released GPT-5.5 pro for Responses API requests for tougher problems that benefit from more compute.
GPT-5.5 supports a 1M token context window, image input, structured outputs, function calling, prompt caching, Batch, tool search, built-in computer use, hosted shell, apply patch, Skills, MCP, and web search. Key updates include:
Reasoning effort now defaults to medium.
When image_detail is unset or set to auto, the model now uses original behavior.
Caching for GPT-5.5 only works with extended prompt caching. In-memory prompt caching is not supported.
Learn more here.
Apr 21
Feature
gpt-image-2
v1/images/generations
v1/images/edits
v1/batch
Released GPT Image 2, a state-of-the-art image generation model for image generation and editing. GPT Image 2 supports flexible image sizes, high-fidelity image inputs, token-based image pricing, and Batch API support with a 50% discount.
Apr 15
Update
Updated the Agents SDK with new capabilities, including:
running agents in controlled sandboxes;
inspecting and customizing the open-source harness; and
controlling when memories are created and where they’re stored.
March, 2026
Mar 17
Feature
gpt-5.4-mini
gpt-5.4-nano
v1/responses
v1/chat/completions
Released GPT-5.4 mini and GPT-5.4 nano to the Chat Completions and Responses API. GPT-5.4 mini brings GPT-5.4-class capabilities to a faster, more efficient model for high-volume workloads, while GPT-5.4 nano is optimized for simple high-volume tasks where speed and cost matter most.
GPT-5.4 mini supports tool search, built-in computer use, and compaction. GPT-5.4 nano supports compaction, but does not support tool search or computer use.
Mar 16
Update
gpt-5.3-chat-latest
Updated the gpt-5.3-chat-latest slug to point to the latest model currently used in ChatGPT.
Mar 13
Fix
gpt-5.4
v1/responses
v1/chat/completions
Updated our image encoder to fix a small bug with input_image inputs in GPT-5.4. Some image understanding use cases may now see improved quality. No action is required.
Mar 12
Feature
sora-2
sora-2-pro
v1/videos
v1/videos/characters
v1/videos/extensions
v1/batch
Expanded the Sora API with reusable character references, longer generations up to 20 seconds, 1080p output for sora-2-pro, video extensions, and Batch API support for POST /v1/videos. 1080p generations on sora-2-pro are billed at $0.70 per second. Learn more here.
Mar 12
Update
sora-2
sora-2-pro
v1/videos/edits
v1/videos/{video_id}/remix
Added POST /v1/videos/edits for editing existing videos. This will replace POST /v1/videos/{video_id}/remix, which will be deprecated in 6 months. Learn more here.
Mar 5
Feature
gpt-5.4
gpt-5.4-pro
v1/responses
v1/chat/completions
Released GPT-5.4, our newest frontier model for professional work, to the Chat Completions and Responses API, and released GPT-5.4 pro to the Responses API for tougher problems that benefit from more compute.
Also released:
Tool search in the Responses API, which lets models defer large tool surfaces until runtime to reduce token usage, preserve cache performance, and improve latency.
Built-in Computer use support in GPT-5.4 through the Responses API computer tool for screenshot-based UI interaction.
A 1M token context window and native Compaction support for longer-running agent workflows.
Mar 3
Feature
gpt-5.3-chat-latest
v1/chat/completions
v1/responses
Released gpt-5.3-chat-latest to the Chat Completions and Responses API. This model points to the GPT-5.3 Instant snapshot currently used in ChatGPT. Read more here.
February, 2026
Feb 24
Feature
v1/responses
v1/chat/completions
Expanded input_file support to accept more document, presentation, spreadsheet, code, and text file types. Learn more here.
Feb 24
Feature
v1/responses
Released phase to the Responses API. It labels an assistant message as intermediate commentary (commentary) or the final answer (final_answer). Read more here.
Feb 24
Feature
gpt-5.3-codex
v1/responses
Released gpt-5.3-codex to the Responses API. Read more here.
Feb 23
Feature
v1/responses
Launched WebSocket mode for the Responses API. Learn more here.
Feb 23
Feature
last year i bought a Rodecaster Duo to solve some audio woes to allow myself and my girlfriend to have microphones to our respective computers when gaming together and talking on discord in the same room without any echo, and to be able to swap that to my work pc easily. the rodecaster is really nice, it’s pretty effortless to use and works great for our home. I would gladly recommend it to anyone looking for a similar solution.
as is usual for any device in my house, i try to ensure when it’s time to update the firmware I have enough tooling in place to capture how firmware updates work, or to at a minimum capture a firmware blob to try and reverse engineer it and poke around for fun and/or to see the often horrific reality that is the industry we work in.
fw update
I was feeling pretty lazy and assumed that rode would dump the firmware somewhere on my computer before flashing the device, so i set up Instruments on macos to capture disk activity, and found where the fw was dumped, and surprisingly it was just a gzipped tarball. The device I did this update on happened to have the ability to write to USB disks disabled, so the update actually failed.
Poking around a bit, i found the binaries of the stuff that actually runs on the device, as well as a shell script that handles the updates themselves. there are two partitions on the disk, so that if you brick one it boots from the other. It also doesn’t have any signature checks or anything on the incoming firmware. I’m used to many vendors of this style of device requiring signed firmwares or something nowadays, kind of nice to actually own a device I can modify. I also noticed that ssh seemed to be enabled by default, and plugged in an ethernet cable and saw that ssh indeed is enabled w/ pubkey auth only. Here are the keys that are added by default:
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCX/bCFTDgViuPvdFL/VMMVRrw9b5S8HcDQk17qoCEYwmI+IIG8rEAsLiaeCOwyhf9IN+8/LRaN0Z5ZfU3WMbmsKEg8zd1Yvqq74nFbhO47vbtzmCi9S4ucIKkBEVOyvyN5lt9hWf5t5nZSmlfldZK3Pem5y8wHM5A+K/gSnzp4gwQ1QYfFb068uQ+ciIdOhb8SkUs8CwzotglIbp19I6ZmXmsNj/TmpbUf5rMfUAf1gysZ5j1UdRWrvWVh5daqvZRsBBPbXEeJfDU3Nr3HR14XYt9mgexrz/5oyKSj/lQYLmh9cDfsxvkGNIQ8fF9l+n2L1KZM4lLgiGk4KFBjQHaIBZx9OebCiiZCO4NTJUBDk9a+SZpiDiipADV07s7vTInYyFA6GrmKtnq3M6upT4WJBvVuL/BMnK5yY1RZtoqox2/pcCg2rH5S1GIy0v0HFJisl7kWInlaG2mdsaCx19wAjCFe/qT9LyxjQ6+0rArI55/JJFDkNeMjrewRQwNdASjCox8vqXCBfjvsR9qv70/ywcymgsnLAnq2LuYg5FYwMMDYOvVnhACC+BYTdNDTn5oeMIjQCUenY/DPCHpJkf4YOf3YCMUTEU9tExhtwW/X+m21hS3+STLtTfqbUeg9CeuPQZgfl9vc65n3tMxAdlEGEDoTaNMAgr2TzJv92Ka9iQ==
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDaNyzPfIcEeQsfzyQs/wyX6mX52kiS+4eNHfCaxFlgj
since our update failed, i swapped to a windows pc and set up wireshark with USBPcap and then ran the update thru the rodecaster app. I roughly glanced at the pcap file and looked at where the update started, since it was a lot of traffic as I was also using it for audio on another computer. I wrote down the packet numbers I thought were interesting and threw them to claude code to dig thru the pcap while I was doing other stuff.
A bit later (cc only took 10 minutes or so but I was busy for a while) I came back to a breakdown of the structure, and a python script to manually update a device. The RODECaster App sends some HID control stuff to the device, one command to enter the update mode (the ‘M’ command), and then another (the U command) to trigger the update. Both are just single ASCII characters sent over HID report 1.
I am but a yaml-writing slave and sometimes a below-average ghidra user, and don’t often interact with hardware devices, so getting some help from CC in discovery was useful, as well as pointing me to resources to actually learn more about hid devices.
The structure was pretty simple, you send the m command, and then copy archive.tar.gz and archive.md5 (obviously just with md5sum of the archive) onto the newly exposed disk. then you send the U command to trigger the flashing itself.
so the flow is:
plug in the rodecaster and power it on (or vice versa)
send the ‘M’ command
mount the disk and copy archive.tar.gz and archive.md5 to it
chmod 777 both of them because i dont care to figure out how to do it properly
unmount the disk
send the ‘U’ command
wait for the thing to reboot into your new firmware
custom firmware
I was still working from my mac, and wanted to create some cfw to be able to ssh into the device, so i just used a container to enable password authentication for ssh (don’t shoot me) as well as add my own pubkey to the authorized keys, and dump out an archive for me to flash. you don’t really need much to actually flash the device, see here (example of the functions its not really much to add the rest.)
run your script to flash the thing and bingo bongo you can ssh to it
conclusion
I was really surprised that I could actually flash firmware so easily to this, and it is really nice to own a device. It’s a really nice piece of kit and just kinda blends into the background and I never have to think about it. I don’t really know why ssh was enabled, or why it had this key added by default, but I submitted a ticket to RODE for this as I could not find an obvious security email to report to. I did not hear back, but I will watch to see if future firmware updates change anything.
It’s been a few months since i’ve done anything with this, and I am trying to just dump out my thoughts into a notepad and only very lightly edit it and then just poast. I really love all of the RODE stuff I have, and yet again just want to buy more gear.
if you want to ask me questions about this or have any questions, you can reach me with the primary letter of this domain, at this domain.
thanks computer, until next time
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.