10 interesting stories served every morning and every evening.
During our talks with F-Droid users at FOSDEM26 we were baffled to learn most were relieved that Google has canceled their plans to lock-down Android.
Why baffled? Because no such thing actually happened, the plans announced last August are still scheduled to take place. We see a battle of PR campaigns and whomever has the last post out remains in the media memory as the truth, and having journalists just copy/paste Google posts serves no one.
But Google said… Said what? That there’s a magical “advanced flow”? Did you see it? Did anyone experience it? When is it scheduled to be released? Was it part of Android 16 QPR2 in December? Of 16 QPR3 Beta 2.1 last week? Of Android 17 Beta 1? No? That’s the issue… As time marches on people were left with the impression that everything was done, fixed, Google “wasn’t evil” after all, this time, yay!
While we all have bad memories of “banners” as the dreaded ad delivery medium of the Internet, after FOSDEM we decided that we have to raise the issue back and have everyone, who cares about Android as an open platform, informed that we are running out of time until Google becomes the gate-keeper of all users devices.
Hence, the website and starting today our clients, with the updates of F-Droid and F-Droid Basic, feature a banner that reminds everyone how little time we have and how to voice their concerns to whatever local authority is able to understand the dangers of this path Android is led to.
We are not alone in our fight, IzzyOnDroid added a banner too, more F-Droid clients will add the warning banner soon and other app downloaders, like Obtainium, already have an in-app warning dialogue.
Regarding F-Droid Basic rewrite, development continues with a new release 2.0-alpha3:
Note that if you are already using F-Droid Basic version 1.23.x, you won’t receive this update automatically. You need to navigate to the app inside F-Droid and toggle “Allow beta updates” in top right three dot menu.
In apps news, we’re slowly getting back on track with post Debian upgrade fixes (if your app still uses Java 17 is there a chance you can upgrade to 21?) and post FOSDEM delays. Every app is important to us, yet actions like the Google one above waste the time we could have put to better use in Gitlab.
Buses was updated to 1.10 after a two year hiatus.
Conversations and Quicksy were updated to 2.19.10+free improving on cleaning up after banned users, a better QR workflow and better tablet rotation support. These are nice, but another change raises our interest, “Play Store flavor: Stop using Google library and interface directly with Google Play Service via IPC”. Sounds interesting for your app too? Is this a path to having one single version for both F-Droid and Play that is fully FLOSS? We don’t know yet, but we salute any trick that removes another proprietary dependency from the code. If curious feel free to take a look at the commit.
Dolphin Emulator was updated to 2512. We missed one version in between so the changelogs are huge, luckily the devs publish highly detailed posts about updates. So we’ll start with “Release 2509” (about 40 mins to read), we side-track with “Starlight Spotlight: A Hospital Wii in a New Light” (for about 50 mins), we continue to the current release in “Release 2512” (40 more minutes) and we finish with “Rise of the Triforce” delving in history for more than one hour.
Image Toolbox was updated to 3.6.1 adding many fixes and… some AI tools. Were you expecting such helpers? Will you use them?
Luanti was updated to 5.15.1 adding some welcomed fixes. If your game world started flickering after the last update make sure to update.
Nextcloud apps are getting an update almost every week, like Nextcloud was updated to 33.0.0, Nextcloud Cookbook to 0.27.0, Nextcloud Dev to 20260219, Nextcloud Notes to 33.0.0 and Nextcloud Talk was updated to 23.0.0.
But are you following the server side too? Nextcloud Hub 26 Winter was just released adding a plethora of features. If you want to read about them, see the 30 minutes post here or watch the one hour long video presentation from the team here.
ProtonVPN - Secure and Free VPN was updated to 5.15.70.0 adding more control to auto-connects, countries and cities. Also all connections are handled now by WireGuard and Stealth protocols as the older OpenVPN was removed making the app almost 40% smaller.
Offi was updated to 14.0 with a bit of code polish. Unfortunately for Android 7 users, the app now needs Android 8 or later.
QUIK SMS was updated to 4.3.4 with many fixes. But Vishal praised the duplicate remover, the default auto de-duplication function and found that the bug that made deleted messages reappear is fixed.
SimpleEmail was updated to 1.5.4 after a 2 year pause. It’s just a fixes release, updating translations and making the app compatible with Android 12 and later versions.
* NeoDB You: A native Android app for NeoDB designed with Material 3/You
Thank you for reading this week’s TWIF 🙂
Please subscribe to the RSS feed in your favourite RSS application to be updated of new TWIFs when they come up.
You are welcome to join the TWIF forum thread. If you have any news from the community, post it there, maybe it will be featured next week 😉
To help support F-Droid, please check out the donation page and contribute what you can.
...
Read the original on f-droid.org »
US President Donald Trump has announced that the US will raise global tariffs to 15%.
This is an increase from the 10% rate announced on Friday, when the president invoked a never-before-used law known
as Section 122 after the Supreme Court struck down his previous tariffs with a 6-3 majority.
The law, which falls under the 1974 Trade Act, gives Trump the power to put in place tariffs up to a maximum of 15% for 150 days, at which point Congress must step in.
Trump has called the Supreme Court’s decision “ridiculous” and “extraordinarily anti-American”.
Some lawmakers are questioning the president’s decision to continue the levies, with Democratic congressman Ted Lieu saying Trump is taking out his anger towards the top court on Americans. “These temporary tariffs will be challenged in court and Democrats will kill them when they expire,” he writes on X.
American allies have also weighed in on the changes, with German Chancellor Friedrich Merz warning about the uncertainty they bring the global economy. Meanwhile, the UK says it expects to retain its “privileged trading position with the US”.
We’ve wrapping up our live coverage for now, but you can
read more in our news article.
...
Read the original on www.bbc.com »
TL;DR: Over the past decade, I’ve worked to build the perfect family dashboard system for our home, called Timeframe. Combining calendar, weather, and smart home data, it’s become an important part of our daily lives.
When Caitlin and I got married a decade ago, we set an intention to have a healthy relationship with technology in our home. We kept our bedroom free of any screens, charging our devices elsewhere overnight. But we missed our calendar and weather apps.
So I set out to build a solution to our problem. First, I constructed a Magic Mirror using an off-the-shelf medicine cabinet and LCD display with its frame removed. It showed the calendar and weather data we needed:
But it was hard to read the text, especially during the day as we get significant natural light in Colorado. At night, it glowed like any backlit display, sticking out sorely in our living space.
I then spent about a year experimenting with various jailbroken Kindle devices, eventually landing on design with calendar and weather data on a pair of screens. The Kindles took a few seconds to refresh and flash the screen to reset the ink pixels, so they only updated every half hour. I designed wood enclosures and laser-cut them at the local library makerspace:
Software-wise, I built a Ruby on Rails app for fetching the necessary data from Google Calendar and Dark Sky. The Kindles woke up on a schedule, loading a URL in the app that rendered a PNG using IMGKit. The prototype proved e-paper was the right solution: it was unobtrusive regardless of lighting:
The Kindles were a hack, requiring constant tinkering to keep them working. It was time for a more reliable solution. I tried an OLED screen to see if the lack of a global backlight would be less distracting, but it wasn’t much better than the Magic Mirror:
So it was back to e-paper. I found a system of displays from Visionect, which came in 6”/10”/13”/32” sizes and could update every ten minutes for 2-3 months on a single charge:
The 32” screen used an outdated lower-contrast panel and its resolution was too low to render text smoothly. The smaller sizes used a contrasty, high-PPI panel. I ended up using a combination of them around the house: a 6” in the mudroom for the weather, a 13” (with its built-in magnetic backing) in the kitchen attached to the side of the fridge, and a 10” in the bedroom.
The Visionect displays required running custom closed-source software, either as a SaaS or locally with Docker. I opted for a local installation on the Raspberry Pi already running the Rails backend. I had my best results pushing images to the Visionect displays every five minutes in a recurring background job. It used IMGKit to generate a PNG and send it to the Visionect API, logic I extracted into visionect-ruby. This setup proved to be incredibly reliable, without a single failure for months at a time.
Visiting friends often asked how they could have a similar system in their home. Three years after the initial prototype, I did my first market test with a potential customer. At their request, I experimented with different formats, including a month view on the 13” screen:
Unfortunately, the customer didn’t see enough value to justify the $1000 price tag (in 2019!) for the 13” device, let alone anything I’d charge for a subscription service. At around the same time, Visionect started charging a $7/mo per-device fee to run their backend software on premises with Docker, after years of it being free to use. I’d have needed to charge $10/month, if not more, for a single screen!
In late 2021, the Marshall Fire destroyed our home along with ~1,000 others. Our homeowner’s insurance gave us two years to rebuild, so we set off to redesign our home from the ground up.
Around the same time, Boox released the 25.3” Mira Pro, the first high-resolution option for large e-paper screens. Best of all, it could update in realtime! Unlike the Visionect devices, it was just a display with an HDMI port and needed to be plugged into power. A quick prototype powered by an old Mac Mini made it immediately obvious that it was a huge step forward in capability. The larger screen allowed for significantly more information to be displayed:
But the most compelling innovation was having the screen update in realtime. I added a clock, the current song playing on our Sonos system (using jishi/node-sonos-http-api) and the next-hour precipitation forecast from Dark Sky:
The working prototype was enough to convince me to build a place for it in the new house. We designed a “phone nook” on our main floor with an art light for the display:
We also ran power to two more locations for 13” Visionect displays, one in our bedroom and one by the door to our garage:
The real-time requirements of the Mira Pro immediately surfaced performance and complexity issues in the backend, prompting an almost complete rewrite.
While the Visionect system worked just fine with multiple-second response times, switching to long-polling every two seconds put a ceiling on how slow response times could be. To start, I moved away from generating images. The Visionect folks added the ability to render a URL directly in the backend, freeing up resources to serve the long-polling requests.
Most significantly, I started migrating towards Home Assistant (HA) as the primary data source. HA already had integrations for Google Calendar, Dark Sky (now Apple Weather), and Sonos, enabling me to remove over half of the code in the Timeframe codebase! I ended up landing a PR to Home Assistant to allow for the calendar behavior I needed, and will probably need to write a couple more before HA can be the sole data source.
With less data-fetching logic, I was able to remove both the database and Redis from the Rails application, a massive reduction in complexity. I now run the background tasks with Rufus Scheduler and save data fetching results with the Rails file store cache backend.
In addition to data retrieval, I’ve also worked to move as much of the application logic into Home Assistant. I now automatically display the status of any sensor that begins with sensor.timeframe, using a simple ICON,Label CSV format.
For example, the other day I wanted to have a reminder to start or schedule our dishwasher after 8pm if it wasn’t set to run. It took me about a minute to write a template sensor using the power level from the outlet:
{% if states(‘sensor.kitchen_dishwasher_switched_outlet_power’)|float < 2 and now().hour > 19 %}
utensils,Run the dishwasher!
{% endif %}
In the month since adding the helper, it reminded me twice when I’d have otherwise forgotten. And I didn’t have to commit or deploy any code!
Since moving into our new home, we’ve come to rely on the real-time functionality much more significantly. Effectively, we’ve turned the top-left corner of the displays into a status indicator for the house. For example, it shows what doors are open/unlocked:
Or whether the laundry is done:
It has a powerful function: if the status on the display is blank, the house is in a “healthy” state and does not need any attention. This approach of only showing what information is relevant in a given moment flies right in the face of how most smart homes approach communicating their status:
The single status indicator removes the need to scan an entire screen. This change in approach is possible because of one key difference: we have separated the control of our devices from the display of their status.
I continue to receive significant interest in the project and remain focused on bringing it to market. A few key issues remain:
While I have made significant progress in handling runtime errors gracefully, I have plenty to learn about creating embedded systems that do not need maintenance.
There are still several data sources I fetch directly outside of Home Assistant. Once HA is the sole source of data, I’ll be able to have Timeframe be a Home Assistant App, making it significantly easier to distribute.
The current hardware setup is not ready for adoption by the average consumer. The 25” Boox display is excellent but costs about $2000! It also doesn’t include the hardware needed to drive the display. There are a couple of potential options to consider, such as Android-powered devices from Boox and Philips or low-cost options from TRMNL.
Building Timeframe continues to be a passion of mine. While my day job has me building software for over a hundred million people, it’s refreshing to work on a project that improves my family’s daily life.
...
Read the original on hawksley.org »
And I don’t just mean that nobody uses it anymore. Like, I knew everyone under 50 had moved on, but I didn’t realize the extent of the slop conveyor belt that’s replaced us.
I logged on for the first time in ~8 years to see if there was a group for my neighborhood (there wasn’t). Out of curiosity I thought I’d scroll a bit down the main feed.
The first post was the latest xkcd (a page I follow). The next ten posts were not by friends or pages I follow. They were basically all thirst traps of young women, mostly AI-generated, with generic captions. Here’s a sampler — mildly NSFW, but I did leave out a couple of the lewder ones:
Yikes. Again, I don’t follow any of these pages. This is all just what Facebook is pushing on me.
I know Twitter/X has worse problems with spam bots in the replies, but this is the News Feed! It’s the main page of the site! It’s the product that defined modern social media!
It wasn’t all like that, though. There was also an AI video of a policeman confiscating a little boy’s bike, only to bring him a brand new one:
And there were some sloppy memes and jokes, mostly about relationships, like this (admittedly not AI) video sketch where a woman decides to intentionally start a fight with her boyfriend because she’s on her period:
Maybe that isn’t literally about sex, but I’d classify it as the same sort of lizard-brain-rot engagement bait as those selfies.
Several commenters have vouched that Yoleendadong makes funny, high-quality content and shouldn’t be lumped in with AI slop. I’m just saying I think there’s a reason this particular video of hers popped up, and it’s probably the kind of engagement created by the premise.
Meta even gives us some helpful ideas for sexist questions we can ask their AI about the video:
Yep, that’s another “yikes” from me. To be fair, though, sometimes that suggested questions feature is pretty useful! Like with this post, for example:
Why is she wearing pink heels? What is her personality? Great questions, Meta.
I said these were “mostly” AI-generated. The truth is with how good the models are getting these days, it’s hard to tell, and I think a couple of them might be real people.
Still, some of these are pretty obviously AI. Here’s one with a bunch of alien text and mangled logos on the scoreboard in the background:
Hmm, I wonder if anyone has noticed this is AI? Let’s check out the comments and see if anyone’s pointed that ou—
…never mind. (I dunno, maybe those are all bots too.)
So: is this just something wacky with my algorithm?
I mean… maybe? That’s part of the whole thing with these algorithmic feeds; it’s hard to know if anyone else is seeing what I’m seeing.
On the one hand, I doubt most (straight) women’s feeds would look like this. But on the other hand, I hadn’t logged in in nearly a decade! I hate to think what the feed looks like for some lonely old guy who’s been scrolling the lightly-clothed AI gooniverse for hours every day.
Did everyone but me know it was like this? I’d seen screencaps of stuff like the Jesus-statue-made-out-of-broccoli slop a year or two ago, but I thought that only happened to grandmas. I hadn’t heard it was this bad.
I wonder if this evolution was less noticeable for people who are logging in every day. Or maybe it only gets this bad when there aren’t any posts from your actual friends?
In any case, I stopped exploring after I saw a couple more of those AI-generated pictures but with girls that looked like they were about ~14, which made me sick to my stomach. So long Facebook, see you never, until one day I inexplicably need to use your platform to get updates from my kid’s school.
...
Read the original on pilk.website »
Claude Sonnet 4.6 is our most capable Sonnet model yet. It’s a full upgrade of the model’s skills across coding, computer use, long-context reasoning, agent planning, knowledge work, and design. Sonnet 4.6 also features a 1M token context window in beta. For those on our Free and Pro plans, Claude Sonnet 4.6 is now the default model in claude.ai and Claude Cowork. Pricing remains the same as Sonnet 4.5, starting at $3/$15 per million tokens.Sonnet 4.6 brings much-improved coding skills to more of our users. Improvements in consistency, instruction following, and more have made developers with early access prefer Sonnet 4.6 to its predecessor by a wide margin. They often even prefer it to our smartest model from November 2025, Claude Opus 4.5.Performance that would have previously required reaching for an Opus-class model—including on real-world, economically valuable office tasks—is now available with Sonnet 4.6. The model also shows a major improvement in computer use skills compared to prior Sonnet models.As with every new Claude model, we’ve run extensive safety evaluations of Sonnet 4.6, which overall showed it to be as safe as, or safer than, our other recent Claude models. Our safety researchers concluded that Sonnet 4.6 has “a broadly warm, honest, prosocial, and at times funny character, very strong safety behaviors, and no signs of major concerns around high-stakes forms of misalignment.”Almost every organization has software it can’t easily automate: specialized systems and tools built before modern interfaces like APIs existed. To have AI use such software, users would previously have had to build bespoke connectors. But a model that can use a computer the way a person does changes that equation.In October 2024, we were the first to introduce a general-purpose computer-using model. At the time, we wrote that it was “still experimental—at times cumbersome and error-prone,” but we expected rapid improvement. OSWorld, the standard benchmark for AI computer use, shows how far our models have come. It presents hundreds of tasks across real software (Chrome, LibreOffice, VS Code, and more) running on a simulated computer. There are no special APIs or purpose-built connectors; the model sees the computer and interacts with it in much the same way a person would: clicking a (virtual) mouse and typing on a (virtual) keyboard.Across sixteen months, our Sonnet models have made steady gains on OSWorld. The improvements can also be seen beyond benchmarks: early Sonnet 4.6 users are seeing human-level capability in tasks like navigating a complex spreadsheet or filling out a multi-step web form, before pulling it all together across multiple browser tabs.The model certainly still lags behind the most skilled humans at using computers. But the rate of progress is remarkable nonetheless. It means that computer use is much more useful for a range of work tasks—and that substantially more capable models are within reach.Scores prior to Claude Sonnet 4.5 were measured on the original OSWorld; scores from Sonnet 4.5 onward use OSWorld-Verified. OSWorld-Verified (released July 2025) is an in-place upgrade of the original OSWorld benchmark, with updates to task quality, evaluation grading, and infrastructure.At the same time, computer use poses risks: malicious actors can attempt to hijack the model by hiding instructions on websites in what’s known as a prompt injection attack. We’ve been working to improve our models’ resistance to prompt injections—our safety evaluations show that Sonnet 4.6 is a major improvement compared to its predecessor, Sonnet 4.5, and performs similarly to Opus 4.6. You can find out more about how to mitigate prompt injections and other safety concerns in our API docs.Beyond computer use, Claude Sonnet 4.6 has improved on benchmarks across the board. It approaches Opus-level intelligence at a price point that makes it more practical for far more tasks. You can find a full discussion of Sonnet 4.6’s capabilities and its safety-related behaviors in our system card; a summary and comparison to other recent models is below.In Claude Code, our early testing found that users preferred Sonnet 4.6 over Sonnet 4.5 roughly 70% of the time. Users reported that it more effectively read the context before modifying code and consolidated shared logic rather than duplicating it. This made it less frustrating to use over long sessions than earlier models.Users even preferred Sonnet 4.6 to Opus 4.5, our frontier model from November, 59% of the time. They rated Sonnet 4.6 as significantly less prone to overengineering and “laziness,” and meaningfully better at instruction following. They reported fewer false claims of success, fewer hallucinations, and more consistent follow-through on multi-step tasks.Sonnet 4.6’s 1M token context window is enough to hold entire codebases, lengthy contracts, or dozens of research papers in a single request. More importantly, Sonnet 4.6 reasons effectively across all that context. This can make it much better at long-horizon planning. We saw this particularly clearly in the Vending-Bench Arena evaluation, which tests how well a model can run a (simulated) business over time—and which includes an element of competition, with different AI models facing off against each other to make the biggest profits.Sonnet 4.6 developed an interesting new strategy: it invested heavily in capacity for the first ten simulated months, spending significantly more than its competitors, and then pivoted sharply to focus on profitability in the final stretch. The timing of this pivot helped it finish well ahead of the competition.Sonnet 4.6 outperforms Sonnet 4.5 on Vending-Bench Arena by investing in capacity early, then pivoting to profitability in the final stretch.Early customers also reported broad improvements, with frontend code and financial analysis standing out. Customers independently described visual outputs from Sonnet 4.6 as notably more polished, with better layouts, animations, and design sensibility than those from previous models. Customers also needed fewer rounds of iteration to reach production-quality results.Claude Sonnet 4.6 matches Opus 4.6 performance on OfficeQA, which measures how well a model can read enterprise documents (charts, PDFs, tables), pull the right facts, and reason from those facts. It’s a meaningful upgrade for document comprehension workloads.The performance-to-cost ratio of Claude Sonnet 4.6 is extraordinary—it’s hard to overstate how fast Claude models have been evolving in recent months. Sonnet 4.6 outperforms on our orchestration evals, handles our most complex agentic workloads, and keeps improving the higher you push the effort settings.Claude Sonnet 4.6 is a notable improvement over Sonnet 4.5 across the board, including long-horizon tasks and more difficult problems.Out of the gate, Claude Sonnet 4.6 is already excelling at complex code fixes, especially when searching across large codebases is essential. For teams running agentic coding at scale, we’re seeing strong resolution rates and the kind of consistency developers need.Claude Sonnet 4.6 has meaningfully closed the gap with Opus on bug detection, letting us run more reviewers in parallel, catch a wider variety of bugs, and do it all without increasing cost.For the first time, Sonnet brings frontier-level reasoning in a smaller and more cost-effective form factor. It provides a viable alternative if you are a heavy Opus user.Claude Sonnet 4.6 meaningfully improves the answer retrieval behind our core product—we saw a significant jump in answer match rate compared to Sonnet 4.5 in our Financial Services Benchmark, with better recall on the specific workflows our customers depend on.Box evaluated how Claude Sonnet 4.6 performs when tested on deep reasoning and complex agentic tasks across real enterprise documents. It demonstrated significant improvements, outperforming Claude Sonnet 4.5 in heavy reasoning Q&A by 15 percentage points.Claude Sonnet 4.6 hit 94% on our insurance benchmark, making it the highest-performing model we’ve tested for computer use. This kind of accuracy is mission-critical to workflows like submission intake and first notice of loss.Claude Sonnet 4.6 delivers frontier-level results on complex app builds and bug-fixing. It’s becoming our go-to for the kind of deep codebase work that used to require more expensive models.Claude Sonnet 4.6 produced the best iOS code we’ve tested for Rakuten AI. Better spec compliance, better architecture, and it reached for modern tooling we didn’t ask for, all in one shot. The results genuinely surprised us.
Sonnet 4.6 is a significant leap forward on reasoning through difficult tasks. We find it especially strong on branched and multi-step tasks like contract routing, conditional template selection, and CRM coordination—exactly where our customers need strong model sense and reliability.We’ve been impressed by how accurately Claude Sonnet 4.6 handles complex computer use. It’s a clear improvement over anything else we’ve tested in our evals.Claude Sonnet 4.6 has perfect design taste when building frontend pages and data reports, and it requires far less hand-holding to get there than anything we’ve tested before.Claude Sonnet 4.6 was exceptionally responsive to direction — delivering precise figures and structured comparisons when asked, while also generating genuinely useful ideas on trial strategy and exhibit preparation.On the Claude Developer Platform, Sonnet 4.6 supports both adaptive thinking and extended thinking, as well as context compaction in beta, which automatically summarizes older context as conversations approach limits, increasing effective context length.On our API, Claude’s web search and fetch tools now automatically write and execute code to filter and process search results, keeping only relevant content in context—improving both response quality and token efficiency. Additionally, code execution, memory, programmatic tool calling, tool search, and tool use examples are now generally available.Sonnet 4.6 offers strong performance at any thinking effort, even with extended thinking off. As part of your migration from Sonnet 4.5, we recommend exploring across the spectrum to find the ideal balance of speed and reliable performance, depending on what you’re building.We find that Opus 4.6 remains the strongest option for tasks that demand the deepest reasoning, such as codebase refactoring, coordinating multiple agents in a workflow, and problems where getting it just right is paramount.For Claude in Excel users, our add-in now supports MCP connectors, letting Claude work with the other tools you use day-to-day, like S&P Global, LSEG, Daloopa, PitchBook, Moody’s, and FactSet. You can ask Claude to pull in context from outside your spreadsheet without ever leaving Excel. If you’ve already set up MCP connectors in Claude.ai, those same connections will work in Excel automatically. This is available on Pro, Max, Team, and Enterprise plans.How to use Claude Sonnet 4.6Claude Sonnet 4.6 is available now on all Claude plans, Claude Cowork, Claude Code, our API, and all major cloud platforms. We’ve also upgraded our free tier to Sonnet 4.6 by default—it now includes file creation, connectors, skills, and compaction.If you’re a developer, you can get started quickly by using claude-sonnet-4-6 via the Claude API.
...
Read the original on www.anthropic.com »
🇬🇧->🇵🇱 Przejdź do polskiej wersji tego wpisu / Go to polish version of this post
Just a year ago, I was really deep into the Apple ecosystem. It seemed like there was no turning back from the orchard for me. Phone, laptop, watch, tablet, video and music streaming, cloud storage, and even a key tracker. All from one manufacturer. Plus shared family photo albums, calendars, and even shopping lists.
However, at some point, I discovered Plenti, a company that rents a really wide range of different devices at quite reasonable prices. Casually, I threw the phrase “samsung fold” into the search engine on their website and it turned out that the Samsung Galaxy Z Fold 6 could be rented for just 250-300 PLN per month. That was quite an interesting option, as I was insanely curious about how it is to live with a foldable phone, which after unfolding becomes the equivalent of a tablet. Plus, I would never dare to buy this type of device, because firstly, their price is astronomical, and secondly, I have serious doubts about the longevity of the folding screen. I checked the rental conditions from Plenti and nothing raised my suspicions. Renting seemed like a really cool option, so I decided to get the Fold 6 for half a year. That’s how I broke out of the orchard and slightly reopened the doors to my heart for solutions without the apple logo. I even wrote a post about the whole process - I betrayed #TeamApple for broken phone. What I’m getting at is that this is how Android returned to my living room and I think I started liking it anew.
My adventure with Samsung ended after the planned 6 months. The Galaxy Z Fold 6 is a good phone, and the ability to unfold it to the size of a tablet is an amazing feature. However, what bothered me about it was:
paying 300 PLN (~80 USD) for rent is a good short-term solution to get something to test, but not in the long run.
All the points above made me give up on extending the rental and start wondering what to do next. Interestingly, I liked Android enough that I didn’t necessarily want to go back to iOS. Around this time, an article hit my RSS reader: Creators of the most secure version of Android fear France. Travel ban for the whole team (I think it was this one, but I’m not entirely sure, it doesn’t really matter). It talked about how France wants to get its hands on the GrapheneOS system and thus carry out a very serious attack on the privacy of its users. I thought then, “Hey! A European country wants to force a backdoor into the system, because it is too well secured to surveil its users. Either this is artificially blowing the topic out of proportion, or there is actually something special about this system!”. At that moment, a somewhat forgotten nerd gene ignited in me. I decided to abandon not only iOS, but also mainstream Android, and try a completely alternative system.
GrapheneOS is a custom, open-source operating system designed with the idea of providing users with the highest level of privacy and security. It is based on the Android Open Source Project (AOSP), but differs significantly from standard software versions found in smartphones. Its creators completely eliminated integration with Google services at the system level, which avoids tracking and data collection by corporations, while offering a modern and stable working environment.
The system is distinguished by advanced “hardening” of the kernel and key components, which minimizes vulnerability to hacking attacks and exploits. A unique feature of GrapheneOS is the ability to run Google Play Services in an isolated environment (sandbox), allowing the user to use popular applications without granting them broad system permissions. Currently, the project focuses on supporting Google Pixel series phones, utilizing their dedicated Titan M security chips for full data protection.
When I used to read about GrapheneOS, the list of compatible devices included items from several different manufacturers. Now it’s only Google Pixel devices. This doesn’t mean you can’t run this system on a Samsung, for example, but the creators simply don’t guarantee it will work properly, and you have to deal with potentially porting the version yourself. Note that it’s quite funny that a system freed from Google services should be run exactly on Google devices. If anyone wants to read more about why Pixels are the best for GrapheneOS, I recommend checking out the following keywords - Verified Boot, Titan M, IOMMU, MTE.
At the stage of choosing a device to test GrapheneOS on, I wasn’t yet sure if such a solution would work for me at all and if I’d last with it in the long run. So it would be unreasonable to lay out a significant amount of money. Because of this, probably the only sensible choice was the Google Pixel 9a. This was a few months ago, when not enough time had passed since the premiere of the 10 series models for them to make it onto the fully supported devices list. At that time, the Pixel 9a was the freshest device on the list (offering up to 7 YEARS of support!) and on top of that, it was very attractively priced, as I bought it for around 1600 PLN (~450 USD).
In retrospect, I still consider it a good choice and definitely recommend this path to anyone who is currently at the stage of deciding on what hardware to start their GrapheneOS adventure. The only thing that bothers me a bit about the Pixel 9a is the quality of the photos it takes. I switched to it having previously had the iPhone 15 Pro and Samsung Galaxy Z Fold 6, which are excellent in this regard, so it’s no wonder I’m a bit spoiled, because I was simply used to a completely different level of cameras. Now I also know that GrapheneOS will stay with me for longer, so it’s possible that knowing then what I know now, I would have opted for some more expensive gear. However, this isn’t important to me now, because for the time being I don’t plan to switch to another device, and by the time that changes, the market situation and the list of available options will certainly have changed too. Besides, I’m positively surprised by the battery life and overall performance of this phone.
A suitable smartphone - in my case, it’s a Google Pixel 9a.
A cable to connect the phone to a computer; it can’t be just any cable, but one that is used not only for charging but also for data transmission. It’s best to just use the cable that came with the phone.
A computer with a Chromium-based browser (e.g., Google Chrome, Brave, Microsoft Edge, Vivaldi?). Unfortunately, I must recommend Windows 10/11 here, because then you don’t have to mess around with any drivers; it’s the simplest option.
If it’s new, we take it out of the box and turn it on. If it was previously used, we restore it to factory settings (Settings -> System -> Reset options -> Erase all data (factory reset) -> Erase all data). I think it’s stating the obvious, but I’ll write it anyway - a factory reset results in the deletion of all user data from the device, so if you have anything important on it, you need to back it up.
We must go through the basic setup until we see the home screen. We do the absolute minimum. Here is a breakdown of the steps:
we don’t connect to Wi-Fi, so we skip this step too
we don’t need to do anything with the warranty terms, so just the Next button
there is no need to waste time setting up biometrics, so we politely decline and skip fingerprint and face scan
First of all, we need to make sure that our phone’s software is updated to the latest available version. For this purpose, we go to Settings -> System -> System update. If necessary, we update.
Next, we go to Settings -> About phone -> find the Build number field and tap it 7 times until we see the message You are now a developer. In the meantime, the phone will ask for the PIN we set during the phone setup.
We go back and now enter Settings -> System -> Developer options -> turn on the OEM unlocking option. The phone will ask for the PIN again. After entering it, we still have to confirm that we definitely want to remove the lock.
When the screen goes completely dark, we simultaneously press and hold the power and volume down buttons until the text-based Fastboot Mode interface appears. If the phone starts up normally, it means we performed one of the earlier steps incorrectly.
We go to the computer and open the browser (based on the Chromium engine) to the address https://grapheneos.org/install/web.
A window with a list of devices to choose from will pop up in the browser. There should basically be only one item on it, and that should be our Pixel. We select it and press the Connect button.
Changes will occur on the phone’s display. A message will appear asking to confirm that we actually want to unlock the bootloader. To do this, we must press one of the volume buttons so that instead of Do not unlock the bootlader, Unlock the bootlader appears. At this point, we can confirm by pressing the power button.
On the GrapheneOS website, we scroll down to the Obtaining factory images section and press the Download release button. If the phone is still connected to the computer, the website will decide on its own which system image to download.
We wait for the download to finish. It is obvious that the time needed for this depends directly on the speed of the internet connection.
Locking the bootloader is crucial because it enables the full operation of the Verified Boot feature. It also prevents the use of fastboot mode to flash, format, or wipe partitions. Verified Boot detects any modifications to the OS partitions and blocks the reading of any altered or corrupted data. If changes are detected, the system uses error correction data to attempt to recover the original data, which is then verified again — thanks to this mechanism, the system is resilient to accidental (non-malicious) file corruption.
Being in Fastboot Mode, when we see the Start message, we press the power button, which will cause the system to start normally. If we don’t see Start at the height of the power button, we have to press the volume buttons and find this option.
This is a standard procedure, so we will only go through it briefly:
I recommend turning off the location service, because it’s better to configure it calmly later by granting permissions only to apps that really need it
securing the phone with a fingerprint; I personally am an advocate of this solution, so I recommend using it, GrapheneOS does not (yet) support face unlock, so fingerprint and a standard password are the only methods we have to choose from (of course I reject pattern unlock right at the start as a form of screen lock that cannot even in good conscience be called any security)
I assume that if you are reading this post, you are a graphene freshman and you have no backup to restore, so we just skip this step
We land back in Fastboot Mode. I assume the phone was connected to the computer the whole time (if not, reconnect it). We return to the browser on the computer. We find the Locking the bootloader section and press the Lock bootloader button.
Again, confirmation of this operation on the phone is required. It looks analogous to unlocking, except this time, using the volume buttons, we have to make the Lock the bootloader option active and confirm it with the power button.
Just like when removing the lock, we go to Settings -> About phone -> find the Build number field and tap it 7 times until we see the message You are now a developer. In the meantime, the phone will ask for the PIN we set during the phone setup.
We go back and now enter Settings -> System -> Developer options -> turn off the OEM unlocking option. The phone will ask us to restart to change this setting, but for now we cancel this request, because we still want to completely turn off Developer options, which is done by unchecking the box next to the first option at the very top, Use developer options.
Now the real fun begins. You’ll hear/read as many opinions on what you should and shouldn’t do regarding GrapheneOS hardening as there are people. Some are conservative, while others approach the topic a bit more liberally. In my opinion, there is no one right path, and everyone should dig around, test things out, and decide what suits them and fits their security profile. You’ll quickly find out that GrapheneOS is really one big compromise between convenience and privacy. While this same rule applies to everything belonging to the digital world, it’s only in this case that you’ll truly notice it, because GrapheneOS will show you how many things you can control, which you can’t do using conventional Android. I don’t intend to use this post to promote some “one and only” method of using GrapheneOS. I’ll simply present how I use this system. This way, I’ll show the basics to people fresh to the topic, maybe I’ll manage to suggest an interesting trick they didn’t know to those who have been users for a while, and on a third note, maybe some expert will show up who, after reading my ramblings, will suggest something interesting or point out what I’m doing wrong / could do better. I’m sure that’s the case, since my adventure with GrapheneOS has practically only been going on for 3 months. I warn you right away that I’m not sure if I’ll be able to maintain a logical train of thought, as I’ll probably jump around topics a bit. The subject of GrapheneOS is vast and in today’s post I’ll only manage to slightly touch upon it.
One of the first things I did after booting up the freshly installed system was to create a second user profile. This is done in Settings -> System -> Multiple users. The idea is for this feature to allow two (or more) people to use one phone, each having a separate profile with their own settings, apps, etc. Who in their right mind does that? While I can imagine sharing a home tablet, sharing a phone completely eludes me. It therefore seems like a dead feature, but nothing could be further from the truth.
For me, it works like this: on the Owner user, because that’s the name of the main account created automatically with the system, I installed the Google Play Store along with Google Play services and GmsCompatConfig. This is done through the App Store application, which is a component of the GrapheneOS system. Please don’t confuse this with Apple’s app store, even though the name is the same. From the Play Store I only installed the following applications:
And that’s it. As you can see, this profile serves me only for apps that absolutely require integration with Google services. In practice, I switch to it only when I want to pay contactlessly in a store, which I actually do rarely lately, because if there’s an option, I pay using BLIK codes. Right after switching from Samsung there were more apps on this profile, but one by one I successively gave up on those that made me dependent on the big G.
It’s on the second profile, which let’s assume I called Tommy, that I keep my entire digital life. What does this give me? For instance, the main profile cannot be easily deleted, but the additional one can. Let’s imagine a situation where I need to quickly wipe my phone, but in a way that its basic functions still work, i.e., without a full factory reset. An example could be, say, arriving in the USA and undergoing immigration control. They want access to my phone, so I delete the Tommy user, switch to the Owner user, and hand them the phone. It makes calls, sends SMS messages, even has a banking app, so theoretically it shouldn’t arouse suspicion. However, it lacks all my contacts, a browser with my visited pages history, a password manager, and messengers with chat histories. This is rather a drastic scenario, but not really that improbable, as actions like searching a phone upon arrival in the States are something that happens on a daily basis. Besides, the basic rule of security is not to use an account with administrator privileges on a daily basis.
On GrapheneOS, Obtainium is my primary aggregator for obtaining .apk installation files and automating app updates. It’s like the Google Play Store, but privacy-respecting and for open-source applications. It would be a sin to use GrapheneOS and not at least try to switch to open-source apps. Below I present a list of apps that I use. Additionally, I’m tossing in links to the source code repositories of each of them.
To understand how Obtainium works and how to use it, I recommend checking out this video guide.
I have a few apps that are not open-source, but I still need them. In this case, I don’t download them from the Google Play Store, but exactly from the Aurora Store, which I mentioned above.
Aurora Store is an open-source client of the Google Play store (I guess you could call it a frontend) that allows downloading applications from Google servers without needing Google services (GMS) on the phone.
* Privacy - you don’t need to log in with a Google account to download free apps (you can use built-in anonymous accounts).
With these anonymous accounts, the thing is that sometimes they work, and sometimes they don’t, due to limits that are unreachable with a normal account used by one person, but when a thousand people download apps from one account at once, it starts to get suspicious, and the limits are exceeded quite quickly. Using Aurora Store violates the Google Play Store terms of service, so on the other hand if we use our Google account, it might be temporarily blocked or permanently banned. Some option here is to create a “burner” account just for this, but that takes away some of our privacy, because Google can still index us based on what we downloaded. Anonymous accounts in this case provide almost complete anonymity, because then we are just a drop in the ocean.
When it comes to security, yes, in theory we download .apk files from a verified source, but only under the condition that the Aurora Store creators don’t serve us a Man in the Middle attack. The decision whether you trust the creators of this app is up to you.
Below I present a list of applications that I downloaded from the Aurora Store, checked, and can confirm that they work without GMS (Google Mobile Services).
* My municipality’s app - because I need to know when they’ll collect my trash :)
* OpenVPN - I use it as a tunnel to my home network
* Perplexity - I switched to Gemini, but I confirm it works
* Synology Photos - my home photo gallery on a NAS
* Pocket Casts - podcasts, I plan to migrate to AntennaPod
* TickTick - to-do lists, it’s hard for me to find a sensible alternative that is multiplatform and has all the features I need
Has anyone ever wondered if all apps on a phone need Internet access? Indeed, in the vast majority of cases, a mobile app without network access is useless, but you can’t generalize like that, because for example, the previously mentioned FUTO Voice Input uses a local LLM to convert speech to text, which works offline on the device. Why would such an app need Internet access then? For nothing, so it shouldn’t have such permission. Now let’s take apps like FairScan (document scanning), Catima (loyalty card aggregator), Collabora Office (office suite), or Librera (ebook reader). They too do not need Internet access!
The situation looks even more bizarre when you look at which apps actually need access to all of our device’s sensors. If we think about it calmly, we’ll conclude that in this specific case it’s completely the opposite of the previous one, meaning practically no app needs this information. And I remind you that by default on Android with Google services, all apps have such permissions.
To manage a given application’s permissions, just tap and hold on its icon, select App info from the pop-up menu, and find the Permissions tab. A list categorized by things like - Allowed, Ask every time, and Not allowed will appear. I recommend reviewing this list for each app separately right after installing it. This is the foundation of GrapheneOS hardening.
A collective menu where you can view specific permissions and which apps have them granted is available in Settings -> Security & privacy -> Privacy -> Permission manager. Another interesting place is the Privacy dashboard available in the same location. It’s a tool that shows not only app permissions, but also how often a given app reaches for the permissions granted to it.
In GrapheneOS we don’t only have user profiles, but each user can also have something called a Private space. I encountered something similar on Samsung, where it was called Secure Folder, so I assume this might just be an Android feature implemented differently by each manufacturer.
Private space is turned on in Settings -> Security & privacy -> Private space. It acts like a sort of separated sandbox that is part of the environment you use, but at the same time is isolated from it. For me, it’s a place that gives me quick access to apps that nevertheless require Google services. You might ask - why then do I keep the mBank and T-Mobile apps on the Owner user if I could keep them here? Well, for reasons unknown to me, I’m unable to configure my private space so that paying with contactless BLIK via NFC works correctly in it. The same goes for Magenta Moments from T-Mobile, which don’t work correctly despite GMS being installed in the private space.
* Google Drive - I use it as a cloud to share files with clients
* mObywatel - at first I kept this app in the main profile as downloaded from Aurora Store and everything somewhat worked, but every now and then the app caught a total freeze and stopped responding, I think it might be related to the fact that it does send some Google services-related requests in the background and doesn’t respond until such a request times out, I have this on my list to investigate
* Play Store - I have to download all these apps from somewhere, doing it via Aurora Store in the private space wouldn’t make sense since I have the whole Google services package installed here anyway
* XTB - another investing app… works without GMS, but like I said, I keep all financial ones in one place
Oof… I did it again, sorry. I’m just counting the characters and it comes out to just under 35,000… I’ll probably break that barrier with these next few sentences. Well, long again, but purely meaty again, so I don’t think anyone has reason to complain. As I mentioned earlier, I’ve only touched upon the topic of GrapheneOS, which is extensive, and it’s a good thing, because it’s a great system, and the biggest respect goes to the people behind this project. It’s thanks to them that we even have the option of at least partially freeing ourselves from Google (Android) and Apple (iOS). Therefore, I highly invite you to the final chapter of this post.
Finally, I would like to encourage you to support the GrapheneOS project. The developers behind it are doing a really great job and in my opinion deserve to have some money thrown at them. Information on where and how this can be done can be found here.
...
Read the original on blog.tomaszdunia.pl »
A few days ago, people started tagging me on Bluesky and Hacker News about a diagram on Microsoft’s Learn portal. It looked… familiar.
In 2010, I wrote A successful Git branching
model and created a diagram to go with it. I designed that diagram in Apple Keynote, at the time obsessing over the colors, the curves, and the layout until it clearly communicated how branches relate to each other over time. I also published the source file so others could build on it. That diagram has since spread everywhere: in books, talks, blog posts, team wikis, and YouTube videos. I never minded. That was the whole point: sharing knowledge and letting the internet take it by storm!
What I did not expect was for Microsoft, a trillion-dollar company, some 15+ years later, to apparently run it through an AI image generator and publish the result on their official Learn portal, without any credit or link back to the original.
The AI rip-off was not just ugly. It was careless, blatantly amateuristic, and lacking any ambition, to put it gently. Microsoft unworthy. The carefully crafted visual language and layout of the original, the branch colors, the lane design, the dot and bubble alignment that made the original so readable—all of it had been muddled into a laughable form. Proper AI slop.
Arrows missing and pointing in the wrong direction, and the obvious “continvoucly morged” text quickly gave it away as a cheap AI artifact.
It had the rough shape of my diagram though. Enough actually so that people recognized the original in it and started calling Microsoft out on it and reaching out to me. That so many people were upset about this was really nice, honestly. That, and “continvoucly morged” was a very fun meme—thank you, internet! 😄
Oh god yes, Microsoft continvoucly morged my diagram there for sure 😬— Vincent Driessen (@nvie.com) 2026-02-16T20:55:54.762Z
Other than that, I find this whole thing mostly very saddening. Not because some company used my diagram. As I said, it’s been everywhere for 15 years and I’ve always been fine with that. What’s dispiriting is the (lack of) process
and care: take someone’s carefully crafted work, run it through a machine to wash off the fingerprints, and ship it as your own. This isn’t a case of being inspired by something and building on it. It’s the opposite of that. It’s taking something that worked and making it worse. Is there even a goal here beyond “generating content”?
What’s slightly worrying me is that this time around, the diagram was both well-known enough and obviously AI-slop-y enough that it was easy to spot as plagiarism. But we all know there will just be more and more content like this that isn’t so well-known or soon will get mutated or disguised in more advanced ways that this plagiarism no longer will be recognizable as such.
I don’t need much here. A simple link back and attribution to the original article would be a good start. I would also be interested in understanding how this Learn page at Microsoft came to be, what the goals were here, and what the process has been that led to the creation of this ugly asset, and how there seemingly has not been any form of proof-reading for a document used as a learning resource by many developers.
...
Read the original on nvie.com »
Your browser does not support the audio element.
This content is generated by Google AI. Generative AI is experimental
Last week, we released a major update to Gemini 3 Deep Think to solve modern challenges across science, research and engineering. Today, we’re releasing the upgraded core intelligence that makes those breakthroughs possible: Gemini 3.1 Pro. We are shipping 3.1 Pro across our consumer and developer products to bring this progress in intelligence to your everyday applications. For developers in preview via the Gemini API in Google AI Studio, Gemini CLI, our agentic development platform Google Antigravity and Android StudioFor enterprises in Vertex AI and Gemini EnterpriseFor consumers via the Gemini app and NotebookLMBuilding on the Gemini 3 series, 3.1 Pro represents a step forward in core reasoning. 3.1 Pro is a smarter, more capable baseline for complex problem-solving. This is reflected in our progress on rigorous benchmarks. On ARC-AGI-2, a benchmark that evaluates a model’s ability to solve entirely new logic patterns, 3.1 Pro achieved a verified score of 77.1%. This is more than double the reasoning performance of 3 Pro.
3.1 Pro is designed for tasks where a simple answer isn’t enough, taking advanced reasoning and making it useful for your hardest challenges. This improved intelligence can help in practical applications — whether you’re looking for a clear, visual explanation of a complex topic, a way to synthesize data into a single view, or bringing a creative project to life.
Code-based animation: 3.1 Pro can generate website-ready, animated SVGs directly from a text prompt. Because these are built in pure code rather than pixels, they remain crisp at any scale and maintain incredibly small file sizes compared to traditional video.
Complex system synthesis: 3.1 Pro utilizes advanced reasoning to bridge the gap between complex APIs and user-friendly design. In this example, the model built a live aerospace dashboard, successfully configuring a public telemetry stream to visualize the International Space Station’s orbit.
Interactive design: 3.1 Pro codes a complex 3D starling murmuration. It doesn’t just generate the visual code; it builds an immersive experience where users can manipulate the flock with hand-tracking and listen to a generative score that shifts based on the birds’ movement. For researchers and designers, this provides a powerful way to prototype sensory-rich interfaces.
Creative coding: 3.1 Pro can translate literary themes into functional code. When prompted to build a modern personal portfolio for Emily Brontë’s “Wuthering Heights,” the model didn’t just summarize the text. It reasoned through the novel’s atmospheric tone to design a sleek, contemporary interface, creating a website that captures the essence of the protagonist.
Since releasing Gemini 3 Pro in November, your feedback and the pace of progress have driven these rapid improvements. We are releasing 3.1 Pro in preview today to validate these updates and continue to make further advancements in areas such as ambitious agentic workflows before we make it generally available soon.Starting today, Gemini 3.1 Pro in the Gemini app is rolling out with higher limits for users with the Google AI Pro and Ultra plans. 3.1 Pro is also now available on NotebookLM exclusively for Pro and Ultra users. And developers and enterprises can access 3.1 Pro now in preview in the Gemini API via AI Studio, Antigravity, Vertex AI, Gemini Enterprise, Gemini CLI and Android Studio.We can’t wait to see what you build and discover with it.
...
Read the original on blog.google »
Your page may be loading slowly because you’re building optimized sources. If you intended on using uncompiled sources, please click this link.
Google Cloud Console has failed to load JavaScript sources from www.gstatic.com.
Possible reasons are:www.gstatic.com or its IP addresses are blocked by your network administratorGoogle has temporarily blocked your account or network due to excessive automated requestsPlease contact your network administrator for further assistance.
...
Read the original on console.cloud.google.com »
The Workflow in One Sentence I’ve been using Claude Code as my primary development tool for approx 9 months, and the workflow I’ve settled into is radically different from what most people do with AI coding tools. Most developers type a prompt, sometimes use plan mode, fix the errors, repeat. The more terminally online are stitching together ralph loops, mcps, gas towns (remember those?), etc. The results in both cases are a mess that completely falls apart for anything non-trivial.
The workflow I’m going to describe has one core principle: never let Claude write code until you’ve reviewed and approved a written plan. This separation of planning and execution is the single most important thing I do. It prevents wasted effort, keeps me in control of architecture decisions, and produces significantly better results with minimal token usage than jumping straight to code.
flowchart LR
R[Research] –> P[Plan]
P –> A[Annotate]
A –>|repeat 1-6x| A
A –> T[Todo List]
T –> I[Implement]
I –> F[Feedback & Iterate]
Every meaningful task starts with a deep-read directive. I ask Claude to thoroughly understand the relevant part of the codebase before doing anything else. And I always require the findings to be written into a persistent markdown file, never just a verbal summary in the chat.
read this folder in depth, understand how it works deeply, what it does and all its specificities. when that’s done, write a detailed report of your learnings and findings in research.md
study the notification system in great details, understand the intricacies of it and write a detailed research.md document with everything there is to know about how notifications work
go through the task scheduling flow, understand it deeply and look for potential bugs. there definitely are bugs in the system as it sometimes runs tasks that should have been cancelled. keep researching the flow until you find all the bugs, don’t stop until all the bugs are found. when you’re done, write a detailed report of your findings in research.md
Notice the language: “deeply”, “in great details”, “intricacies”, “go through everything”. This isn’t fluff. Without these words, Claude will skim. It’ll read a file, see what a function does at the signature level, and move on. You need to signal that surface-level reading is not acceptable.
The written artifact (research.md) is critical. It’s not about making Claude do homework. It’s my review surface. I can read it, verify Claude actually understood the system, and correct misunderstandings before any planning happens. If the research is wrong, the plan will be wrong, and the implementation will be wrong. Garbage in, garbage out.
This is the most expensive failure mode with AI-assisted coding, and it’s not wrong syntax or bad logic. It’s implementations that work in isolation but break the surrounding system. A function that ignores an existing caching layer. A migration that doesn’t account for the ORM’s conventions. An API endpoint that duplicates logic that already exists elsewhere. The research phase prevents all of this.
Once I’ve reviewed the research, I ask for a detailed implementation plan in a separate markdown file.
I want to build a new feature that extends the system to perform . write a detailed plan.md document outlining how to implement this. include code snippets
the list endpoint should support cursor-based pagination instead of offset. write a detailed plan.md for how to achieve this. read source files before suggesting changes, base the plan on the actual codebase
The generated plan always includes a detailed explanation of the approach, code snippets showing the actual changes, file paths that will be modified, and considerations and trade-offs.
I use my own .md plan files rather than Claude Code’s built-in plan mode. The built-in plan mode sucks. My markdown file gives me full control. I can edit it in my editor, add inline notes, and it persists as a real artifact in the project.
One trick I use constantly: for well-contained features where I’ve seen a good implementation in an open source repo, I’ll share that code as a reference alongside the plan request. If I want to add sortable IDs, I paste the ID generation code from a project that does it well and say “this is how they do sortable IDs, write a plan.md explaining how we can adopt a similar approach.” Claude works dramatically better when it has a concrete reference implementation to work from rather than designing from scratch.
But the plan document itself isn’t the interesting part. The interesting part is what happens next.
This is the most distinctive part of my workflow, and the part where I add the most value.
flowchart TD
W[Claude writes plan.md] –> R[I review in my editor]
R –> N[I add inline notes]
N –> S[Send Claude back to the document]
S –> U[Claude updates plan]
U –> D{Satisfied?}
D –>|No| R
D –>|Yes| T[Request todo list]
After Claude writes the plan, I open it in my editor and add inline notes directly into the document. These notes correct assumptions, reject approaches, add constraints, or provide domain knowledge that Claude doesn’t have.
The notes vary wildly in length. Sometimes a note is two words: “not optional” next to a parameter Claude marked as optional. Other times it’s a paragraph explaining a business constraint or pasting a code snippet showing the data shape I expect.
“use drizzle:generate for migrations, not raw SQL” — domain knowledge Claude doesn’t have
“no — this should be a PATCH, not a PUT” — correcting a wrong assumption
“remove this section entirely, we don’t need caching here” — rejecting a proposed approach
“the queue consumer already handles retries, so this retry logic is redundant. remove it and just let it fail” — explaining why something should change
“this is wrong, the visibility field needs to be on the list itself, not on individual items. when a list is public, all items are public. restructure the schema section accordingly” — redirecting an entire section of the plan
Then I send Claude back to the document:
I added a few notes to the document, address all the notes and update the document accordingly. don’t implement yet
This cycle repeats 1 to 6 times. The explicit “don’t implement yet” guard is essential. Without it, Claude will jump to code the moment it thinks the plan is good enough. It’s not good enough until I say it is.
Why This Works So Well
The markdown file acts as shared mutable state between me and Claude. I can think at my own pace, annotate precisely where something is wrong, and re-engage without losing context. I’m not trying to explain everything in a chat message. I’m pointing at the exact spot in the document where the issue is and writing my correction right there.
This is fundamentally different from trying to steer implementation through chat messages. The plan is a structured, complete specification I can review holistically. A chat conversation is something I’d have to scroll through to reconstruct decisions. The plan wins every time.
Three rounds of “I added notes, update the plan” can transform a generic implementation plan into one that fits perfectly into the existing system. Claude is excellent at understanding code, proposing solutions, and writing implementations. But it doesn’t know my product priorities, my users’ pain points, or the engineering trade-offs I’m willing to make. The annotation cycle is how I inject that judgement.
add a detailed todo list to the plan, with all the phases and individual tasks necessary to complete the plan - don’t implement yet
This creates a checklist that serves as a progress tracker during implementation. Claude marks items as completed as it goes, so I can glance at the plan at any point and see exactly where things stand. Especially valuable in sessions that run for hours.
When the plan is ready, I issue the implementation command. I’ve refined this into a standard prompt I reuse across sessions:
implement it all. when you’re done with a task or phase, mark it as completed in the plan document. do not stop until all tasks and phases are completed. do not add unnecessary comments or jsdocs, do not use any or unknown types. continuously run typecheck to make sure you’re not introducing new issues.
This single prompt encodes everything that matters:
“implement it all”: do everything in the plan, don’t cherry-pick
“mark it as completed in the plan document”: the plan is the source of truth for progress
“do not stop until all tasks and phases are completed”: don’t pause for confirmation mid-flow
“do not add unnecessary comments or jsdocs”: keep the code clean
“do not use any or unknown types”: maintain strict typing
“continuously run typecheck”: catch problems early, not at the end
I use this exact phrasing (with minor variations) in virtually every implementation session. By the time I say “implement it all,” every decision has been made and validated. The implementation becomes mechanical, not creative. This is deliberate. I want implementation to be boring. The creative work happened in the annotation cycles. Once the plan is right, execution should be straightforward.
Without the planning phase, what typically happens is Claude makes a reasonable-but-wrong assumption early on, builds on top of it for 15 minutes, and then I have to unwind a chain of changes. The “don’t implement yet” guard eliminates this entirely.
Once Claude is executing the plan, my role shifts from architect to supervisor. My prompts become dramatically shorter.
flowchart LR
I[Claude implements] –> R[I review / test]
R –> C{Correct?}
C –>|No| F[Terse correction]
F –> I
C –>|Yes| N{More tasks?}
N –>|Yes| I
N –>|No| D[Done]
Where a planning note might be a paragraph, an implementation correction is often a single sentence:
“You built the settings page in the main app when it should be in the admin app, move it.”
Claude has the full context of the plan and the ongoing session, so terse corrections are enough.
Frontend work is the most iterative part. I test in the browser and fire off rapid corrections:
For visual issues, I sometimes attach screenshots. A screenshot of a misaligned table communicates the problem faster than describing it.
“this table should look exactly like the users table, same header, same pagination, same row density.”
This is far more precise than describing a design from scratch. Most features in a mature codebase are variations on existing patterns. A new settings page should look like the existing settings pages. Pointing to the reference communicates all the implicit requirements without spelling them out. Claude would typically read the reference file(s) before making the correction.
When something goes in a wrong direction, I don’t try to patch it. I revert and re-scope by discarding the git changes:
“I reverted everything. Now all I want is to make the list view more minimal — nothing else.”
Narrowing scope after a revert almost always produces better results than trying to incrementally fix a bad approach.
Even though I delegate execution to Claude, I never give it total autonomy over what gets built. I do the vast majority of the active steering in the plan.md documents.
This matters because Claude will sometimes propose solutions that are technically correct but wrong for the project. Maybe the approach is over-engineered, or it changes a public API signature that other parts of the system depend on, or it picks a more complex option when a simpler one would do. I have context about the broader system, the product direction, and the engineering culture that Claude doesn’t.
flowchart TD
P[Claude proposes changes] –> E[I evaluate each item]
E –> A[Accept as-is]
E –> M[Modify approach]
E –> S[Skip / remove]
E –> O[Override technical choice]
A & M & S & O –> R[Refined implementation scope]
Cherry-picking from proposals: When Claude identifies multiple issues, I go through them one by one: “for the first one, just use Promise.all, don’t make it overly complicated; for the third one, extract it into a separate function for readability; ignore the fourth and fifth ones, they’re not worth the complexity.” I’m making item-level decisions based on my knowledge of what matters right now.
Trimming scope: When the plan includes nice-to-haves, I actively cut them. “remove the download feature from the plan, I don’t want to implement this now.” This prevents scope creep.
Protecting existing interfaces: I set hard constraints when I know something shouldn’t change: “the signatures of these three functions should not change, the caller should adapt, not the library.”
Overriding technical choices: Sometimes I have a specific preference Claude wouldn’t know about: “use this model instead of that one” or “use this library’s built-in method instead of writing a custom one.” Fast, direct overrides.
Claude handles the mechanical execution, while I make the judgement calls. The plan captures the big decisions upfront, and selective guidance handles the smaller ones that emerge during implementation.
I run research, planning, and implementation in a single long session rather than splitting them across separate sessions. A single session might start with deep-reading a folder, go through three rounds of plan annotation, then run the full implementation, all in one continuous conversation.
I am not seeing the performance degradation everyone talks about after 50% context window. Actually, by the time I say “implement it all,” Claude has spent the entire session building understanding: reading files during research, refining its mental model during annotation cycles, absorbing my domain knowledge corrections.
When the context window fills up, Claude’s auto-compaction maintains enough context to keep going. And the plan document, the persistent artifact, survives compaction in full fidelity. I can point Claude to it at any point in time.
The Workflow in One Sentence
Read deeply, write a plan, annotate the plan until it’s right, then let Claude execute the whole thing without stopping, checking types along the way.
That’s it. No magic prompts, no elaborate system instructions, no clever hacks. Just a disciplined pipeline that separates thinking from typing. The research prevents Claude from making ignorant changes. The plan prevents it from making wrong changes. The annotation cycle injects my judgement. And the implementation command lets it run without interruption once every decision has been made.
Try my workflow, you’ll wonder how you ever shipped anything with coding agents without an annotated plan document sitting between you and the code.
The Workflow in One Sentence
...
Read the original on boristane.com »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.