10 interesting stories served every morning and every evening.
Anna’s Blog
Updates about Anna’s Archive, the largest truly open library in human history.
We backed up Spotify (metadata and music files). It’s distributed in bulk torrents (~300TB), grouped by popularity.
This release includes the largest publicly available music metadata database with 256 million tracks and 186 million unique ISRCs.
It’s the world’s first “preservation archive” for music which is fully open (meaning it can easily be mirrored by anyone with enough disk space), with 86 million music files, representing around 99.6% of listens.
Anna’s Archive normally focuses on text (e.g. books and papers). We explained in “The critical window of shadow libraries” that we do this because text has the highest information density. But our mission (preserving humanity’s knowledge and culture) doesn’t distinguish among media types. Sometimes an opportunity comes along outside of text. This is such a case.
A while ago, we discovered a way to scrape Spotify at scale. We saw a role for us here to build a music archive primarily aimed at preservation.
Generally speaking, music is already fairly well preserved. There are many music enthusiasts in the world who digitized their CD and LP collections, shared them through torrents or other digital means, and meticulously catalogued them.
However, these existing efforts have some major issues:
Over-focus on the most popular artists. There is a long tail of music which only gets preserved when a single person cares enough to share it. And such files are often poorly seeded.
Over-focus on the highest possible quality. Since these are created by audiophiles with high end equipment and fans of a particular artist, they chase the highest possible file quality (e.g. lossless FLAC). This inflates the file size and makes it hard to keep a full archive of all music that humanity has ever produced.
No authoritative list of torrents aiming to represent all music ever produced. An equivalent of our book torrent list (which aggregate torrents from LibGen, Sci-Hub, Z-Lib, and many more) does not exist for music.
This Spotify scrape is our humble attempt to start such a “preservation archive” for music. Of course Spotify doesn’t have all the music in the world, but it’s a great start.
Before we dive into the details of this collection, here is a quick overview:
Spotify has around 256 million tracks. This collection contains metadata for an estimated 99.9% of tracks.
We archived around 86 million music files, representing around 99.6% of listens. It’s a little under 300TB in total size.
We primarily used Spotify’s “popularity” metric to prioritize tracks. View the top 10,000 most popular songs in this HTML file (13.8MB gzipped).
For popularity>0, we got close to all tracks on the platform. The quality is the original OGG Vorbis at 160kbit/s. Metadata was added without reencoding the audio (and an archive of diff files is available to reconstruct the original files from Spotify, as well as a metadata file with original hashes and checksums).
For popularity=0, we got files representing about half the number of listens (either original or a copy with the same ISRC). The audio is reencoded to OGG Opus at 75kbit/s — sounding the same to most people, but noticeable to an expert.
The cutoff is 2025-07, anything released after that date may not be present (though in some cases it is).
This is by far the largest music metadata database that is publicly available. For comparison, we have 256 million tracks, while others have 50-150 million. Our data is well-annotated: MusicBrainz has 5 million unique ISRCs, while our database has 186 million.
This is the world’s first “preservation archive” for music which is fully open (meaning it can easily be mirrored by anyone with enough disk space).
The data will be released in different stages on our Torrents page:
[ ] .zstdpatch files (to reconstruct original files before we added embedded metadata)
For now this is a torrents-only archive aimed at preservation, but if there is enough interest, we could add downloading of individual files to Anna’s Archive. Please let us know if you’d like this.
Please help preserve these files:
Seed these torrents (on the Torrents page of Anna’s Archive). Even a seeding a few torrents helps!
With your help, humanity’s musical heritage will be forever protected from destruction by natural disasters, wars, budget cuts, and other catastrophes.
In this blog we will analyze the data and look at details of the release. We hope you enjoy.
Let’s dive into the data! Here’s some high-level statistics pulled from the metadata:
The most convenient available way to sort songs on Spotify is using the popularity metric, defined as follows:
The popularity of a track is a value between 0 and 100, with 100 being the most popular. The popularity is calculated by algorithm and is based, in the most part, on the total number of plays the track has had and how recent those plays are.
Generally speaking, songs that are being played a lot now will have a higher popularity than songs that were played a lot in the past. Duplicate tracks (e.g. the same track from a single and an album) are rated independently. Artist and album popularity is derived mathematically from track popularity.
If we group songs by popularity, we see that there is an extremely large tail end:
≥70% of songs are ones almost no one ever listens to (stream count < 1000). To see some detail, we can plot this on a logarithmic scale:
The top 10,000 songs span popularities 70-100. You can view them all in this HTML file (13.8MB gzipped).
Additionally, we can estimate the number of listens per track and total number per popularity. The stream count data is estimated since it is difficult to fetch at scale, so we sampled it randomly.
As we can see, most of the listens come from songs with a popularity between 50 and 80, even though there’s only 210.000 songs with popularity ≥50, around 0.1% of songs. Note the huge (subjectively estimated) error bar on pop=0 — the reason for this is that Spotify does not publish stream counts for songs with < 1000 streams.
We can also estimate that the top three songs (as of writing) have a higher total stream count than the bottom 20-100 million songs combined:
select json_group_array(artists.name), tracks.name, tracks.popularity
from tracks
join track_artists on track_rowid = tracks.rowid
join artists on artist_rowid = artists.rowid
where tracks.id in (select id from tracks order by popularity desc limit 3)
group by tracks.id;
Note that the popularity is very time-dependent and not directly translatable into stream counts, so these top songs are basically arbitrary.
We have archived around 86 million songs from Spotify, ordering by popularity descending. While this only represents 37% of songs, it represents around 99.6% of listens:
Put another way, for any random song a person listens to, there is a 99.6% likelihood that it is part of the archive. We expect this number to be higher if you filter to only human-created songs. Do remember though that the error bar on listens for popularity 0 is large.
For popularity=0, we ordered tracks by a secondary importance metric based on artist followers and album popularity, and fetched in descending order.
We have stopped here due to the long tail end with diminishing returns (700TB+ additional storage for minor benefit), as well as the bad quality of songs with popularity=0 (many AI generated, hard to filter).
Before diving into more fun stats, let’s look at how the collection itself is structured. It’s in two parts: metadata and music files, both of which are distributed through torrents.
The metadata torrents contain, based on statistical analysis, around 99.9% of artists, albums, tracks. The metadata is published as compact queryable SQLite databases. Care was taken, by doing API response reconstruction, that there is (almost) no data loss in the conversion from the API JSON.
The metadata for artists, albums, tracks is less than 200 GB compressed. The secondary metadata of audio analysis is 4TB compressed.
We look at more detail at the structure of the metadata at the end of this blog post.
The data itself is distributed in the Anna’s Archive Containers (AAC) format. This is a standard which we created a few years ago for distributing files across multiple torrents. It is not to be confused with the Advanced Audio Coding (AAC) encoding format.
Since the original files contain zero metadata, as much metadata as possible was added to the OGG files, including title, url, ISRC, UPC, album art, replaygain information, etc. The invalid OGG data packet Spotify prepends to every track file was stripped — it is present in the track_files db.
For popularity>0, the quality is the original OGG Vorbis at 160kbit/s. Metadata was added without reencoding the audio (and an archive of diff files is available to reconstruct the original files from Spotify).
For popularity=0, the audio is reencoded to OGG Opus at 75kbit/s — sounding the same to most people, but noticeable to an expert.
There is a known bug where the REPLAYGAIN_ALBUM_PEAK vorbiscomment tag value is a copy-paste of REPLAYGAIN_ALBUM_GAIN instead of the correct value for many files.
Many people complain about how Spotify shuffles tracks. Since we have metadata for 99.9+% of tracks on Spotify, we can create a true shuffle across all songs on Spotify!
$ sqlite3 spotify_clean.sqlite3
sqlite> .mode table
sqlite> with random_ids as (select value as inx, (abs(random())%(select max(rowid) from tracks)) as trowid from generate_series(0)) select inx,tracks.id,tracks.popularity,tracks.name from random_ids join tracks on tracks.rowid=trowid limit 20;
| inx | id | popularity | name |
| 0 | 7KS7cm2arAGA2VZaZ2XvNa | 0 | Just Derry |
| 1 | 1BkLS2tmxD088l2ojUW5cv | 0 | Kapitel 37 - Aber erst wird gegessen - Schon wieder Weihnach |
| | | | ten mit der buckligen Verwandtschaft |
| 2 | 5RSU7MELzCaPweG8ALmjLK | 0 | El Buen Pastor |
| 3 | 1YNIl8AKIFltYH8O2coSoT | 0 | You Are The One |
| 4 | 1GxMuEYWs6Lzbn2EcHAYVx | 0 | Waorani |
| 5 | 4NhARf6pjwDpbyQdZeSsW3 | 0 | Magic in the Sand |
| 6 | 7pDrZ6rGaO6FHk6QtTKvQo | 0 | Yo No Fui |
| 7 | 15w4LBQ6rkf3QA2OiSMBRD | 25 | 你走 |
| 8 | 5Tx7jRLKfYlay199QB2MSs | 0 | Soul Clap |
| 9 | 3L7CkCD9595MuM0SVuBZ64 | 1 | Xuân Và Tuổi Trẻ |
| 10 | 4S6EkSnfxlU5UQUOZs7bKR | 1 | Elle était belle |
| 11 | 0ZIOUYrrArvSTq6mrbVqa1 | 0 | Kapitel 7.2 - Die Welt der Magie - 4 in 1 Sammelband: Weiße |
| | | | Magie | Medialität, Channeling & Trance | Divination & Wahrs |
| | | | agen | Energetisches Heilen |
| 12 | 4VfKaW1X1FKv8qlrgKbwfT | 0 | Pura energia |
| 13 | 1VugH5kD8tnMKAPeeeTK9o | 10 | Dalia |
| 14 | 6NPPbOybTFLL0LzMEbVvuo | 4 | Teil 12 - Folge 2: Arkadien brennt |
| 15 | 1VSVrAbaxNllk7ojNGXDym | 3 | Bre Petrunko |
| 16 | 4NSmBO7uzkuES7vDLvHtX8 | 0 | Paranoia |
| 17 | 7AHhiIXvx09DRZGQIsbcxB | 0 | Sand Underfoot Moments |
| 18 | 0sitt32n4JoSM1ewOWL7hs | 0 | Start Over Again |
| 19 | 080Zimdx271ixXbzdZOqSx | 3 | Auf all euren Wegen |
Or, filtering to only somewhat popular songs
sqlite> with random_ids as (select value as inx, (abs(random())%(select max(rowid) from tracks)) as trowid from generate_series(0)) select inx,tracks.id,tracks.popularity,albums.name as album_name,tracks.name from random_ids join tracks on tracks.rowid=trowid join albums on albums.rowid = album_rowid
where tracks.popularity >= 10 limit 20;
| inx | id | popularity | album_name | name |
| 32 | 1om6LphEpiLpl9irlOsnzb | 23 | The Essential Widespread Panic | Love Tractor |
| 47 | 2PCtPCRDia6spej5xcxbvW | 20 | Desatinos Desplumados | Sirena |
| 65 | 5wmR10WloZqVVdIpYhdaqq | 20 | Um Passeio pela Harpa Cristã - Vol 6 | As Santas Escrituras |
| 89 | 5xCuYNX3QlPsxhKLbWlQO9 | 11 | No Me Amenaces | No Me Amenaces |
| 96 | 2GRmiDIcIwhQnkxakNyUy4 | 16 | Very Bad Truth (Kingston Universi… | Kapitel 8.3 - Very Bad Truth |
| 98 | 5720pe1PjNXoMcbDPmyeLW | 11 | Kleiner Eisbär: Hilf mir fliegen! | Kapitel 06: Hilf mir fliegen! |
| 109 | 1mRXGNVsfD9UtFw6r5YtzF | 11 | Lunar Archive | Outdoor Seating |
| 110 | 5XOQwf6vkcJxWG9zgqVEWI | 19 | Teenage Dream | Firework |
| 125 | 0rbHOp8B4CpPXXZSekySvv | 15 | Previa y Cachengue 2025 | Debi tirar mas fotos |
...
Read the original on annas-archive.li »
Skip to main content
Ask the publishers to restore access to 500,000+ books.
8 Days Left: The year is almost over—help us finish strong in 2025!
Please Don’t Scroll Past This
Can you chip in? As an independent nonprofit, the Internet Archive is fighting for universal access to quality information. We build and maintain all our own systems, but we don’t charge for access, sell user information, or run ads. We’d be deeply grateful if you’d join the one in a thousand users that support us financially.
We understand that not everyone can donate right now, but if you can afford to contribute this Thursday, we promise it will be put to good use. Our resources are crucial for knowledge lovers everywhere—so if you find all these bits and bytes useful, please pitch in.
Please Don’t Scroll Past This The Internet Archive is working to keep the record straight by recording government websites, news publications, historical documents, and more. If you find our library useful, please pitch in.
Remind Me
By submitting, you agree to receive donor-related emails from the Internet Archive. Your privacy is important to us. We do not sell or trade your information with anyone.
An icon used to represent a menu that can be
toggled by interacting with this icon.
An illustration of an open book.
An illustration of two cells of a film
strip.
An illustration of an audio speaker.
An illustration of two photographs.
An illustration of a person’s head and chest.
An illustration of a horizontal line over an up
pointing arrow.
Search the history of more than 1 trillion web pages.
Capture a web page as it appears now for use as a trusted citation in the future.
Internet Archive’s in-browser video “theater” requires JavaScript to be enabled.
It appears your browser does not have it turned on.
Please see your browser settings for this feature.
Sharyn Alfonsi’s “Inside CECOT” for 60 Minutes, which was censored by Bari Weiss, as it appeared on Canada’s Global TV app.
...
Read the original on archive.org »
...
Read the original on www.jmail.world »
Guidelines |
FAQ |
Lists |
API |
Security |
Terms no one reads |
Sell 7% for clout |
Overwhelm mods
...
Read the original on dosaygo-studio.github.io »
To see all available qualifiers, see our documentation.
We read every piece of feedback, and take your input very seriously.
Secure your code as you build
To see all available qualifiers, see our documentation.
We read every piece of feedback, and take your input very seriously.
Secure your code as you build
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
You switched accounts on another tab or window. Reload to refresh your session.
...
Read the original on github.com »
is a news writer who covers the streaming wars, consumer tech, crypto, social media, and much more. Previously, she was a writer and editor at MUO.
Posts from this author will be added to your daily email digest and your homepage feed.
is a news writer who covers the streaming wars, consumer tech, crypto, social media, and much more. Previously, she was a writer and editor at MUO.
Posts from this author will be added to your daily email digest and your homepage feed.
ACR uses visual and audio data to identify what you’re watching on TV, including shows and movies on streaming services and cable TV, YouTube videos, Blu-ray discs, and more. Attorney General Paxton alleges that ACR also captures security and doorbell camera streams, media sent using Apple AirPlay or Google Cast, as well as the displays of other devices connected to the TV’s HDMI port, such as laptops and game consoles.
The lawsuit accuses Samsung, Sony, LG, Hisense, and TCL of “deceptively” prompting users to activate ACR, while “disclosures are hidden, vague, and misleading.” Samsung and Hisense, for example, capture screenshots of a TV’s display “every 500 milliseconds,” Paxton claims. The lawsuit alleges that TV manufacturers siphon viewing data back to each company “without the user’s knowledge or consent,” which they can then sell for targeted advertising.
Along with these allegations, Attorney General Paxton also raises concerns about TCL and Hisense’s ties to China, as they’re both based in the country. The lawsuit claims the TVs made by both companies are “Chinese-sponsored surveillance devices, recording the viewing habits of Texans at every turn.”
Attorney General Paxton accuses the five TV makers of violating the state’s Deceptive Trade Practices Act, which is meant to protect consumers from false, deceptive, or misleading practices. Paxton asks the court to impose a civil penalty and to block each company from collecting, sharing, or selling the ACR data they collect about Texas-based consumers. Samsung, Sony, LG, Hisense, and TCL didn’t immediately respond to a request for comment.
Vizio, which is now owned by Walmart, paid $2.2 million to the Federal Trade Commission and New Jersey in 2017 over similar allegations related to ACR.
“This conduct is invasive, deceptive, and unlawful,” Paxton says in a statement. “The fundamental right to privacy will be protected in Texas because owning a television does not mean surrendering your personal information to Big Tech or foreign adversaries.”
Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.
...
Read the original on www.theverge.com »
Skip to content
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
You switched accounts on another tab or window. Reload to refresh your session.
You must be signed in to star a gist
You must be signed in to fork a gist
Embed this gist in your website.
Save hackermondev/5e2cdc32849405fff6b46957747a2d28 to your computer and use it in GitHub Desktop.
Embed this gist in your website.
Save hackermondev/5e2cdc32849405fff6b46957747a2d28 to your computer and use it in GitHub Desktop.
How we pwned X (Twitter), Vercel, Cursor, Discord, and hundreds of companies through a supply-chain attack
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
You can’t perform that action at this time.
...
Read the original on gist.github.com »
People examining documents released by the Department of Justice in the Jeffrey Epstein case discovered that some of the file redaction can be undone with Photoshop techniques, or by simply highlighting text to paste into a word processing file.
Un-redacted text from these documents began circulating through social media on Monday evening. An exhibit in a civil case in the Virgin Islands against Darren K Indyke and Richard D Kahn, two executors of Epstein’s estate, contains redacted allegations explaining how Epstein and his associates had facilitated the sexual abuse of children. The exhibit was the second amended complaint in the state case against Indyke and Kahn.
In section 85, the redacted portion states: “Between September 2015 and June 2019, Indyke signed (FAC) for over $400,000 made payable to young female models and actresses, including a former Russian model who received over $380,000 through monthly payments of $8,333 made over a period of more than three and a half years until the middle of 2019.”
Prosecutors in the Virgin Islands settled its civil sex-trafficking case against Epstein’s estate, Indyke and Kahn in 2022 for $105m, plus one half of the proceeds from the sale of Little St James, the island on which Epstein resided and on which many of his crimes occurred. The justice department press release announcing the settlement did not include an admission of liability.
Indyke, an attorney who represented Epstein for decades, has not been criminally indicted by federal authorities. He was hired by the Parlatore Law Group in 2022, before the justice department settled the Epstein case. That firm represents the defense secretary, Pete Hegseth, and previously represented Donald Trump in his defense against charges stemming from the discovery of classified government documents stored at Trump’s Florida estate. Calls and email seeking comment from Indyke and the Parlatore Law Group have not yet been returned.
Trump has repeatedly denied any knowledge of or involvement in Epstein’s criminal activities and any wrongdoing.
Other sections further allege how Epstein’s enterprise concealed crimes.
“Defendants also attempted to conceal their criminal sex trafficking and abuse, conduct by paying large sums of money to participant-witnesses, including by paying for their attorneys’ fees and case costs in litigation related to this conduct,” reads one redacted passage.
“Epstein also threatened harm to victims and helped release damaging stories about them to damage their credibility when they tried to go public with their stories of being trafficked and sexually abused. Epstein also instructed one or more Epstein Enterprise participant-witnesses to destroy evidence relevant to ongoing court proceedings involving Defendants’ criminal sex trafficking and abuse conduct.”
Redactions of sections 184 through 192 of the document describe property taxes paid by companies incorporated by Epstein on properties that were not on the balance sheet for those firms.
“For instance, Cypress’s Balance Sheet as of December 31, 2018 did not reflect any assets other than cash of $18,824. Further, Cypress reported only $301 in expenses for the year ended December 31, 2018, despite it paying $106,394.60 in Santa Fe property taxes on November 6, 2018,” reads one redacted passage.
“Similarly, in 2017, Cypress reported as its only asset cash in the amount of $29,736 and expenses of $150, despite it paying $55,770.41 and $113,679.56 in Santa Fe property taxes during 2017.”
The Epstein Files Transparency Act signed into law last month permits the Department of Justice “to withhold certain information such as the personal information of victims and materials that would jeopardize an active federal investigation”.
It was unclear how property material complies with the redaction standard under the law. An inquiry to the Department of Justice has not yet been answered.
...
Read the original on www.theguardian.com »
In all of the debates about the value of AI-assistance in software development there’s one depressing anecdote that I keep on seeing: the junior engineer, empowered by some class of LLM tool, who deposits giant, untested PRs on their coworkers—or open source maintainers—and expects the “code review” process to handle the rest.
This is rude, a waste of other people’s time, and is honestly a dereliction of duty as a software developer.
Your job is to deliver code you have proven to work.
As software engineers we don’t just crank out code—in fact these days you could argue that’s what the LLMs are for. We need to deliver code that works—and we need to include proof that it works as well. Not doing that directly shifts the burden of the actual work to whoever is expected to review our code.
There are two steps to proving a piece of code works. Neither is optional.
The first is manual testing. If you haven’t seen the code do the right thing yourself, that code doesn’t work. If it does turn out to work, that’s honestly just pure chance.
Manual testing skills are genuine skills that you need to develop. You need to be able to get the system into an initial state that demonstrates your change, then exercise the change, then check and demonstrate that it has the desired effect.
If possible I like to reduce these steps to a sequence of terminal commands which I can paste, along with their output, into a comment in the code review. Here’s a recent example.
Some changes are harder to demonstrate. It’s still your job to demonstrate them! Record a screen capture video and add that to the PR. Show your reviewers that the change you made actually works.
Once you’ve tested the happy path where everything works you can start trying the edge cases. Manual testing is a skill, and finding the things that break is the next level of that skill that helps define a senior engineer.
The second step in proving a change works is automated testing. This is so much easier now that we have LLM tooling, which means there’s no excuse at all for skipping this step.
Your contribution should bundle the change with an automated test that proves the change works. That test should fail if you revert the implementation.
The process for writing a test mirrors that of manual testing: get the system into an initial known state, exercise the change, assert that it worked correctly. Integrating a test harness to productively facilitate this is another key skill worth investing in.
Don’t be tempted to skip the manual test because you think the automated test has you covered already! Almost every time I’ve done this myself I’ve quickly regretted it.
The most important trend in LLMs in 2025 has been the explosive growth of coding agents—tools like Claude Code and Codex CLI that can actively execute the code they are working on to check that it works and further iterate on any problems.
To master these tools you need to learn how to get them to prove their changes work as well.
This looks exactly the same as the process I described above: they need to be able to manually test their changes as they work, and they need to be able to build automated tests that guarantee the change will continue to work in the future.
Since they’re robots, automated tests and manual tests are effectively the same thing.
They do feel a little different though. When I’m working on CLI tools I’ll usually teach Claude Code how to run them itself so it can do one-off tests, even though the eventual automated tests will use a system like Click’s CLIRunner.
When working on CSS changes I’ll often encourage my coding agent to take screenshots when it needs to check if the change it made had the desired effect.
The good news about automated tests is that coding agents need very little encouragement to write them. If your project has tests already most agents will extend that test suite without you even telling them to do so. They’ll also reuse patterns from existing tests, so keeping your test code well organized and populated with patterns you like is a great way to help your agent build testing code to your taste.
Developing good taste in testing code is another of those skills that differentiates a senior engineer.
A computer can never be held accountable. That’s your job as the human in the loop.
Almost anyone can prompt an LLM to generate a thousand-line patch and submit it for code review. That’s no longer valuable. What’s valuable is contributing code that is proven to work.
Next time you submit a PR, make sure you’ve included your evidence that it works as it should.
...
Read the original on simonwillison.net »
Flock
Flock Exposed Its AI-Powered Cameras to the Internet. We Tracked Ourselves
Flock left at least 60 of its people-tracking Condor PTZ cameras live streaming and exposed to the open internet.
I am standing on the corner of Harris Road and Young Street outside of the Crossroads Business Park in Bakersfield, California, looking up at a Flock surveillance camera bolted high above a traffic signal. On my phone, I am watching myself in real time as the camera records and livestreams me—without any password or login—to the open internet. I wander into the intersection, stare at the camera and wave. On the livestream, I can see myself clearly. Hundreds of miles away, my colleagues are remotely watching me too through the exposed feed. Flock left livestreams and administrator control panels for at least 60 of its AI-enabled Condor cameras around the country exposed to the open internet, where anyone could watch them, download 30 days worth of video archive, and change settings, see log files, and run diagnostics. Unlike many of Flock’s cameras, which are designed to capture license plates as people drive by, Flock’s Condor cameras are pan-tilt-zoom (PTZ) cameras designed to record and track people, not vehicles. Condor cameras can be set to automatically zoom in on people’s faces as they walk through a parking lot, down a public street, or play on a playground, or they can be controlled manually, according to marketing material on Flock’s website. We watched Condor cameras zoom in on a woman walking her dog on a bike path in suburban Atlanta; a camera followed a man walking through a Macy’s parking lot in Bakersfield; surveil children swinging on a swingset at a playground; and film high-res video of people sitting at a stoplight in traffic. In one case, we were able to watch a man rollerblade down Brookhaven, Georgia’s Peachtree Creek Greenway bike path. The Flock camera zoomed in on him and tracked him as he rolled past. Minutes later, he showed up on another exposed camera livestream further down the bike path. The camera’s resolution was good enough that we were able to see that, when he stopped beneath one of the cameras, he was watching rollerblading videos on his phone.The exposure was initially discovered by YouTuber and technologist Benn Jordan and was shared with security researcher Jon “GainSec” Gaines, who recently found numerous vulnerabilities in several other models of Flock’s automated license plate reader (ALPR) cameras. They shared the details of what they found with me, and I verified many of the details seen in the exposed portals by driving to Bakersfield to walk in front of two cameras there while I watched myself on the livestream. I also pulled Flock’s contracts with cities for Condor cameras, pulled details from company presentations about the technology, and geolocated a handful of the cameras to cities and towns across the United States. Jordan also filmed himself in front of several of the cameras on the Peachtree Creek Greenway bike path. Jordan said he and Gaines discovered many of the exposed cameras with Shodan, an internet of things search engine that researchers regularly use to identify improperly secured devices. After finding links to the feed, “immediately, we were just without any username, without any password, we were just seeing everything from playgrounds to parking lots with people, Christmas shopping and unloading their stuff into cars,” Jordan told me in an interview. “I think it was like the first time that I actually got like immediately scared … I think the one that affected me most was as playground. You could see unattended kids, and that’s something I want people to know about so they can understand how dangerous this is.” In a YouTube video about his research, Jordan said he was able to use footage pulled from the exposed feed to identify specific people using open source investigation tools in order to show how trivially an exposure like this could be abused.
This post is for paid members only
Become a paid member for unlimited ad-free access to articles, bonus podcast content, and more.
Subscribe
Sign up for free access to this post
Free members get access to posts like this one along with an email round-up of our week’s stories.
Subscribe
Already have an account? Sign in
More like this
Flock Uses Overseas Gig Workers to Build its Surveillance AI
Flock accidentally exposed training materials and a panel which tracked what its AI annotators were working on. It showed that Flock, which has cameras in thousands of U.S. communities, is using workers in the Philippines to review and classify footage.
Cops Used Flock to Monitor No Kings Protests Around the Country
A massive cache of Flock lookups collated by the Electronic Frontier Foundation (EFF) shows as many as 50 federal, state, and local agencies used Flock during protests over the last year.
“Most drivers are unaware that San Jose’s Police Department is tracking their locations and do not know all that their saved location data can reveal about their private lives and activities.”
Why I Quit Streaming And Got Back Into Cassettes
In the age of Spotify and AI slop, tapes remind us what we’re missing when we stop taking risks.
Podcast: We Tracked Ourselves with Exposed Flock Cameras
How we tracked ourselves with exposed Flock cameras; a year in review; and our personal recommendations on all sorts of things.
iCloud, Mega, and as a torrent. Archivists have uploaded the 60 Minutes episode Bari Weiss spiked.
...
Read the original on www.404media.co »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.