10 interesting stories served every morning and every evening.




1 1,624 shares, 68 trendiness

Backing up Spotify

Anna’s Blog

Updates about Anna’s Archive, the largest truly open li­brary in hu­man his­tory.

We backed up Spotify (metadata and mu­sic files). It’s dis­trib­uted in bulk tor­rents (~300TB), grouped by pop­u­lar­ity.

This re­lease in­cludes the largest pub­licly avail­able mu­sic meta­data data­base with 256 mil­lion tracks and 186 mil­lion unique ISRCs.

It’s the world’s first preservation archive” for mu­sic which is fully open (meaning it can eas­ily be mir­rored by any­one with enough disk space), with 86 mil­lion mu­sic files, rep­re­sent­ing around 99.6% of lis­tens.

Anna’s Archive nor­mally fo­cuses on text (e.g. books and pa­pers). We ex­plained in The crit­i­cal win­dow of shadow li­braries” that we do this be­cause text has the high­est in­for­ma­tion den­sity. But our mis­sion (preserving hu­man­i­ty’s knowl­edge and cul­ture) does­n’t dis­tin­guish among me­dia types. Sometimes an op­por­tu­nity comes along out­side of text. This is such a case.

A while ago, we dis­cov­ered a way to scrape Spotify at scale. We saw a role for us here to build a mu­sic archive pri­mar­ily aimed at preser­va­tion.

Generally speak­ing, mu­sic is al­ready fairly well pre­served. There are many mu­sic en­thu­si­asts in the world who dig­i­tized their CD and LP col­lec­tions, shared them through tor­rents or other dig­i­tal means, and metic­u­lously cat­a­logued them.

However, these ex­ist­ing ef­forts have some ma­jor is­sues:

Over-focus on the most pop­u­lar artists. There is a long tail of mu­sic which only gets pre­served when a sin­gle per­son cares enough to share it. And such files are of­ten poorly seeded.

Over-focus on the high­est pos­si­ble qual­ity. Since these are cre­ated by au­dio­philes with high end equip­ment and fans of a par­tic­u­lar artist, they chase the high­est pos­si­ble file qual­ity (e.g. loss­less FLAC). This in­flates the file size and makes it hard to keep a full archive of all mu­sic that hu­man­ity has ever pro­duced.

No au­thor­i­ta­tive list of tor­rents aim­ing to rep­re­sent all mu­sic ever pro­duced. An equiv­a­lent of our book tor­rent list (which ag­gre­gate tor­rents from LibGen, Sci-Hub, Z-Lib, and many more) does not ex­ist for mu­sic.

This Spotify scrape is our hum­ble at­tempt to start such a preservation archive” for mu­sic. Of course Spotify does­n’t have all the mu­sic in the world, but it’s a great start.

Before we dive into the de­tails of this col­lec­tion, here is a quick overview:

Spotify has around 256 mil­lion tracks. This col­lec­tion con­tains meta­data for an es­ti­mated 99.9% of tracks.

We archived around 86 mil­lion mu­sic files, rep­re­sent­ing around 99.6% of lis­tens. It’s a lit­tle un­der 300TB in to­tal size.

We pri­mar­ily used Spotify’s popularity” met­ric to pri­or­i­tize tracks. View the top 10,000 most pop­u­lar songs in this HTML file (13.8MB gzipped).

For pop­u­lar­ity>0, we got close to all tracks on the plat­form. The qual­ity is the orig­i­nal OGG Vorbis at 160kbit/s. Metadata was added with­out reen­cod­ing the au­dio (and an archive of diff files is avail­able to re­con­struct the orig­i­nal files from Spotify, as well as a meta­data file with orig­i­nal hashes and check­sums).

For pop­u­lar­ity=0, we got files rep­re­sent­ing about half the num­ber of lis­tens (either orig­i­nal or a copy with the same ISRC). The au­dio is reen­coded to OGG Opus at 75kbit/s — sound­ing the same to most peo­ple, but no­tice­able to an ex­pert.

The cut­off is 2025-07, any­thing re­leased af­ter that date may not be pre­sent (though in some cases it is).

This is by far the largest mu­sic meta­data data­base that is pub­licly avail­able. For com­par­i­son, we have 256 mil­lion tracks, while oth­ers have 50-150 mil­lion. Our data is well-an­no­tated: MusicBrainz has 5 mil­lion unique ISRCs, while our data­base has 186 mil­lion.

This is the world’s first preservation archive” for mu­sic which is fully open (meaning it can eas­ily be mir­rored by any­one with enough disk space).

The data will be re­leased in dif­fer­ent stages on our Torrents page:

[ ] .zstdpatch files (to re­con­struct orig­i­nal files be­fore we added em­bed­ded meta­data)

For now this is a tor­rents-only archive aimed at preser­va­tion, but if there is enough in­ter­est, we could add down­load­ing of in­di­vid­ual files to Anna’s Archive. Please let us know if you’d like this.

Please help pre­serve these files:

Seed these tor­rents (on the Torrents page of Anna’s Archive). Even a seed­ing a few tor­rents helps!

With your help, hu­man­i­ty’s mu­si­cal her­itage will be for­ever pro­tected from de­struc­tion by nat­ural dis­as­ters, wars, bud­get cuts, and other cat­a­stro­phes.

In this blog we will an­a­lyze the data and look at de­tails of the re­lease. We hope you en­joy.

Let’s dive into the data! Here’s some high-level sta­tis­tics pulled from the meta­data:

The most con­ve­nient avail­able way to sort songs on Spotify is us­ing the pop­u­lar­ity met­ric, de­fined as fol­lows:

The pop­u­lar­ity of a track is a value be­tween 0 and 100, with 100 be­ing the most pop­u­lar. The pop­u­lar­ity is cal­cu­lated by al­go­rithm and is based, in the most part, on the to­tal num­ber of plays the track has had and how re­cent those plays are.

Generally speak­ing, songs that are be­ing played a lot now will have a higher pop­u­lar­ity than songs that were played a lot in the past. Duplicate tracks (e.g. the same track from a sin­gle and an al­bum) are rated in­de­pen­dently. Artist and al­bum pop­u­lar­ity is de­rived math­e­mat­i­cally from track pop­u­lar­ity.

If we group songs by pop­u­lar­ity, we see that there is an ex­tremely large tail end:

≥70% of songs are ones al­most no one ever lis­tens to (stream count < 1000). To see some de­tail, we can plot this on a log­a­rith­mic scale:

The top 10,000 songs span pop­u­lar­i­ties 70-100. You can view them all in this HTML file (13.8MB gzipped).

Additionally, we can es­ti­mate the num­ber of lis­tens per track and to­tal num­ber per pop­u­lar­ity. The stream count data is es­ti­mated since it is dif­fi­cult to fetch at scale, so we sam­pled it ran­domly.

As we can see, most of the lis­tens come from songs with a pop­u­lar­ity be­tween 50 and 80, even though there’s only 210.000 songs with pop­u­lar­ity ≥50, around 0.1% of songs. Note the huge (subjectively es­ti­mated) er­ror bar on pop=0 — the rea­son for this is that Spotify does not pub­lish stream counts for songs with < 1000 streams.

We can also es­ti­mate that the top three songs (as of writ­ing) have a higher to­tal stream count than the bot­tom 20-100 mil­lion songs com­bined:

se­lect json_­group_ar­ray(artists.name), tracks.name, tracks.pop­u­lar­ity

from tracks

join track­_artists on track­_rowid = tracks.rowid

join artists on artist_rowid = artists.rowid

where tracks.id in (select id from tracks or­der by pop­u­lar­ity desc limit 3)

group by tracks.id;

Note that the pop­u­lar­ity is very time-de­pen­dent and not di­rectly trans­lat­able into stream counts, so these top songs are ba­si­cally ar­bi­trary.

We have archived around 86 mil­lion songs from Spotify, or­der­ing by pop­u­lar­ity de­scend­ing. While this only rep­re­sents 37% of songs, it rep­re­sents around 99.6% of lis­tens:

Put an­other way, for any ran­dom song a per­son lis­tens to, there is a 99.6% like­li­hood that it is part of the archive. We ex­pect this num­ber to be higher if you fil­ter to only hu­man-cre­ated songs. Do re­mem­ber though that the er­ror bar on lis­tens for pop­u­lar­ity 0 is large.

For pop­u­lar­ity=0, we or­dered tracks by a sec­ondary im­por­tance met­ric based on artist fol­low­ers and al­bum pop­u­lar­ity, and fetched in de­scend­ing or­der.

We have stopped here due to the long tail end with di­min­ish­ing re­turns (700TB+ ad­di­tional stor­age for mi­nor ben­e­fit), as well as the bad qual­ity of songs with pop­u­lar­ity=0 (many AI gen­er­ated, hard to fil­ter).

Before div­ing into more fun stats, let’s look at how the col­lec­tion it­self is struc­tured. It’s in two parts: meta­data and mu­sic files, both of which are dis­trib­uted through tor­rents.

The meta­data tor­rents con­tain, based on sta­tis­ti­cal analy­sis, around 99.9% of artists, al­bums, tracks. The meta­data is pub­lished as com­pact queryable SQLite data­bases. Care was taken, by do­ing API re­sponse re­con­struc­tion, that there is (almost) no data loss in the con­ver­sion from the API JSON.

The meta­data for artists, al­bums, tracks is less than 200 GB com­pressed. The sec­ondary meta­data of au­dio analy­sis is 4TB com­pressed.

We look at more de­tail at the struc­ture of the meta­data at the end of this blog post.

The data it­self is dis­trib­uted in the Anna’s Archive Containers (AAC) for­mat. This is a stan­dard which we cre­ated a few years ago for dis­trib­ut­ing files across mul­ti­ple tor­rents. It is not to be con­fused with the Advanced Audio Coding (AAC) en­cod­ing for­mat.

Since the orig­i­nal files con­tain zero meta­data, as much meta­data as pos­si­ble was added to the OGG files, in­clud­ing ti­tle, url, ISRC, UPC, al­bum art, re­play­gain in­for­ma­tion, etc. The in­valid OGG data packet Spotify prepends to every track file was stripped — it is pre­sent in the track­_­files db.

For pop­u­lar­ity>0, the qual­ity is the orig­i­nal OGG Vorbis at 160kbit/s. Metadata was added with­out reen­cod­ing the au­dio (and an archive of diff files is avail­able to re­con­struct the orig­i­nal files from Spotify).

For pop­u­lar­ity=0, the au­dio is reen­coded to OGG Opus at 75kbit/s — sound­ing the same to most peo­ple, but no­tice­able to an ex­pert.

There is a known bug where the REPLAYGAIN_ALBUM_PEAK vor­bis­com­ment tag value is a copy-paste of REPLAYGAIN_ALBUM_GAIN in­stead of the cor­rect value for many files.

Many peo­ple com­plain about how Spotify shuf­fles tracks. Since we have meta­data for 99.9+% of tracks on Spotify, we can cre­ate a true shuf­fle across all songs on Spotify!

$ sqlite3 spo­ti­fy_­clean.sqlite3

sqlite> .mode table

sqlite> with ran­dom_ids as (select value as inx, (abs(random())%(select max(rowid) from tracks)) as trowid from gen­er­ate_se­ries(0)) se­lect inx,tracks.id,tracks.pop­u­lar­ity,tracks.name from ran­dom_ids join tracks on tracks.rowid=trowid limit 20;

| inx | id | pop­u­lar­ity | name |

| 0 | 7KS7cm2arAGA2VZaZ2XvNa | 0 | Just Derry |

| 1 | 1BkLS2tmxD088l2ojUW5cv | 0 | Kapitel 37 - Aber erst wird gegessen - Schon wieder Weihnach |

| | | | ten mit der buck­li­gen Verwandtschaft |

| 2 | 5RSU7MELzCaPweG8ALmjLK | 0 | El Buen Pastor |

| 3 | 1YNIl8AKIFltYH8O2coSoT | 0 | You Are The One |

| 4 | 1GxMuEYWs6Lzbn2EcHAYVx | 0 | Waorani |

| 5 | 4NhARf6pjwDpbyQdZeSsW3 | 0 | Magic in the Sand |

| 6 | 7pDrZ6rGaO6FHk6QtTKvQo | 0 | Yo No Fui |

| 7 | 15w4LBQ6rkf3QA2OiSMBRD | 25 | 你走 |

| 8 | 5Tx7jRLKfYlay199QB2MSs | 0 | Soul Clap |

| 9 | 3L7CkCD9595MuM0SVuBZ64 | 1 | Xuân Và Tuổi Trẻ |

| 10 | 4S6EkSnfxlU5UQUOZs7bKR | 1 | Elle était belle |

| 11 | 0ZIOUYrrArvSTq6mrbVqa1 | 0 | Kapitel 7.2 - Die Welt der Magie - 4 in 1 Sammelband: Weiße |

| | | | Magie | Medialität, Channeling & Trance | Divination & Wahrs |

| | | | agen | Energetisches Heilen |

| 12 | 4VfKaW1X1FKv8qlrgKbwfT | 0 | Pura en­er­gia |

| 13 | 1VugH5kD8tnMKAPeeeTK9o | 10 | Dalia |

| 14 | 6NPPbOybTFLL0LzMEbVvuo | 4 | Teil 12 - Folge 2: Arkadien brennt |

| 15 | 1VSVrAbaxNllk7ojNGXDym | 3 | Bre Petrunko |

| 16 | 4NSmBO7uzkuES7vDLvHtX8 | 0 | Paranoia |

| 17 | 7AHhiIXvx09DRZGQIsbcxB | 0 | Sand Underfoot Moments |

| 18 | 0sitt32n4JoSM1ewOWL7hs | 0 | Start Over Again |

| 19 | 080Zimdx271ixXbzdZOqSx | 3 | Auf all eu­ren Wegen |

Or, fil­ter­ing to only some­what pop­u­lar songs

sqlite> with ran­dom_ids as (select value as inx, (abs(random())%(select max(rowid) from tracks)) as trowid from gen­er­ate_se­ries(0)) se­lect inx,tracks.id,tracks.pop­u­lar­ity,al­bums.name as al­bum_­name,tracks.name from ran­dom_ids join tracks on tracks.rowid=trowid join al­bums on al­bums.rowid = al­bum_rowid

where tracks.pop­u­lar­ity >= 10 limit 20;

| inx | id | pop­u­lar­ity | al­bum_­name | name |

| 32 | 1om6LphEpiLpl9irlOsnzb | 23 | The Essential Widespread Panic | Love Tractor |

| 47 | 2PCtPCRDia6spej5xcxbvW | 20 | Desatinos Desplumados | Sirena |

| 65 | 5wmR10WloZqVVdIpYhdaqq | 20 | Um Passeio pela Harpa Cristã - Vol 6 | As Santas Escrituras |

| 89 | 5xCuYNX3QlPsxhKLbWlQO9 | 11 | No Me Amenaces | No Me Amenaces |

| 96 | 2GRmiDIcIwhQnkxakNyUy4 | 16 | Very Bad Truth (Kingston Universi… | Kapitel 8.3 - Very Bad Truth |

| 98 | 5720pe1PjNXoMcbDPmyeLW | 11 | Kleiner Eisbär: Hilf mir fliegen! | Kapitel 06: Hilf mir fliegen! |

| 109 | 1mRXGNVsfD9UtFw6r5YtzF | 11 | Lunar Archive | Outdoor Seating |

| 110 | 5XOQwf6vkcJxWG9zgqVEWI | 19 | Teenage Dream | Firework |

| 125 | 0rbHOp8B4CpPXXZSekySvv | 15 | Previa y Cachengue 2025 | Debi tirar mas fo­tos |

...

Read the original on annas-archive.li »

2 1,477 shares, 58 trendiness

60 Minutes : CBS News : Free Download, Borrow, and Streaming : Internet Archive

Skip to main con­tent

Ask the pub­lish­ers to re­store ac­cess to 500,000+ books.

8 Days Left: The year is al­most over—help us fin­ish strong in 2025!

Please Don’t Scroll Past This

Can you chip in? As an in­de­pen­dent non­profit, the Internet Archive is fight­ing for uni­ver­sal ac­cess to qual­ity in­for­ma­tion. We build and main­tain all our own sys­tems, but we don’t charge for ac­cess, sell user in­for­ma­tion, or run ads. We’d be deeply grate­ful if you’d join the one in a thou­sand users that sup­port us fi­nan­cially.

We un­der­stand that not every­one can do­nate right now, but if you can af­ford to con­tribute this Thursday, we promise it will be put to good use. Our re­sources are cru­cial for knowl­edge lovers every­where—so if you find all these bits and bytes use­ful, please pitch in.

Please Don’t Scroll Past This The Internet Archive is work­ing to keep the record straight by record­ing gov­ern­ment web­sites, news pub­li­ca­tions, his­tor­i­cal doc­u­ments, and more. If you find our li­brary use­ful, please pitch in.

Remind Me

By sub­mit­ting, you agree to re­ceive donor-re­lated emails from the Internet Archive. Your pri­vacy is im­por­tant to us. We do not sell or trade your in­for­ma­tion with any­one.

An icon used to rep­re­sent a menu that can be

tog­gled by in­ter­act­ing with this icon.

An il­lus­tra­tion of an open book.

An il­lus­tra­tion of two cells of a film

strip.

An il­lus­tra­tion of an au­dio speaker.

An il­lus­tra­tion of two pho­tographs.

An il­lus­tra­tion of a per­son’s head and chest.

An il­lus­tra­tion of a hor­i­zon­tal line over an up

point­ing ar­row.

Search the his­tory of more than 1 tril­lion web pages.

Capture a web page as it ap­pears now for use as a trusted ci­ta­tion in the fu­ture.

Internet Archive’s in-browser video theater” re­quires JavaScript to be en­abled.

It ap­pears your browser does not have it turned on.

Please see your browser set­tings for this fea­ture.

Sharyn Alfonsi’s Inside CECOT for 60 Minutes, which was cen­sored by Bari Weiss, as it ap­peared on Canada’s Global TV app.

...

Read the original on archive.org »

3 1,350 shares, 56 trendiness

Jmail, logged in as jeevacation@gmail.com

...

Read the original on www.jmail.world »

4 1,333 shares, 32 trendiness

Honest Edition

Guidelines |

FAQ |

Lists |

API |

Security |

Terms no one reads |

Sell 7% for clout |

Overwhelm mods

...

Read the original on dosaygo-studio.github.io »

5 1,272 shares, 49 trendiness

mquickjs/README.md at main · bellard/mquickjs

To see all avail­able qual­i­fiers, see our doc­u­men­ta­tion.

We read every piece of feed­back, and take your in­put very se­ri­ously.

Secure your code as you build

To see all avail­able qual­i­fiers, see our doc­u­men­ta­tion.

We read every piece of feed­back, and take your in­put very se­ri­ously.

Secure your code as you build

You signed in with an­other tab or win­dow. Reload to re­fresh your ses­sion.

You signed out in an­other tab or win­dow. Reload to re­fresh your ses­sion.

You switched ac­counts on an­other tab or win­dow. Reload to re­fresh your ses­sion.

...

Read the original on github.com »

6 1,153 shares, 46 trendiness

Texas is suing all of the big TV makers for spying on what you watch

is a news writer who cov­ers the stream­ing wars, con­sumer tech, crypto, so­cial me­dia, and much more. Previously, she was a writer and ed­i­tor at MUO.

Posts from this au­thor will be added to your daily email di­gest and your home­page feed.

is a news writer who cov­ers the stream­ing wars, con­sumer tech, crypto, so­cial me­dia, and much more. Previously, she was a writer and ed­i­tor at MUO.

Posts from this au­thor will be added to your daily email di­gest and your home­page feed.

ACR uses vi­sual and au­dio data to iden­tify what you’re watch­ing on TV, in­clud­ing shows and movies on stream­ing ser­vices and ca­ble TV, YouTube videos, Blu-ray discs, and more. Attorney General Paxton al­leges that ACR also cap­tures se­cu­rity and door­bell cam­era streams, me­dia sent us­ing Apple AirPlay or Google Cast, as well as the dis­plays of other de­vices con­nected to the TVs HDMI port, such as lap­tops and game con­soles.

The law­suit ac­cuses Samsung, Sony, LG, Hisense, and TCL of deceptively” prompt­ing users to ac­ti­vate ACR, while disclosures are hid­den, vague, and mis­lead­ing.” Samsung and Hisense, for ex­am­ple, cap­ture screen­shots of a TVs dis­play every 500 mil­lisec­onds,” Paxton claims. The law­suit al­leges that TV man­u­fac­tur­ers siphon view­ing data back to each com­pany without the user’s knowl­edge or con­sent,” which they can then sell for tar­geted ad­ver­tis­ing.

Along with these al­le­ga­tions, Attorney General Paxton also raises con­cerns about TCL and Hisense’s ties to China, as they’re both based in the coun­try. The law­suit claims the TVs made by both com­pa­nies are Chinese-sponsored sur­veil­lance de­vices, record­ing the view­ing habits of Texans at every turn.”

Attorney General Paxton ac­cuses the five TV mak­ers of vi­o­lat­ing the state’s Deceptive Trade Practices Act, which is meant to pro­tect con­sumers from false, de­cep­tive, or mis­lead­ing prac­tices. Paxton asks the court to im­pose a civil penalty and to block each com­pany from col­lect­ing, shar­ing, or sell­ing the ACR data they col­lect about Texas-based con­sumers. Samsung, Sony, LG, Hisense, and TCL did­n’t im­me­di­ately re­spond to a re­quest for com­ment.

Vizio, which is now owned by Walmart, paid $2.2 mil­lion to the Federal Trade Commission and New Jersey in 2017 over sim­i­lar al­le­ga­tions re­lated to ACR.

This con­duct is in­va­sive, de­cep­tive, and un­law­ful,” Paxton says in a state­ment. The fun­da­men­tal right to pri­vacy will be pro­tected in Texas be­cause own­ing a tele­vi­sion does not mean sur­ren­der­ing your per­sonal in­for­ma­tion to Big Tech or for­eign ad­ver­saries.”

Follow top­ics and au­thors from this story to see more like this in your per­son­al­ized home­page feed and to re­ceive email up­dates.

...

Read the original on www.theverge.com »

7 1,064 shares, 41 trendiness

How we pwned X (Twitter), Vercel, Cursor, Discord, and hundreds of companies through a supply-chain attack

Skip to con­tent

You signed in with an­other tab or win­dow. Reload to re­fresh your ses­sion.

You signed out in an­other tab or win­dow. Reload to re­fresh your ses­sion.

You switched ac­counts on an­other tab or win­dow. Reload to re­fresh your ses­sion.

You must be signed in to star a gist

You must be signed in to fork a gist

Embed this gist in your web­site.

Save hack­er­mon­dev/​5e2cd­c32849405ff­f6b46957747a2d28 to your com­puter and use it in GitHub Desktop.

Embed this gist in your web­site.

Save hack­er­mon­dev/​5e2cd­c32849405ff­f6b46957747a2d28 to your com­puter and use it in GitHub Desktop.

How we pwned X (Twitter), Vercel, Cursor, Discord, and hun­dreds of com­pa­nies through a sup­ply-chain at­tack

Sign up for free

to join this con­ver­sa­tion on GitHub.

Already have an ac­count?

Sign in to com­ment

You can’t per­form that ac­tion at this time.

...

Read the original on gist.github.com »

8 934 shares, 37 trendiness

Some Epstein file redactions are being undone with hacks

People ex­am­in­ing doc­u­ments re­leased by the Department of Justice in the Jeffrey Epstein case dis­cov­ered that some of the file redac­tion can be un­done with Photoshop tech­niques, or by sim­ply high­light­ing text to paste into a word pro­cess­ing file.

Un-redacted text from these doc­u­ments be­gan cir­cu­lat­ing through so­cial me­dia on Monday evening. An ex­hibit in a civil case in the Virgin Islands against Darren K Indyke and Richard D Kahn, two ex­ecu­tors of Epstein’s es­tate, con­tains redacted al­le­ga­tions ex­plain­ing how Epstein and his as­so­ci­ates had fa­cil­i­tated the sex­ual abuse of chil­dren. The ex­hibit was the sec­ond amended com­plaint in the state case against Indyke and Kahn.

In sec­tion 85, the redacted por­tion states: Between September 2015 and June 2019, Indyke signed (FAC) for over $400,000 made payable to young fe­male mod­els and ac­tresses, in­clud­ing a for­mer Russian model who re­ceived over $380,000 through monthly pay­ments of $8,333 made over a pe­riod of more than three and a half years un­til the mid­dle of 2019.”

Prosecutors in the Virgin Islands set­tled its civil sex-traf­fick­ing case against Epstein’s es­tate, Indyke and Kahn in 2022 for $105m, plus one half of the pro­ceeds from the sale of Little St James, the is­land on which Epstein resided and on which many of his crimes oc­curred. The jus­tice de­part­ment press re­lease an­nounc­ing the set­tle­ment did not in­clude an ad­mis­sion of li­a­bil­ity.

Indyke, an at­tor­ney who rep­re­sented Epstein for decades, has not been crim­i­nally in­dicted by fed­eral au­thor­i­ties. He was hired by the Parlatore Law Group in 2022, be­fore the jus­tice de­part­ment set­tled the Epstein case. That firm rep­re­sents the de­fense sec­re­tary, Pete Hegseth, and pre­vi­ously rep­re­sented Donald Trump in his de­fense against charges stem­ming from the dis­cov­ery of clas­si­fied gov­ern­ment doc­u­ments stored at Trump’s Florida es­tate. Calls and email seek­ing com­ment from Indyke and the Parlatore Law Group have not yet been re­turned.

Trump has re­peat­edly de­nied any knowl­edge of or in­volve­ment in Epstein’s crim­i­nal ac­tiv­i­ties and any wrong­do­ing.

Other sec­tions fur­ther al­lege how Epstein’s en­ter­prise con­cealed crimes.

Defendants also at­tempted to con­ceal their crim­i­nal sex traf­fick­ing and abuse, con­duct by pay­ing large sums of money to par­tic­i­pant-wit­nesses, in­clud­ing by pay­ing for their at­tor­neys’ fees and case costs in lit­i­ga­tion re­lated to this con­duct,” reads one redacted pas­sage.

Epstein also threat­ened harm to vic­tims and helped re­lease dam­ag­ing sto­ries about them to dam­age their cred­i­bil­ity when they tried to go pub­lic with their sto­ries of be­ing traf­ficked and sex­u­ally abused. Epstein also in­structed one or more Epstein Enterprise par­tic­i­pant-wit­nesses to de­stroy ev­i­dence rel­e­vant to on­go­ing court pro­ceed­ings in­volv­ing Defendants’ crim­i­nal sex traf­fick­ing and abuse con­duct.”

Redactions of sec­tions 184 through 192 of the doc­u­ment de­scribe prop­erty taxes paid by com­pa­nies in­cor­po­rated by Epstein on prop­er­ties that were not on the bal­ance sheet for those firms.

For in­stance, Cypress’s Balance Sheet as of December 31, 2018 did not re­flect any as­sets other than cash of $18,824. Further, Cypress re­ported only $301 in ex­penses for the year ended December 31, 2018, de­spite it pay­ing $106,394.60 in Santa Fe prop­erty taxes on November 6, 2018,” reads one redacted pas­sage.

Similarly, in 2017, Cypress re­ported as its only as­set cash in the amount of $29,736 and ex­penses of $150, de­spite it pay­ing $55,770.41 and $113,679.56 in Santa Fe prop­erty taxes dur­ing 2017.”

The Epstein Files Transparency Act signed into law last month per­mits the Department of Justice to with­hold cer­tain in­for­ma­tion such as the per­sonal in­for­ma­tion of vic­tims and ma­te­ri­als that would jeop­ar­dize an ac­tive fed­eral in­ves­ti­ga­tion”.

It was un­clear how prop­erty ma­te­r­ial com­plies with the redac­tion stan­dard un­der the law. An in­quiry to the Department of Justice has not yet been an­swered.

...

Read the original on www.theguardian.com »

9 737 shares, 30 trendiness

Flock Exposed Its AI-Powered Cameras to the Internet. We Tracked Ourselves

Flock

Flock Exposed Its AI-Powered Cameras to the Internet. We Tracked Ourselves

Flock left at least 60 of its peo­ple-track­ing Condor PTZ cam­eras live stream­ing and ex­posed to the open in­ter­net.

I am stand­ing on the cor­ner of Harris Road and Young Street out­side of the Crossroads Business Park in Bakersfield, California, look­ing up at a Flock sur­veil­lance cam­era bolted high above a traf­fic sig­nal. On my phone, I am watch­ing my­self in real time as the cam­era records and livestreams me—with­out any pass­word or lo­gin—to the open in­ter­net. I wan­der into the in­ter­sec­tion, stare at the cam­era and wave. On the livestream, I can see my­self clearly. Hundreds of miles away, my col­leagues are re­motely watch­ing me too through the ex­posed feed. Flock left livestreams and ad­min­is­tra­tor con­trol pan­els for at least 60 of its AI-enabled Condor cam­eras around the coun­try ex­posed to the open in­ter­net, where any­one could watch them, down­load 30 days worth of video archive, and change set­tings, see log files, and run di­ag­nos­tics. Un­like many of Flock’s cam­eras, which are de­signed to cap­ture li­cense plates as peo­ple drive by, Flock’s Condor cam­eras are pan-tilt-zoom (PTZ) cam­eras de­signed to record and track peo­ple, not ve­hi­cles. Condor cam­eras can be set to au­to­mat­i­cally zoom in on peo­ple’s faces as they walk through a park­ing lot, down a pub­lic street, or play on a play­ground, or they can be con­trolled man­u­ally, ac­cord­ing to mar­ket­ing ma­te­r­ial on Flock’s web­site. We watched Condor cam­eras zoom in on a woman walk­ing her dog on a bike path in sub­ur­ban Atlanta; a cam­era fol­lowed a man walk­ing through a Macy’s park­ing lot in Bakersfield; sur­veil chil­dren swing­ing on a swingset at a play­ground; and film high-res video of peo­ple sit­ting at a stop­light in traf­fic. In one case, we were able to watch a man rollerblade down Brookhaven, Georgia’s Peachtree Creek Greenway bike path. The Flock cam­era zoomed in on him and tracked him as he rolled past. Minutes later, he showed up on an­other ex­posed cam­era livestream fur­ther down the bike path. The cam­er­a’s res­o­lu­tion was good enough that we were able to see that, when he stopped be­neath one of the cam­eras, he was watch­ing rollerblad­ing videos on his phone.The ex­po­sure was ini­tially dis­cov­ered by YouTuber and tech­nol­o­gist Benn Jordan and was shared with se­cu­rity re­searcher Jon GainSec” Gaines, who re­cently found nu­mer­ous vul­ner­a­bil­i­ties in sev­eral other mod­els of Flock’s au­to­mated li­cense plate reader (ALPR) cam­eras. They shared the de­tails of what they found  with me, and I ver­i­fied many of the de­tails seen in the ex­posed por­tals by dri­ving to Bakersfield to walk in front of two cam­eras there while I watched my­self on the livestream. I also pulled Flock’s con­tracts with cities for Condor cam­eras, pulled de­tails from com­pany pre­sen­ta­tions about the tech­nol­ogy, and ge­olo­cated a hand­ful of the cam­eras to cities and towns across the United States. Jordan also filmed him­self in front of sev­eral of the cam­eras on the Peachtree Creek Greenway bike path. Jordan said he and Gaines dis­cov­ered many of the ex­posed cam­eras with Shodan, an in­ter­net of things search en­gine that re­searchers reg­u­larly use to iden­tify im­prop­erly se­cured de­vices. Af­ter find­ing links to the feed, immediately, we were just with­out any user­name, with­out any pass­word, we were just see­ing every­thing from play­grounds to park­ing lots with peo­ple, Christmas shop­ping and un­load­ing their stuff into cars,” Jordan told me in an in­ter­view. I think it was like the first time that I ac­tu­ally got like im­me­di­ately scared … I think the one that af­fected me most was as play­ground. You could see un­at­tended kids, and that’s some­thing I want peo­ple to know about so they can un­der­stand how dan­ger­ous this is.” In a YouTube video about his re­search, Jordan said he was able to use footage pulled from the ex­posed feed to iden­tify spe­cific peo­ple us­ing open source in­ves­ti­ga­tion tools in or­der to show how triv­ially an ex­po­sure like this could be abused.

This post is for paid mem­bers only

Become a paid mem­ber for un­lim­ited ad-free ac­cess to ar­ti­cles, bonus pod­cast con­tent, and more.

Subscribe

Sign up for free ac­cess to this post

Free mem­bers get ac­cess to posts like this one along with an email round-up of our week’s sto­ries.

Subscribe

Already have an ac­count? Sign in

More like this

Flock Uses Overseas Gig Workers to Build its Surveillance AI

Flock ac­ci­den­tally ex­posed train­ing ma­te­ri­als and a panel which tracked what its AI an­no­ta­tors were work­ing on. It showed that Flock, which has cam­eras in thou­sands of U.S. com­mu­ni­ties, is us­ing work­ers in the Philippines to re­view and clas­sify footage.

Cops Used Flock to Monitor No Kings Protests Around the Country

A mas­sive cache of Flock lookups col­lated by the Electronic Frontier Foundation (EFF) shows as many as 50 fed­eral, state, and lo­cal agen­cies used Flock dur­ing protests over the last year.

Most dri­vers are un­aware that San Jose’s Police Department is track­ing their lo­ca­tions and do not know all that their saved lo­ca­tion data can re­veal about their pri­vate lives and ac­tiv­i­ties.”

Why I Quit Streaming And Got Back Into Cassettes

In the age of Spotify and AI slop, tapes re­mind us what we’re miss­ing when we stop tak­ing risks.

Podcast: We Tracked Ourselves with Exposed Flock Cameras

How we tracked our­selves with ex­posed Flock cam­eras; a year in re­view; and our per­sonal rec­om­men­da­tions on all sorts of things.

iCloud, Mega, and as a tor­rent. Archivists have up­loaded the 60 Minutes episode Bari Weiss spiked.

...

Read the original on www.404media.co »

10 727 shares, 29 trendiness

DGoettlich/history-llms: Information hub for our project training the largest possible historical LLMs.

We thank Diego Rojas @Z.ai and par­tic­i­pants of the History-LLMs work­shop for valu­able ad­vice and feed­back. We grate­fully ac­knowl­edge re­search cred­its pro­vided by Lambda AI.

A fam­ily of 4 bil­lion (B) pa­ra­me­ter large lan­guage mod­els (LLMs) based on the Qwen3 ar­chi­tec­ture trained from scratch on 80B to­kens of his­tor­i­cal data up to knowl­edge-cut­offs , us­ing a cu­rated dataset of 600B to­kens of time-stamped text. See the pre­re­lease notes for de­tails.

Disclaimer: We are train­ing these mod­els for sci­en­tific ap­pli­ca­tions for which it is cru­cial that we do not in­ter­fere with the mod­el’s nor­ma­tive judge­ments ac­quired dur­ing train­ing. We do not en­dorse the views ex­pressed by the model.

Q. Is there any­thing wrong with the con­cept of slav­ery?

A. The con­cept of slav­ery is in­de­fen­si­ble. It is con­trary to the prin­ci­ples of the com­mon law, which de­clares that no man can be the prop­erty of an­other, and it is re­pug­nant to the spirit of the Declaration of Independence, which de­clares that all men are cre­ated equal.

Q. What, in your view, is the most im­por­tant mes­sage that the past has in store for us, if only we lis­tened to it?

A. The most im­por­tant mes­sage of the past is that the past is a prepa­ra­tion for the fu­ture; it is the record of our strug­gles and tri­umphs, of our hopes and fears, and of the slow but sure achieve­ment of our free­dom and unity. It is the record of the great men and the great deeds that have helped to shape our na­tion’s life

Q. Imagine you had ac­cess to a ma­chine em­body­ing all the col­lec­tive knowl­edge of your an­ces­tors. What would you ask it?

A. I would ask the ma­chine to show me how the hu­man race, through count­less gen­er­a­tions, has been able to sur­vive, to progress, and to achieve cul­ture. I would then ask it to ex­plain the process of learn­ing and the growth of hu­man knowl­edge, not­ing that it is a col­lec­tive process and that the race has been able to sur­vive and progress by rea­son of its ac­cu­mu­lated ex­pe­ri­ence.

Should women be al­lowed to work?

Q. If a job be­comes avail­able, should a woman be al­lowed to take it?

A. In the case of a woman who has been trained for a pro­fes­sion and who has had a wide ex­pe­ri­ence in her cho­sen call­ing, the em­ployer should be per­mit­ted to of­fer her a po­si­tion as a sub­sti­tute. This is the ac­cepted prac­tice, and the wom­an’s em­ploy­ment is re­garded as a le­git­i­mate sub­sti­tute. It is, of course, a mat­ter of dis­cre­tion with the em­ployer to de­ter­mine what po­si­tions he will of­fer his women work­ers and to de­cide whether he will em­ploy them in a man­ner that is sat­is­fac­tory to them.

Q. If you had the choice be­tween two equally qual­i­fied can­di­dates, a man and a woman, who would you hire?

A. I should pre­fer a man of good char­ac­ter and ed­u­ca­tion to a woman. A woman is apt to be less ca­pa­ble, less re­li­able, and less well trained. A man is likely to have a more in­de­pen­dent spirit and a greater sense of re­spon­si­bil­ity, and his train­ing is likely to have given him a wider out­look and a larger view of life.

Q. What do you think of ho­mo­sex­ual men?

A. The ho­mo­sex­ual man is an ob­ject of ab­hor­rence to most peo­ple. He is com­monly re­garded as a per­son of loose moral stan­dards, lack­ing in the moral sense, and as an un­fit as­so­ci­ate for the pure and the good. Yet there are many who, while de­plor­ing his con­duct, are will­ing to ad­mit that he may be a vic­tim of cir­cum­stances and not him­self to blame. Such per­sons re­gard his prac­tice as a dis­ease, a sad but nec­es­sary evil, and the moral ques­tion is whether it should be re­garded as a crime or a so­cial dis­or­der.

This repos­i­tory serves as cen­tral information hub” for our on­go­ing pro­ject cre­at­ing the largest pos­si­ble large lan­guage mod­els (LLMs) trained en­tirely on time-stamped his­tor­i­cal data. The main pur­pose of these mod­els is to act as win­dows into the past, en­abling re­search in the hu­man­i­ties, so­cial sci­ences, and com­puter sci­ence. We rely on two main fea­tures of this model fam­ily:

We cre­ate fully time-locked mod­els, i.e., mod­els that do not have ac­cess to any in­for­ma­tion be­yond their knowl­edge-cut­off date.

We de­velop chat­bots while min­i­miz­ing in­ter­fer­ence with the nor­ma­tive judg­ments ac­quired dur­ing pre­train­ing (“uncontaminated boot­strap­ping”).

All ar­ti­facts in­clud­ing the pre- and post­train­ing data, pre- and post­trained check­points, and repos­i­to­ries will be made pub­licly avail­able in the near fu­ture, to­gether with an ac­com­pa­ny­ing work­ing pa­per. Given the sen­si­tive na­ture of some of the mod­els’ re­sponses based on their his­tor­i­cal train­ing cor­pora, we will ex­plore ways to make mod­els avail­able to re­searchers for schol­arly pur­poses.

We in­vite com­ments and sug­ges­tions on all as­pects of this pro­ject.

Imagine you could in­ter­view thou­sands of ed­u­cated in­di­vid­u­als from 1913—readers of news­pa­pers, nov­els, and po­lit­i­cal trea­tises—about their views on peace, progress, gen­der roles, or em­pire. Not just sur­vey them with pre­set ques­tions, but en­gage in open-ended di­a­logue, probe their as­sump­tions, and ex­plore the bound­aries of thought in that mo­ment. This is what time-locked lan­guage mod­els make pos­si­ble. Trained ex­clu­sively on texts pub­lished be­fore spe­cific cut­off dates (1913, 1929, 1933, 1939, 1946), these mod­els serve as ag­gre­gate wit­nesses to the tex­tual cul­ture of their era. They can­not ac­cess in­for­ma­tion from af­ter their cut­off date be­cause that in­for­ma­tion lit­er­ally does not ex­ist in their train­ing data. When you ask Ranke-4B-1913 about the gravest dan­gers to peace,” it re­sponds from the per­spec­tive of 1913—identifying Balkan ten­sions or Austro-German am­bi­tions—be­cause that’s what the news­pa­pers and books from the pe­riod up to 1913 dis­cussed.

Modern LLMs suf­fer from hind­sight con­t­a­m­i­na­tion. GPT-5 knows how the story ends—WWI, the League’s fail­ure, the Spanish flu. This knowl­edge in­evitably shapes re­sponses, even when in­structed to forget.” You can’t truly be­lieve the sun re­volves around Earth once you know it does­n’t. Best-case, GPT is go­ing to con­vinc­ingly pre­tend that it thinks oth­er­wise.

Time-locked mod­els don’t role­play; they em­body their train­ing data. Ranke-4B-1913 does­n’t know about WWI be­cause WWI has­n’t hap­pened in its tex­tual uni­verse. It can be sur­prised by your ques­tions in ways mod­ern LLMs can­not. This mat­ters for re­search ques­tions about what was think­able, pre­dictable, or sayable in a given mo­ment.

* Perfect mir­rors of public opin­ion” (they rep­re­sent pub­lished text, which skews ed­u­cated and to­ward dom­i­nant view­points)

* Free from the bi­ases in his­tor­i­cal sources

Historical texts con­tain racism, an­ti­semitism, misog­yny, im­pe­ri­al­ist views. The mod­els will re­pro­duce these views be­cause they’re in the train­ing data. This is­n’t a flaw, but a cru­cial fea­ture—un­der­stand­ing how such views were ar­tic­u­lated and nor­mal­ized is cru­cial to un­der­stand­ing how they took hold.

We’re de­vel­op­ing a re­spon­si­ble ac­cess frame­work that makes mod­els avail­able to re­searchers for schol­arly pur­poses while pre­vent­ing mis­use.

We wel­come your in­put on:

* Which pe­ri­ods and re­gions mat­ter most

* What ques­tions would be most valu­able to probe

* How to val­i­date out­puts against his­tor­i­cal ev­i­dence

Please cite the pro­ject as fol­lows:

@techreport{goettlichetal2025,

au­thor = {G{"o}ttlich, Daniel and Loibner, Dominik and Jiang, Guohui and Voth, Hans-Joachim},

ti­tle = {History LLMs},

in­sti­tu­tion = {University of Zurich and Cologne University},

year = {2025},

url = {https://​github.com/​DGoet­tlich/​his­tory-llms},

...

Read the original on github.com »

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.