10 interesting stories served every morning and every evening.

1 1,035 shares, 37 trendiness, 781 words and 7 minutes reading time

Async-await on stable Rust!

On this com­ing Thursday, November 7, async-await syn­tax hits sta­ble

Rust, as part of the 1.39.0 re­lease. This work has been a long time in de­vel­op­ment — the key ideas for zero-cost fu­tures, for ex­am­ple, were first pro­posed by Aaron Turon and Alex Crichton in

2016! — and we are very proud of the end re­sult. We be­lieve that Async I/O is go­ing to be an in­creas­ingly im­por­tant part of Rust’s story.

While this first re­lease of async-await” is a mo­men­tous event, it’s also only the be­gin­ning. The cur­rent sup­port for async-await marks a kind of Minimum Viable Product” (MVP). We ex­pect to be pol­ish­ing, im­prov­ing, and ex­tend­ing it for some time.

Already, in the time since async-await hit beta, we’ve made a lot of great progress, in­clud­ing mak­ing some key di­ag­nos­tic

im­prove­ments that help to make async-await er­rors far more ap­proach­able. To get in­volved in that work, check out the Async Foundations Working Group; if noth­ing else, you can help us by fil­ing bugs about pol­ish is­sues or by nom­i­nat­ing those

bugs that are both­er­ing you the most, to help di­rect our ef­forts.

Many thanks are due to the peo­ple who made async-await a re­al­ity. The im­ple­men­ta­tion and de­sign would never have hap­pened with­out the lead­er­ship of cramertj and with­out­boats, the im­ple­men­ta­tion and pol­ish work from the com­piler side (davidtwco, tmandry, gile­scope, csmoe), the core gen­er­a­tor sup­port that fu­tures builds on (Zoxc), the foun­da­tional work on Future and the Pin APIs (aturon, alex­crich­ton, RalfJ, pythonesque), and of course the in­put pro­vided by so many com­mu­nity mem­bers on RFC threads and dis­cus­sions.

Now that async-await is ap­proach­ing sta­bi­liza­tion, all the ma­jor Async I/O run­times are at work adding and ex­tend­ing their sup­port for the new syn­tax:

So, what is async await? Async-await is a way to write func­tions that can pause”, re­turn con­trol to the run­time, and then pick up from where they left off. Typically those pauses are to wait for I/O, but there can be any num­ber of uses.

You may be fa­mil­iar with the async-await from JavaScript or C#. Rust’s ver­sion of the fea­ture is sim­i­lar, but with a few key dif­fer­ences.

To use async-await, you start by writ­ing async fn in­stead of fn:

async fn first_­func­tion() -> u32 { .. }

Unlike a reg­u­lar func­tion, call­ing an async fn does­n’t have any im­me­di­ate ef­fect. Instead, it re­turns a Future. This is a sus­pended com­pu­ta­tion that is wait­ing to be ex­e­cuted. To ac­tu­ally ex­e­cute the fu­ture, use the .await op­er­a­tor:

async fn an­oth­er_­func­tion() {

// Create the fu­ture:

let fu­ture = first_­func­tion();

// Await the fu­ture, which will ex­e­cute it (and sus­pend

// this func­tion if we en­counter a need to wait for I/O):

let re­sult: u32 = fu­ture.await;

This ex­am­ple shows the first dif­fer­ence be­tween Rust and other lan­guages: we write fu­ture.await in­stead of await fu­ture. This syn­tax in­te­grates bet­ter with Rust’s ? op­er­a­tor for prop­a­gat­ing er­rors (which, af­ter all, are very com­mon in I/O). You can sim­ply write fu­ture.await? to await the re­sult of a fu­ture and prop­a­gate er­rors. It also has the ad­van­tage of mak­ing method chain­ing pain­less.

The other dif­fer­ence be­tween Rust fu­tures and fu­tures in JS and C# is that they are based on a poll” model, which makes them zero

cost. In other lan­guages, in­vok­ing an async func­tion im­me­di­ately cre­ates a fu­ture and sched­ules it for ex­e­cu­tion: await­ing the fu­ture is­n’t nec­es­sary for it to ex­e­cute. But this im­plies some over­head for each fu­ture that is cre­ated.

In con­trast, in Rust, call­ing an async func­tion does not do any sched­ul­ing in and of it­self, which means that we can com­pose a com­plex nest of fu­tures with­out in­cur­ring a per-fu­ture cost. As an end-user, though, the main thing you’ll no­tice is that fu­tures feel lazy”: they don’t do any­thing un­til you await them.

If you’d like a closer look at how fu­tures work un­der the hood, take a look at the ex­ecu­tor sec­tion of the async book, or watch the

ex­cel­lent talk that with­out­boats gave at Rust LATAM 2019

on the topic.

We be­lieve that hav­ing async-await on sta­ble Rust is go­ing to be a key en­abler for a lot of new and ex­cit­ing de­vel­op­ments in Rust. If you’ve tried Async I/O in Rust in the past and had prob­lems — par­tic­u­larly if you tried the com­bi­na­tor-based fu­tures of the past — you’ll find

async-await in­te­grates much bet­ter with Rust’s bor­row­ing

sys­tem. Moreover, there are now a num­ber of great run­times and other li­braries avail­able in the ecosys­tem to work with. So get out there and build stuff!


Read the original on blog.rust-lang.org »

2 882 shares, 32 trendiness, 1091 words and 9 minutes reading time

Give Firefox A Chance For A Faster, Calmer And Distraction-Free Internet

We’re liv­ing in the Google Chrome browser dom­i­nance age (65% of the mar­ket share world­wide) but for the first time in a few years, Chrome has some very se­ri­ous com­pe­ti­tion.

Firefox is an open-source browser made by a non-profit or­ga­ni­za­tion named Mozilla. The mis­sion of the Mozilla Foundation is to help build a health­ier, more open and ac­ces­si­ble in­ter­net.

Over the last year Firefox has got a lot faster and more re­source friendly.

The team be­hind it has made some eth­i­cal and peo­ple friendly de­ci­sions that make the web more pri­vate, much faster and dis­trac­tion free for every­one.

Using Firefox gives you peace of mind and keeps you away from the ad­ver­tis­ing com­pa­nies con­stantly fol­low­ing you around, pro­fil­ing you and tempt­ing you to pur­chase their prod­ucts.

Firefox cur­rently stands at 4% of the browser mar­ket share world­wide and that’s a shame. Many more peo­ple would find great value in us­ing it.

You can sim­ply in­stall Firefox and start surf­ing right away, but here’s a brief look at some of the Firefox fea­tures you can ex­plore.

When I in­stall Firefox, the first thing I do is to use the Customize Firefox sec­tion. Many in­ter­est­ing op­tions in there let you make Firefox look and feel the way you want to.

I re­move every­thing from the tool­bar. I hide the ti­tle bar, menu bar and book­marks bar. I se­lect a dark theme and put the den­sity at com­pact. This gives me a nice, clean and min­i­mal browser.

I untick every­thing in this sec­tion with the re­sult be­ing a nice blank, dark home page and new tab page.

You can tick on Web Search” if you pre­fer a min­i­mal­ist home page with a nice Firefox logo and a search box like this:

Select a more hu­man friendly search en­gine. DuckDuckGo is the best op­tion from all the de­fault choices.

Then visit the search en­gine you’d like to use as the de­fault. Click on the search box on the right-hand side of your tool­bar and se­lect Add” to add it.

Now go back to your Search Preferences and se­lect the newly added search en­gine as your new de­fault.

You can now also re­move the search bar from the tool­bar if you wish.

Want more op­tions for good Google al­ter­na­tives? These are three that I like:

StartPage.com — Gives you ex­act same search re­sults as Google but with­out all that track­ing and pro­fil­ing. Qwant.com — A pri­vate search en­gine with sim­i­lar phi­los­o­phy to DuckDuckGo but based in France, Europe. Ecosia.org — They plant trees for each search that you make.

Choose Strict” or go Custom” to make it even stricter.

In Cookies” block All third-party cook­ies” and in Tracking Content” se­lect All Windows“. Tick Cryptominers” and Fingerprinters” too.

This way I am a bit safer from all the track­ers. All the third-party track­ers and cook­ies are blocked au­to­mat­i­cally. As a side ef­fect of this, pretty much all the in­tru­sive ad­ver­tis­ing is blocked.

Some ads that are served from the first party, that are not per­son­al­ized and that have no track­ing are vis­i­ble. Examples are ads on DuckDuckGo, Twitter and Reddit.

If you want to block all the ads (and do even more con­tent block­ing) you’ll need an ad blocker such as uBlock Origin.

I’m happy to be ex­posed to con­tex­tual and text based ads with­out track­ing, per­son­al­iza­tion and sur­veil­lance so plain Firefox with­out any ex­ten­sions works well.

On a typ­i­cal Wired.com ar­ti­cle, Firefox blocks four so­cial me­dia track­ers from Facebook, Twitter and Linkedin.

In ad­di­tion, it also blocks 24 pieces of track­ing con­tent from com­pa­nies such as Hotjar and Amazon.

And last but not least, it blocks eight third-party cook­ies from com­pa­nies such as Google and Snapchat.

That’s more than 35 blocks on every page load.

Firefox ac­tu­ally ends up block­ing ma­jor­ity of the data Wired tries to load but all the con­tent still looks the same as it would look with­out any of the blocks.

And Wired still gets to place some non-in­tru­sive ad­ver­tis­ing such as them pro­mot­ing the op­tion to sub­scribe to the mag­a­zine.

I tick Delete cook­ies and site data when Firefox is closed” while I use Manage Permissions” to al­low cook­ies from web­sites I want to stay logged into not to be deleted.

I rec­om­mend you en­able every­thing here. Especially if you’re not us­ing a pass­word man­ager.

Firefox will not only save all your pass­words, but it will also auto-fill them, it will gen­er­ate strong pass­words for you when sign­ing up for new ac­counts and even alert you in case web­sites you visit have been breached.

I block new re­quests ask­ing for ac­cess to things such as Notifications“, Location” and Autoplay” of video and au­dio.

Simply click on the Settings” but­ton next to the dif­fer­ent items and se­lect Block new re­quests ask­ing to ac­cess“.

The web is so much calmer with­out all those prompts ask­ing you to en­able or al­low this and that.

I man­u­ally en­able it for spe­cific sites that I re­ally need or want to.

I love the Firefox Reader View. A Reader View” icon will show on the right-hand side of your tool­bar on avail­able sites. It looks like this:

Reader View ba­si­cally strips away all the dis­trac­tions such as but­tons, ads and other web­site el­e­ments. It gives you pure con­tent and con­tent only.

You can even change the de­fault lay­out of the Reader View by choos­ing a light or dark mode, chang­ing the font and the font size.

Here’s how a typ­i­cal The New York Times ar­ti­cle looks with the Reader View off and on:

Type about:con­fig in the ad­dress bar to visit the Configuration Editor which is full of hid­den Firefox pref­er­ences aimed at ad­vanced users.

Click on I ac­cept the risk!” but­ton on the hu­mor­ous This might void your war­ranty!” warn­ing mes­sage.

Search for privacy.firstparty.isolate” and set its value to true“.

First-Party Isolation is a great fea­ture and I ex­pect it to make it into de­fault set­tings in the near fu­ture.

This fea­ture re­stricts cook­ies, cache and other site data so it can only be ac­cessed by the first party do­main name.

This stops ad­ver­tis­ing com­pa­nies from be­ing able to fol­low and track your be­hav­ior across the dif­fer­ent sites that you visit.

This was meant as a brief be­gin­ner in­tro­duc­tion to what Firefox of­fers out of the box and us­ing the in-built pref­er­ences.

Firefox fea­tures many other op­tions such as con­tain­ers, themes and thou­sands of ex­ten­sions that you can en­able to add any fea­ture that you may wish for.

This is some­thing you can start to ex­plore as you get more used to Firefox and more com­fort­able within the Firefox en­vi­ron­ment.

Give Firefox a chance now and en­joy a more open, pri­vate and hu­man-cen­tric web ex­pe­ri­ence!


Read the original on marko.fyi »

3 759 shares, 29 trendiness, 809 words and 8 minutes reading time

Postgres is a great pub/sub & job server

If you’re mak­ing any pro­ject of suf­fi­cient com­plex­ity, you’ll need a pub­lish/​sub­scribe server to process events. This ar­ti­cle will in­tro­duce you to Postgres, ex­plain the al­ter­na­tives, and walk you through an ex­am­ple use case of pub/​sub and its so­lu­tion.

If you aren’t too fa­mil­iar with Postgres, it’s a fea­ture-packed re­la­tional data­base that many com­pa­nies use as a tra­di­tional cen­tral data store. By stor­ing your users” table in Postgres, you can im­me­di­ately scale to 100 columns and a row for every liv­ing per­son.

It’s pos­si­ble to scale Postgres to stor­ing a bil­lion 1KB rows en­tirely in mem­ory - This means you could quickly run queries against the full name of every­one on the planet on com­mod­ity hard­ware and with lit­tle fine-tun­ing.

I’m not go­ing to be­la­bor the point that some­thing called PostgresSQL” is a good SQL data­base. I’ll show you a more in­ter­est­ing use case for it where we com­bine a few fea­tures to turn Postgres into a pow­er­ful pub­sub / job server.

If you do enough sys­tem de­sign, you’ll in­evitably need to solve a prob­lem with pub­lish/​sub­scribe ar­chi­tec­ture. We hit it quickly at LayerCI - we needed to keep the view­ers of a test run’s page and the github API no­ti­fied about a run as it pro­gressed.

For your pub/​sub server, you have a lot of op­tions:

There are very few use cases where you’d need a ded­i­cated pub/​sub server like Kafka. Postgres can eas­ily han­dle 10,000 in­ser­tions per sec­ond, and it can be tuned to even higher num­bers. It’s rarely a mis­take to start with Postgres and then switch out the most per­for­mance crit­i­cal parts of your sys­tem when the time comes.

In the list above, I skipped things sim­i­lar to pub/​sub servers called job queues” - they only let one subscriber” watch for new events” at a time, and keep a queue of un­processed events:

It turns out that Postgres gen­er­ally su­per­sedes job servers as well. You can have your work­ers watch” the new events” chan­nel and try to claim a job when­ever a new one is pushed. As a bonus, Postgres lets other ser­vices watch the sta­tus of the events with no added com­plex­ity.

At LayerCI, we run test runs”, which start by cloning a repos­i­tory, and then run­ning some user spec­i­fied tests. There are mi­croser­vices that do var­i­ous ini­tial­iza­tion steps for the test run, and ad­di­tional mi­croser­vices (such as the web­socket gate­way) that need to lis­ten to the sta­tus of the runs.

An in­stance of an API server cre­ates a run by in­sert­ing a row into the Runs” row of a Postgres table:

CREATE TYPE ci_job_s­ta­tus AS ENUM (‘new’, initializing’, initialized’, running’, success’, error’);



repos­i­tory var­char(256),

sta­tus ci_job_s­ta­tus,

sta­tus_change_­time time­stamp

/*on API call*/

INSERT INTO ci_job_s­ta­tus(repos­i­tory, sta­tus, sta­tus_change_­time) VALUES (’https://​github.com/​col­in­chartier/​lay­erci-color-test, new’, NOW());

How do the work­ers worker claim” a job? By set­ting the job sta­tus atom­i­cally:

UPDATE ci_jobs SET sta­tus=‘ini­tial­iz­ing’

WHERE id = (


FROM ci_jobs

WHERE sta­tus=‘new’





Finally, we can use a trig­ger and a chan­nel to no­tify the work­ers that there might be new work avail­able:

CREATE OR REPLACE FUNCTION ci_job­s_s­ta­tus_no­tify()

RETURNS trig­ger AS


PERFORM pg_no­tify(‘ci_job­s_s­ta­tus_chan­nel’, NEW.id::text);



$$ LANGUAGE plpgsql;

CREATE TRIGGER ci_job­s_s­ta­tus


ON ci_jobs


EXECUTE PROCEDURE ci_job­s_s­ta­tus_no­tify();

All the work­ers have to do is listen” on this sta­tus chan­nel and try to claim a job when­ever a job’s sta­tus changes:

tryP­ick­upJob := make(chan in­ter­face{})

//equivalent to LISTEN ci_job­s_s­ta­tus_chan­nel;’


go func() {

for event := range lis­tener.No­tify {

se­lect {

case tryP­ick­upJob

When we com­bine these el­e­ments, we get some­thing like the fol­low­ing:

This ar­chi­tec­ture scales to many se­quen­tial work­ers pro­cess­ing the job in a row, all you need is a processing” state and a processed” state for each worker. For LayerCI that looks like: new, ini­tial­iz­ing, ini­tial­ized, run­ning, com­plete.

It also al­lows other ser­vices to watch the ci_job­s_s­ta­tus_chan­nel - Our web­socket gate­way for the /run page and github no­ti­fi­ca­tion ser­vices sim­ply watch the chan­nel and no­tify any rel­e­vant par­ties of the pub­lished events.

There are also a bunch of other ben­e­fits to us­ing post­gres in­stead of some­thing like Redis Pub/Sub:

* Many SQL users will al­ready have Postgres in­stalled for use as a data­base, so there are no ex­tra setup costs for us­ing it for pub/​sub.

* As a data­base, Postgres has very good per­sis­tence guar­an­tees - It’s easy to query dead” jobs with, e.g., SELECT * FROM ci_jobs WHERE sta­tus=‘ini­tial­iz­ing’ AND NOW() - sta­tus_change_­time > 1 hour’::in­ter­val to han­dle work­ers crash­ing or hang­ing.

* Since jobs are de­fined in SQL, it’s easy to gen­er­ate graphql and pro­to­buf rep­re­sen­ta­tions of them (i.e., to pro­vide APIs that checks the run sta­tus.)

* It’s easy to have mul­ti­ple watch­ers of sta­tus changes, you can have other ser­vices use the same LISTEN ci_job­s_s­ta­tus_chan­nel”

* Postgres has very good lan­guage sup­port, with bind­ings for most pop­u­lar lan­guages. This is a stark dif­fer­ence from most other pub/​sub servers.

* You can also run com­pli­cated SQL queries on things that are still in your work queues” to give highly tai­lored API end­points to your users. LayerCI has pages like https://​lay­erci.com/​github/​dis­trib­uted-con­tain­ers-inc/​sanic that show the sta­tus of var­i­ous jobs (even run­ning ones.)

If you need a pub­lish/​sub­scribe or job server at any point in your pro­ject, it’s not a bad idea to start by us­ing Postgres. It’ll give you lots of data in­tegrity and per­for­mance guar­an­tees, and it does­n’t re­quire you or your team learn­ing any new tech­nol­ogy.

LayerCI runs tests up to 95% faster by tak­ing snap­shots of the VM run­ning the test as it pro­gresses. You can try LayerCI for free and get run­ning in un­der five min­utes by click­ing here.


Read the original on layerci.com »

4 700 shares, 25 trendiness, 829 words and 8 minutes reading time

Former Twitter employees charged with spying for Saudi Arabia by digging into the accounts of kingdom critics

The charges, un­veiled Wednesday in San Francisco, came a day af­ter the ar­rest of one of the for­mer Twitter em­ploy­ees, Ahmad Abouammo, a U. S. cit­i­zen who is al­leged to have spied on the ac­counts of three users — in­clud­ing one whose posts dis­cussed the in­ner work­ings of the Saudi lead­er­ship — on be­half of the gov­ern­ment in Riyadh.

The sec­ond for­mer Twitter em­ployee — Ali Alzabarah, a Saudi cit­i­zen — was ac­cused of ac­cess­ing the per­sonal in­for­ma­tion of more than 6,000 Twitter ac­counts in 2015 on be­half of Saudi Arabia. One of those ac­counts be­longed to a promi­nent dis­si­dent, Omar Abdulaziz, who later be­came close to Khashoggi, a Washington Post con­tribut­ing colum­nist who ad­vo­cated for free ex­pres­sion in the Arab world.

The crim­i­nal com­plaint un­sealed to­day al­leges that Saudi agents mined Twitter’s in­ter­nal sys­tems for per­sonal in­for­ma­tion about known Saudi crit­ics and thou­sands of other Twitter users,” said U. S. Attorney David L. Anderson. We will not al­low U.S. com­pa­nies or U.S. tech­nol­ogy to be­come tools of for­eign re­pres­sion in vi­o­la­tion of U.S. law.”

Twitter re­stricts ac­cess to sen­si­tive ac­count in­for­ma­tion to a lim­ited group of trained and vet­ted em­ploy­ees,” said a spokesman, who spoke on the con­di­tion of anonymity to pro­tect the safety” of Twitter per­son­nel. We un­der­stand the in­cred­i­ble risks faced by many who use Twitter to share their per­spec­tives with the world and to hold those in power ac­count­able. We have tools in place to pro­tect their pri­vacy and their abil­ity to do their vi­tal work.”

The three men are ac­cused of work­ing with a Saudi of­fi­cial who leads a char­i­ta­ble or­ga­ni­za­tion be­long­ing to Mohammed. Based on a de­scrip­tion of the char­ity, the of­fi­cial is Bader Al Asaker, which was con­firmed by a per­son fa­mil­iar with the case, who spoke on the con­di­tion of anonymity to dis­cuss an on­go­ing case. Asaker’s char­ity, MiSK, be­longs to Mohammed, who is re­ferred to in the com­plaint as Royal Family Member 1.

According to the com­plaint, Asaker was working for and at the di­rec­tion of” Mohammed with re­spect to his on­line pres­ence” on Twitter. In 2015, when most of the ac­tiv­ity took place, Mohammed was a ris­ing fig­ure in the Saudi royal fam­ily.

After King Abdullah died in January 2015 and Mohammed’s fa­ther, Salman, took the throne, he was ap­pointed de­fense min­is­ter and, a few months later, deputy crown prince. The ac­tiv­ity was par­tic­u­larly pro­nounced — with Abouammo re­peat­edly view­ing the dis­si­dent Abdulaziz’s data and call­ing Asaker — dur­ing that pe­riod, ac­cord­ing to the com­plaint.

The case is in­cred­i­bly sig­nif­i­cant,” said Adam Coogle, a Human Rights Watch re­searcher who just pub­lished a study on Saudi Arabia’s tar­get­ing of dis­si­dents. Twitter is the de facto pub­lic space of Saudi Arabia — the place where Saudi cit­i­zens come and dis­cuss is­sues. It’s a space in which the Saudi au­thor­i­ties have used var­i­ous means to cur­tail crit­i­cal voices, in­clud­ing by seek­ing to un­mask anony­mous ac­counts.”

Abouammo, who was ar­rested in Seattle, worked for Twitter as a me­dia part­ner­ships man­ager. He met Asaker in London in late 2014. Within a week, he be­gan il­lic­itly ac­cess­ing data for the Saudis, ac­cord­ing to the com­plaint. One of his tar­gets was re­ferred to in the com­plaint as Twitter User 1,” a prominent critic” of the Saudi king­dom and royal fam­ily with more than 1 million fol­low­ers. The de­scrip­tion matches the ac­count of @Mujtahidd, the Twitter han­dle for an anony­mous per­son whose dis­clo­sures about cor­rup­tion in the Saudi lead­er­ship have an­gered of­fi­cials there. The iden­tity of the tar­geted ac­count was con­firmed by the per­son fa­mil­iar with the case.

Last fall, an FBI agent in­ter­viewed Abouammo at his home about the watch and his com­mu­ni­ca­tions with Asaker and oth­ers. According to the com­plaint, Abouammo cre­ated a false re­ceipt us­ing his home com­puter dur­ing the in­ter­view to show a $100,000 pay­ment re­ceived from Asaker to dis­guise the pay­ments as me­dia strat­egy work.

In May 2015, Alzabarah flew to Washington, D. C., where he had plans to meet Asaker, ac­cord­ing to the com­plaint. Almutairi was also in Washington at the time, as was Mohammed, who paid a visit to the White House.

Within a week of re­turn­ing to San Francisco, Alzabarah be­gan to trawl through Twitter users’ pri­vate data en masse,” the com­plaint al­leges. As Abouammo had done, Alzabarah looked at @Mujtahidd’s ac­count. He also scru­ti­nized the ac­count of a well-known and in­flu­en­tial critic” of the gov­ern­ment who lives in asy­lum in Canada, a de­scrip­tion that matches Abdulaziz, whose iden­tity in the case was con­firmed by the per­son fa­mil­iar with the mat­ter.

Abdulaziz last month sued Twitter, al­leg­ing it failed to alert him that his ac­count had been hacked by Alzabarah de­spite, the suit says, Twitter hav­ing rea­son to be­lieve that had hap­pened. Abdulaziz, whose two broth­ers have been de­tained by the Saudi gov­ern­ment, forged a friend­ship with Khashoggi in 2017. The two were or­ga­niz­ing a pro­ject in­side Saudi Arabia to chal­lenge pro-gov­ern­ment trolls on Twitter in the months be­fore Khashoggi’s killing.


Read the original on www.washingtonpost.com »

5 699 shares, 28 trendiness, 462 words and 4 minutes reading time

My name causes an issue with any booking! (names end with MR and MRS)

Airlines were early pi­o­neers in com­mu­ni­ca­tion tech­nolo­gies, and have been very slow to mod­ern­ize. For ex­am­ple, to­day, air­line IT sys­tems still com­mu­ni­cate ex­ten­sively us­ing TTY: Type-A for syn­chro­nous com­mu­ni­ca­tion, and Type-B for asyn­chro­nous com­mu­ni­ca­tions.

There is a stan­dard for TTY, which no­body fol­lows, a de-facto stan­dard by SITA, which is mostly fol­lowed, and many par­ties have quirks in their im­ple­men­ta­tion, ei­ther not be­ing able to parse some fields/​spe­cial in­di­ca­tors, or emit­ting in­cor­rect ones; every­thing you’d ex­pect from a 100 years old for­mat which grew or­gan­i­cally as new needs and ideas arose.

This is a per­va­sive theme in Airline IT, with mul­ti­ple epochs of tech­nol­ogy be­ing used side by side as com­pa­nies mi­grate very slowly.

The air­lines will swear that they re­ceived the name as A and the agent will swear that the name was sent as Amr.

They are both right, quite likely, and the is­sue lies be­tween the Travel Agency and the GDS.

GDS — such as Amadeus and Sabre — gen­er­ally of­fer mul­ti­ple in­ter­faces into their sys­tems, from old ones kept for com­pat­i­bil­ity rea­sons to more mod­ern ones. More mod­ern in­ter­faces will ac­cept struc­tured mes­sages which leave no room for am­bi­gu­ity; the old ones how­ever… are full of quirks.

In gen­eral, Travel Agencies are loathe to mod­ern­ize their IT: it re­quires re-train­ing the agents, and buy­ing new soft­ware, which costs quite a bit of money with lit­tle to no ben­e­fits to them.

In the case of a Travel Agency con­nected to Amadeus, for ex­am­ple, this means that they are likely us­ing ATE: the Amadeus Terminal Emulator, which as the name im­plies em­u­lates the ter­mi­nals of old.

Check the Quick Reference Guide, p. 33 on how to cre­ate a PNR:


* 1: 1 pas­sen­ger with the fol­low­ing sur­name.

Using a space, the pars­ing is un­am­bigu­ous, how­ever not all agents put a space, thus if in­stead the agent types:


Then the com­mand will be parsed as (NM, 1, ELADAWY, A, MR) to be helpful”.

As men­tioned, in­ter­nally a GDS will use struc­tured records. If you solve the data en­try is­sue, and your first name is prop­erly recorded into the sys­tem, then you should not have to worry about fur­ther is­sues.

You’ll need to dou­ble-check the agents’ work. Just be­cause they type AMR does not mean that the sys­tem will in­ter­pret it as AMR, as we’ve seen.

They can fix the is­sue by ex­plic­itly spec­i­fy­ing the ti­tle: NM1ELADAWY/AMR MR.

Agents may not be en­ter­ing your name in the sys­tem im­me­di­ately, for a va­ri­ety of rea­sons. If they do not, you can­not dou­ble-check that they did it prop­erly. You may have to in­sist that they do it im­me­di­ately.

OTAs gen­er­ally have more mod­ern, au­tom­a­tized, sys­tems. As such they are more likely to rely on the more mod­ern in­ter­faces of a GDS.

Using an OTA may be a sim­pler way for you to en­sure your name is prop­erly en­tered in the sys­tem.


Read the original on travel.stackexchange.com »

6 682 shares, 24 trendiness, 0 words and 0 minutes reading time

429 Too Many Requests


Read the original on stripe.com »

7 616 shares, 43 trendiness, 1308 words and 12 minutes reading time

Makers, Don't Let Yourself Be Forced Into the 'Manager Schedule'

In Masters of Doom, a book about the game de­vel­op­ment com­pany id Software and its in­flu­ence on pop­u­lar cul­ture, David Kushner re­flected on the un­con­ven­tional work­ing style of the com­pa­ny’s ace coder, John Carmack.

To in­crease his pro­duc­tiv­ity and find a break from dis­trac­tion while work­ing on his break­through Quake en­gine, Carmack adopted an ag­gres­sive tac­tic — he be­gan grad­u­ally shift­ing the start of his work­day.

Eventually, he was start­ing his pro­gram­ming in the evening and fin­ish­ing just be­fore dawn.

These un­in­ter­rupted stretches of si­lence, iso­la­tion, and deep work al­lowed Carmack to rein­vent gam­ing with the world’s first light­ning-fast 3D game en­gine.

While Carmack’s sched­ule may have made it harder for the rest of his team to reach him at times, the value he pro­duced while work­ing at his full cog­ni­tive ca­pac­ity far out­weighed that in­con­ve­nience.

The Carmacks’ of the world — those whose work in­volves writ­ing code, get­ting cre­ative, and prob­lem-solv­ing — op­er­ate on what tech in­vestor Paul Graham refers to as the maker sched­ule. In his 2009 es­say ti­tled Maker’s Schedule, Manager’s Schedule”, he ar­gued that peo­ple who make things op­er­ate on a dif­fer­ent sched­ule from those who man­age things.

Managers’ days are cut into one-hour in­ter­vals. You can block off sev­eral hours for a sin­gle task if you need to, but by de­fault, you change what you’re do­ing every hour.”Mak­ers, on the other hand, generally pre­fer to use time in units of half a day at least. You can’t write or pro­gram well in units of an hour. That’s barely enough time to get started.”

For man­agers, in­ter­rup­tions in the form of meet­ings, phone calls, and Slack no­ti­fi­ca­tions are nor­mal. For some­one on the maker sched­ule, how­ever, even the slight­est dis­trac­tion can have a dis­rup­tive ef­fect.

Research shows that it takes as long as 30 min­utes for mak­ers to get into the flow and we can’t sim­ply switch from one task to an­other. Instead, it changes the whole mode in which we work and this con­stant con­text switch­ing pre­vents our brains from fully en­gag­ing the task at hand. A study con­ducted by Gloria Marks, a Professor of Informatics at the University of California, re­vealed that it takes us an av­er­age of 23 min­utes and 15 sec­onds to re­fo­cus on a task af­ter an in­ter­rup­tion, and even when we do, we ex­pe­ri­ence a de­crease in pro­duc­tiv­ity.

A sin­gle standup meet­ing can, there­fore, blow a whole af­ter­noon by break­ing it into two pieces each too small to do any­thing sub­stan­tial in. And if you know your work is go­ing to be in­ter­rupted, why bother start­ing any­thing am­bi­tious?

Working in an open of­fice ren­ders us even more vul­ner­a­ble.

Separately, man­agers and mak­ers work fine. Friction hap­pens when they meet. And since most pow­er­ful peo­ple op­er­ate on the man­ager sched­ule, they’re in a po­si­tion to force every­one to adapt to their sched­ule, po­ten­tially wreck­ing the mak­ers’ pro­duc­tiv­ity.

And the pre­dictable re­sult is — al­most no or­ga­ni­za­tions to­day sup­port maker sched­ules.

The rea­sons why most man­agers fail to ac­com­mo­date the mak­ers and their sched­ule are quite straight­for­ward.

Instant mes­sag­ing tools like Slack trans­formed the way we com­mu­ni­cate at work, em­pow­er­ing the man­agers to col­lab­o­rate with mak­ers at their con­ve­nience. The work style these tools en­able fits the man­agers’ sched­ule so neatly, that they of­ten don’t see the costs to the maker. Immediate re­sponse be­comes the im­plicit ex­pec­ta­tion, with barely any bar­ri­ers or re­stric­tions in place.

And in the ab­sence of bar­ri­ers, con­ve­nience al­ways wins.

The rea­son why many man­agers fail to see and ad­dress this prob­lem is that they are used to look­ing at com­mu­ni­ca­tion and as­sume it’s a good thing. Because they see ac­tiv­ity. People are at­tend­ing meet­ings, talk­ing to each other, the on­line pres­ence in­di­ca­tors are bright green. Clearly, a lot of work is hap­pen­ing!

At the same time, real work is not get­ting done. Meaningful work is usu­ally done qui­etly and in soli­tude.

Most mak­ers don’t have the lev­els of con­trol and au­ton­omy nec­es­sary to block out half a day with­out any calls or meet­ings. So in­stead of push­ing the is­sue with the man­age­ment, we try to com­pen­sate by at­tempt­ing to mul­ti­task — un­for­tu­nately, that rarely works. Building con­text can take hours, and con­text switch­ing be­tween com­mu­ni­ca­tion and cre­ative work only kills the qual­ity of both.

Being busy feels like work to us, but it’s not the work that needs to be done.

In many com­pa­nies, the choice that the mak­ers face is that be­tween cav­ing to the man­agers and sac­ri­fic­ing their deep work time and pro­duc­tiv­ity — or of­fend­ing peo­ple.

But there are smarter com­pro­mises.

The first tech­nique that Paul Graham rec­om­mends to sim­u­late the man­ager’s sched­ule within the mak­er’s is office hours”.

Office hours are chunks of time that mak­ers set aside for meet­ings, while the rest of the time they are free to go into a Do Not Disturb mode. Managers get their (brief) face time with the mak­ers on their team, while mak­ers get long stretches of time to get stuff done.

During his time as a tech­ni­cal lead at Buffer, Harrison Harnisch de­cided to ap­ply this con­cept to his sched­ule, split­ting his week up, and set­ting clear ex­pec­ta­tions about how a day should be treated. On Mondays and Fridays, he fo­cused solely on col­lab­o­rat­ing with his team, while re­serv­ing the rest of the week for heads-down cod­ing.

We have adopted a sim­i­lar sched­ule at Nuclino, re­serv­ing sev­eral days per week for our maker time” while work­ing from home. It does­n’t mean that we ig­nore all mes­sages and only look up from our work when some­thing is on fire — but the gen­eral ex­pec­ta­tion is that it’s okay to not be im­me­di­ately avail­able to your team­mates when you are fo­cus­ing on your work.

It is im­por­tant to note that deep work time can be in­ter­rupted by things that are both ur­gent and im­por­tant. However, treat­ing every ques­tion as ur­gent is likely to do more harm than good.”

It’s a nat­ural knee-jerk re­ac­tion for many man­agers to sched­ule a meet­ing when­ever a de­ci­sion needs to be made. Most of the time, such meet­ings quickly morph into ad hoc group brain­storm­ing ses­sions that may feel pro­duc­tive be­cause of all the talk­ing, but at the end of the day yield no tan­gi­ble re­sults, dis­rupt­ing every­one’s work for no good rea­son.

On the days that are re­served for col­lab­o­ra­tion, it does not al­ways need to hap­pen syn­chro­nously. Nor does it have to be face-to-face for it to be mean­ing­ful and pro­duc­tive.

Instead, com­mu­ni­ca­tion can hap­pen at a qui­eter asyn­chro­nous fre­quency in the form of thought­ful, writ­ten dis­cus­sions rather than soul-suck­ing meet­ings or er­ratic one-line-at-a-time chat mes­sages.

People think it’s ef­fi­cient to dis­trib­ute in­for­ma­tion all at the same time to a bunch of peo­ple around a room. But it’s ac­tu­ally a lot less ef­fi­cient than dis­trib­ut­ing it asyn­chro­nously by writ­ing it up and send­ing it out and let­ting peo­ple ab­sorb it when they’re ready to so it does­n’t break their days into smaller bits.”

In our ex­pe­ri­ence, the best way to pre­vent a use­less meet­ing is to write up our goals and thoughts first. Despite work­ing in the same of­fice, our team at Nuclino has con­verted nearly all of our meet­ings into asyn­chro­nously writ­ten re­ports.

Not only does that pre­serve a de­tailed log — every meet­ing and pro­ject we’ve ever had is neatly doc­u­mented — it also helps every team mem­ber have a say, prop­erly ex­press their thoughts, and ab­sorb the in­put of oth­ers at a time and pace that is con­ve­nient for them.

A lot of the in­ter­rup­tions hap­pen be­cause peo­ple have repet­i­tive ques­tions and can’t find the an­swers on their own. If the is­sue is a blocker, hav­ing to wait till the office hours” start can be frus­trat­ing.

The most straight­for­ward way to ad­dress this is to build a team knowl­edge base. Not only does that min­i­mize the num­ber of repet­i­tive ques­tions bounced around the of­fice, it al­lows new team mem­bers to ba­si­cally on­board them­selves.

But at the end of the day, it’s a mat­ter of cul­ture. None of these rules would work if the man­age­ment fails to see that mak­ers need to fol­low a dif­fer­ent sched­ule — and to make an ef­fort to re­spect it.

The truth is, though there is a time and place for syn­chro­nous, in­stant, and face-to-face com­mu­ni­ca­tion, that time is not all the time. In fact, very few things are ur­gent enough to jus­tify the po­ten­tial cost of an in­ter­rup­tion. Most are triv­ial. And while the of­fices we work in and the col­lab­o­ra­tion tools we use may nudge us to adopt the ASAP cul­ture, be­ing al­ways avail­able and keep­ing busy are not sus­tain­able sub­sti­tutes for chal­leng­ing, thought­ful work.

Keep calm and fol­low the no­hello rule.


Read the original on blog.nuclino.com »

8 613 shares, 21 trendiness, 2420 words and 26 minutes reading time

Facebook Libra is Architecturally Unsound

Coming out of blog­ging re­tire­ment with a post that di­verges from my usual nerdy pur­suits about Haskelling and maths. I’ve spent the last few years work­ing on fi­nan­cial tech­nol­ogy in the EU and I felt it was ap­pro­pri­ate to write on what I see as a un­der­cov­ered topic in tech jour­nal­ism.

In the last few months Facebook has re­leased what they claim is a new fi­nan­cial ser­vices plat­form called Libra. Libra aims to be dig­i­tal set­tle­ment sys­tem based on a bas­ket of in­ter­na­tional cur­ren­cies that are man­aged on a blockchain” and held in a cash pool gov­erned in Switzerland. The stated goals of the pro­ject are lofty and have mas­sive geopo­lit­i­cal con­se­quences.

There are many sen­si­ble ar­ti­cles writ­ten in the Financial Times and New York Times about the un­sound mon­e­tary and eco­nomic as­sump­tions that un­der­lie the pro­posed fi­nan­cial struc­ture, but there aren’t enough tech­nol­o­gists who have given their analy­sis from a tech­ni­cal per­spec­tive. Not many peo­ple who work on fi­nan­cial in­fra­struc­ture speak pub­licly about their work and so this pro­ject has got­ten quite a bit of a pass in tech jour­nal­ism. Financial jour­nal­ism re­ally needs to do their due dili­gence on this pro­ject as the in­ter­nals are ex­posed for the world to see. For ref­er­ence I am re­fer­ring to the code is open sourced at this Github repos­i­tory and un­der the Calibra Organisation.

What is laid bare for the world to see is an ar­chi­tec­turally schiz­o­phrenic code ar­ti­fact claim­ing to be a new re­li­able plat­form for global pay­ment in­fra­struc­ture. Yet the ac­tual im­ple­men­ta­tion di­verges from this goal in bizarre ways when one ac­tu­ally dives into the code­base. I’m sure there is an in­ter­est­ing story about the in­ter­nal cor­po­rate pol­i­tics of this pro­ject and as such I thought it apt to do some dili­gence on what I see as a truly strange set of ar­chi­tec­tural choices that break the en­tire sys­tem and put con­sumers at risk.

I won’t pre­tend to have an ob­jec­tive opin­ion about Facebook as a com­pany. Few peo­ple in tech view the com­pany in a pos­i­tive light any­more. Reading through the pub­li­ca­tions re­leased, it is clear there is a fun­da­men­tal de­cep­tion in the stated goal and im­ple­men­ta­tion of the pro­ject. Put con­cisely, this pro­ject will not em­power any­one. It is a pivot from a com­pany whose ad­ver­tis­ing busi­ness is so em­broiled in scan­dal and cor­rup­tion that it has no choice but to try to di­ver­sify into pay­ments and credit scor­ing to sur­vive. The clear long term goal is to act as a data bro­ker and me­di­ate con­sumers ac­cess to credit based on their pri­vate so­cial me­dia data. This is such an ut­terly ter­ri­fy­ing and dystopian story that should cause more alarm than it does.

The only sav­ing grace of this story is the ar­ti­fact they open sourced is so hi­lar­i­ously un­suited for the task they set out to do it can only be re­garded as an act of hubris. There are sev­eral core ar­chi­tec­tural er­rors in this pro­ject:

Byzantine fault tol­er­ance is a fairly niche area of dis­trib­uted sys­tems re­search that con­cerns the abil­ity of a net­worked sys­tem to en­dure ar­bi­trary fail­ures of its com­po­nents while tak­ing cor­rec­tive ac­tions crit­i­cal to the sys­tem’s op­er­a­tion. Networks that are byzan­tine tol­er­ant must re­sist sev­eral types of at­tacks in­clud­ing restarts, crashes, ma­li­cious pay­loads, and ma­li­cious vot­ing in leader elec­tions. This de­sign de­ci­sion is cen­tral to Libra and it makes zero sense.

The time com­plex­ity over­head from this ad­di­tional struc­ture varies based on al­go­rithms. There is a wide amount of lit­er­a­ture where Paxos and Raft de­riv­a­tive al­go­rithms that have been en­riched with Byzantine tol­er­ant fea­tures but all of these struc­tures come with an ad­di­tional com­mu­ni­ca­tion over­head on top of the \(O(n^2)\) com­mu­ni­ca­tion cost to main­tain the quo­rum. The al­go­rithm cho­sen by Libra still has a worst-case \(O(n^2)\) com­mu­ni­ca­tion cost in the case of lead­er­ship fail­ure. And oc­curs ad­di­tional over­head from po­ten­tial lead­er­ship re­elec­tions on sev­eral types of net­work fail­ure events.

For a sys­tem that is de­signed to be run in a con­sor­tia of highly reg­u­lated multi­na­tional cor­po­rates, all run­ning Facebook signed code and ac­cess con­trolled by Facebook it sim­ply makes no sense to deal with ma­li­cious ac­tors at the con­sen­sus level. Why is this sys­tem de­signed to be byzan­tine tol­er­ant at all rather than just main­tain­ing a con­sis­tent au­dit log for com­pli­ance checks. The pos­si­bil­ity that a Libra node run by Mastercard or Andressen Horrowitz would sud­denly start run­ning ma­li­cious code is such a bizarre sce­nario to plan for and is bet­ter solved by sim­ply en­forc­ing pro­to­col in­tegrity and through non-tech­ni­cal (i.e. legal) means.

In con­gres­sional tes­ti­mony the prod­uct was stated as a chal­lenger to emerg­ing in­ter­na­tional pay­ment pro­to­cols such as WeChat, Alipay and M-Pesa. Yet none of these sys­tems are de­signed to run on byzan­tine tol­er­ant pools of val­ida­tors. They are sim­ply de­signed in the tra­di­tional high-through­put bus that or­ders ledger trans­ac­tions ac­cord­ing to a fixed set of rules. This is the nat­ural ap­proach to de­sign­ing a pay­ment sys­tem. Preventing dou­ble-spends and forks is sim­ply not an is­sue that a prop­erly de­signed pay­ment rails should ever have to deal with by de­sign.

The over­head from the con­sen­sus al­go­rithm serves no pur­pose and will only limit through­put of the whole sys­tem, and ap­pears to be there here no rea­son other than ap­par­ently cargo cult­ing pub­lic blockchain tech­nol­ogy which is not de­signed for this use case.

By the ad­mis­sion of the whitepa­per the sys­tem is de­signed to be pseudonony­mous mean­ing the ad­dresses used at the pro­to­col are de­rived from el­lip­tic curve pub­lic keys and con­tain no meta­data about the ac­counts. Yet nowhere in the gov­er­nance struc­ture de­scrip­tion for the or­gan­i­sa­tion or the pro­to­col it­self does it in­di­cate how the eco­nomic data in­volved in trans­ac­tions would be ob­scured from the val­ida­tors. The sys­tem is de­signed to be a very large way of repli­cat­ing trans­ac­tions to a num­ber of ex­ter­nal par­ties who un­der ex­ist­ing European and US bank se­crecy laws should not be privy to the eco­nomic de­tails.

Data poli­cies are dif­fi­cult to co­or­di­nate across bor­ders, es­pe­cially with dis­parate laws and reg­u­la­tions across ju­ris­dic­tions hav­ing dif­fer­ing cul­tural views on data pro­tec­tion and pri­vacy. The pro­to­col it­self is com­pletely open to con­sor­tia mem­bers by de­fault which is a clear tech­ni­cal de­sign de­fi­ciency un­suited to meet the re­quire­ments it was de­signed for.

In the United Kingdom clear­ing sys­tems like BACs are ca­pa­ble of clear­ing some­thing on the or­der of 580,000,000 trans­ac­tions a month. While highly tuned sys­tems like Visa are ca­pa­ble of achiev­ing 150,000,000 trans­ac­tions a day. The per­for­mance of these sys­tems is a func­tion of the size of trans­ac­tions, net­work rout­ing, sys­tem load and AML (anti-money laun­der­ing) checks.

For do­mes­tic trans­fers, the ef­fi­ciency prob­lems that Libra tries to solve, aren’t re­ally prob­lems in na­tion states which have mod­ernised their clear­ing in­fra­struc­ture in the last decade. For re­tail con­sumers in the European Union, mov­ing money is sim­ply a non-is­sue. It can be done sim­ply with a stan­dard smart­phone in sec­onds with tra­di­tional in­fra­struc­ture. For large cor­po­rate trea­sury de­bar­ments there are dif­fer­ent mech­a­nisms and reg­u­la­tions in­volved for mov­ing large quan­ti­ties of money.

There is no tech­ni­cal rea­son that cross bor­der pay­ments could also not set­tle in­stantly, ex­cept for the dif­fer­ences in rules and re­quire­ments across the ju­ris­dic­tions in­volved. If the re­quired pre­ven­tive mea­sures (customer due dili­gence, sanc­tions screen­ing, etc) are com­pleted mul­ti­ple times at dif­fer­ent steps in the trans­ac­tion chain this can de­lay the trans­ac­tion. Nevertheless, this de­lay is purely a func­tion of the gov­ern­ing law and com­pli­ance rather than the tech­nol­ogy.

For con­sumers there is no rea­son why a trans­ac­tion in the United Kingdom won’t clear in sec­onds. For re­tail trans­ac­tions in the EU, they are re­ally only rate-lim­ited by KYC (know your cus­tomer) and AML con­straints im­posed by gov­ern­ments and reg­u­la­tors which would equally ap­ply to Libra pay­ments. Even if Facebook were to over­come the hur­dles on in­ter­na­tional money and pri­vate data move­ment, the model as pro­posed is hun­dreds of per­son-years away from be­ing able to han­dle global trans­ac­tion through­put and would likely have to be com­pletely re­designed from first prin­ci­ples.

The whitepa­per makes a bold set of claims about a new untested lan­guage called Move which are quite du­bi­ous from a pro­gram­ming lan­guage the­ory (PLT) per­spec­tive.

Move” is a new pro­gram­ming lan­guage for im­ple­ment­ing cus­tom trans­ac­tion logic and smart con­tracts” on the Libra Blockchain. Because of Libra’s goal to one day serve bil­lions of peo­ple, Move is de­signed with safety and se­cu­rity as the high­est pri­or­i­ties.

The key fea­ture of Move is the abil­ity to de­fine cus­tom re­source types with se­man­tics in­spired by lin­ear logic

In the pub­lic blockchains, smart con­tracts re­fer to logic de­ployed on pub­lic net­works which al­lows es­crow­ing, laun­der­ing money, and the is­suance of ex­trale­gal se­cu­ri­ties and gam­bling prod­ucts. These are typ­i­cally done in a shock­ingly badly de­signed lan­guage called Solidity, which from an aca­d­e­mic PL per­spec­tive, makes PHP look like a work of ge­nius. Oddly the new lan­guage de­signed by Facebook seems to have no use case in com­mon with these tech­nolo­gies as it is ef­fec­tively a script­ing lan­guage de­signed for un­clear cor­po­rate use cases.

In pri­vate dis­trib­uted ledgers, smart con­tracts are one of those terms that are thrown around by con­sul­tants with­out much re­gard for clear de­f­i­n­i­tions or pur­pose. Enterprise soft­ware con­sul­tants gen­er­ally thrive on am­bi­gu­ity and smart con­tracts are the apoth­e­o­sis of en­ter­prise ob­scu­ran­tism be­cause they can be de­fined to mean lit­er­ally any­thing.

On the na­ture of it’s al­leged safety we have to look at the lan­guage’s se­man­tics. Soudness in PLT gen­er­ally con­sists of two dif­fer­ent proofs called progress” and preservation” which de­ter­mine the con­sis­tency of the whole space of eval­u­a­tion rules for the lan­guage. More con­cretely, in type the­ory a func­tion is linear” if it con­sumes its ar­gu­ment ex­actly once and is affine” if it con­sumes it at most once. A lin­ear type sys­tem gives a sta­tic guar­an­tee that a claimed lin­ear func­tion re­ally is lin­ear by pre­scrib­ing types to all of func­tions subex­pres­sions and track­ing call sites. This is a sub­tle prop­erty to prove and re­quires quite a bit of ma­chin­ery to do for whole pro­gram analy­sis. Linear typ­ing is still a very aca­d­e­mic re­search area that has in­spired unique­ness typ­ing in Clean and own­er­ship typ­ing in Rust. There are some ten­ta­tive pro­pos­als for adding lin­ear types to Glasgow Haskell Compiler.

The claim of the Move lan­guage to use of lin­ear types ap­pears to be un­sub­stan­ti­ated by a dive into the com­piler as it re­veals no such type­checker logic. As far as one can tell the whitepa­per cites the canon­i­cal lit­er­a­ture from Girard and Pierce and does noth­ing of the sort in the ac­tual im­ple­men­ta­tion.

On top of this, the for­mal se­man­tics of the sup­pos­edly safe lan­guage are nowhere to be found in ei­ther the im­ple­men­ta­tion or the pa­per. The lan­guage is small enough that a full cor­rect­ness proof of the se­man­tics in Coq or Isabelle is tractable. Indeed, an end to end com­piler which did a full proof-car­ry­ing trans­for­ma­tion down to byte­code is quite vi­able us­ing mod­ern tool­chains in­vented in the last decade. We’ve known how to do this since the work of George Necula and Peter Lee all the way back in 1996.

From a pro­gram­ming lan­guage the­ory per­spec­tive the claim that Move is sound and se­cure is im­pos­si­ble to an­swer as the claims seem to re­duce to noth­ing more than hand­wav­ing and mar­ket­ing rather than ac­tual proof. This is an alarm­ing po­si­tion for a lan­guage en­gi­neer­ing pro­ject which ex­pects the pub­lic to trust it to han­dle bil­lions of dol­lars.

Building sound cryp­tosys­tems is a very dif­fi­cult en­gi­neer­ing prob­lem and a healthy dose of para­noia around cryp­tog­ra­phy is al­ways the healthy at­ti­tude when deal­ing with dan­ger­ous code. There are ma­jor leaps for­ward in this space like the Microsoft Everest pro­ject build­ing a ver­i­fi­ably se­cure TLS stack. The tools to build ver­i­fi­able prim­i­tives ex­ist to­day and while ex­pen­sive to do, are cer­tainly not out­side the eco­nomic ca­pac­ity of Facebook to build. Yet the team has cho­sen not to on a pro­ject stated to be re­li­able for global fi­nance.

The li­bra pro­ject de­pends on sev­eral fairly new wild west” li­braries for build­ing ex­per­i­men­tal cryp­tosys­tems that have only emerged in the last few years. It is im­pos­si­ble to say whether the de­pen­den­cies on the fol­low­ing tools are se­cure or not as none of these li­braries have had se­cu­rity au­dits nor do they have stan­dard prac­tice dis­clo­sure poli­cies. Several core li­braries in par­tic­u­lar are in­de­ter­mi­nate about their vul­ner­a­bil­i­ties to side chan­nel and tim­ing at­tacks.

The li­brary gets even more ex­per­i­men­tal and ven­tures quite out­side the Cryptography Standard Model by fold­ing in very novel tech­niques like ver­i­fi­able ran­dom func­tions, bi­lin­ear pair­ings and thresh­old sig­na­tures. These tech­niques and li­braries might be sound, but the amal­ga­ma­tion of all of them into one sys­tem should raise some se­ri­ous con­cerns about at­tack sur­face area. The com­bi­na­tion of all of these new tools and tech­niques makes the bur­den of proof much higher.

It should be as­sumed this en­tire crypto stack is vul­ner­a­ble to a va­ri­ety of at­tacks un­til proven oth­er­wise. The move fast and break things” model should not ap­ply to cryp­to­graphic tools han­dling con­sumer fi­nan­cial data.

A defin­ing fea­ture of a pay­ment rail is the abil­ity to re­verse trans­ac­tion in case pay­ments need to be un­done by le­gal ac­tion or if they re­sult in ac­ci­den­tal or sys­tem mal­func­tion. The Libra sys­tem is de­signed to have total fi­nal­ity” and does not in­clude a trans­ac­tion type to re­verse a pay­ment. In the United Kingdom all pay­ments be­tween £100 and £30,000 are gov­erned Consumer Credit Act. This means the pay­ment provider has equal re­spon­si­bil­ity and li­a­bil­ity with the seller if there’s a prob­lem with the items bought or the payee fails to ren­der ser­vices. There are sim­i­lar reg­u­la­tions across EU, Asia and North America.

The cur­rent Libra de­sign in­cludes no pro­to­col to com­ply with con­sumer pro­tec­tion laws and has clear plan to build one. Even worse, from an data ar­chi­tec­ture the fi­nal­ity of the core au­then­ti­cated data struc­ture based on a Merkle ac­cu­mu­la­tor state ad­mits no mech­a­nism to build this into the core ledger with­out a re­design of the core.

The fi­nal con­clu­sion one must take away af­ter do­ing tech­ni­cal due dili­gence on this pro­ject is this sim­ply that it would not pass muster in any re­spected jour­nal on dis­trib­uted sys­tems re­search or fi­nan­cial en­gi­neer­ing. Before try­ing to dis­rupt global mon­e­tary pol­icy there is a mas­sive amount of a tech­ni­cal work needed to build a re­li­able net­work the pub­lic and reg­u­la­tors could trust to se­curely han­dle user data.

I see no rea­son to be­lieve that Facebook has done the tech­ni­cal work needed to over­come these tech­ni­cal is­sues in their pro­ject, not does it have any tech­ni­cal ad­van­tage over ex­ist­ing in­fra­struc­ture that al­ready works. Claiming one’s com­pany needs reg­u­la­tory flex­i­bil­ity to ex­plore in­no­va­tion is not an ex­cuse for not do­ing it in the first place.


Read the original on www.stephendiehl.com »

9 595 shares, 19 trendiness, 2098 words and 16 minutes reading time

Intel Performance Strategy Team Publishing Intentionally Misleading Benchmarks

Update 2019-11-06 3:50PM Pacific: Update to the Intel Xeon Platinum 9282 GROMACS Benchmarks Piece — Please Read.

Today some­thing hap­pened that many may not have seen. Intel pub­lished a set of bench­marks show­ing its ad­van­tage of a dual Intel Xeon Platinum 9282 sys­tem ver­sus the AMD EPYC 7742. Vendors pre­sent bench­marks to show that their prod­ucts are good from time-to-time. There is one dif­fer­ence in this case: we checked Intel’s work and found that they pre­sented a num­ber to in­ten­tion­ally mis­lead would-be buy­ers as to the com­pa­ny’s rel­a­tive per­for­mance ver­sus AMD.

For years, even through the 2017 in­tro­duc­tion of Skylake Xeon and Naples EPYC parts, on the server side, the com­pany has been rel­a­tively good about get­ting a bal­anced view. In late 2018, the com­pany brought on a new team, al­legedly to look over the per­for­mance bench­marks the com­pany pro­duced. That has cul­mi­nated in the Performance at Intel” Medium blog. This is de­scribed as:

Intel’s blog to share timely and can­did in­for­ma­tion spe­cific to the per­for­mance of Intel tech­nolo­gies and prod­ucts.” In only its sev­enth post, it has be­trayed that motto.

Here is the post in ques­tion. HPC Leadership Where it Matters — Real-World Performance

Just to be clear, I know and per­son­ally like the Intel per­for­mance labs folks as well as the folks on their new per­for­mance strat­egy team. This is just a gaffe that needed to be pointed out since, in the­ory, Intel has taken to do more dili­gence now than when they were do­ing a good job in 2017.

First, here is a chart Intel pro­duced as part of the story to show that it has su­pe­rior per­for­mance to AMD, and we are go­ing to high­light one of the re­sults, the GROMACS re­sult:

The rea­son we high­lighted this re­sult is be­cause it looked off to us. A 400W 56 core part seemed a bit strange that is was 20% faster here.

The foot­note on con­fig­u­ra­tion de­tails we fol­lowed even­tu­ally lead­ing us to the ref­er­enced #31 cor­re­spond­ing to this re­sult. Here is the con­fig­u­ra­tion for the test:

Intel® Xeon® Platinum 9282 proces­sor: Intel® Compiler 2019u4, Intel® Math Kernel Library (Intel® MKL) 2019u4, Intel MPI 2019u4, AVX-512 build, BIOS: HT ON, Turbo OFF, SNC OFF, 2 threads per core;

AMD EPYC™ 7742: Intel® Compiler 2019u4, Intel® MKL 2019u4, Intel MPI 2019u4, AVX2 build, BIOS: SMT ON, Boost ON, NPS 4, 1 threads per core. (Source: Intel)

We split the para­graph on the source page into three lines and will dis­cuss the first, fol­lowed by the last two.

The first line is damn­ing. Intel used GROMACS 2019.3. To be fair, they used the same ver­sion which makes it a valid test. GROMACS 2019.3 was re­leased on June 14, 2019, just af­ter the 2nd Gen Intel Xeon Scalable se­ries. On October 2, 2019 the GROMACS team re­leased GROMACS 2019.4. Keep in mind that it is over a month be­fore Intel pub­lished its ar­ti­cle.

In GROMACS 2019.4, there was a small, but very im­por­tant fix for the com­par­i­son Intel was try­ing to show aptly called: Added AMD Zen 2 de­tec­tion. Which says:

The AMD Zen 2 ar­chi­tec­ture is now de­tected as dif­fer­ent from Zen 1 and uses 256-bit wide AVX2 SIMD in­struc­tions (GMX_SIMD=AVX2_256) by de­fault. Also the non-bonded ker­nel pa­ra­me­ters have been tuned for Zen 2. This has a sig­nif­i­cant im­pact on per­for­mance. (Source: GROMACS Manual)

In the in­dus­try, it is or should have been well known that older ver­sions of GROMACS were not prop­erly sup­port­ing the new Rome” EPYC ar­chi­tec­ture. We cited this specif­i­cally in our launch piece since we found the is­sue. We even specif­i­cally call it out on every EPYC 7002 v. Xeon chart we have pro­duced for GROMACS since the re­sults did not meet ex­pec­ta­tions:

That is just one of our test cases which is con­sid­ered a small” case which is frankly too small for the size of these nodes. Still, the data was very easy to spot some­thing was awry.

What Intel per­haps did not know, is that we also had one of the lead de­vel­op­ers on GROMACS, a pop­u­lar HPC tool, on our dual AMD EPYC 7002 sys­tem to ad­dress some of the very ba­sic op­ti­miza­tions for the 2019.4 re­lease. I be­lieve there may be more com­ing, but this is one where we found the lack of op­ti­miza­tion, and ac­tu­ally helped ever so slightly in get­ting it fixed.

By Intel us­ing the post-2nd gen­er­a­tion Intel Xeon Scalable ver­sion of GROMACS but the pre-AMD EPYC 7002 se­ries which had been out for over a month, Intel’s num­bers are highly skewed for the Platinum 9282 which only has a 20% lead.

Again, tech­ni­cally this was a valid test by us­ing the same ver­sion. On the other hand, Intel specif­i­cally used a ver­sion that was prior to the pack­age get­ting any AMD Zen 2 op­ti­miza­tions.

Moving to the test con­fig­u­ra­tion lines for Intel and AMD, here are the lines in table form for eas­ier com­par­i­son:

Intel used its com­piler, MKL, and MPI for this test. In the 2017 era, Intel tested Xeon and EPYC with a va­ri­ety of com­pil­ers and picked the best one. We are go­ing to give their lab team that runs the tests the ben­e­fit of the doubt here that Intel’s com­piler and MKL/ MPI im­ple­men­ta­tion yield the best re­sults. Indeed, it is bet­ter that AMD does well than Arm for Intel since a cus­tomer stay­ing on x86 is a much eas­ier TAM to fight back against in 2021 for Intel.

The AVX sta­tus we ad­dressed in the sec­tion above. Using AVX2 on GROMACS 2019.3 would have dis­ad­van­taged the AMD EPYC parts.

On both CPUs we see that there are two threads per CPU which means 56 cores/ 112 threads on the Platinum 9282 and 64 cores/ 128 threads on two AMD EPYC 7742 CPUs.

Then things change. Turbo was en­abled on the EPYC 7742, but not on the Xeon Platinum 9282. In GROMACS, tran­si­tions in and out of AVX-512 code can lead to dif­fer­ences in boost clocks which can im­pact per­for­mance. We are just go­ing to point out the delta here.

SNC is off for Intel but NPS=4 is set for AMD. Sub-NUMA clus­ter­ing al­lows for each mem­ory con­troller to be split into two do­mains. On a stan­dard Xeon die, that means two NUMA nodes per CPU. Assuming it works the same on the dual-die Platinum 9282, it would be four NUMA nodes per pack­age.

The AMD EPYC NPS set­ting is very sim­i­lar as it al­lows one to go from one NUMA node per socket and in­stead se­lect two or four. Here, Intel is run­ning four NUMA nodes per socket, or eight to­tal for the dual AMD EPYC 7742 sys­tem ver­sus only two NUMA nodes per socket or four to­tal on the Platinum 9282 sys­tem. SNC/ NPS usu­ally in­creases mem­ory band­width to cores by lo­cal­iz­ing mem­ory ac­cess. What is slightly in­ter­est­ing here is how Intel char­ac­ter­izes GROMACS as be­ing com­pute ver­sus-mem­ory bound.

Finally, threads per core. On the Intel plat­form, it is 2. On AMD, it is 1. That means Intel is us­ing 224 threads on 112 cores for the Xeon Platinum 9282 and 128 threads on 128 cores with 256 threads on the sys­tem. Putting the trans­la­tion of con­fig­u­ra­tion words into a table this is what Intel did with the test con­fig­u­ra­tions:

What we do not know is whether Intel needed to do this due to prob­lem sizes. GROMACS can er­ror out if you have too many threads which is why we have a STH Small Case that will not run on many 4P sys­tems and is strug­gling, as shown above, on even the dual EPYC 7742 sys­tem. It does re­quire very solid thread pin­ning in this sce­nario of one GROMACS thread on a two-thread core oth­er­wise per­for­mance can go poorly quickly with this con­fig­u­ra­tion.

Even as­sum­ing that Intel soft­ware tool­ing is su­pe­rior, Intel changed the boost set­ting, added more NUMA nodes for AMD, and used fewer threads per core than with Intel. Perhaps that is how Intel got the best re­sults us­ing an older ver­sion of GROMACS, but that is a fair num­ber of changes.

One of the other, very in­ter­est­ing points here is that Intel tested on a Naples gen­er­a­tion test plat­form.

Here is the Intel con­fig­u­ra­tion:

Here is the AMD EPYC con­fig­u­ra­tion:

Intel is us­ing 16GB DIMMs ver­sus 32GB DIMMs for EPYC. They have dif­fer­ent num­bers of mem­ory chan­nels so we can let that pass. One item was very in­ter­est­ing, the test server. Intel used its S9200WKL which we cov­ered in In­tel Xeon Platinum 9200 Formerly Cascade Lake-AP Launched.

What is more in­ter­est­ing is the AMD EPYC 7742 con­fig­u­ra­tion. Here, Intel is us­ing the Supermicro AS-2023-TR4 that it shows is built upon the Supermicro HD11DSU-iN which is sim­i­lar to the moth­er­board we re­viewed in our Su­per­mi­cro AS-1123US-TR4 Server Review server.

Most likely, it has to be a Revision 2.0 moth­er­board to sup­port the EPYC 7002 gen­er­a­tion and DDR4-3200 speeds. Being a H11 plat­form, it will only sup­port PCIe Gen3, not Gen4. Again, this is an off-the-shelf con­fig­urable sys­tem that the sock­eted EPYC 7742 al­lows for ver­sus the Intel-only Platinum 9200 so­lu­tion. We ex­plained why that is an ex­tra­or­di­nar­ily im­por­tant nu­ance in Why the Intel Xeon Platinum 9200 Series Lacks Mainstream Support.

The base cTDP of the EPYC 7742 is 225W as Intel notes in its ar­ti­cle. Technically, the Supermicro server us­ing a Rev 2.0 board is ca­pa­ble of run­ning the AMD EPYC 7742 se­ries at a cTDP of 240W. If some­one was to com­pare, on a socket-to-socket ba­sis, an EPYC 7742 to a 400W Platinum 9282, one may ex­pect that push­ing the cTDP up to 240W would be a com­mon set­ting. Of note, in our test­ing, even with a cTDP of 240W the power con­sump­tion to TDP ra­tio for EPYC 7742 chips is much closer than the Platinum 8280’s AVX-512 power con­sump­tion to TDP ra­tio is. Extrapolating, there is a 100% chance that the EPYC 7742 is us­ing con­sid­er­ably less power here to the point that cTDP should have been set to 240W.

In the main ar­ti­cle’s text, Intel states 225W for the part and cTDP was not men­tioned in the #31 con­fig­u­ra­tion de­tails. We are also go­ing to note that there is a 280W AMD EPYC 7H12 part avail­able, but it is un­likely Intel had ac­cess to this for in­ter­nal lab test­ing at this point (STH has not been able to get a pair ei­ther.) That would have at least been a some­what bet­ter com­par­i­son.

Finally, Intel is us­ing a 3.10.0-1062.1.1 CentOS/ RHEL ker­nel. Newer Linux ker­nels tend to per­form bet­ter with the newer EPYC chips but it is valid that Intel is be­ing con­sis­tent even if it po­ten­tially dis­ad­van­tages AMD.

That was around 1800 words on a sin­gle bench­mark that Intel pre­sented. Should the text leave any doubt, per­son­ally I tend to give the ben­e­fit of the doubt to the folks in Intel’s per­for­mance labs since they did a fairly good job in the 2017 era. However, now those folks have a per­for­mance strat­egy team sit­ting above them that is pub­lish­ing ar­ti­cles like this that have misses that a rea­son­ably pru­dent per­for­mance ar­biter should see.

One can only con­clude that Intel’s Performance at Intel” blog is not a rep­utable at­tempt to pre­sent fac­tual in­for­ma­tion. It is sim­ply a way for Intel to pub­lish mis­in­for­ma­tion to the mar­ket in the hope that peo­ple do not do the dili­gence to see what is back­ing the claims. Once one does the dili­gence, things fall apart quickly.

The fact that Intel doc­u­mented their pro­ce­dures means that they had valid tests. It is just that the tests pre­sented, and val­i­dated by the Performance at Intel team were clearly con­ducted in a way to mis­in­form a po­ten­tial cus­tomer about the cur­rent state of per­for­mance. This GROMACS ex­am­ple has been pub­licly known for al­most four months and has not been cur­rent state for over a month. AMD does the same things as part of mar­ket­ing e.g. AMD EPYC Rome NAMD and the Intel Xeon Response at Computex 2019 so it is, per­haps, par for the course. So per­haps the best course of ac­tion is to ig­nore these claims.

It was once told to me, you only lose your rep­u­ta­tion once in the val­ley.” With this, the Performance at Intel” blog just had that mo­ment. Perhaps in mar­ket­ing, one gets mul­ti­ple at­tempts.


Read the original on www.servethehome.com »

10 580 shares, 22 trendiness, 3876 words and 37 minutes reading time

Parse, don’t validate

Historically, I’ve strug­gled to find a con­cise, sim­ple way to ex­plain what it means to prac­tice type-dri­ven de­sign. Too of­ten, when some­one asks me How did you come up with this ap­proach?” I find I can’t give them a sat­is­fy­ing an­swer. I know it did­n’t just come to me in a vi­sion—I have an it­er­a­tive de­sign process that does­n’t re­quire pluck­ing the right” ap­proach out of thin air—yet I haven’t been very suc­cess­ful in com­mu­ni­cat­ing that process to oth­ers.

However, about a month ago, I was re­flect­ing on Twitter about the dif­fer­ences I ex­pe­ri­enced pars­ing JSON in sta­t­i­cally- and dy­nam­i­cally-typed lan­guages, and fi­nally, I re­al­ized what I was look­ing for. Now I have a sin­gle, snappy slo­gan that en­cap­su­lates what type-dri­ven de­sign means to me, and bet­ter yet, it’s only three words long:

Alright, I’ll con­fess: un­less you al­ready know what type-dri­ven de­sign is, my catchy slo­gan prob­a­bly does­n’t mean all that much to you. Fortunately, that’s what the re­main­der of this blog post is for. I’m go­ing to ex­plain pre­cisely what I mean in gory de­tail—but first, we need to prac­tice a lit­tle wish­ful think­ing.

One of the won­der­ful things about sta­tic type sys­tems is that they can make it pos­si­ble, and some­times even easy, to an­swer ques­tions like is it pos­si­ble to write this func­tion?” For an ex­treme ex­am­ple, con­sider the fol­low­ing Haskell type sig­na­ture:

Is it pos­si­ble to im­ple­ment foo? Trivially, the an­swer is no, as Void is a type that con­tains no val­ues, so it’s im­pos­si­ble for any func­tion to pro­duce a value of type Void. That ex­am­ple is pretty bor­ing, but the ques­tion gets much more in­ter­est­ing if we choose a more re­al­is­tic ex­am­ple:

This func­tion re­turns the first el­e­ment from a list. Is it pos­si­ble to im­ple­ment? It cer­tainly does­n’t sound like it does any­thing very com­pli­cated, but if we at­tempt to im­ple­ment it, the com­piler won’t be sat­is­fied:

warn­ing: [-Wincomplete-patterns]

Pattern match(es) are non-ex­haus­tive

In an equa­tion for head’: Patterns not matched: []

This mes­sage is help­fully point­ing out that our func­tion is par­tial, which is to say it is not de­fined for all pos­si­ble in­puts. Specifically, it is not de­fined when the in­put is [], the empty list. This makes sense, as it is­n’t pos­si­ble to re­turn the first el­e­ment of a list if the list is empty—there’s no el­e­ment to re­turn! So, re­mark­ably, we learn this func­tion is­n’t pos­si­ble to im­ple­ment, ei­ther.

To some­one com­ing from a dy­nam­i­cally-typed back­ground, this might seem per­plex­ing. If we have a list, we might very well want to get the first el­e­ment in it. And in­deed, the op­er­a­tion of getting the first el­e­ment of a list” is­n’t im­pos­si­ble in Haskell, it just re­quires a lit­tle ex­tra cer­e­mony. There are two dif­fer­ent ways to fix the head func­tion, and we’ll start with the sim­plest one.

As es­tab­lished, head is par­tial be­cause there is no el­e­ment to re­turn if the list is empty: we’ve made a promise we can­not pos­si­bly ful­fill. Fortunately, there’s an easy so­lu­tion to that dilemma: we can weaken our promise. Since we can­not guar­an­tee the caller an el­e­ment of the list, we’ll have to prac­tice a lit­tle ex­pec­ta­tion man­age­ment: we’ll do our best re­turn an el­e­ment if we can, but we re­serve the right to re­turn noth­ing at all. In Haskell, we ex­press this pos­si­bil­ity us­ing the Maybe type:

This buys us the free­dom we need to im­ple­ment head—it al­lows us to re­turn Nothing when we dis­cover we can’t pro­duce a value of type a af­ter all:

Problem solved, right? For the mo­ment, yes… but this so­lu­tion has a hid­den cost.

Returning Maybe is un­doubtably con­ve­nient when we’re im­ple­ment­ing head. However, it be­comes sig­nif­i­cantly less con­ve­nient when we want to ac­tu­ally use it! Since head al­ways has the po­ten­tial to re­turn Nothing, the bur­den falls upon its callers to han­dle that pos­si­bil­ity, and some­times that pass­ing of the buck can be in­cred­i­bly frus­trat­ing. To see why, con­sider the fol­low­ing code:

get­Con­fig­u­ra­tionDi­rec­to­ries :: IO [FilePath]

get­Con­fig­u­ra­tionDi­rec­to­ries = do

con­figDirsString ini­tial­ize­Cache cacheDir

Nothing -> er­ror should never hap­pen; al­ready checked con­figDirs is non-empty”

When get­Con­fig­u­ra­tionDi­rec­to­ries re­trieves a list of file paths from the en­vi­ron­ment, it proac­tively checks that the list is non-empty. However, when we use head in main to get the first el­e­ment of the list, the Maybe FilePath re­sult still re­quires us to han­dle a Nothing case that we know will never hap­pen! This is ter­ri­bly bad for sev­eral rea­sons:

First, it’s just an­noy­ing. We al­ready checked that the list is non-empty, why do we have to clut­ter our code with an­other re­dun­dant check?

Second, it has a po­ten­tial per­for­mance cost. Although the cost of the re­dun­dant check is triv­ial in this par­tic­u­lar ex­am­ple, one could imag­ine a more com­plex sce­nario where the re­dun­dant checks could add up, such as if they were hap­pen­ing in a tight loop.

Finally, and worst of all, this code is a bug wait­ing to hap­pen! What if get­Con­fig­u­ra­tionDi­rec­to­ries were mod­i­fied to stop check­ing that the list is empty, in­ten­tion­ally or un­in­ten­tion­ally? The pro­gram­mer might not re­mem­ber to up­date main, and sud­denly the impossible” er­ror be­comes not only pos­si­ble, but prob­a­ble.

The need for this re­dun­dant check has es­sen­tially forced us to punch a hole in our type sys­tem. If we could sta­t­i­cally prove the Nothing case im­pos­si­ble, then a mod­i­fi­ca­tion to get­Con­fig­u­ra­tionDi­rec­to­ries that stopped check­ing if the list was empty would in­val­i­date the proof and trig­ger a com­pile-time fail­ure. However, as-writ­ten, we’re forced to rely on a test suite or man­ual in­spec­tion to catch the bug.

Clearly, our mod­i­fied ver­sion of head leaves some things to be de­sired. Somehow, we’d like it to be smarter: if we al­ready checked that the list was non-empty, head should un­con­di­tion­ally re­turn the first el­e­ment with­out forc­ing us to han­dle the case we know is im­pos­si­ble. How can we do that?

Let’s look at the orig­i­nal (partial) type sig­na­ture for head again:

The pre­vi­ous sec­tion il­lus­trated that we can turn that par­tial type sig­na­ture into a to­tal one by weak­en­ing the promise made in the re­turn type. However, since we don’t want to do that, there’s only one thing left that can be changed: the ar­gu­ment type (in this case, [a]). Instead of weak­en­ing the re­turn type, we can strengthen the ar­gu­ment type, elim­i­nat­ing the pos­si­bil­ity of head ever be­ing called on an empty list in the first place.

To do this, we need a type that rep­re­sents non-empty lists. Fortunately, the ex­ist­ing NonEmpty type from Data. List.NonEmpty is ex­actly that. It has the fol­low­ing de­f­i­n­i­tion:

Note that NonEmpty a is re­ally just a tu­ple of an a and an or­di­nary, pos­si­bly-empty [a]. This con­ve­niently mod­els a non-empty list by stor­ing the first el­e­ment of the list sep­a­rately from the list’s tail: even if the [a] com­po­nent is [], the a com­po­nent must al­ways be pre­sent. This makes head com­pletely triv­ial to im­ple­ment:

Unlike be­fore, GHC ac­cepts this de­f­i­n­i­tion with­out com­plaint—this de­f­i­n­i­tion is to­tal, not par­tial. We can up­date our pro­gram to use the new im­ple­men­ta­tion:

get­Con­fig­u­ra­tionDi­rec­to­ries :: IO (NonEmpty FilePath)

get­Con­fig­u­ra­tionDi­rec­to­ries = do

con­figDirsString pure non­Emp­ty­Con­figDirsList

Nothing -> throwIO $ user­Error CONFIG_DIRS can­not be empty”

main :: IO ()

main = do


Note that the re­dun­dant check in main is now com­pletely gone! Instead, we per­form the check ex­actly once, in get­Con­fig­u­ra­tionDi­rec­to­ries. It con­structs a NonEmpty a from a [a] us­ing the non­Empty func­tion from Data.List.NonEmpty, which has the fol­low­ing type:

The Maybe is still there, but this time, we han­dle the Nothing case very early in our pro­gram: right in the same place we were al­ready do­ing the in­put val­i­da­tion. Once that check has passed, we now have a NonEmpty FilePath value, which pre­serves (in the type sys­tem!) the knowl­edge that the list re­ally is non-empty. Put an­other way, you can think of a value of type NonEmpty a as be­ing like a value of type [a], plus a proof that the list is non-empty.

By strength­en­ing the type of the ar­gu­ment to head in­stead of weak­en­ing the type of its re­sult, we’ve com­pletely elim­i­nated all the prob­lems from the pre­vi­ous sec­tion:

The code has no re­dun­dant checks, so there can’t be any per­for­mance over­head.

Furthermore, if get­Con­fig­u­ra­tionDi­rec­to­ries changes to stop check­ing that the list is non-empty, its re­turn type must change, too. Consequently, main will fail to type­check, alert­ing us to the prob­lem be­fore we even run the pro­gram!

What’s more, it’s triv­ial to re­cover the old be­hav­ior of head from the new one by com­pos­ing head with non­Empty:

Note that the in­verse is not true: there is no way to ob­tain the new ver­sion of head from the old one. All in all, the sec­ond ap­proach is su­pe­rior on all axes.

You may be won­der­ing what the above ex­am­ple has to do with the ti­tle of this blog post. After all, we only ex­am­ined two dif­fer­ent ways to val­i­date that a list was non-empty—no pars­ing in sight. That in­ter­pre­ta­tion is­n’t wrong, but I’d like to pro­pose an­other per­spec­tive: in my mind, the dif­fer­ence be­tween val­i­da­tion and pars­ing lies al­most en­tirely in how in­for­ma­tion is pre­served. Consider the fol­low­ing pair of func­tions:

val­i­dateNonEmpty :: [a] -> IO ()

val­i­dateNonEmpty (_:_) = pure ()

val­i­dateNonEmpty [] = throwIO $ user­Error list can­not be empty”

parseNonEmpty :: [a] -> IO (NonEmpty a)

parseNonEmpty (x:xs) = pure (x:|xs)

parseNonEmpty [] = throwIO $ user­Error list can­not be empty”

These two func­tions are nearly iden­ti­cal: they check if the pro­vided list is empty, and if it is, they abort the pro­gram with an er­ror mes­sage. The dif­fer­ence lies en­tirely in the re­turn type: val­i­dateNonEmpty al­ways re­turns (), the type that con­tains no in­for­ma­tion, but parseNonEmpty re­turns NonEmpty a, a re­fine­ment of the in­put type that pre­serves the knowl­edge gained in the type sys­tem. Both of these func­tions check the same thing, but parseNonEmpty gives the caller ac­cess to the in­for­ma­tion it learned, while val­i­dateNonEmpty just throws it away.

These two func­tions el­e­gantly il­lus­trate two dif­fer­ent per­spec­tives on the role of a sta­tic type sys­tem: val­i­dateNonEmpty obeys the type­checker well enough, but only parseNonEmpty takes full ad­van­tage of it. If you see why parseNonEmpty is prefer­able, you un­der­stand what I mean by the mantra parse, don’t val­i­date.” Still, per­haps you are skep­ti­cal of parseNonEmp­ty’s name. Is it re­ally pars­ing any­thing, or is it merely val­i­dat­ing its in­put and re­turn­ing a re­sult? While the pre­cise de­f­i­n­i­tion of what it means to parse or val­i­date some­thing is de­bat­able, I be­lieve parseNonEmpty is a bona-fide parser (albeit a par­tic­u­larly sim­ple one).

Consider: what is a parser? Really, a parser is just a func­tion that con­sumes less-struc­tured in­put and pro­duces more-struc­tured out­put. By its very na­ture, a parser is a par­tial func­tion—some val­ues in the do­main do not cor­re­spond to any value in the range—so all parsers must have some no­tion of fail­ure. Often, the in­put to a parser is text, but this is by no means a re­quire­ment, and parseNonEmpty is a per­fectly cro­mu­lent parser: it parses lists into non-empty lists, sig­nal­ing fail­ure by ter­mi­nat­ing the pro­gram with an er­ror mes­sage.

Under this flex­i­ble de­f­i­n­i­tion, parsers are an in­cred­i­bly pow­er­ful tool: they al­low dis­charg­ing checks on in­put up-front, right on the bound­ary be­tween a pro­gram and the out­side world, and once those checks have been per­formed, they never need to be checked again! Haskellers are well-aware of this power, and they use many dif­fer­ent types of parsers on a reg­u­lar ba­sis:

The ae­son li­brary pro­vides a Parser type that can be used to parse JSON data into do­main types.

Likewise, opt­parse-ap­plica­tive pro­vides a set of parser com­bi­na­tors for pars­ing com­mand-line ar­gu­ments.

Database li­braries like per­sis­tent and post­gresql-sim­ple have a mech­a­nism for pars­ing val­ues held in an ex­ter­nal data store.

The ser­vant ecosys­tem is built around pars­ing Haskell datatypes from path com­po­nents, query pa­ra­me­ters, HTTP head­ers, and more.

The com­mon theme be­tween all these li­braries is that they sit on the bound­ary be­tween your Haskell ap­pli­ca­tion and the ex­ter­nal world. That world does­n’t speak in prod­uct and sum types, but in streams of bytes, so there’s no get­ting around a need to do some pars­ing. Doing that pars­ing up front, be­fore act­ing on the data, can go a long way to­ward avoid­ing many classes of bugs, some of which might even be se­cu­rity vul­ner­a­bil­i­ties.

One draw­back to this ap­proach of pars­ing every­thing up front is that it some­times re­quires val­ues be parsed long be­fore they are ac­tu­ally used. In a dy­nam­i­cally-typed lan­guage, this can make keep­ing the pars­ing and pro­cess­ing logic in sync a lit­tle tricky with­out ex­ten­sive test cov­er­age, much of which can be la­bo­ri­ous to main­tain. However, with a sta­tic type sys­tem, the prob­lem be­comes mar­velously sim­ple, as demon­strated by the NonEmpty ex­am­ple above: if the pars­ing and pro­cess­ing logic go out of sync, the pro­gram will fail to even com­pile.

Hopefully, by this point, you are at least some­what sold on the idea that pars­ing is prefer­able to val­i­da­tion, but you may have lin­ger­ing doubts. Is val­i­da­tion re­ally so bad if the type sys­tem is go­ing to force you to do the nec­es­sary checks even­tu­ally any­way? Maybe the er­ror re­port­ing will be a lit­tle bit worse, but a bit of re­dun­dant check­ing can’t hurt, right?

Unfortunately, it is­n’t so sim­ple. Ad-hoc val­i­da­tion leads to a phe­nom­e­non that the lan­guage-the­o­retic se­cu­rity field calls shot­gun pars­ing. In the 2016 pa­per, The Seven Turrets of Babel: A Taxonomy of LangSec Errors and How to Expunge Them, its au­thors pro­vide the fol­low­ing de­f­i­n­i­tion:

Shotgun pars­ing is a pro­gram­ming an­tipat­tern whereby pars­ing and in­put-val­i­dat­ing code is mixed with and spread across pro­cess­ing code—throw­ing a cloud of checks at the in­put, and hop­ing, with­out any sys­tem­atic jus­ti­fi­ca­tion, that one or an­other would catch all the bad” cases.

They go on to ex­plain the prob­lems in­her­ent to such val­i­da­tion tech­niques:

Shotgun pars­ing nec­es­sar­ily de­prives the pro­gram of the abil­ity to re­ject in­valid in­put in­stead of pro­cess­ing it. Late-discovered er­rors in an in­put stream will re­sult in some por­tion of in­valid in­put hav­ing been processed, with the con­se­quence that pro­gram state is dif­fi­cult to ac­cu­rately pre­dict.

In other words, a pro­gram that does not parse all of its in­put up front runs the risk of act­ing upon on a valid por­tion of the in­put, dis­cov­er­ing a dif­fer­ent por­tion is in­valid, and sud­denly need­ing to roll back what­ever mod­i­fi­ca­tions it al­ready ex­e­cuted in or­der to main­tain con­sis­tency. Sometimes this is pos­si­ble—such as rolling back a trans­ac­tion in an RDBMS—but in gen­eral it may not be.

It may not be im­me­di­ately ap­par­ent what shot­gun pars­ing has to do with val­i­da­tion—af­ter all, if you do all your val­i­da­tion up front, you mit­i­gate the risk of shot­gun pars­ing. The prob­lem is that val­i­da­tion-based ap­proaches make it ex­tremely dif­fi­cult or im­pos­si­ble to de­ter­mine if every­thing was ac­tu­ally val­i­dated up front or if some of those so-called impossible” cases might ac­tu­ally hap­pen. The en­tire pro­gram must as­sume that rais­ing an ex­cep­tion any­where is not only pos­si­ble, it’s reg­u­larly nec­es­sary.

Parsing avoids this prob­lem by strat­i­fy­ing the pro­gram into two phases—pars­ing and ex­e­cu­tion—where fail­ure due to in­valid in­put can only hap­pen in the first phase. The set of re­main­ing fail­ure modes dur­ing ex­e­cu­tion is min­i­mal by com­par­i­son, and they can be han­dled with the ten­der care they re­quire.

So far, this blog post has been some­thing of a sales pitch. You, dear reader, ought to be pars­ing!” it says, and if I’ve done my job prop­erly, at least some of you are sold. However, even if you un­der­stand the what” and the why,” you might not feel es­pe­cially con­fi­dent about the how.”

My ad­vice: fo­cus on the datatypes.

Suppose you are writ­ing a func­tion that ac­cepts a list of tu­ples rep­re­sent­ing key-value pairs, and you sud­denly re­al­ize you aren’t sure what to do if the list has du­pli­cate keys. One so­lu­tion would be to write a func­tion that as­serts there aren’t any du­pli­cates in the list:

However, this check is frag­ile: it’s ex­tremely easy to for­get. Because its re­turn value is un­used, it can al­ways be omit­ted, and the code that needs it would still type­check. A bet­ter so­lu­tion is to choose a data struc­ture that dis­al­lows du­pli­cate keys by con­struc­tion, such as a Map. Adjust your func­tion’s type sig­na­ture to ac­cept a Map in­stead of a list of tu­ples, and im­ple­ment it as you nor­mally would.

Once you’ve done that, the call site of your new func­tion will likely fail to type­check, since it is still be­ing passed a list of tu­ples. If the caller was given the value via one of its ar­gu­ments, or if it re­ceived it from the re­sult of some other func­tion, you can con­tinue up­dat­ing the type from list to Map, all the way up the call chain. Eventually, you will ei­ther reach the lo­ca­tion the value is cre­ated, or you’ll find a place where du­pli­cates ac­tu­ally ought to be al­lowed. At that point, you can in­sert a call to a mod­i­fied ver­sion of chec­kN­oDu­pli­cateKeys:

Now the check can­not be omit­ted, since its re­sult is ac­tu­ally nec­es­sary for the pro­gram to pro­ceed!

Use a data struc­ture that makes il­le­gal states un­rep­re­sentable. Model your data us­ing the most pre­cise data struc­ture you rea­son­ably can. If rul­ing out a par­tic­u­lar pos­si­bil­ity is too hard us­ing the en­cod­ing you are cur­rently us­ing, con­sider al­ter­nate en­cod­ings that can ex­press the prop­erty you care about more eas­ily. Don’t be afraid to refac­tor.

Push the bur­den of proof up­ward as far as pos­si­ble, but no fur­ther. Get your data into the most pre­cise rep­re­sen­ta­tion you need as quickly as you can. Ideally, this should hap­pen at the bound­ary of your sys­tem, be­fore any of the data is acted upon.

If one par­tic­u­lar code branch even­tu­ally re­quires a more pre­cise rep­re­sen­ta­tion of a piece of data, parse the data into the more pre­cise rep­re­sen­ta­tion as soon as the branch is se­lected. Use sum types ju­di­ciously to al­low your datatypes to re­flect and adapt to con­trol flow.

In other words, write func­tions on the data rep­re­sen­ta­tion you wish you had, not the data rep­re­sen­ta­tion you are given. The de­sign process then be­comes an ex­er­cise in bridg­ing the gap, of­ten by work­ing from both ends un­til they meet some­where in the mid­dle. Don’t be afraid to it­er­a­tively ad­just parts of the de­sign as you go, since you may learn some­thing new dur­ing the refac­tor­ing process!

Here are a hand­ful of ad­di­tional points of ad­vice, arranged in no par­tic­u­lar or­der:

Let your datatypes in­form your code, don’t let your code con­trol your datatypes. Avoid the temp­ta­tion to just stick a Bool in a record some­where be­cause it’s needed by the func­tion you’re cur­rently writ­ing. Don’t be afraid to refac­tor code to use the right data rep­re­sen­ta­tion—the type sys­tem will en­sure you’ve cov­ered all the places that need chang­ing, and it will likely save you a headache later.

Treat func­tions that re­turn m () with deep sus­pi­cion. Sometimes these are gen­uinely nec­es­sary, as they may per­form an im­per­a­tive ef­fect with no mean­ing­ful re­sult, but if the pri­mary pur­pose of that ef­fect is rais­ing an er­ror, it’s likely there’s a bet­ter way.

Don’t be afraid to parse data in mul­ti­ple passes. Avoiding shot­gun pars­ing just means you should­n’t act on the in­put data be­fore it’s fully parsed, not that you can’t use some of the in­put data to de­cide how to parse other in­put data. Plenty of use­ful parsers are con­text-sen­si­tive.

Avoid de­nor­mal­ized rep­re­sen­ta­tions of data, es­pe­cially if it’s mu­ta­ble. Duplicating the same data in mul­ti­ple places in­tro­duces a triv­ially rep­re­sentable il­le­gal state: the places get­ting out of sync. Strive for a sin­gle source of truth.

Keep de­nor­mal­ized rep­re­sen­ta­tions of data be­hind ab­strac­tion bound­aries. If de­nor­mal­iza­tion is ab­solutely nec­es­sary, use en­cap­su­la­tion to en­sure a small, trusted mod­ule holds sole re­spon­si­bil­ity for keep­ing the rep­re­sen­ta­tions in sync.

Use ab­stract datatypes to make val­ida­tors look like” parsers. Sometimes, mak­ing an il­le­gal state truly un­rep­re­sentable is just plain im­prac­ti­cal given the tools Haskell pro­vides, such as en­sur­ing an in­te­ger is in a par­tic­u­lar range. In that case, use an ab­stract new­type with a smart con­struc­tor to fake” a parser from a val­ida­tor.

As al­ways, use your best judge­ment. It prob­a­bly is­n’t worth break­ing out sin­gle­tons and refac­tor­ing your en­tire ap­pli­ca­tion just to get rid of a sin­gle er­ror impossible” call some­where—just make sure to treat those sit­u­a­tions like the ra­dioac­tive sub­stance they are, and han­dle them with the ap­pro­pri­ate care. If all else fails, at least leave a com­ment to doc­u­ment the in­vari­ant for who­ever needs to mod­ify the code next.

That’s all, re­ally. Hopefully this blog post proves that tak­ing ad­van­tage of the Haskell type sys­tem does­n’t re­quire a PhD, and it does­n’t even re­quire us­ing the lat­est and great­est of GHCs shiny new lan­guage ex­ten­sions—though they can cer­tainly some­times help! Sometimes the biggest ob­sta­cle to us­ing Haskell to its fullest is sim­ply be­ing aware what op­tions are avail­able, and un­for­tu­nately, one down­side of Haskell’s small com­mu­nity is a rel­a­tive dearth of re­sources that doc­u­ment de­sign pat­terns and tech­niques that have be­come tribal knowl­edge.

None of the ideas in this blog post are new. In fact, the core idea—“write to­tal func­tions”—is con­cep­tu­ally quite sim­ple. Despite that, I find it re­mark­ably chal­leng­ing to com­mu­ni­cate ac­tion­able, prac­ti­ca­ble de­tails about the way I write Haskell code. It’s easy to spend lots of time talk­ing about ab­stract con­cepts—many of which are quite valu­able!—with­out com­mu­ni­cat­ing any­thing use­ful about process. My hope is that this is a small step in that di­rec­tion.

Sadly, I don’t know very many other re­sources on this par­tic­u­lar topic, but I do know of one: I never hes­i­tate to rec­om­mend Matt Parson’s fan­tas­tic blog post Type Safety Back and Forth. If you want an­other ac­ces­si­ble per­spec­tive on these ideas, in­clud­ing an­other worked ex­am­ple, I’d highly en­cour­age giv­ing it a read. For a sig­nif­i­cantly more ad­vanced take on many of these ideas, I can also rec­om­mend Matt Noonan’s 2018 pa­per Ghosts of Departed Proofs, which out­lines a hand­ful of tech­niques for cap­tur­ing more com­plex in­vari­ants in the type sys­tem than I have de­scribed here.

As a clos­ing note, I want to say that do­ing the kind of refac­tor­ing de­scribed in this blog post is not al­ways easy. The ex­am­ples I’ve given are sim­ple, but real life is of­ten much less straight­for­ward. Even for those ex­pe­ri­enced in type-dri­ven de­sign, it can be gen­uinely dif­fi­cult to cap­ture cer­tain in­vari­ants in the type sys­tem, so do not con­sider it a per­sonal fail­ing if you can­not solve some­thing the way you’d like! Consider the prin­ci­ples in this blog post ideals to strive for, not strict re­quire­ments to meet. All that mat­ters is to try.

Feeds are avail­able via Atom

or RSS.


Read the original on lexi-lambda.github.io »

To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".

10HN is also available as an iOS App

If you visit 10HN only rarely, check out the the best articles from the past week.

If you like 10HN please leave feedback and share

Visit pancik.com for more.