10 interesting stories served every morning and every evening.
...
Read the original on musclewiki.com »
If you are here in a panic because Google Safe Browsing has blacklisted your website or SaaS, skip ahead to the section describing how to handle the situation. There’s also a lot of very interesting comments on the Hacker News comments page.
In the old days, when Google (or any poorly tuned AI that Google unleashed) decided it wanted to kill your business, it would usually resort to denying access to one of its multiple walled gardens, and that was that. You’ve probably heard the horror stories:
They all fit the same mold. First, a business, by choice, uses Google services in a way that makes its survival entirely dependent on them. Second, Google, being the automated behemoth that Google is, does its thing: it ever so slightly adjusts the position of its own butt on its planet sized leather armchair, and, without really noticing, crushes a myriad of (relatively) ant-sized businesses in the process. Third, and finally, the ant-sized businesses desperately try to inform Google that they are being crushed, but they can only reach an automated suggestions box.
Sometimes, the ant-sized CEO knows a higher up at Google because they were college buddies, or the CTO writes an ant-sized Medium post that somehow makes it to the front page of Hacker News mound. Then Google notices the ant-sized problem and sometimes deems it worthy of solving, usually for fear of regulatory repercussions that the ant revolution might entail.
For this reason, conventional ant-sized wisdom dictates that if possible, you should not build your business to be overly reliant on Google’s services. And if you manage to avoid depending on Google’s multiple walled gardens to survive, you will probably be OK.
In today’s episode of “the Internet is not what it used to be”, let’s talk about a fresh new avenue for Google to inadvertently crush your startup that does not require you to use Google services in any (deliberate) way.
Did you know that it’s possible for your site’s domains to be blacklisted by Google for no particular reason, and that this blacklist is not only enforced directly in Google Chrome, but also by several other software and hardware vendors? Did you know that these other vendors synchronize this list with wildly variable timings and interpretations, in a way that can make fixing any issues extremely stressful and unpredictable? Did you know that Google’s ETA for reviewing a blacklist report, no matter how invalid, is measured in weeks?
This blacklist “feature” is called Google Safe Browsing, and the image here depicts the subtle message your users will see if one of your domains happens to be flagged in the Safe Browsing database. Warning texts range from “deceptive site ahead” to “the site ahead contains malware” (see here for a full list), but they all share an equally scary red background design, and borderline impossible UI for people to skip the warning and use the site anyway.
The first time we experienced this issue, we learned about it from a surge of customer reports that said that they were seeing the red warning page when trying to use our SaaS. The second time, we were better prepared and therefore had some free time to write this post.
For context, InvGate (our company) is a SaaS platform for IT departments that runs on AWS with over 1000 SME and enterprise customers, serving millions of end users. This means our product is used by IT teams to manage issues and requests from their own users. You can imagine the pleasant reaction of IT Managers when suddenly their IT ticketing system starts displaying such ominous security warnings to their end users.
When we first bumped into this problem, we frantically tried to understand what was going on and learning how Google Safe Browsing (GSB from now on) worked while our technical support team tried to keep up with customers reporting the issue. We quickly realized an Amazon Cloudfront CDN URL that we used to serve static assets (CSS, Javascript and other media) had been flagged and this was causing our entire application to fail for the customer instances that were using that particular CDN. A quick review of the allegedly affected system showed that everything appeared normal.
While our DevOps team was working in full emergency mode to get a new CDN set up and preparing to move customers over onto a new domain, I found that Google’s documentation claims that GSB provides additional explanations about why a site has been flagged in the Google Search Console (GSC from now on) of the offending site. I won’t bore you with the details, but in order to access this information, you have to claim ownership of the site in GSC, which requires you to set up a custom DNS record or upload some files onto the root of the offending domain. We scrambled to do exactly that and after 20 minutes, managed to find the report about our site.
The report looked something like this:
The report also contained a “Request Review” button that I promptly clicked without actually taking any action on the site, since there was no information whatsoever about the alleged problem. I filed for a review with a message noting that there were no offending URLs listed, despite documentation indicating that example URLs are always be provided by Google to assist webmasters in identifying issues.
Around an hour later, and before we had finished moving customers out of that CDN, our site was cleared from the GSB database. I received an automated email confirming that the review had been successful around 2 hours after that fact. No clarification was given about what caused the problem in the first place.
Over the week that followed this incident, and despite having had our URL cleared from the Safe Browsing blacklist, we continued to receive sporadic reports of companies having trouble to access our systems.
Google Safe Browsing provides two different APIs for both commercial and non-commercial software developers to use the blacklist in their products. In particular, we identified that at least some customers using Firefox were also running into issues, and both antivirus/antimalware software and network-wide security appliances from customers were also flagging our site and preventing users from accessing it many days after the issue had been resolved.
We continued to move all the customers off the formerly blacklisted CDN and onto a new one, and the issue was therefore resolved for good. We never properly established the cause of the issue, but we chalked it up to some AI tripping on acid at Google’s HQ.
My 2 cents: If you run a SaaS business with an availability SLA, getting flagged by Google Safe Browsing for no particular reason represents a very real risk to business continuity.
Sadly, given the oh-so-Googly opacity of the mechanism for flagging and reviewing sites, I don’t think there is a way you can fully prevent this from happening to you. But you can certainly architect your app and processes to minimize the chances of it happening, lower the impact of actually being flagged, and minimize the time needed to circumvent the issue if it arises.
Here are the steps we are taking, and I therefore recommend:
* Don’t keep all your eggs in one basket, domain wise. GSB appears to flag entire domains or subdomains. For that reason, it’s a good idea to spread your applications over multiple domains, as that will reduce the impact of any single domain getting flagged. For example: company.com for your website, app.company.net for your application, eucdn.company.net for customers in Europe, useastcdn.company.net for customers in the US East coast, etc.
* Don’t host any customer generated data in your main domains. A lot of the cases of blacklisting that I found while researching this issue were caused by SaaS customers unknowingly uploading malicious files onto servers. Those files are harmless to the systems themselves, but their very existence can cause the whole domain to be blacklisted. Anything that your users upload onto your apps should be hosted outside your main domains. For example: use companyusercontent.com to store files uploaded by customers.
* Proactively claim ownership of all your production domains in Google Search Console. If you do, that won’t prevent your site from being blacklisted, but you will get an email as it happens which will allow you to react quickly to the issue. It takes a little while to do, and it’s precious time when you are actually dealing with an incident of this sort that is impacting your customers.
* Be ready to jump domains if you need to. This is the hardest thing to do, but it’s the only effective tool against being blacklisted: engineer your systems so that their referenced service domain names can easily be modified (by having scripts or orchestration tools available to perform this change), and possibly even have alternative names available and standing by. For example, have eucdn.company2.net be a CNAME for eucdn.company.net, and if the first domain is blocked update the configuration of your app to load its assets from the alternate domain by using a tool.
* If you can easily and quickly switch your app to a different domain name, that is the only thing that will reliably, quickly and pseudo-definitively resolve the incident. If possible, do that. You’re done.
* Failing that, once you identify the blocked domain, review the reports that appear on Google Search Console. If you had not claimed ownership of the domain before this point, you will have to do it right now, which will take a while.
* If your site has actually been hacked, fix the issue (i.e. delete offending content or hacked pages) and then request a security review. If your site has not been hacked or the Safe Browsing report is nonsensical, request a security review anyway and state that the report is incomplete.
* Then, instead of waiting in agony, assuming that downtime is critical for your system or business, get to work on moving to a new domain name anyway. The review might take weeks.
The second time around, months after the first incident, we received an email from the Search Console warning us that one of our domains had been flagged. A few hours after this initial email report, being a G Suite domain administrator, I received another interesting email, which you can read below.
Let me summarize what that is, because it’s quite mind blowing. This email refers to the Search Console blacklist alert emails. What this second e-mail says is that G Suite’s automated phishing e-mail filter thinks Google Search Console’s email about our domain being blacklisted is fake. It most certainly is not, since our domain was indeed blacklisted when we received the email. So Google can’t even decide whether its own email alerts about phishing are phishing. (LOL? 🤔)
It’s very clear to anyone working in tech that large corporate technology behemoths are to a great extent, gatekeepers of the Internet. But I tend to interpret that in a loose, metaphorical way. The Safe Browsing incident described in this post made it very clear that Google literally controls who can access your website, no matter where and how you operate it. With Chrome having around 70% market share, and both Firefox and Safari using the GSB database to some extent, Google can with a flick of a bit singlehandedly make any site virtually inaccessible on the Internet.
This is an extraordinary amount of power, and one that is not suitable for Google’s “an AI will review your problem when and if it finds it convenient to do so” approach.
...
Read the original on gomox.medium.com »
We recently announced a license change: Blog, FAQ. We posted some additional guidance on the license change this morning. I wanted to share why we had to make this change.
This was an incredibly hard decision, especially with my background and history around Open Source. I take our responsibility very seriously. And to be clear, this change most likely has zero effect on you, our users. It has no effect on our customers that engage with us either in cloud or on premises. Its goal, hopefully, is pretty clear.
So why the change? AWS and Amazon Elasticsearch Service. They have been doing things that we think are just NOT OK since 2015 and it has only gotten worse. If we don’t stand up to them now, as a successful company and leader in the market, who will?
Our license change is aimed at preventing companies from taking our Elasticsearch and Kibana products and providing them directly as a service without collaborating with us.
Our license change comes after years of what we believe to be Amazon/AWS misleading and confusing the community - enough is enough.
We’ve tried every avenue available including going through the courts, but with AWS’s ongoing behavior, we have decided to change our license so that we can focus on building products and innovating rather than litigating.
AWS’s behavior has forced us to take this step and we do not do so lightly. If they had not acted as they have, we would not be having this discussion today.
We think that Amazon’s behavior is inconsistent with the norms and values that are especially important in the open source ecosystem. Our hope is to take our presence in the market and use it to stand up to this now so others don’t face these same issues in the future.
In the open source world, trademarks are considered a great and positive way to protect product reputation. Trademarks have been used and enforced broadly. They are considered sacred by the open source community, from small projects to foundations like Apache to companies like RedHat. So imagine our surprise when Amazon launched their service in 2015 based on Elasticsearch and called it Amazon Elasticsearch Service. We consider this to be a pretty obvious trademark violation. NOT OK.
I took a personal loan to register the Elasticsearch trademark in 2011 believing in this norm in the open source ecosystem. Seeing the trademark so blatantly misused was especially painful to me. Our efforts to resolve the problem with Amazon failed, forcing us to file a lawsuit. NOT OK.
We have seen that this trademark issue drives confusion with users thinking Amazon Elasticsearch Service is actually a service provided jointly with Elastic, with our blessing and collaboration. This is just not true. NOT OK.
When the service launched, imagine our surprise when the Amazon CTO tweeted that the service was released in collaboration with us. It was not. And over the years, we have heard repeatedly that this confusion persists. NOT OK.
When Amazon announced their Open Distro for Elasticsearch fork, they used code that we believe was copied by a third party from our commercial code and provided it as part of the Open Distro project. We believe this further divided our community and drove additional confusion.
More on this here. NOT OK.
Recently, we found more examples of what we consider to be ethically challenged behavior. We have differentiated with proprietary features, and now we see these feature designs serving as “inspiration” for Amazon, telling us their behavior continues and is more brazen. NOT OK.
We collaborate with cloud service providers, including Microsoft, Google, Alibaba, Tencent, Clever Cloud, and others. We have shown we can find a way to do it. We even work with other parts of Amazon. We are always open to doing that; it just needs to be OK.
I believe in the core values of the Open Source Community: transparency, collaboration, openness. Building great products to the benefit of users across the world. Amazing things have been built and will continue to be built using Elasticsearch and Kibana.
And to be clear, this change most likely has zero effect on you, our users. And no effect on our customers that engage with us either in cloud or on premises.
We created Elasticsearch; we care about it more than anyone else. It is our life’s work. We will wake up every day and do more to move the technology forward and innovate on your behalf.
Thanks for listening. If you have more questions or you want more clarification please read here or contact us at elastic_license@elastic.co.
Thank you. It is a privilege to be on this journey with you.
...
Read the original on www.elastic.co »
Over the past several months, the Brave team has been working with Protocol Labs on adding InterPlanetary File System (IPFS) support in Brave. This is the first deep integration of its kind and we’re very proud to outline how it works in this post.
IPFS is an exciting technology that can help content creators distribute content without high bandwidth costs, while taking advantage of data deduplication and data replication. There are performance advantages for loading content over IPFS by leveraging its geographically distributed swarm network. IPFS is important for blockchain and for self described data integrity. Previously viewed content can even be accessed offline with IPFS! The IPFS network gives access to content even if it has been censored by corporations and nation-states, such as for example, parts of Wikipedia.
IPFS support allows Brave desktop users to download content by using a content hash, known as the Content identifier (CID). Unlike HTTP(S), there is no specified location for the content.
Each node in the IPFS network is a potential host for the content being requested, and if a node doesn’t have the content being requested, the node can retrieve the content from the swarm of peers. The retrieved content is verified locally, removing the need to trust a third party’s integrity.
HTTP(S) uses Uniform Resource Locators (URLs) to specify the location of content. This system can be easily censored since the content is hosted in specific locations on behalf of a single entity and it is susceptible to Denial of Service Attacks (DDoS). IPFS identifies its content by content paths and/or CIDs inside of Uniform Resource Identifier (URIs) but not URLs.
...
Read the original on brave.com »
Last week, Elastic announced they will change their software licensing strategy, and will not release new versions of Elasticsearch and Kibana under the Apache License, Version 2.0 (ALv2). Instead, new versions of the software will be offered under the Elastic License (which limits how it can be used) or the Server Side Public License (which has requirements that make it unacceptable to many in the open source community). This means that Elasticsearch and Kibana will no longer be open source software. In order to ensure open source versions of both packages remain available and well supported, including in our own offerings, we are announcing today that AWS will step up to create and maintain a ALv2-licensed fork of open source Elasticsearch and Kibana.
We launched Open Distro for Elasticsearch in 2019 to provide customers and developers with a fully featured Elasticsearch distribution that provides all of the freedoms of ALv2-licensed software. Open Distro for Elasticsearch is a 100% open source distribution that delivers functionality practically every Elasticsearch user or developer needs, including support for network encryption and access controls. In building Open Distro, we followed the recommended open source development practice of “upstream first.” All changes to Elasticsearch were sent as upstream pull requests (#42066, #42658, #43284, #43839, #53643, #57271, #59563, #61400, #64513), and we then included the “oss” builds offered by Elastic in our distribution. This ensured that we were collaborating with the upstream developers and maintainers, and not creating a “fork” of the software.
Choosing to fork a project is not a decision to be taken lightly, but it can be the right path forward when the needs of a community diverge—as they have here. An important benefit of open source software is that when something like this happens, developers already have all the rights they need to pick up the work themselves, if they are sufficiently motivated. There are many success stories here, like Grafana emerging from a fork of Kibana 3.
When AWS decides to offer a service based on an open source project, we ensure that we are equipped and prepared to maintain it ourselves if necessary. AWS brings years of experience working with these codebases, as well as making upstream code contributions to both Elasticsearch and Apache Lucene, the core search library that Elasticsearch is built on—with more than 230 Lucene contributions in 2020 alone.
Our forks of Elasticsearch and Kibana will be based on the latest ALv2-licensed codebases, version 7.10. We will publish new GitHub repositories in the next few weeks. In time, both will be included in the existing Open Distro distributions, replacing the ALv2 builds provided by Elastic. We’re in this for the long haul, and will work in a way that fosters healthy and sustainable open source practices—including implementing shared project governance with a community of contributors.
You can rest assured that neither Elastic’s license change, nor our decision to fork, will have any negative impact on the Amazon Elasticsearch Service (Amazon ES) you currently enjoy. Today, we offer 18 versions of Elasticsearch on Amazon ES, and none of these are affected by the license change.
In the future, Amazon ES will be powered by the new fork of Elasticsearch and Kibana. We will continue to deliver new features, fixes, and enhancements. We are committed to providing compatibility to eliminate any need to update your client or application code. Just as we do today, we will provide you with a seamless upgrade path to new versions of the software.
This change will not slow the velocity of enhancements we offer to our customers. If anything, a community-owned Elasticsearch codebase presents new opportunities for us to move faster in improving stability, scalability, resiliency, and performance.
Developers embrace open source software for many reasons, perhaps the most important being the freedom to use that software where and how they wish.
The term “open source” has had a specific meaning since it was coined in 1998. Elastic’s assertions that the SSPL is “free and open” are misleading and wrong. They’re trying to claim the benefits of open source, while chipping away at the very definition of open source itself. Their choice of SSPL belies this. SSPL is a non-open source license designed to look like an open source license, blurring the lines between the two. As the Fedora community states, “[to] consider the SSPL to be ‘Free’ or ‘Open Source’ causes [a] shadow to be cast across all other licenses in the FOSS ecosystem.”
In April 2018, when Elastic co-mingled their proprietary licensed software with the ALv2 code, they promised in “We Opened X-Pack”: “We did not change the license of any of the Apache 2.0 code of Elasticsearch, Kibana, Beats, and Logstash — and we never will.” Last week, after reneging on this promise, Elastic updated that same page with a footnote that says “circumstances have changed.”
Elastic knows what they’re doing is fishy. The community has told them this (e.g., see Brasseur, Quinn, DeVault, and Jacob). It’s also why they felt the need to write an additional blustery blog (on top of their initial license change blog) to try to explain their actions as “AWS made us do it.” Most folks aren’t fooled. We didn’t make them do anything. They believe that restricting their license will lock others out of offering managed Elasticsearch services, which will let Elastic build a bigger business. Elastic has a right to change their license, but they should also step up and own their own decision.
In the meantime, we’re excited about the long-term journey we’ve embarked on with Open Distro for Elasticsearch. We look forward to providing a truly open source option for Elasticsearch and Kibana using the ALv2 license, and building and supporting this future with the community.
An earlier version of this post incorrectly indicated that the Jenkins CI tool was a fork. We thank @abayer for the correction.
...
Read the original on aws.amazon.com »
Signal is up and running.
...
Read the original on status.signal.org »
I know a number of folks use The Great Suspender to automatically suspend inactive browser tabs in Chrome. Apparently recent versions of this extension have been taken over by a shady anonymous entity and is now
flagged by Microsoft as malware. Notably the most recent version of the extension (v7.1.8) has added integrated analytics that can track all of your browsing activity across all sites. Yikes.
Recommendations for users of The Great Suspender (7.1.8):
* Disable analytics tracking by opening the extension options for
The Great Suspender and checking the box
“Automatic deactivation of any kind of tracking”.
* Pray that the shady developer doesn’t issue a malicious update to The Great Suspender later.
(There’s no sensible way to disable updates of an individual extension.)
* Close as many unneeded tabs as you can.
* Download the latest good version of The Great Suspender (7.1.6) from GitHub,
and move it to some permanent location outside your Downloads folder.
(It should be commit 9730c09.)
* Load your downloaded copy as an unpacked extension.
(This copy will not auto-update to future untrusted versions of the extension.)
Caveat: My understanding is that installing an unpacked extension in this way will cause Chrome to issue a new kind of security prompt every time it is launched, which you’ll have to ignore. 😕
Other browser extensions for suspending tabs exist, as mentioned in the
Hacker New discussion for this article. However I have not conducted my own security review on any of those other extensions, so buyer beware.
...
Read the original on dafoster.net »
You have a mind-shattering headache. You’re standing in the aisle of your local CVS, massaging your temples while scanning the shelves for something—anything—to make make the pain stop.
What do you reach for? Tylenol? Advil? Aleve?
Most people, I imagine, grab whatever’s cheapest, or closest, or whatever they always use. But if you’re scrupulous enough to ask Google for the best painkiller, here’s how your friendly neighborhood tech behemoth would answer:
Oh thanks Google that’s just all of them.
If you’re among the 77% of Americans that Google their health problems, insipid answers like this won’t surprise you. But we should be surprised, because researchers carry out tens of thousands of clinical trials every year. And hundreds of clinical trials have examined the effectiveness of painkillers. So why can’t I Google those results?
And so in the year of our lord 2017 I had a Brilliant Startup Idea: use a structured database of clinical trials to provide simple, practical answers to common medical questions.
As a proof-of-concept I tried this by hand: I made a spreadsheet with every OTC painkiller trial I could find and used R to run a network meta-analysis, the gold standard of evidence-based medicine.
The results were pretty interesting, and exactly the kind of thing I was looking for back in the sad sterile aisles of CVS:
A wave of exhiliration washed over me. Here was a problem that
A perfect bullseye. After a few hours searching domains I came up with a name for my project: GlacierMD.
Over the next nine months I would quit my job, write over 200,000 lines of code, hire five contractors, create a Delaware C-Corp, add four doctors to my advisory board, and demo GlacierMD for twelve Bay Area medical practices. I would spend $40K of my own savings buying clinical trials and paying contractors to enter said trials into the GlacierMD database.
On July 2, 2018, GlacierMD powered the world’s largest depression meta-analysis, using data from 846 trials, beating Cipriani’s previous record of 522.
Choirs of angels sang in my ears. Here I was, living the Silicon Valley dream: making the world a better place through technology.
Two weeks later GlacierMD was dead.
“That’s an awesome idea,” said Carl. “It sounds like something worth working on.”
Carl was my boss. We worked at a startup that leveraged autonomous blockchains to transfer money from naïve investors to slightly less naïve twenty-somethings. There are worse gigs.
And here was Carl telling me that my startup idea would bring such benefit to humanity that I simply had to quit, his roadmap be damned. I nodded knowingly, feeling the weight of this responsibility resting on my proud shoulders.
“Thanks Carl,” I said. “I’ll try to mention you when I accept my Nobel.”
I quit two weeks later and started coding at a blistering pace. I drew all sorts of inscrutable diagrams with dry-erase pens on my parents’ windows. I hired a motley crew of Egyptian contractors to start entering clinical trials into my database. I commissioned a logo, registered my domain, and started obsessing over color schemes.
When I finally finished the MVP I showed it to the head of product at the company I’d just left. I watched him as he watched my demo, waiting for his eyes to melt with the glory of it all. Instead he just sorta shrugged.
“Lots of people make medical claims on the internet,” he said. “Why should I trust yours?”
I started babbling about network meta-analyses, statistical power, and p-values, but he cut me off.
“Yeah okay that’s great but nobody cares about this math crap. You need doctors.”
Goddamnit he was right. If nobody could be bothered with the math, then I was no better than Gwyneth Paltrow hawking vagina eggs. To build trust I needed to get endorsements from trustworthy people.
So I called up some friends, some buddies, some friends-of-friends. “Would you like to be an advisor for my cutting-edge health-tech startup?” I’d ask, while eating Dominos in my parents’ laundry room. I’d give them 1% of this extermely valuable, high-growth startup and in exchange I could plaster their faces all over my website.
Four of these doctors agreed. This is called making deals ladies and gentlemen and I was like the lovechild of Warren Buffet and Dr. Oz.
Things are going great. My friends and family all tell me they love the site. Even some strangers on the internet love it. “I know right,” I tell them. “So how much would you pay for this?”
“Hahahahahahah,” they say in unison. “Good one!”
I forgot that the the first law of consumer tech is nobody pays for consumer tech. But no problemo, I say to myself. This is why Eric Schmidt invented ads. I’ll just plaster a few banners on GlacierMD and bing bang boom I’ll be seasteading with Peter Thiel before Burning Man.
But then I look at WebMD’s 10-Qs and start to spiral. Turns out the world’s biggest health website makes about $0.50/year per user. That is…not enough money to bootstrap GlacierMD. I’m pouring money into my rent, into my Egyptian contractors, into AWS—I need some cash soon.
What I need are people willing to pay for this thing. What about doctors? Doctors have money, right? Maybe doctors, or practices, or whatever—someone in the medical industry—maybe they would shell out some cash for my on-demand meta-analyses.
So I listened to a few podcasts and became a sales expert. I started cold calling people using scripts from the internet and tried to convince them to sit through a GlacierMD demo.
In the meantime I receive some worrying messages from my Egyptian contractors.
“I think it’s time to talk about a raise,” one of them says.
“I feel that I have become exceptional at my job,” says another. “Please consider a raise or I will stop working.”
“Please increase my pay,” says the third, including helpful screenshots demonstrating how to give said raise through the Upwork website.
Are my contractors unionizing? I wonder. I glance obliquely at my shrinking bank account statement, grit my teeth, and approve the raises. At this rate I’ll hit zero in a matter of weeks.
But my sales calls start paying off. Miraculously I find some doctors that are willing to talk to me. So I borrow my parents’ car and drive out to the burbs to meet a doctor I’ll call Susan.
Susan has a small practice in downtown Redwood City, a Silicon Valley town that looks 3-D printed from the Google Image results for main street.
Susan is a bit chatty (she’s a psychiatrist) but eventually I demo GlacierMD. I show her how you can filter studies based on the demographic data of the patient, how you can get treatment recommendations based on a preferred side effect profile, how you can generate a dose-response curve. She oohs and aahs at all the right points. By the end of the interview she’s practically drooling.
Hook, line, and sinker I think to myself. I’m already contemplating what color Away bags would look best in the back of my Cybertruck when Susan interrupts my train of thought.
“What a fun project!” she says enthusiastically.
Something in her tone makes me pause. “Uh, yeah,” I say. “So what would you imagine a product like this—one that could change the very practice of medicine—how much would you pay for such a service?”
“Oh, uh—hmmmm,” she said. “I don’t know if we can spare the budget here, to be honest. It’s very fun…but I’m not sure if our practice can justify this cost.”
If you read enough sales books most of them tell you that when people say your product is too expensive what they really mean is your product isn’t valuable enough. Susan acted like I was offering her Nirvana as a Service so the conversation has taken quite a wild turn.
“So you don’t think this product is useful?”
“Oh sure! I mean, I think in many cases I’ll just prescribe what I normally do, since I’m comfortable with it. But you know it’s possible that sometimes I’ll prescribe something different, based on your metastudies.”
“And that isn’t worth something? Prescribing better treatments?”
“Hmmmm,” she said, picking at her fingernails. “Not directly. Of course I always have the best interests of my patients in mind, but, you know, it’s not like they’ll pay more if I prescribe Lexapro instead of Zoloft. They won’t come back more often or refer more friends. So I’d sorta just be, like, donating this money if I paid you for this thing, right?”
I had literally nothing to say to that. It had been a bit of a working assumption of mine over the past few weeks that if you could improve the health of the patients then, you know, the doctors or the hospitals or whatever would pay for that. There was this giant thing called healthcare right, and its main purpose is improving health—trillions of dollars are spent trying to do this. So if I built a thing that improves health someone should pay me, right?
I said goodbye to Susan and tried to cheer myself up. I had ten more meetings with doctors all over the Bay Area—surely not all of them were ruthless capitalists like Susan. Maybe they would see the the towering genius of GlacierMD and shell out some cash.
But in fact everyone gave me some version of Susan’s answer. “We just can’t justify the cost,” a pediatrician told me. “I’m not sure it’s in the budget,” said a primary care physician. “It’s awesome,” said a hospitalist. “You should try to sell this!” Ugh.
So in July 2018, nine months and $40K after starting GlacierMD, I shut it down. I fired my contractors, archived the database, and shut down the servers. GlacierMD was dead.
Make something people want. It’s Y-Combinator’s motto and a maxim of aspiring internet entrepreneurs. The idea is that if you build something truly awesome, you’ll figure out a way to make some money off of it.
So I built something people wanted. Consumers wanted it, doctors wanted it, I wanted it. Where did I go wrong?
Occassionally I like to disconnect from the IV drip of internet pseudoknowledge and learn stuff from books. I know, it’s weird—maybe even a bit hipster. But recently I read Wharton’s introductory marketing textbook, Strategic Marketing Management. The very first chapter has this to say:
“To succeed, an offering must create value for all entities involved in the exchange—target customers, the company, and its collaborators.”
All stakeholders. You can’t just create value for the user: that’s a charity. You also can’t just create value for your company: that’s a scam. Your goal is to set up some kind of positive-sum exchange, where everyone benefits, including you. A business plan, according to this textbook, starts with this simple question: how will you create value for yourself and the company?
I winced audibly when I read this. How much time I could’ve saved! If I’d articulated at the beginning how I expected to extract value from GlacierMD, maybe I would’ve researched the economics of an ad-based model, or I would’ve validated that doctors were willing to pay, or hospitals, or insurance companies.
A few months after shuttering GlacierMD and returning to corporate life my buddy pitched me a new startup idea.
“It’s called Doppelganger,” he said. “It’s super simple—you upload a selfie to the database, and then it uses AI or whatver to instantly find everyone in the database who—”
“Looks like you,” I finished for him.
“Exactly,” he said, grinning ear to ear. “How awesome would that be? You should build it!”
I mean, I dunno, it sounds like something fun to do at parties. In a narrow sense, it’s something I want, but there’s no way in hell I’m going to devote any time to this. Doppelganger has created value for the customer but not for the company.
“Call me when you have a business plan,” I said, lacing up my Allbirds and riding my Lime scooter into the sunset.
Want more like this?
We’ll send new posts from TJCX directly to your inbox.
...
Read the original on tjcx.me »
Effort estimation is an important component of any project, software or otherwise. While effort estimation is something that everybody in industry is involved with on a regular basis, it is a niche topic in software engineering research. The problem is researcher attitude (e.g., they are unwilling to venture into the wilds of industry), which has stopped them acquiring the estimation data needed to build realistic models. A few intrepid people have risked an assault on their ego and talked to people in industry, the outcome has been, until very recently, a small collection of tiny estimation datasets.
In a research context the term effort estimation is actually a hang over from the 1970s; effort correction more accurately describes the behavior of most models since the 1990s. In the 1970s models took various quantities (e.g., estimated lines of code) and calculated an effort estimate. Later models have included an estimate as input to the model, producing a corrected estimate as output. For the sake of appearances I will use existing terminology.
Which effort estimation datasets do researchers tend to use?
A 2012 review of datasets used for effort estimation using machine learning between 1991-2010, found that the top three were: Desharnias with 24 papers (29%), COCOMO with 19 papers (23%) and ISBSG with 17. A 2019 review of datasets used for effort estimation using machine learning between 1991 and 2017, found the top three to be NASA with 17 papers (23%), the COCOMO data and ISBSG were joint second with 16 papers (21%), and Desharnais was third with 14. The 2012 review included more sources in its search than the 2019 review, and subjectively your author has noticed a greater use of the NASA dataset over the last five years or so.
How large are these datasets that have attracted so many research papers?
The NASA dataset contains 93 rows (that is not a typo, there is no power-of-ten missing), COCOMO 63 rows, Desharnais 81 rows, and ISBSG is licensed by the International Software Benchmarking Standards Group (academics can apply for a limited time use for research purposes, i.e., not pay the $3,000 annual subscription). The China dataset contains 499 rows, and is sometimes used (there is no mention of a supercomputer being required for this amount of data ;-).
Why are researchers involved in software effort estimation feeding tiny datasets from the 1990s into machine learning algorithms?
Grant money. Research projects are more likely to be funded if they use a trendy technique, and for the last decade machine learning has been the trendiest technique in software engineering research. What data is available to learn from? Those estimation datasets that were flogged to death in the 1990s using non-machine learning techniques, e.g., regression.
Use of machine learning also has the advantage of not needing to know anything about the details of estimating software effort. Everything can be reduced to a discussion of the machine learning algorithms, with performance judged by a chosen error metric. Nobody actually looks at the predicted estimates to discover that the models are essentially producing the same answer, e.g., one learner predicts 43 months, 2 weeks, 4 days, 6 hours, 47 minutes and 11 seconds, while a ‘better’ fitting one predicts 43 months, 2 weeks, 2 days, 6 hours, 27 minutes and 51 seconds.
How many ways are there to do machine learning on datasets containing less than 100 rows?
A paper from 2012 evaluated the possibilities using 9-learners times 10 data-prerocessing options (e.g., log transform or discretization) times 7-error estimation metrics giving 630 possible final models; they picked the top 10 performers.
This 2012 study has not stopped researchers continuing to twiddle away on the option’s nobs available to them; anything to keep the paper mills running.
To quote the authors of one review paper: “Unfortunately, we found that very few papers (including most of our own) paid any attention at all to properties of the data set.”
Agile techniques are widely used these days, and datasets from the 1990s are not applicable. What datasets do researchers use to build Agile effort estimation models?
A 2020 review of Agile development effort estimation found 73 papers. The most popular data set, containing 21 rows, was used by nine papers. Three papers used simulated data! At least some authors were going out and finding data, even if it contains fewer rows than the NASA dataset.
As researchers in business schools have shown, large datasets can be obtained from industry; ISBSG actively solicits data from industry and now has data on 9,500+ projects (as far as I can tell a small amount for each project, but that is still a lot of projects).
Are there any estimates on Github? Some Open source projects use JIRA, which includes support for making estimates. Some story point estimates can be found on Github, but the actuals are missing.
A handful of researchers have obtained and released estimation datasets containing thousands of rows, e.g., the SiP dataset contains 10,100 rows and the CESAW dataset contains over 40,000 rows. These datasets are generally ignored, perhaps because when presented with lots of real data researchers have no idea what to do with it.
...
Read the original on shape-of-code.coding-guidelines.com »
The windows crate lets you call any Windows API past, present, and future using code generated on the fly directly from the metadata describing the API and right into your Rust package where you can call them as if they were just another Rust module.
The Rust language projection follows in the tradition established by C++/WinRT of building language projections for Windows using standard languages and compilers, providing a natural and idiomatic way for Rust developers to call Windows APIs.
Start by adding the following to your Cargo.toml file:
[dependencies]
windows = “0.2.1”
[build-dependencies]
windows = “0.2.1”
This will allow Cargo to download, build, and cache Windows support as a package. Next, specify which types you need inside of a build.rs build script and the windows crate will generate the necessary bindings:
fn main() {
windows::build!(
windows::data::xml::dom::*
windows::win32::system_services::{CreateEventW, SetEvent, WaitForSingleObject}
windows::win32::windows_programming::CloseHandle
Finally, make use of any Windows APIs as needed.
mod bindings {
::windows::include_bindings!();
use bindings::{
windows::data::xml::dom::*,
windows::win32::system_services::{CreateEventW, SetEvent, WaitForSingleObject},
windows::win32::windows_programming::CloseHandle,
fn main() -> windows::Result
To reduce build time, use a bindings crate rather simply a module. This will allow Cargo to cache the results and build your project far more quickly.
There is an experimental documentation generator for the Windows API. The documentation is published here. This can be useful to figure out how the various Windows APIs map to Rust modules and which use paths you need to use from within the build macro.
For a more complete example, take a look at Robert Mikhayelyan’s Minesweeper. More simple examples can be found here.
...
Read the original on github.com »
To add this web app to your iOS home screen tap the share button and select "Add to the Home Screen".
10HN is also available as an iOS App
If you visit 10HN only rarely, check out the the best articles from the past week.
If you like 10HN please leave feedback and share
Visit pancik.com for more.