To replace Data Science. HASH: a free online platform for modeling the world (from the creator of StackOverflow)
Sometimes, when you try to understand how the world works, basic math is enough. If we increase the flow of hot water by x, the temperature of the mixture rises by y.
Sometimes you work on more complex things, and you can’t even start to guess how the input affects the output. Everything seems to be going well at the warehouse when you have less than four employees, but when you take the fifth, they begin to step on each other's heels, and the fifth does no good.
Here is what hash.ai. Read the David blog post on launch, then try creating your own simulations!
Today, along with Joel Spolsky and Jude Allred, I am pleased to introduce HASH, the company we founded a little over a year ago. We believe that most of the problems in our world arise due to various information failures. Economic collapses, wars, illnesses, choosing the right life partner or university diploma - our mission is to help everyone make the right decisions and overcome information failures.
Brilliant innovators sought to streamline world information and make it accessible to all, and the next step on this path is to make this information understandable and suitable for general use.
High-tech organizations with a high level of financing (such as hedge funds) are able to efficiently process huge amounts of world information, while receiving insignificant revenues and the smallest fractions of a second in economic transactions. At the same time, the vast majority of enterprises and individuals are not able to systematically analyze the whole variety of signals contained in the world.
Modeling can make the world a better place: it can improve our understanding and perception of the world around us. Modeling is not only a useful tool for human cognition, it can also give people the opportunity to create computer displays of real world problems. In fact, models are universal interfaces available for both humans and artificial intelligence, and we believe that models can become a connecting fabric between the world of people and the world of machines.
We hope that models will help people and computers make decisions more efficiently. In particular, they will help promote the rational resolution of conflicts, reduce and eliminate market failures, and also help people live a happy and healthy life. And we do not want to wait for the advent of this bright future.
If you also don’t want to wait, register now - or read on to find out more.
I used to run a digital consulting company in London that developed websites, software, and campaigns based on data. Our company has worked for a wide range of clients: from private investment companies and start-ups to the largest public clients.
From time to time, we have faced really interesting tasks, such as tracking the spread of diseases (for example, sexually transmitted infections), evaluating the effectiveness of measures to combat them (for example, informational advertising campaigns), and optimizing advertising costs (i.e. identification of objects that affect nodes in networks that are most likely to prevent the spread of diseases).
It turns out that in order to find answers to such questions both in epidemiology and in behavioral advertising, there is a single gold standard - “agent-based modeling” (ABM). ABM works as follows.
- Agents represent participants : whether it’s individuals, companies, households, cars in a factory, or something else. Different models are systems with varying degrees of detail. In theory, an “agent” can even be a molecule.
- Agents have properties , the values attached to them, Properties vary by agent. So, in a person, a property can be logical (registered voter - yes/no), numerical (annual income) or represent multiple choice (party affiliation).
- Agents exist in a particular environment (often in several): for example, in geospatial or network graphics.
- Agents are determined by their behavior : in essence, behavior is a code that describes how agents should interact with the outside world and respond to it.
ABMs can be built on the basis of basic principles and are useful for testing the “What if.” Hypotheses, which will allow you to safely explore the digital twins of real-world systems. This makes multi-agent modeling much more useful than predicting the spread of disease and information online.
Solving problems that Data Science cannot solve
A number of complex systemic problems complicate the task of predictable modeling. These problems are associated with agents, their properties and characteristics: non-linearity, occurrence, adaptation, interdependence and feedback cycles between them. Arising events of the “black swan” type are by definition not reflected in existing patterns and historical data, and therefore are completely ignored.
There are no systems that exist in isolation - all of them are part of our complex real world, and therefore all problems of business, politics and people, in the long run, are problems of understanding complex systems. In most cases, a reasonable abstraction allows us to discount most of the extraneous factors, but sometimes it is difficult to understand what, when and under what circumstances can be of interest.
In some systems, all this is unimportant, but when answering some questions (for example, how can we help create a more stable economy or good external relations), we may encounter issues of life and death. In order to fully understand these extremely important problems associated with critical risk, we need to conduct a generalized search in the space in which they exist, based on the observed dynamics of these systems. Just pattern recognition and analysis of historical results is well suited for the formation of a basic shell, but does not give an idea of the essence of the problems.
Since the space around the problems representing all the possible configurations of the world is much larger than the historical space in which these problems were observed, it is sometimes tempting to write off the correct scientific modeling and consider it unrealizable. At the same time, correct simulation does not seek to simulate all possible versions of the world that may ever arise (of course, there are infinitely many of them). Rather, it helps people understand which of these versions can become a reality, and draws attention to possible new scenarios unknown to human analysts due to the nature of these scenarios.
Crises like the financial collapse of 07/08, became disasters precisely because decision makers did not understand and did not take into account the fundamental dynamics of complex systems - the economy, in this case. In regulations such as Basel II, capital reserve requirements were introduced, which, combined with market-to-market accounting practices, led to the sale of assets, with participants forced to go into dying markets, widening the gap.
Although historical and current value data can be used to pre-populate and reverse test agent models, creating an ABM is not necessary. This opens the door to direct formal modeling in a wide range of areas where machine learning cannot currently be applied.
Moreover, simulations combine the advantages of formal modeling with a wealth of quality description, which makes them highly explicable and easy for humans to understand.Unlike models that sometimes look like a black box, agent-based simulations are verifiable, and users can follow step-by-step how exactly those or other results are obtained and what factors contribute to their obtaining.
So why then so little is said about simulations, and why do they remain underestimated and rarely used?
Current issues of agent-based modeling
The modeling process requires a lot of effort, and the cost of servicing, operating and supporting simulations is quite high. Modeling requires knowledge of specialized tools, frameworks, and even strange proprietary programming languages. Resulting simulations are often not portable or repurposable. In cases where the simulation logic is based on guesswork or does not lend itself to calibration, the results can lead to a false sense of confidence or security, which can exacerbate the existing poor-quality decision logic.
Despite the fact that simulations claim to be widespread in the world of supply chains, production, finance, defense, etc., the market-leading software packages for agent modeling today work in limited volumes and are based on outdated technologies and paradigms that defy themselves real-time distributed computing. Their user interfaces have not changed since the 1990s, the experience of the developers who offer them is out of date, they don’t run at all in the browser and on mobile devices, and users often have to deploy special software just to access them.
For the most part, these simulations are toy models created to demonstrate certain dynamics, and lacking interoperability. After these models are built, they become fragmented, few people share them, and no one relies on the results of colleagues in their work. Most of the constructed models are so limited (to ensure their timely operation) that they capture only a small part of the dynamics of the systems that they represent. Instead of building rich virtual worlds and selectively incorporating relevant aspects based on the results of experiments, developers create cheap and easy to explore toy abstractions that do not inspire confidence in users. There is deep and justified skepticism regarding the “scientific” nature of these toy models, as well as doubts that more complex models can be properly calibrated and parameterized.
Pay attention to the problems of finding suitable and granular data at the agent level, the difficulties of converting domain experience into code, and the wide range of structural barriers to creating ABM, and you will understand why general-purpose modeling is losing and rarely used in modern business.
Simulation Available to Everyone
We faced many systemic problems, and now we want to create system-level solutions. HASH strives to solve the simulation problem by vertically integrating the entire stack, creating a single platform for building, running and learning based on simulations.
Today we publicly launch two parts of HASH:
- HASH Core : a web-based development environment and simulation viewer.
- HASH Index : A collection of simulations and modular components.
All simulations in HASH consist of agents (represented by descriptive schemes) and behaviors (which are typically represented by pure functions). Agents are controlled by behavior patterns, and data sets are used to initialize and update them as part of modeling based on real observations. These kits can also be used to reinforce and calibrate models. Behaviors and datasets are attached to the corresponding objects and schemes, so developers can easily search for models using the HASH Index and combine them using the HASH Core.
All models, datasets, and behaviors are available in the HASH Index. All HASH Index content is now available for free. The HASH Index is an environment designed to be a cross between GitHub and the package manager.In the future, this environment will be expanded to create an additional market that facilitates the purchase and sale of paid behaviors, data sets and simulation models. In our view, companies will publish free components to gain trust and authority, and then will sell more complete simulations and consulting services.
Our future plans for the H-Index involve the implementation of forks, branches, discussions and pull requests - we want to add functionality from Git, which, like the use of package managers, is currently the second nature of most modern software developers.
The impact of these changes on the developers' workflow is significant: as the H-Index develops, industry specialists with limited programming knowledge will be able to fork and adapt (or fully integrate) existing behavioral models into their simulations. This will allow them to simulate complex dynamics without having to program large-scale projects from scratch.
However, work on our products has not yet been completed. Although our lightning fast HASH Engine allows you to simulate at an unbeatable speed, it is currently only available through the H-Core web interface, which inevitably limits its memory and processor resources available on the browser tab. All this means that although the H-Engine is designed to work with truly world-class simulations, our early beta users were limited and could only create relatively small models. Thus, H-Core in its current iteration is comparable to something like NetLogo, an agent-based academic modeling tool. NetLogo is useful for illustrating the effects of homogeneous agents in complex systems and explaining the dynamics of these systems, but is limited in modeling real-world environments with a high degree of reliability or large scale. Due to these limitations, tools for conducting optimization experiments (parametric sweeps, Monte Carlo simulations and more exotic reinforced learning) are not yet available - but they are very important to us.
We are releasing our roadmap on realizing these opportunities and using simulation to make everyday decisions in the real world:
HASH Core and the HASH Index are now officially in beta.
- In the coming weeks, we will work intensively on both platforms and will be glad to your contribution.
We are proud to announce that at the end of this year we will open the source code for the HASH Engine, the heart of our simulation system.
- Our goal is to make the platform accessible to all, and give people the opportunity to run H-Engine locally and on closed systems.
- We are currently planning to release a public version of the H-Engine under an open source license by the end of 2020.
This year, we will also begin deploying the HASH Cloud and recruiting beta users.
- H-Cloud is part of our platform that will allow users to run simulations in the cloud with one click from the existing H-Core authorization and viewing interface (as well as the command line using the open-source H-Engine build)
- In parallel with this, an experimental interface will be opened in H-Core that will provide an opportunity to gain an understanding of the commercial benefits of scale simulations.
- Through the H-Cloud, users will be able to access simulations and experiment results programmatically, which will allow them to control algorithms and applications outside of HASH.
You can learn more about our upcoming products on the public roadmap at hash.ai/roadmap
We started together a little over a year ago, now there are about ten people in our team. I am incredibly proud of the team we have created and what we have achieved during this time.
We are pleased to meet with HASH users and launched the community in Slack, which can be accessed through the icon in the lower right corner of any hash page.ai - we will be happy to help you build your models, answer your questions, and also accept your suggestions and error messages.
To prevent information gaps, it is necessary to create tools that did not previously exist to solve problems that cannot be solved today. We must give people superpowers, this is our mission.
If you want to build a model using HASH, you can register at hash.ai/signup .
If you want to take part in our mission and help everyone make the right decisions, you can post simulations, behaviors, and data for the H-Index. You can also apply for any of our open positions at hash.ai/careers .
And finally, if you make business decisions and are interested in learning how to apply HASH, contact us - hash.ai/contact .
We are grateful to the early HASH investors for their support: amazing community creators such as Stack Overflow founder Joel Spolsky and Kaggle founder Anthony Goldblum, as well as Ash Fontane and Lee Edwards of Zetta Venture Partners and Root Ventures. We are pleased to begin our public mission.
Founder and CEO of HASH
Еще примеры тут .
Узнайте подробности, как получить востребованную профессию с нуля или Level Up по навыкам и зарплате, пройдя платные онлайн-курсы SkillFactory:
- Курс по Machine Learning (12 недель)
- Обучение профессии Data Science с нуля (12 месяцев)
- Профессия аналитика с любым стартовым уровнем (9 месяцев)
- Курс «Python для веб-разработки» (9 месяцев)
- Тренды в Data Scienсe 2020
- Data Science умерла. Да здравствует Business Science
- Крутые Data Scientist не тратят время на статистику
- Как стать Data Scientist без онлайн-курсов
- 450 бесплатных курсов от Лиги Плюща
- Data Science для гуманитариев: что такое «data»
- Data Scienсe на стероидах: знакомство с Decision Intelligence