Основной недостаток текущих решений как, например, Kubernetes это соотношение ценность/сложность. Все эти pods, kubelets, groups, tons of error prone configuration, десятки демонов и все же он не решает ни одной из простейших нужд:.
- Прозрачный service discovery
- Динамическая реконфигурация
- Агрегация логов
- Простой мониторинг
С моей точки зрения для простого и полнзного PaaS нужны:
1. Приложение описывается одним простым YAML хранящемся в центральном репозитории
— name имя YAML файла
— provided services as a list of ports (TCP/UDP)
— required services as a list of application: port—>local port entries
— image name and version in the artifact repository
— optional list of tags to classify the application
— optional config file location, the PaaS will update it atomically in real-time
2. An instance of application gets all of required services mapped to a distinct ports on loopback interface. A config file «magically» appears on the hardcoded path. Developer box, dev cluster, staging cluster, production cluster — all environments look the same to the app instance. Packaged as a container or not, started under debugger or not etc. it behaves the same.
3. The port mapped on loopback are listened by a smart proxy that balances traffic, handles (partial) outages of target application, service discovery, and retries with fallback policy.
4. Application instance logs to stdout or stderr, it gets forwarded to wherever appropriate.
5. Apps in the cluster are identified by name of the image + version of the image + # of instance.
6. External systems are exposed to the cluster apps just like another app and proxies work the same.
7. Exposing apps to the world is done again with a list of a simple YAMLs stored in a centralized config repository.
8. Since all communication goes through smart proxies, they accumulate metrics on all network events — connects/disconnects/traffic/request timings (for HTTP and other well-known protocols).
9. Deployment is simple process — there is a YAML for each app instance that is stored in the central configuration:
— application name
— cluster node to run at
— instance id, positive number (optional)
Stopping an app is simply deleting a file (or moving elsewhere).
This takes just enough of opinionated convention and leaves a bit flexibility to get the following:
1. High developer productivity — works on your box the same as in the cloud.
2. Free of most configuration headaches — network dependencies. Yet you may still keep buisness confug in a single coherent place.
3. Reliability and network connectivity issues, log aggregation and monitoring is handled by infrastructure.
What is not covered:
1. Auto distribution of app instances across cluster. I’ve never seen it work w/o extensive tweaking. Therefore it is best left to a script tailored to your infrastructure.
2. scaling, because
Now the interesting part — is it too hard to implement? Maybe without intricate cluster managers and layers upon layers of abstraction, each with a unique configuration dance.