All the deployments are automated and Ansible plays a central role. With the growing complexity of the code base, a new system was needed to overcome

Dive into tdp-lib, the SDK in charge of TDP cluster management

submited by
Style Pass
2023-01-25 21:00:08

All the deployments are automated and Ansible plays a central role. With the growing complexity of the code base, a new system was needed to overcome the Ansible limitations which will enable us to tackle new challenges.

TDP is an 100% open source big data platform based on the Hadoop ecosystem. Alliage offers support and professional services on TDP.

Scheduling in Ansible is not easy. Having a task triggered at the end of another (thanks to handlers) or selecting the different tasks to be executed according to the need (thanks to tags) does not scale.

In TDP, the main way to control the deployment is through variables. Defining and versioning variables in Ansible is an easy way to complicate your life, there’s currently 22 different locations where you can define variables and as the project grows in complexity, it is difficult to keep track of where each variable is defined or re-defined. Moreover it can sometimes lead to defining defaults outside our control. We decided to add a 23rd way to easily version, and add custom behavior. We had to develop the necessary tools to correctly version these variables.

The answer to these two requirements is tdp-lib. This SDK allows compatible collections to define a DAG (directed acyclic graph) containing all the relationships between components and services, detailing the execution order of the tasks. Moreover, it allows the definition of variables per service and per component in yaml files.

Leave a Comment