Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Req] Tool/Script to generate "cluster.yml" configs from existing exported CDP cluster template json files #39

Open
lhoss opened this issue Jun 11, 2021 · 1 comment
Labels
enhancement New feature or request

Comments

@lhoss
Copy link

lhoss commented Jun 11, 2021

Not sure I'm missing a simpler strategy, on howto prepare a new "cluster.yml" file required by this playbook (to be configured in the definition_path), from an exportable CDP (7.1.x / private-cloud) cluster ?
Would be great to get input from Cloudera guys :)
Alternatively, or in addition, it would be highly useful/helpful if much more advanced "cluster.yml" example would be added to the repo (for ex. to deploy a 3-master node HA cluster, as this was nicely done in the former HDP repo: https://github.com/hortonworks/ansible-hortonworks/blob/master/playbooks/group_vars/example-hdp-ha-3-masters-with-ranger-atlas )

Once I learn here in the community, there's indeed no such existing script .. I'm happy to write&contribute myself something

Minimal features:

  • create the mapping of the contained services to the host-groups
  • create the mapping of all the found "configs" elements to the key/value pairs in the "cluster.yml" service's "dict" element
    • (later) nice to have: an option to skip any config values which are/were just defaults from the beginning
  • many other things .. that are required to make it useable/work?
  • handling these "refName" values in the sourc template json

Script Input / Output examples

  • Input file, just small extract (from exported cluster template)
    • Can give more infos later howto that for people not familiar.
{
  "cdhVersion" : "7.1.4",
  "displayName" : "Basic Cluster",
  "cmVersion" : "7.1.4",
  "repositories" : [ ... ],
  "products" : [ {
    "version" : "7.1.4-1.cdh7.1.4.p0.6300266",
    "product" : "CDH"
  } ],
  "services" : [ {
    "refName" : "zookeeper",
    "serviceType" : "ZOOKEEPER",
    "serviceConfigs" : [ {
      "name" : "zookeeper_datadir_autocreate",
      "value" : "true"
    } ],
    "roleConfigGroups" : [ {
      "refName" : "zookeeper-SERVER-BASE",
      "roleType" : "SERVER",
      "configs" : [ {
        "name" : "zk_server_log_dir",
        "value" : "/var/log/zookeeper"
      }, {
        "name" : "dataDir",
        "variable" : "zookeeper-SERVER-BASE-dataDir"
      }, {
        "name" : "dataLogDir",
        "variable" : "zookeeper-SERVER-BASE-dataLogDir"
      } ],
      "base" : true
    } ]
  }, 
...

  } ],
  "hostTemplates" : [ {
    "refName" : "HostTemplate-0-from-eval-cdp-public[1-3].internal.cloudapp.net",
    "cardinality" : 3,
    "roleConfigGroupsRefNames" : [ "hdfs-DATANODE-BASE", "spark_on_yarn-GATEWAY-BASE", "yarn-NODEMANAGER-BASE" ]
  }, {
    "refName" : "HostTemplate-1-from-eval-cdp-public0.internal.cloudapp.net",
    "cardinality" : 1,
    "roleConfigGroupsRefNames" : [ "hdfs-NAMENODE-BASE", "hdfs-SECONDARYNAMENODE-BASE", "spark_on_yarn-GATEWAY-BASE", "spark_on_yarn-SPARK_YARN_HISTORY_SERVER-BASE", "yarn-JOBHISTORY-BASE", "yarn-RESOURCEMANAGER-BASE", "zookeeper-SERVER-BASE" ]
  } ],
...

Output file, following the format of cluster.yml, for ex: roles/cloudera_deploy/defaults/basic_cluster.yml

clusters:
  - name: Basic Cluster
    services: [HDFS, YARN, ZOOKEEPER]
    repositories:
      - https://archive.cloudera.com/cdh7/7.1.4.0/parcels/
    configs:
      ZOOKEEPER:
        SERVICEWIDE:
          zookeeper_datadir_autocreate: true
          zk_server_log_dir": "/var/log/zookeeper"
     
      HDFS:
        DATANODE:
          dfs_data_dir_list: /dfs/dn
        NAMENODE:
          dfs_name_dir_list: /dfs/nn
...
    host_templates:
      Master1:
        HDFS: [NAMENODE, SECONDARYNAMENODE]
        YARN: [RESOURCEMANAGER, JOBHISTORY]
        ZOOKEEPER: [SERVER]
      Workers:
        HDFS: [DATANODE]
        YARN: [NODEMANAGER]
@tmgstevens
Copy link
Contributor

I definitely agree this is a useful thing to have. In theory it shouldn't be too hard to do a naive mapping from an exported cluster template into a cluster.yml, however the things that might be tricky are:

  • Not all running clusters have host templates, so you may need to re-engineer those
  • Any configs that are created by the overlays (tls, kerberos, oom/heapdump-settings, logdir settings, etc) would be hard to re-overlay.
  • We don't currently support custom role groups, so things could fall down slightly there.

But if you fancy a stab I'd be happy to review and contribute. It could probably work quite nicely in j2 and then it's potentially possible to operate a "clone-a-cluster" type feature if we're already integrated with Ansible?

@wmudge wmudge added the enhancement New feature or request label Nov 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

3 participants