-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge balazs-patchset #13
base: trunk
Are you sure you want to change the base?
Merge balazs-patchset #13
Conversation
Fixed a typo ("yum" -> "git")
Merge upstream changes
Change Lab 6 from Hue to CDP Data Visualization
…in cdsw setup script.
Updated Lab 6 instructions for CDV
Fixing Kudu timestamp column issue
Thanks a lot, @balazsgaspar ! This is great! Thanks again! |
for project in r.json(): | ||
if project['name'] == project_name: | ||
project_id = project['id'] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason for removing the try-except block?
I added that to help with troubleshooting. Without it the HTTP messages are not show properly and it makes it more difficult to find the root cause of failures in certain cases.
r = s.post(CDSW_API + '/users/admin/projects', json={'template': 'git', | ||
'project_visibility': 'private', | ||
'name': project_name, | ||
'gitUrl': 'https://github.com/balazsgaspar/edge2ai-workshop'}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace balazsgaspar
with cloudera-labs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, done.
models = [m for m in r.json() if m['name'] == model_name] | ||
if models and models[0]['latestModelDeployment']['status'] == 'deployed': | ||
print('Model is already deployed!! Skipping.') | ||
#exit(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case the model has already been deployed, we don't want to re-run the steps below either (Add engine, variables, etc). At this point we could jump directly to the beginning of the VizApps section.
It would be better to encapsulate the model deployment and the vizapps deployment in separate functions and call these functions from the main script section.
This exit(0)
would turn into a return
, so that the whole deploy_model
function would terminate and the deploy_vizapps
one would be executed.
print('Viz project ID: %s'% (viz_project_id,)) | ||
|
||
print('# Add custom engine for Data Visualization server') | ||
#docker.repository.cloudera.com/cloudera/cdv/cdswdataviz:6.2.3-b18 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this comment is necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Need to test with a more recent Viz image and if everything works, will update the image tag in the script as well.
print('Engine Image ID: %s'% (engine_image_id,)) | ||
|
||
print('# Set new engine image as default for the viz project') | ||
#docker.repository.cloudera.com/cloudera/cdv/cdswdataviz:6.2.3-b18 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
print('Project image default engine Image ID set to: %s'% (project_engine_image_id)) | ||
|
||
print('# Create application with Data Visualization server') | ||
# https://github.infra.cloudera.com/Sense/cloudera-sense/blob/master/services/web/server/api/v1/controllers/applications/partial.swagger.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's this URL for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of the "lite" versions of the workshops. However, maintaining content is not an easy task when there are changes/updates to the product.
Having a duplicate of every lab with a few changes here and there will make it more difficult to keep them up-to-date and they will end up diverging over time.
Do you think it would be possible to have a single document with optional steps that could be skipped for a "lite" lab? That would make it a lot easier to maintain.
if app_status == 'running': | ||
print('# Viz server app is running. CDSW setup complete!') | ||
break | ||
elif build_status == 'stopped': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be app_status
instead of build_status
elif build_status == 'stopped': | ||
# Additional error handling - if the app exists and is stopped, start it? | ||
break | ||
elif build_status == 'failed': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
print('Set unauthenticated access flag to: %s'% (r.json()["allow_unauthenticated_access_to_app"],)) | ||
|
||
print('# Add project for Data Visualization server') | ||
project_name = 'viz' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we give it a nicer name so that it looks better on CDSW? ;) E.g. "VizApps Workshop"
+ | ||
image::images/hue_timestamp_fixed.png[width=800] | ||
+ | ||
. We will now create a simple interactive real-time dashboard. Open CDP Data Visualization and log in. Select the Data tab and click NEW CONNECTION. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth explaining how to "Open CDP Data Visualization and log in", through CDSW. In real life they won't have the handy web server link to help.
The first time I had to find a VizApps link in CDSW it probably took me 1 hour to figure out :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, log in with which credentials?
I found that admin/supersecret1 doesn't work. We could document vizapps_admin/vizapps_admin, but I rather figure out how to use admin/supersecret1, so that's consistent with everything else.
'description': 'viz server app', | ||
'kernel': 'python3', | ||
'memory': 2, | ||
'name': 'viz-server', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicer name for display? :) <Something> Application
, so it's more obvious in CDSW
+ | ||
When both tabs have been filled, test the connection. You should see "Connection Verified". Hit Connect. | ||
|
||
. We can explore the databases and tables of the newly added connection. From the newly created Impala connection open the default database and select "sensors". A preview with sample data will be loaded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please mention the "Connection Explorer" tab?
. On the Cloudera Manager console, click on the Cloudera logo at the top-left corner to ensure you are at the home page and then click on the *SQL Stream Builder* service. | ||
|
||
. Click on the *SQLStreamBuilder Console* link to open the SSB UI. | ||
.. On the first access you will be prompted to accept the use of cookies on this page. Click *Accept* to dismiss that message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an example of what I mentioned of it being harder to maintain multiple similar documents. This prompt does not exist in the latest version any longer. I had already updated the original lab to remove it, but the copy retained the deprecated bit :)
Fixes #8, #9, #11 and #12
Tested against latest cdp716 stack
stack.cdp716.sh
instance ip address WEB CM CEM NIFI NREG SREG SMM HUE CDSW Model Status Viz Status
aws_instance.web[null] 3.127.215.88 Ok
aws_instance.cluster[0] 18.196.81.128 Ok Ok Ok Ok Ok Ok Ok Ok deployed running
aws_instance.cluster[1] 18.184.166.69 Ok Ok Ok Ok Ok Ok Ok Ok deployed running