Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge balazs-patchset #13

Open
wants to merge 45 commits into
base: trunk
Choose a base branch
from

Conversation

balazsgaspar
Copy link
Contributor

Fixes #8, #9, #11 and #12
Tested against latest cdp716 stack stack.cdp716.sh

instance ip address WEB CM CEM NIFI NREG SREG SMM HUE CDSW Model Status Viz Status
aws_instance.web[null] 3.127.215.88 Ok
aws_instance.cluster[0] 18.196.81.128 Ok Ok Ok Ok Ok Ok Ok Ok deployed running
aws_instance.cluster[1] 18.184.166.69 Ok Ok Ok Ok Ok Ok Ok Ok deployed running

balazsgaspar and others added 30 commits March 7, 2021 18:55
Fixed a typo ("yum" -> "git")
Change Lab 6 from Hue to CDP Data Visualization
Updated Lab 6 instructions for CDV
Fixing Kudu timestamp column issue
@asdaraujo
Copy link
Collaborator

Thanks a lot, @balazsgaspar ! This is great!
Could you please take a look at the items I've commented on?

Thanks again!

for project in r.json():
if project['name'] == project_name:
project_id = project['id']

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for removing the try-except block?
I added that to help with troubleshooting. Without it the HTTP messages are not show properly and it makes it more difficult to find the root cause of failures in certain cases.

r = s.post(CDSW_API + '/users/admin/projects', json={'template': 'git',
'project_visibility': 'private',
'name': project_name,
'gitUrl': 'https://github.com/balazsgaspar/edge2ai-workshop'})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace balazsgaspar with cloudera-labs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, done.

models = [m for m in r.json() if m['name'] == model_name]
if models and models[0]['latestModelDeployment']['status'] == 'deployed':
print('Model is already deployed!! Skipping.')
#exit(0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case the model has already been deployed, we don't want to re-run the steps below either (Add engine, variables, etc). At this point we could jump directly to the beginning of the VizApps section.

It would be better to encapsulate the model deployment and the vizapps deployment in separate functions and call these functions from the main script section.

This exit(0) would turn into a return, so that the whole deploy_model function would terminate and the deploy_vizapps one would be executed.

print('Viz project ID: %s'% (viz_project_id,))

print('# Add custom engine for Data Visualization server')
#docker.repository.cloudera.com/cloudera/cdv/cdswdataviz:6.2.3-b18
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this comment is necessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Need to test with a more recent Viz image and if everything works, will update the image tag in the script as well.

print('Engine Image ID: %s'% (engine_image_id,))

print('# Set new engine image as default for the viz project')
#docker.repository.cloudera.com/cloudera/cdv/cdswdataviz:6.2.3-b18
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

print('Project image default engine Image ID set to: %s'% (project_engine_image_id))

print('# Create application with Data Visualization server')
# https://github.infra.cloudera.com/Sense/cloudera-sense/blob/master/services/web/server/api/v1/controllers/applications/partial.swagger.yaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this URL for?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Collaborator

@asdaraujo asdaraujo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of the "lite" versions of the workshops. However, maintaining content is not an easy task when there are changes/updates to the product.

Having a duplicate of every lab with a few changes here and there will make it more difficult to keep them up-to-date and they will end up diverging over time.

Do you think it would be possible to have a single document with optional steps that could be skipped for a "lite" lab? That would make it a lot easier to maintain.

if app_status == 'running':
print('# Viz server app is running. CDSW setup complete!')
break
elif build_status == 'stopped':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be app_status instead of build_status

elif build_status == 'stopped':
# Additional error handling - if the app exists and is stopped, start it?
break
elif build_status == 'failed':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

print('Set unauthenticated access flag to: %s'% (r.json()["allow_unauthenticated_access_to_app"],))

print('# Add project for Data Visualization server')
project_name = 'viz'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give it a nicer name so that it looks better on CDSW? ;) E.g. "VizApps Workshop"

+
image::images/hue_timestamp_fixed.png[width=800]
+
. We will now create a simple interactive real-time dashboard. Open CDP Data Visualization and log in. Select the Data tab and click NEW CONNECTION.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be worth explaining how to "Open CDP Data Visualization and log in", through CDSW. In real life they won't have the handy web server link to help.

The first time I had to find a VizApps link in CDSW it probably took me 1 hour to figure out :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, log in with which credentials?
I found that admin/supersecret1 doesn't work. We could document vizapps_admin/vizapps_admin, but I rather figure out how to use admin/supersecret1, so that's consistent with everything else.

'description': 'viz server app',
'kernel': 'python3',
'memory': 2,
'name': 'viz-server',
Copy link
Collaborator

@asdaraujo asdaraujo May 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicer name for display? :) <Something> Application, so it's more obvious in CDSW

+
When both tabs have been filled, test the connection. You should see "Connection Verified". Hit Connect.

. We can explore the databases and tables of the newly added connection. From the newly created Impala connection open the default database and select "sensors". A preview with sample data will be loaded.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please mention the "Connection Explorer" tab?

. On the Cloudera Manager console, click on the Cloudera logo at the top-left corner to ensure you are at the home page and then click on the *SQL Stream Builder* service.

. Click on the *SQLStreamBuilder Console* link to open the SSB UI.
.. On the first access you will be prompted to accept the use of cookies on this page. Click *Accept* to dismiss that message.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of what I mentioned of it being harder to maintain multiple similar documents. This prompt does not exist in the latest version any longer. I had already updated the original lab to remove it, but the copy retained the deprecated bit :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve registration/login page UI
2 participants