Skip to content

Commit

Permalink
Added label to Cluster Page
Browse files Browse the repository at this point in the history
Hive 4 DB OWNER DDL syntax for ALTERing DB ONWER requires 'USER' : #139
  • Loading branch information
dstreev committed Sep 12, 2024
1 parent 6abc80a commit a0a38e3
Show file tree
Hide file tree
Showing 22 changed files with 386 additions and 38 deletions.
2 changes: 2 additions & 0 deletions Writerside/hms-mirror.tree
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,8 @@
<toc-element topic="hms-mirror-iceberg_migration.md"/>
</toc-element>
<toc-element topic="Index-of-Settings.md">
<toc-element topic="filter_settings.md"/>
<toc-element topic="cluster-settings.md"/>
<toc-element topic="Transfer.md">
<toc-element topic="Transfer-Storage-Migration.md"/>
</toc-element>
Expand Down
3 changes: 1 addition & 2 deletions Writerside/topics/Hive-Conversions.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,7 @@ Modify the `hms-mirror` configuration to include the following settings:
``` yaml
clusters:
LEFT|RIGHT:
legacyHive: true|false
hdpHive3: true|false
platformType: HDP2|HDP3|CHD5|CDH6|CDP7.1|CDP7_2|..
```
</tab>
</tabs>
Expand Down
156 changes: 156 additions & 0 deletions Writerside/topics/Index-of-Settings.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,158 @@
# Index of Settings

## Copy Avro Schema Urls

`copyAvroSchemaUrls` is a boolean value that determines if the Avro schema URLs should be copied from the source to the target. This is useful if you're using Avro schemas in your source and target environments and want to maintain the same schema URLs in both environments. The default value is
`false`.

```yaml
copyAvroSchemaUrls: true|false
```
## Data Strategy
`dataStrategy` identifies how/what will be migrated between clusters. The following values are supported:

- SCHEMA_ONLY
- SQL
- EXPORT_IMPORT
- HYBRID
- DUMP
- STORAGE_MIGRATION
- COMMON
- LINKED

```yaml
dataStrategy: "SCHEMA_ONLY|SQL|EXPORT_IMPORT|HYBRID|DUMP|STORAGE_MIGRATION|COMMON|LINKED"
```

## Database Only

`databaseOnly` is a boolean value that determines if only the database objects should be migrated. The default value is
`false`.

```yaml
databaseOnly: true|false
```

## Dump Test Data

`dumpTestData` is a boolean value that determines if test data should be dumped to the target. The default value is
`false`.

```yaml
dumpTestData: true|false
```

## Load Test Data File

`loadTestDataFile` is a string value that identifies the file containing the test data to be loaded.

```yaml
loadTestDataFile: "<path_to_test_data_file>"
```

## Skip Link Check

`skipLinkCheck` is a boolean value that determines if the link check should be skipped. The default value is
`false`. Each cluster identifies an HCFS namespace and the link check will verify that the namespace is accessible
from the `hms-mirror` host. In addition, if the `targetNamespace` is defined, the link check will check that as well.

```yaml
skipLinkCheck: true|false
```

## Databases

`databases` is a list of databases to be migrated. It works in concert with 'Warehouse Plans' to provide a list of
databases to be migrated.

```yaml
databases:
- db_name
- db_name2
```

## Database Prefix

`dbPrefix` is a value to pre-pend to the database name when creating the database in the target cluster. This is
way to avoid conflicts with existing databases in the target cluster. The default value is an empty string.

```yaml
dbPrefix: "<prefix>"
```

## Database Rename

`dbRename` is a string value that identifies the new name of the database in the target cluster. This is useful for
testing a single database migration to an alternate database in the target cluster. This is only valid for a single
database migration.

```yaml
dbRename: "<new_db_name>"
```

## Execute

`execute` is a boolean value that determines if the migration should be executed. The default value is `false`,
which is the dry-run mode. In the 'dry-run' mode, all the reports are generated and none of the actual migration is
done.

<tip>
You should ALWAYS run a 'dry-run' before executing a migration. This will give you a good idea of what will be done
and provide you with the reports to review.
</tip>

```yaml
execute: true|false
```

## Migrate Non-Native Tables

`migrateNonNative` is a boolean value that determines if non-native tables should be migrated. The default is
`false`. A non-native table is a table that doesn't have a LOCATION element in the table definition. This is
typical of tables in Hive that rely on other technologies to store the data. EG: HBase, Kafka, JDBC Federation, etc.

```yaml
migrateNonNative: true|false
```

## Output Directory

`outputDirectory` is a string value that identifies the directory where the reports will be written. The default is
`$HOME/.hms-mirror/reports`. When this value is defined, the reports will be written to the specified output
directory with the timestamp as the 'name' of the report.

```yaml
outputDirectory: "<path_to_output_directory>"
```

## Encrypted Passwords

`encryptedPasswords` is a boolean value that determines if the passwords in the configuration file are encrypted.

```yaml
encryptedPasswords: true|false
```

## Read-Only

`readOnly` is a boolean value that determines if the migration should be read-only. The default value is `false`.
When this value is set to `true`, table schema's created will not have a 'purge' flag set to ensure they can't drop
data. This is useful for testing migrations and for DR scenarios where you want to limit the exposure of potential
changes on the target cluster.

```yaml
readOnly: true|false
```

## Skip Features

`skipFeatures` is a boolean value that is `false` by default so feature check will be made. Features are a
framework of checks that examine a table definition and make corrections to it to ensure it's compatible with the
target cluster. We've found several circumstances where definitions extracted from the source cluster can NOT be replayed on the target cluster
for some reason. These features attempt to correct those issues during the migration.

```yaml
skipFeatures: true|false
```
7 changes: 5 additions & 2 deletions Writerside/topics/JDBC-Drivers-and-Configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,15 +40,15 @@ hiveServer2:
</tab>
</tabs>
Starting with the Apache Standalone driver shipped with CDP 7.1.8 cummulative hot fix parcels, you will need to include additional jars in the configuration `jarFile` configuration, due to some packaging adjustments.
Starting with the Apache Standalone driver shipped with **CDP 7.1.8 cummulative** hot fix parcels, you will need to include additional jars in the configuration `jarFile` configuration, due to some packaging adjustments.

For example: `jarFile: "<cdp_parcel_jars>/hive-jdbc-3.1.3000.7.1.8.28-1-standalone.jar:<cdp_parcel_jars>/log4j-1.2-api-2.18.0.jar:<cdp_parcel_jars>/log4j-api-2.18.0.jar:<cdp_parcel_jars>/log4j-core-2.18.0.jar"` NOTE: The jar file with the Hive Driver MUST be the first in the list of jar files.

The Cloudera JDBC driver shouldn't require additional jars.

## Kerberized HS2 Connections

We currently have validated **kerberos** HS2 connections to CDP clusters using the Hive JDBC driver you'll find in your target CDP distribution.
We currently have validated **kerberos** HS2 connections to CDP clusters using the Hive JDBC driver you'll find in your target CDP distribution.

<warning>
Connections to Kerberized HS2 endpoints on NON-CDP clusters is NOT currently supported. You will need to use KNOX in HDP to connect to a kerberized HS2 endpoint. For CDH, you can setup a non-kerberized HS2 endpoint to support the migration.
Expand All @@ -72,3 +72,6 @@ Once you have everything configured, you can validate all connections required b
'CONNECTIONS --> Validate' left menu option in the UI. This will test the connectivity to the various endpoints
required by `hms-mirror`.

## HDP 3 Connections

The JDBC driver for HDP Hive 3 has some embedded classes for `log4j` that conflict with the `log4j` classes in the `hms-mirror` application. To resolve this, you can use the Cloudera Apache JDBC driver for HDP 3 Hive. This driver is compatible with HDP 3 and does not have the `log4j` conflict.
36 changes: 36 additions & 0 deletions Writerside/topics/Release-Notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,42 @@ found [here](https://github.com/cloudera-labs/hms-mirror/issues?q=is%3Aissue+is%
If there is
something you'd like to see, add a new issue [here](https://github.com/cloudera-labs/hms-mirror/issues)

## 2.2.0.10

**What's New**

### [Hive 4 DB OWNER DDL syntax for ALTERing DB ONWER requires 'USER'](https://github.com/cloudera-labs/hms-mirror/issues/139)

This changed resulted in a simplification of how we determine what the cluster platform is. Previously we used two attributes (`legacyHive` and `hdpHive3`) to determine the platform. This information would direct logic around translations and other features.

Unfortunately, this isn't enough for us to determine all the scenarios we're encountering. These attributes have been replaced with a new attribute call `platformType`. A list of the platform types can be found [here]().

We will make automatic translations of legacy configurations to the new `platformType` attribute. The translation will be pretty basic and result in either the platform type being defined as `HDP2` or `CDP_7.1`. If you have a more complex configuration, you'll need to adjust the `platformType` attribute manually. Future persisted configurations will use the new `platformType` attribute and drop the `legacyHive` and `hdpHive3` attributes.

### [Add "Property Overrides" to Web Interface](https://github.com/cloudera-labs/hms-mirror/issues/111)

A feature that was late in making it into the Web UI is now here.

### [For Web UI Service, default to prefer IPV4](https://github.com/cloudera-labs/hms-mirror/issues/134)

To ensure the right IP stack is used when the Web UI starts up, we're forcing this JDK configuration with the Web UI.

### [Forcibly set Java Home via -Duser.home](https://github.com/cloudera-labs/hms-mirror/issues/136)

We had a few requests and issues with implementations were the target environment isn't always setup with normal user 'home' standards that we can rely on. This change allows us to set the 'home' directory for the user running the application and ensure its translated correctly in hms-mirror for storing and reading configurations, reports, and logs.

If you are in an environment that doesn't follow user `$HOME` standards, you can set the `HOME` environment variable to a custom directory **BEFORE** starting `hms-mirror` to alter the default behavior.

### Cleanup SQL has been added to Web Reporting UI

We've added a 'Cleanup SQL' tab to the Web Reporting UI. This will show you the SQL that was generated to clean up the source cluster after the migration. This is useful to see what will be done before you execute the migration.

**Bugs (Fixed)**

- [DATABASE set OWNER ALTER statement is incorrect](https://github.com/cloudera-labs/hms-mirror/issues/135)
- [SQL ACID Migrations from HDPHive3 cluster Ordering](https://github.com/cloudera-labs/hms-mirror/issues/138)
- [DB Location for HDP3 migrations is flipped](https://github.com/cloudera-labs/hms-mirror/issues/140)

## 2.2.0.9

**Bugs (Fixed)**
Expand Down
3 changes: 3 additions & 0 deletions Writerside/topics/cluster-settings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Cluster

Start typing here...
3 changes: 3 additions & 0 deletions Writerside/topics/filter_settings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Filter

Start typing here...
3 changes: 1 addition & 2 deletions Writerside/topics/hms-mirror-features.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,8 +124,7 @@ But there is a corner case where these optimizations can get in the way and caus
```yaml
clusters:
LEFT:
legacyHive: false
hdpHive3: true
platformType: 'HDP3'
```

## Compress Text Output
Expand Down
2 changes: 1 addition & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@

<groupId>com.cloudera.utils.hadoop</groupId>
<artifactId>hms-mirror</artifactId>
<version>2.2.0.9.4</version>
<version>2.2.0.10</version>
<packaging>jar</packaging>

<name>hms-mirror</name>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@ public interface MirrorConf {
String RENAME_TABLE = "ALTER TABLE {0} RENAME TO {1}";
String SET_TABLE_OWNER_DESC = "Set table owner";
String SET_TABLE_OWNER = "ALTER TABLE {0} SET OWNER USER {1}";
String SET_DB_OWNER_W_USER_TYPE = "ALTER DATABASE {0} SET OWNER USER {1}";
String SET_DB_OWNER = "ALTER DATABASE {0} SET OWNER {1}";
String SET_DB_OWNER_DESC = "Set database owner";

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -757,7 +757,7 @@ CommandLineRunner configMigrateAcidOnly(HmsMirrorConfig hmsMirrorConfig, @Value(
CommandLineRunner configMigrateNonNativeTrue(HmsMirrorConfig hmsMirrorConfig) {
return args -> {
log.info("migrate-non-native: {}", Boolean.TRUE);
hmsMirrorConfig.setMigratedNonNative(Boolean.TRUE);
hmsMirrorConfig.setMigrateNonNative(Boolean.TRUE);
};
}

Expand All @@ -769,7 +769,7 @@ CommandLineRunner configMigrateNonNativeTrue(HmsMirrorConfig hmsMirrorConfig) {
CommandLineRunner configMigrateNonNativeFalse(HmsMirrorConfig hmsMirrorConfig) {
return args -> {
log.info("migrate-non-native: {}", Boolean.FALSE);
hmsMirrorConfig.setMigratedNonNative(Boolean.FALSE);
hmsMirrorConfig.setMigrateNonNative(Boolean.FALSE);
};
}

Expand All @@ -781,7 +781,7 @@ CommandLineRunner configMigrateNonNativeFalse(HmsMirrorConfig hmsMirrorConfig) {
CommandLineRunner configMigrateNonNativeOnlyTrue(HmsMirrorConfig hmsMirrorConfig) {
return args -> {
log.info("migrate-non-native-only: {}", Boolean.TRUE);
hmsMirrorConfig.setMigratedNonNative(Boolean.TRUE);
hmsMirrorConfig.setMigrateNonNative(Boolean.TRUE);
};
}

Expand All @@ -793,7 +793,7 @@ CommandLineRunner configMigrateNonNativeOnlyTrue(HmsMirrorConfig hmsMirrorConfig
CommandLineRunner configMigrateNonNativeOnlyFalse(HmsMirrorConfig hmsMirrorConfig) {
return args -> {
log.info("migrate-non-native-only: {}", Boolean.FALSE);
hmsMirrorConfig.setMigratedNonNative(Boolean.FALSE);
hmsMirrorConfig.setMigrateNonNative(Boolean.FALSE);
};
}

Expand Down
Loading

0 comments on commit a0a38e3

Please sign in to comment.