Run with mkdocs build

pflooky · Jul 13, 2023 · 2753c59 · 2753c59
1 parent 966dccb
commit 2753c59
Show file tree

Hide file tree

Showing 67 changed files with 12,696 additions and 11 deletions.
diff --git a/docs/tech/advanced.md b/docs/tech/advanced.md
@@ -6,7 +6,7 @@ There are many options available for you to use when you have a scenario when da
 
 1. Create expression [datafaker](https://www.datafaker.net/documentation/expressions/)
  1. Can be used to create names, addresses, or anything that can be found
- under [here](tech/sample/datafaker/expressions.txt)
+ under [here](sample/datafaker/expressions.txt)
 2. Create regex
 
 ## Foreign keys across data sets
@@ -27,7 +27,7 @@ sinkOptions:
  - "transaction-cassandra.transactions.account_id"
 ```
 
-[Sample can be found here.](tech/sample/plan/foreign-key-example-plan.yaml)
+[Sample can be found here.](sample/plan/foreign-key-example-plan.yaml)
 You can define any number of foreign key relationships as you want.
 
 ## Edge cases
@@ -57,8 +57,8 @@ You can alter the `status` column in the account data to only generate `open` ac
 and define a foreign key between Postgres and parquet to ensure the same `account_id` is being used. 
 Then in the parquet task, define 1 to 10 transactions per `account_id` to be generated.
 
-[Postgres account generation example task](tech/sample/task/jdbc/postgres/postgres-account-task.yaml) 
-[Parquet transaction generation example task](tech/sample/task/file/parquet/parquet-transaction-task.yaml) 
-[Plan](tech/sample/plan/scenario-based-plan.yaml)
+[Postgres account generation example task](sample/task/jdbc/postgres/postgres-account-task.yaml) 
+[Parquet transaction generation example task](sample/task/file/parquet/parquet-transaction-task.yaml) 
+[Plan](sample/plan/scenario-based-plan.yaml)
 
 ## Generating JSON data
diff --git a/docs/tech/docker.md b/docs/tech/docker.md
@@ -8,7 +8,7 @@
 
 ## Run with custom data connections
 
-1. Use sample `application.conf` from [here](../../app/src/main/resources/application.conf) and put under folder `/tmp/datagen`
+1. Use sample `application.conf` from [here](sample/conf/application.conf) and put under folder `/tmp/datagen`
  1. `cp app/src/main/resources/application.conf /tmp/datagen`
 2. Fill in details of data connections as found [here](connections.md)
 3. `docker run -v /tmp/datagen:/opt/app/data-caterer -e APPLICATION_CONFIG_PATH=/opt/app/datagen/application.conf pflookyy/data-caterer:0.1`
diff --git a/docs/tech/generators.md b/docs/tech/generators.md
@@ -41,15 +41,18 @@ descriptions:
 |------------|---------|-------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 | minLen | 1 | minLen: "2" | Ensures that all generated strings have at least length `minLen` |
 | maxLen | 10 | maxLen: "15" | Ensures that all generated strings have at most length `maxLen` |
-| expression | <empty> | expression: "#{Name.name}"<br/> expression:"#{Address.city}/#{Demographic.maritalStatus}" | Will generate a string based on the faker expression provided. All possible faker expressions can be found [here](tech/sample/datafaker/expressions.txt)<br/> Expression has to be in format `#{<faker expression name>}` |
+| expression | <empty> | expression: "#{Name.name}"<br/> expression:"#{Address.city}/#{Demographic.maritalStatus}" | Will generate a string based on the faker expression provided. All possible faker expressions can be found [here](sample/datafaker/expressions.txt)<br/> Expression has to be in format `#{<faker expression name>}`  |
 | enableNull | false | enableNull: "true" | Enable/disable null values being generated |
 
 **Edge cases**: ("", "\n", "\r", "\t", " ", "\\u0000", "\\ufff")
 
 ### Numeric
+
 For all the numeric data types, there are 4 options to choose from: min, minValue, max and maxValue.
 Generally speaking, you only need to define one of min or minValue, similarly with max or maxValue. 
-The reason why there are 2 options for each is because of when metadata is automatically gathered, we gather the statistics of the observed min and max values. Also, it will attempt to gather any restriction on the min or max value as defined by the data source (i.e. max value as per database type).
+The reason why there are 2 options for each is because of when metadata is automatically gathered, we gather the
+statistics of the observed min and max values. Also, it will attempt to gather any restriction on the min or max value
+as defined by the data source (i.e. max value as per database type).
 
 #### Integer/Long/Short/Decimal
 
@@ -62,7 +65,7 @@ The reason why there are 2 options for each is because of when metadata is autom
 
 **Edge cases Integer**: (2147483647, -2147483648, 0) 
 **Edge cases Long/Decimal**: (9223372036854775807, -9223372036854775808, 0) 
-**Edge cases Short**: (32767, -32768, 0) 
+**Edge cases Short**: (32767, -32768, 0)
 
 #### Double/Float
 
@@ -73,7 +76,8 @@ The reason why there are 2 options for each is because of when metadata is autom
 | maxValue | 1000.0 | maxValue: "25.9" | Ensures that all generated values are less than or equal to `maxValue` |
 | max | 1000.0 | max: "25.9" | Ensures that all generated values are less than or equal to `maxValue`. If `maxValue` is defined, `maxValue` will define the largest possible generated value |
 
-**Edge cases Double**: (+infinity, 1.7976931348623157e+308, 4.9e-324, 0.0, -0.0, -1.7976931348623157e+308, -infinity, NaN) 
+**Edge cases Double**: (+infinity, 1.7976931348623157e+308, 4.9e-324, 0.0, -0.0, -1.7976931348623157e+308, -infinity,
+NaN) 
 **Edge cases Float**: (+infinity, 3.4028235e+38, 1.4e-45, 0.0, -0.0, -3.4028235e+38, -infinity, NaN)
 
 ### Date
@@ -85,7 +89,8 @@ The reason why there are 2 options for each is because of when metadata is autom
 | enableNull | false | enableNull: "true" | Enable/disable null values being generated |
 
 **Edge cases**: (0001-01-01, 1582-10-15, 1970-01-01, 9999-12-31)
-(Reference: https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala#L206)
+(
+Reference: https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/RandomDataGenerator.scala#L206)
 
 ### Timestamp