Dataset Name | GBs in Memory | Rows | Cols | Cells | Num Execs | Num Cores | Exec Memory | Network Time | Total Time |
NYSE | 0.1G | 103K | 9 | 816K | 1 | 1 | 1G | 00:00:15 | 00:00:48 |
AUM | 14G | 9M | 48 | 432M | 5 | 1 | 4G | 00:01:20 | 00:03:50 |
ENERGY | 5G | 43M | 6 | 258M | 8 | 3 | 3G | 00:00:00 | 00:04:35 |
INVEST_DATA | 20G | 3.8M | 158 | 590M | 3 | 2 | 3G | 00:00:40 | 00:03:32 |
Postgres database call, no concurrent processing, simple case, small data.
-bhtimeoff -numexecutors 1-lib "/opt/owl/drivers/postgres"-executormemory 1g-h metastore01.us-east1-b.c.owl-hadoop-cdh.internal:5432/dev?currentSchema=public-drivermemory 1g -master k8s:// -ds public.nyse_128 -deploymode cluster-q "select * from public.nyse" -bhlb 10 -rd "2020-10-26"-driver "org.postgresql.Driver" -bhminoff-loglevel INFO -cxn postgres-gcp -bhmaxoff
Postgres database call uses parallel JDBC, split on aum_id serial id.
-owluser kirk-lib "/opt/owl/drivers/postgres" -datashapeoff-numpartitions 6 -ds public.aum_dt2_50-deploymode cluster -bhlb 10 -bhminoff-cxn postgres-gcp -bhmaxoff -bhtimeoff-numexecutors 6-executormemory 4g -semanticoff-h metastore01.us-east1-b.c.owl-hadoop-cdh.internal:5432/dev?currentSchema=public-columnname aum_id -corroff -drivermemory 4g -master k8s://-q "select * from public.aum_dt2" -histoff -rd "2020-10-27"-driver "org.postgresql.Driver" -loglevel INFO -agentjobid 7664
HDFS file with 43 million rows, converting a string date to date type, deploy mode client.
-f "hdfs:///demo/owl_usage_all.csv" \-rd "2019-02-02" \-ds energy_file \-loglevel DEBUG -readonly \-d "," -df dd-MMM-yy \-master yarn \-deploymode client \-numexecutors 3 \-executormemory 10g
-bhtimeoff -owluser kirk -numexecutors 1-lib "/opt/owl/drivers/postgres" -executormemory 1g-dl -h metastore01.us-east1-b.c.owl-hadoop-cdh.internal:5432/dev?currentSchema=public-drivermemory 1g -master k8s:// -ds public.nyse_128 -deploymode cluster-q "select * from public.nyse" -bhlb 10-rd "2020-10-27" -driver "org.postgresql.Driver"-bhminoff -loglevel INFO -cxn postgres-gcp -bhmaxoff -agentjobid 7721