A Structured Streaming job deployed to production has been resulting in higher than expected cloud storage costs. At present, during normal execution, each microbatch of data is processed in less than 3s; at least 12 times per minute, a microbatch is processed that contains 0 records. The streaming write was configured using the default trigger settings. The production job is currently scheduled alongside many other Databricks jobs in a workspace with instance pools provisioned to reduce start-up time for jobs with batch execution.
Holding all other variables constant and assuming records need to be processed in less than 10 minutes, which adjustment will meet the requirement?
AlejandroU
1 month, 2 weeks agoUrcoIbz
1 month, 2 weeks agobenni_ale
2 months agopk07
4 months, 1 week agopracticioner
5 months, 3 weeks agoEr5
10 months agovikram12apr
11 months agohidelux
11 months agopracticioner
5 months, 3 weeks agospaceexplorer
1 year agoranith
1 year agodivingbell17
1 year, 1 month agoalexvno
1 year, 1 month agoaragorn_brego
1 year, 2 months agoGulenur_GS
1 year, 2 months ago