pyspark.sql.Observation.get blocking on pyspark.sql.Dataframe.writer.save()

DaveLiddy · August 27, 2025, 4:43am

I’m trying to add some metrics to a pyspark pipeline feeding an index. I’ve found that the get operation blocks on the write, which I can workaround by adding in another request on the dataframe, such as the count below. Is a workaround like this necessary or am I missing something with this code?

obs = Observation("metrics")
df = spark.read.load(format="jdbc", **pgoptions) \
    .withColumns(chpt_cols) \
    .select(*target_cols) \
    .observe(obs, 
             count(lit(1)).alias("metric1")
            )

df.write.save(mode='append', format='es', **esoptions)
df.count()
obs.get

Thanks in advance for your replies

Topic		Replies	Views
Saving DF to Elasticsearch usig python Elasticsearch es-hadoop	2	5433	April 8, 2017
Writing Spark Dataframe into ElasticSeach- Runs Successfully but Not all Data dumped Elasticsearch es-hadoop	2	1331	January 4, 2022
Bulk Operation Results from Databricks Spark Job Elasticsearch	3	501	May 30, 2019
Pyspark write to Elasticsearch from Kafka Elasticsearch es-hadoop	2	1760	April 24, 2017
How to write to ES from a pyspark dataframe? Elasticsearch es-hadoop	5	5164	July 6, 2017

pyspark.sql.Observation.get blocking on pyspark.sql.Dataframe.writer.save()

Related topics