pytd.writer.SparkWriter¶

class pytd.writer.SparkWriter(td_spark_path=None, download_if_missing=True, spark_configs=None)[source]¶

A writer module that loads Python data to Treasure Data.

Parameters

td_spark_pathstring, optional: Path to td-spark-assembly_x.xx-x.x.x.jar. If not given, seek a path TDSparkContextBuilder.default_jar_path() by default.
download_if_missingboolean, default: True: Download td-spark if it does not exist at the time of initialization.
spark_configsdict, optional: Additional Spark configurations to be set via SparkConf’s set method.

__init__(self, td_spark_path=None, download_if_missing=True, spark_configs=None)[source]¶: Initialize self. See help(type(self)) for accurate signature.

Methods

`__init__`(self[, td_spark_path, …])	Initialize self.
`close`(self)	Close a PySpark session connected to Treasure Data.
`from_string`(writer, \\kwargs)
`write_dataframe`(self, dataframe, table, …)	Write a given DataFrame to a Treasure Data table.

Attributes

closed

__init__(self, td_spark_path=None, download_if_missing=True, spark_configs=None)[source]: Initialize self. See help(type(self)) for accurate signature.

property closed¶

write_dataframe(self, dataframe, table, if_exists)[source]¶

Write a given DataFrame to a Treasure Data table.

This method internally converts a given pandas.DataFrame into Spark DataFrame, and directly writes to Treasure Data’s main storage so-called Plazma through a PySpark session.

Parameters

dataframepandas.DataFrame

Data loaded to a target table.

tablepytd.table.Table

Target table.

if_exists{‘error’, ‘overwrite’, ‘append’, ‘ignore’}

What happens when a target table already exists.

error: raise an exception.
overwrite: drop it, recreate it, and insert data.
append: insert data. Create if does not exist.
ignore: do nothing.

close(self)[source]¶: Close a PySpark session connected to Treasure Data.

pytd.writer.SparkWriter¶

pytd

Navigation

Related Topics