pytd.writer.SparkWriter¶
-
class
pytd.writer.
SparkWriter
(td_spark_path=None, download_if_missing=True, spark_configs=None)[source]¶ A writer module that loads Python data to Treasure Data.
- Parameters
- td_spark_pathstring, optional
Path to td-spark-assembly_x.xx-x.x.x.jar. If not given, seek a path
TDSparkContextBuilder.default_jar_path()
by default.- download_if_missingboolean, default: True
Download td-spark if it does not exist at the time of initialization.
- spark_configsdict, optional
Additional Spark configurations to be set via
SparkConf
’sset
method.
-
__init__
(self, td_spark_path=None, download_if_missing=True, spark_configs=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(self[, td_spark_path, …])Initialize self.
close
(self)Close a PySpark session connected to Treasure Data.
from_string
(writer, \*\*kwargs)write_dataframe
(self, dataframe, table, …)Write a given DataFrame to a Treasure Data table.
Attributes
-
__init__
(self, td_spark_path=None, download_if_missing=True, spark_configs=None)[source] Initialize self. See help(type(self)) for accurate signature.
-
property
closed
¶
-
write_dataframe
(self, dataframe, table, if_exists)[source]¶ Write a given DataFrame to a Treasure Data table.
This method internally converts a given pandas.DataFrame into Spark DataFrame, and directly writes to Treasure Data’s main storage so-called Plazma through a PySpark session.
- Parameters
- dataframepandas.DataFrame
Data loaded to a target table.
- tablepytd.table.Table
Target table.
- if_exists{‘error’, ‘overwrite’, ‘append’, ‘ignore’}
What happens when a target table already exists.
error: raise an exception.
overwrite: drop it, recreate it, and insert data.
append: insert data. Create if does not exist.
ignore: do nothing.