pytd.Client¶
-
class
pytd.
Client
(apikey=None, endpoint=None, database='sample_datasets', default_engine='presto', header=True, **kwargs)[source]¶ Treasure Data client interface.
A client instance establishes a connection to Treasure Data. This interface gives easy and efficient access to Presto/Hive query engine and Plazma primary storage.
- Parameters
- apikeystring, optional
Treasure Data API key. If not given, a value of environment variable
TD_API_KEY
is used by default.- endpointstring, optional
Treasure Data API server. If not given, https://api.treasuredata.com is used by default. List of available endpoints is: https://support.treasuredata.com/hc/en-us/articles/360001474288-Sites-and-Endpoints
- databasestring, default: ‘sample_datasets’
Name of connected database.
- default_enginestring, {‘presto’, ‘hive’}, or pytd.query_engine.QueryEngine, default: ‘presto’
Query engine. If a QueryEngine instance is given,
apikey
,endpoint
, anddatabase
are overwritten by the values configured in the instance.- headerstring or boolean, default: True
Prepend comment strings, in the form “– comment”, as a header of queries. Set False to disable header.
-
__init__
(self, apikey=None, endpoint=None, database='sample_datasets', default_engine='presto', header=True, **kwargs)[source]¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(self[, apikey, endpoint, database, …])Initialize self.
close
(self)Close a client I/O session to Treasure Data.
get_job
(self, job_id)Get a td-client-python Job object from
job_id
.get_table
(self, database, table)Create a pytd table control instance.
list_databases
(self)Get a list of td-client-python Database objects.
list_jobs
(self)Get a list of td-client-python Job objects.
list_tables
(self[, database])Get a list of td-client-python Table objects.
load_table_from_dataframe
(self, dataframe, …)Write a given DataFrame to a Treasure Data table.
query
(self, query[, engine])Run query and get results.
-
__init__
(self, apikey=None, endpoint=None, database='sample_datasets', default_engine='presto', header=True, **kwargs)[source] Initialize self. See help(type(self)) for accurate signature.
-
list_databases
(self)[source]¶ Get a list of td-client-python Database objects.
- Returns
- list of tdclient.models.Database
-
list_tables
(self, database=None)[source]¶ Get a list of td-client-python Table objects.
- Parameters
- databasestring, optional
Database name. If not give, list tables in a table associated with this pytd.Client instance.
- Returns
- list of tdclient.models.Table
-
list_jobs
(self)[source]¶ Get a list of td-client-python Job objects.
- Returns
- list of tdclient.models.Job
-
get_job
(self, job_id)[source]¶ Get a td-client-python Job object from
job_id
.- Parameters
- job_idinteger
Job ID.
- Returns
- tdclient.models.Job
-
query
(self, query, engine=None, **kwargs)[source]¶ Run query and get results.
- Parameters
- querystring
Query issued on a specified query engine.
- enginestring, {‘presto’, ‘hive’}, or pytd.query_engine.QueryEngine, optional
Query engine. If not given, default query engine created in the constructor will be used.
- **kwargs
Treasure Data-specific optional query parameters. Giving these keyword arguments forces query engine to issue a query via Treasure Data REST API provided by
tdclient
; that is, ifengine
is Presto, you cannot enjoy efficient direct access to the query engine provided byprestodb
.db
(str): use the databaseresult_url
(str): result output URLpriority
(int or str): priority-2: “VERY LOW”
-1: “LOW”
0: “NORMAL”
1: “HIGH”
2: “VERY HIGH”
retry_limit
(int): max number of automatic retrieswait_interval
(int): sleep interval until job finishwait_callback
(function): called every interval against job itself
- Returns
- dictkeys (‘data’, ‘columns’)
- ‘data’
List of rows. Every single row is represented as a list of column values.
- ‘columns’
List of column names.
-
get_table
(self, database, table)[source]¶ Create a pytd table control instance.
- Parameters
- databasestring
Database name.
- tablestring
Table name.
- Returns
- pytd.table.Table
-
load_table_from_dataframe
(self, dataframe, destination, writer='bulk_import', if_exists='error', **kwargs)[source]¶ Write a given DataFrame to a Treasure Data table.
This function may initialize a Writer instance. Note that, as a part of the initialization process for SparkWriter, the latest version of td-spark will be downloaded.
- Parameters
- dataframepandas.DataFrame
Data loaded to a target table.
- destinationstring, or pytd.table.Table
Target table.
- writerstring, {‘bulk_import’, ‘insert_into’, ‘spark’}, or pytd.writer.Writer, default: ‘bulk_import’
A Writer to choose writing method to Treasure Data. If not given or string value, a temporal Writer instance will be created.
- if_exists{‘error’, ‘overwrite’, ‘append’, ‘ignore’}, default: ‘error’
What happens when a target table already exists. - error: raise an exception. - overwrite: drop it, recreate it, and insert data. - append: insert data. Create if does not exist. - ignore: do nothing.