Skip to content

Commit 0a93a47

Browse files
Adds a sample which demonstrates how to publish a plain multi-table .hyper file to Tableau Server (without wrapping it into a TDSX file). This is supported in Tableau Online/Tableau Server starting with version 2021.4. Renamed the old sample which demonstrates how to wrap a multi-table extract into a TDSX file.
1 parent 5fb89d6 commit 0a93a47

File tree

9 files changed

+374
-176
lines changed

9 files changed

+374
-176
lines changed

Community-Supported/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,8 @@ The community samples focus on individual use cases and are Python-only. They ha
2525
- [__publish-hyper__](https://github.com/tableau/hyper-api-samples/tree/main/Community-Supported/publish-hyper)
2626
- Simple example of publishing single-table `.hyper` file.
2727
- [__publish-multi-table-hyper__](https://github.com/tableau/hyper-api-samples/tree/main/Community-Supported/publish-multi-table-hyper)
28+
- Demonstrates how to create a multi-table `.hyper` file and publish it to Tableau Server version 2021.4+.
29+
- [__publish-multi-table-hyper-legacy__](https://github.com/tableau/hyper-api-samples/tree/main/Community-Supported/publish-multi-table-hyper-legacy)
2830
- Demonstrates the full end-to-end workflow of how to create a multi-table `.hyper` file, place the extract into a `.tdsx`, and publish to Tableau Online or Server.
2931
- [__s3-to-hyper__](https://github.com/tableau/hyper-api-samples/tree/main/Community-Supported/s3-to-hyper)
3032
- Demonstrates how to create a `.hyper` file from a wildcard union on text files held in an AWS S3 bucket. The extract is then placed in a `.tdsx` file and published to Tableau Online or Server.
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# publish-multi-table-hyper-legacy
2+
## __Publishing a Multi-Table Hyper File to Tableau Online/Server (for Tableau Server versions < 2021.4)__
3+
4+
![Community Supported](https://img.shields.io/badge/Support%20Level-Community%20Supported-53bd92.svg)
5+
6+
In contrast to single-table `.hyper` files, with multi-table `.hyper` files it is not obvious which data model you want to analyze in the data source on Tableau Server. Thus, when publishing to Tableau Server < 2021.4, you can only publish single-table `.hyper` files or need to wrap the `.hyper` file into a Packaged Data Source (.tdsx). Starting with version 2021.4, you can now publish multi-table `.hyper` files to Tableau Server and Tableau Online; the data model will be automatically inferred (as specified by assumed table constraints).
7+
8+
This sample demonstrates how to wrap the `.hyper` file into a Packaged Data Source (.tdsx). If you are looking for how to publish a plain multi-table `.hyper` file to Tableau Server 2021.4+, have a look at [this sample](https://github.com/tableau/hyper-api-samples/tree/main/Community-Supported/publish-multi-table-hyper).
9+
10+
This sample demonstrates how to leverage the Hyper API, Tableau Server Client Library, and Tableau Tools to do the following:
11+
- Create a multi-table `.hyper` file
12+
- Swap the newly created extract into an existing Packaged Data Source file (.tdsx)
13+
- Publish the data source to a specified project on Tableau Online/Server
14+
15+
It should serve as a starting point for anyone looking to automate the publishing process of multi-table extracts and data sources to Tableau.
16+
17+
# Get started
18+
19+
## __Prerequisites__
20+
To run the script, you will need:
21+
- Windows or Mac
22+
- Tableau Desktop v10.5 or higher
23+
- Python 3
24+
- Run `pip install -r requirements.txt`
25+
- Tableau Online/Server credentials or Personal Access Token
26+
27+
## __Configuration File__
28+
Modify `config.json` and add the following fields:
29+
- Name of the `.hyper` file
30+
- Name of the .tdsx file
31+
- Server/Online url
32+
- Site name
33+
- Project name
34+
- Authentication information
35+
36+
## __Data and Table Definitions__
37+
If you want to simply run the sample to test the publishing process, you do not need to make any changes to the python file. Ensure that you have installed the requirements, update the config file with authentication information and execute the python file.
38+
39+
Once you are ready to use your own data, you will need to change the `build_tables()` and the `get_data()` functions. `build_tables()` returns the schema of all tables to be created (an array of `TableDefinition` objects, one for each table you want to create). `get_data()` returns the data to be inserted, one array for each table (an array of arrays, one containing the data to be inserted for each table). Those functions could be a part of an existing ETL workflow, grab the data from an API request, or pull CSVs from cloud storage like AWS, Azure, or GCP. In any case, writing that code is up to you. You can [check out this doc](https://help.tableau.com/current/api/hyper_api/en-us/reference/py/tableauhyperapi.html?tableauhyperapi.Inserter) information on how to pass data to Hyper's `inserter()` method and [this doc](https://help.tableau.com/current/api/hyper_api/en-us/reference/py/tableauhyperapi.html?tableauhyperapi.SqlType) for more information on the the Hyper API's SqlType class.
40+
41+
__Note:__ The current example features two tables, but in theory, this could support as many as you'd like. Just be sure to add the proper table definitions and make sure that the order in the list of table data and table definitions properly match.
42+
43+
## __Creating the .tdsx File__
44+
As mentioned, one key step needed for the automatic publishing of multi-table `.hyper` files is a Packaged Data Source, or .tdsx. As of now, this is a step that must be completed manually as a part of the setup process. _You will only need to do this once_. If a .tdsx is not present in the directory, the script will prompt you to create one. At this point, you should have entered the required config fields and have run the python script once to create the multi-table `.hyper` file.
45+
46+
Packaged Data Sources contain important metadata needed for Tableau Desktop and Server/Online. This includes things like defined joins and join clauses, relationships, calculated fields, and more.
47+
48+
To create the .tdsx, [follow these steps](https://help.tableau.com/current/pro/desktop/en-us/export_connection.htm):
49+
- Run the script without a data source present to create the initial `.hyper` file
50+
- Double-click the `.hyper` file to open it in Tableau Desktop
51+
- Click and drag the relevant tables and create the joins or relationships
52+
- Head to 'Sheet 1'
53+
- In the top-left corner, right-click on the data source and select 'Add to Saved Data Sources...'
54+
- Name the file to match the value in `config.json`
55+
- Select 'Tableau __Packaged__ Data Source (*.tdsx)' from the dropdown
56+
- Save it in the directory with the script and `.hyper` file
57+
58+
Now you are free to rerun the script and validate the swapping and publishing process. Unless you change how the `.hyper` file is being created (schema, column names, joins, etc.), you will not need to remake the .tdsx again.
59+
60+
## __Additional Customization__
61+
If you end up needing to change more about how the extract is built (e.g., inserting directly from a CSV file) then you will need to also change the `add_to_hyper()` function, but most likely nothing else.
62+
63+
Leverage the [official Hyper API samples](https://github.com/tableau/hyper-api-samples/tree/master/Python) to learn more about what's possible.
64+
65+
66+
## __Resources__
67+
Check out these resources to learn more:
68+
- [Hyper API docs](https://help.tableau.com/current/api/hyper_api/en-us/index.html)
69+
- [TSC Docs](https://tableau.github.io/server-client-python/docs/)
70+
- [REST API docs](https://help.tableau.com/current/api/rest_api/en-us/REST/rest_api.htm)
71+
- [Tableau Tools](https://github.com/bryantbhowell/tableau_tools)
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
{
2+
"hyper_name": "extract.hyper",
3+
"tdsx_name": "test.tdsx",
4+
"server_address": "my.tableau.server",
5+
"site_name": "my_site",
6+
"project_name": "my_project",
7+
"tableau_token_name": "token_name",
8+
"tableau_token": "token_goes_here"
9+
}
Lines changed: 169 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,169 @@
1+
# -----------------------------------------------------------------------------
2+
#
3+
# This file is the copyrighted property of Tableau Software and is protected
4+
# by registered patents and other applicable U.S. and international laws and
5+
# regulations.
6+
#
7+
# You may adapt this file and modify it to fit into your context and use it
8+
# as a template to start your own projects.
9+
#
10+
# -----------------------------------------------------------------------------
11+
12+
from tableauhyperapi import *
13+
from tableau_tools import *
14+
from tableau_tools.tableau_documents import *
15+
import tableauserverclient as TSC
16+
import os, json, sys
17+
18+
def get_data():
19+
'''This function is responsible for returning the two tables as nested arrays, as shown with the example below.'''
20+
21+
# Sample data held as arrays. Column order must be consistent and match the table definitions defined below.
22+
table_one = [[123, 40, 'John', 'Order#1'], [123, 90, 'Jane', 'Order#2'], [456, 110, 'John', 'Order#3'], [456, 80, 'Jane', 'Order#4']]
23+
table_two = [[123, 'Lemonade', 'Beverage'], [456, 'Cookie', 'Food']]
24+
25+
# Create an array of the data_tables to pass to Hyper
26+
table_data = [table_one, table_two]
27+
28+
return table_data
29+
30+
def build_tables():
31+
'''Builds the two tables for the multitable extract.'''
32+
# Since the table names are not prefixed with an explicit schema name, the tables will reside in the default "public" namespace.
33+
# It is important to match the order of the table definitions with the data tables returned in get_data()
34+
table_one = TableDefinition(
35+
table_name="sales",
36+
columns=[
37+
TableDefinition.Column("Product Key", SqlType.int()),
38+
TableDefinition.Column("Sales", SqlType.int()),
39+
TableDefinition.Column("Customer", SqlType.text()),
40+
TableDefinition.Column("Order ID", SqlType.text())
41+
]
42+
)
43+
table_two = TableDefinition(
44+
table_name="products",
45+
columns=[
46+
TableDefinition.Column("Product Key", SqlType.int()),
47+
TableDefinition.Column("Product Name", SqlType.text()),
48+
TableDefinition.Column("Category", SqlType.text())
49+
]
50+
)
51+
table_definitions = [table_one, table_two]
52+
return table_definitions
53+
54+
55+
def load_config():
56+
'''Loads a config file in the current directory called config.json.'''
57+
58+
# Opens the config file and loads as a dictionary.
59+
try:
60+
with open('config.json', 'r') as f:
61+
config = json.load(f)
62+
print("Config file loaded.")
63+
return config
64+
except:
65+
message = 'Could not read config file.'
66+
print("Unexpected error: ", sys.exc_info()[0])
67+
sys.exit(message)
68+
69+
70+
def add_to_hyper(table_data, table_definitions, hyper_name):
71+
'''Uses the Hyper API to build and insert data into the Hyper file.'''
72+
73+
# Starts the Hyper Process with telemetry enabled to send data to Tableau.
74+
# To opt out, simply set telemetry=Telemetry.DO_NOT_SEND_USAGE_DATA_TO_TABLEAU.
75+
print("Starting Hyper process.")
76+
with HyperProcess(telemetry=Telemetry.SEND_USAGE_DATA_TO_TABLEAU) as hyper:
77+
78+
# Creates new Hyper file "[hyper_name].hyper".
79+
# Replaces file with CreateMode.CREATE_AND_REPLACE if it already exists.
80+
print("Opening connection to Hyper file.")
81+
with Connection(endpoint=hyper.endpoint, database=hyper_name, create_mode=CreateMode.CREATE_AND_REPLACE) as connection:
82+
83+
# Creates multiple tables.
84+
for data, definition in zip(table_data, table_definitions):
85+
connection.catalog.create_table(definition)
86+
print(f"Creating table {definition.table_name} in Hyper...")
87+
88+
# Inserts data into table.
89+
with Inserter(connection, definition) as inserter:
90+
print(f"Instering {len(data)} rows into table {definition.table_name}...")
91+
inserter.add_rows(data)
92+
inserter.execute()
93+
94+
print("The connection to the Hyper file has been closed.")
95+
print("The Hyper process has been shut down.")
96+
97+
98+
def swap_hyper(hyper_name, tdsx_name, logger_obj=None):
99+
'''Uses tableau_tools to open a local .tdsx file and replace the hyperfile.'''
100+
101+
# Checks to see if TDSX exists, otherwise, as a one-time step, user will need to create using Desktop.
102+
if os.path.exists(tdsx_name):
103+
print("Found TDSX file.")
104+
else:
105+
message = "--Could not find existing TDSX file. Please use Desktop to create one from the newly created hyper file or update the config file.--"
106+
sys.exit(message)
107+
108+
# Uses tableau_tools to replace the hyper file in the TDSX.
109+
try:
110+
local_tds = TableauFileManager.open(filename=tdsx_name, logger_obj=logger_obj)
111+
except TableauException as e:
112+
sys.exit(e)
113+
filenames = local_tds.get_filenames_in_package()
114+
for filename in filenames:
115+
if filename.find('.hyper') != -1:
116+
print("Overwritting Hyper in original TDSX...")
117+
local_tds.set_file_for_replacement(filename_in_package=filename,
118+
replacement_filname_on_disk=hyper_name)
119+
break
120+
121+
# Overwrites the original TDSX file locally.
122+
tdsx_name_before_extension, tdsx_name_extension = os.path.splitext(tdsx_name)
123+
tdsx_updated_name = tdsx_name_before_extension + '_updated' + tdsx_name_extension
124+
local_tds.save_new_file(new_filename_no_extension=tdsx_updated_name)
125+
os.remove(tdsx_name)
126+
os.rename(tdsx_updated_name, tdsx_name)
127+
128+
129+
def publish_to_server(site_name, server_address, project_name, tdsx_name, tableau_token_name, tableau_token):
130+
'''Publishes updated, local .tdsx to Tableau, overwriting the original file.'''
131+
132+
# Creates the auth object based on the config file.
133+
tableau_auth = TSC.PersonalAccessTokenAuth(
134+
token_name=tableau_token_name, personal_access_token=tableau_token, site_id=site_name)
135+
server = TSC.Server(server_address)
136+
print(f"Signing into to site: {site_name}.")
137+
138+
# Signs in and find the specified project.
139+
with server.auth.sign_in(tableau_auth):
140+
all_projects, pagination_item = server.projects.get()
141+
for project in TSC.Pager(server.projects):
142+
if project.name == project_name:
143+
project_id = project.id
144+
if project_id == None:
145+
message = "Could not find project. Please update the config file."
146+
sys.exit(message)
147+
print(f"Publishing to {project_name}.")
148+
149+
# Publishes the data source.
150+
overwrite_true = TSC.Server.PublishMode.Overwrite
151+
datasource = TSC.DatasourceItem(project_id)
152+
file_path = os.path.join(os.getcwd(), tdsx_name)
153+
datasource = server.datasources.publish(
154+
datasource, file_path, overwrite_true)
155+
print(f"Publishing of datasource '{tdsx_name}' complete.")
156+
157+
158+
# Run
159+
if __name__ == '__main__':
160+
config = load_config()
161+
162+
try:
163+
add_to_hyper(get_data(), build_tables(), config['hyper_name'])
164+
except HyperException as e:
165+
sys.exit(e)
166+
167+
swap_hyper(config['hyper_name'], config['tdsx_name'])
168+
publish_to_server(config['site_name'], config['server_address'], config['project_name'],
169+
config['tdsx_name'], config['tableau_token_name'], config['tableau_token'])
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
tableauhyperapi==0.0.10899
2+
tableauserverclient==0.10
3+
tableau_tools==5.1.3

0 commit comments

Comments
 (0)