Tableau Admin

Login
Insights Home Tableau Admin Home

Automating Your Extract Refresh Process

I’ve spoken with quite a few people who run into this problem so hopefully this will be helpful to you.

Setting Up the Refresh Job

The key to solving the issue is to set up a tableauserverclient extract refresh job as the final step in your source refresh process.

How It Works

Step 1: SQL Server uses xpcmd to execute a .bat file.

Step 2: The .bat file contains the command to run a Python script.

Step 3: The Python script kicks off the extract refresh job using tableauserverclient, ensuring that your Tableau extracts are always up-to-date.

Running the Refresh Job with a Batch File

To automate the execution of your Python script, create a batch file (.bat) with the following content. This file, when executed, will run the Python script that initiates the extract refresh:

@echo off
REM Run the Python script to kick off the Tableau extract refresh job
python "C:\path\to\your_extract_refresh_script.py"

Python Script to Kick Off the Extract Refresh Job

Below is an example Python script that searches for a datasource by a name, retrieves its ID, and then triggers the extract refresh job. This script uses environment variables and the tableauserverclient library:


            import os
            import tableauserverclient as TSC
            from dotenv import load_dotenv
            
            load_dotenv()  # Load environment variables from .env file
            
            SERVER_URL = os.getenv('TABLEAU_SERVER_URL')
            PAT_NAME = os.getenv('TABLEAU_PAT_NAME')
            PAT_SECRET = os.getenv('TABLEAU_PAT_SECRET')
            SITE_ID = os.getenv('TABLEAU_SITE_ID')
            
            DATASOURCE_NAME = 'rocket_launch'  # Partial name of the datasource to search for
            
            tableau_auth = TSC.PersonalAccessTokenAuth(PAT_NAME, PAT_SECRET, SITE_ID)
            server = TSC.Server(SERVER_URL, use_server_version=True)
            
            with server.auth.sign_in(tableau_auth):
                # Create a RequestOptions object to filter datasources by partial name
                options = TSC.RequestOptions()
                options.filter.add(TSC.Filter(TSC.RequestOptions.Field.Name, TSC.RequestOptions.Operator.Equals, DATASOURCE_NAME))
                
                all_datasources, pagination_item = server.datasources.get(req_options=options)
                
                if all_datasources:
                    # Select the first matching datasource
                    datasource = all_datasources[0]
                    # Kick off the extract refresh job for the selected datasource
                    job = server.datasources.refresh(datasource)
                    print(f"Started extract refresh job for datasource: {datasource.name}")
                else:
                    print("No datasource found with the provided partial name.")
            

The Benefits

By integrating this automated step, you eliminate mismatches between the source refresh times and the static extract refresh times—resulting in more reliable and consistent data updates.

Conclusion

Incorporating an extract refresh job at the end of your source refresh process not only streamlines your workflow but also boosts data accuracy. This simple yet effective setup can save time and reduce manual oversight, making your Tableau environment more efficient.