Google Bigquery

Overview

HockeyStack’s Google BigQuery integration allows you to seamlessly sync data from your BigQuery datasets and tables into HockeyStack. By structuring your tables according to our guidelines, you can leverage your BigQuery data for deeper insights and analysis within the HockeyStack platform.


Connecting HockeyStack to Google BigQuery

To connect HockeyStack to BigQuery, prepare a table with the required schema and provide secure access credentials, along with column descriptions for the data you plan to sync.


Step 1: Prepare the Table

Ensure your BigQuery table adheres to the following structure for seamless syncing:

Required Columns:

  • Timestamp: A column representing the timestamp of each record (e.g., event time or action date).

  • Identity (Email): A unique identifier (e.g., user email) to associate records with specific users or entities.

  • Other Action Data: Additional columns representing user actions, events, or attributes (e.g., page_views, purchase_amount).

Incremental Sync Support:

  • HockeyStack uses an internal Last Sync Date to ingest new or updated rows.

  • Include an added_at or updated_at timestamp column if syncing historical data or backdating records. This ensures HockeyStack accurately identifies rows modified before the current sync window.

  • For partitioned tables, use a date-partitioned column (e.g., DATE(event_time)) to optimize incremental sync efficiency.


Step 2: Provide Access Credentials

Grant HockeyStack secure access to your BigQuery data using the following steps:

1. Create a Service Account:

  • In Google Cloud Console, create a dedicated service account for HockeyStack (e.g., hockeystack-sync).

  • Assign the BigQuery Data Viewer role to grant read-only access to the target dataset/table.

2. Share Dataset and Table Names:

  • Provide the Project ID, Dataset ID, and Table ID that HockeyStack should access.

3. Generate and Share a JSON Key:

  • Create a JSON key for the service account and securely share it with HockeyStack.

4. Column Descriptions:

For each column in your table, include a brief description to help HockeyStack map and interpret your data. For example:

  • timestamp: The exact date/time of the event.

  • email: The user’s email address (unique identifier).

  • page_view_count: Total pages viewed by the user in the session.

  • added_at: Timestamp when the row was created in BigQuery.


Post-Setup Process

Once credentials and column details are shared, HockeyStack’s team will:

  1. Validate the connection to your BigQuery table.

  2. Configure data sync based on your schedule (typically daily).

  3. Map your schema to HockeyStack’s analytics model for actionable insights.

After setup, your BigQuery data will flow into HockeyStack, enabling advanced tracking, funnel analysis, and customer journey insights.


Best Practices

  • Use partitioning and clustering in BigQuery to optimize query performance and reduce sync costs.

  • For large historical datasets, backfill data incrementally using the added_at column.

  • Avoid schema changes post-setup. If necessary, inform HockeyStack to remap columns.


Last updated