By using JDBC, step by step
From the first moment that I used Google BigQuery, I felt the necessity to connect to a local IDE instead of using the Web Interface. Still, I gave the time to use it, thinking that maybe I am outdated because of many years of using professional tools like DataGrip, Toad, SQLDeveloper, SQL Server Management Studio, Workbench J, etc.
Maybe I am not patient enough, or I was right, but in any case, if any of the following questions is happening to you, perhaps this post is going to help you:
- Did you feel that the Google BigQuery Web Interface is making you slower?
- Did you have some weird error messages when you know your SQL is correct?
- Did you want to reuse your code and have a local file with SQLs?
- Did you lose using the tabs?
- When creating a new database object, did you need to refresh the entire page?
DBeaver is a free, popular, and open-source universal database tool for developers and database administrators.
I choose to download and install the DBeaver Community Edition at the following link:
To connect BigQuery using JDBC, you will need:
- Project: Project ID
- User: Service Account Email
- Key path: Service Account KEY file
The Project Name doesn’t work. You heard well that you are looking for the Project ID, and you can get it after creating a project and selecting the project name at the top up of the page like the following image:
You can use your Google account by selecting User-based Authentication, but when you close and open the IDE, it will ask you to log in again. And as best practice, It is recommended to use a Service Account.
If you don’t know how to create a Service Account, you can check my previous post at https://towardsdatascience.com/run-bigquery-sql-using-python-api-client-b5287ac05b99
In DBeaver newer version is not requesting the user, but in case you are using an older version, be aware that the user field is expecting the service account email. You can get in at IAM & Admin → Service Account:
The KEY path refers to the JSON file generated when you created the KEY. For more information, you can check my previous post at https://towardsdatascience.com/run-bigquery-sql-using-python-api-client-b5287ac05b99
We are going to connect to BigQuery by going to File → New and selecting Database Connection:
Then write bigquery in the search bar, select All at left, the Google BigQuery icon, and the Next button:
For the connection setting, enter the Project ID, select Service Base, the path for the KEY JSON file, and in the case is asking the user field, then use the Service Account Email.
Then select the button Test Connection at the left and download the Simba JDBC Driver:
We are going to use the USA Names to Test the Connection. To add the Public Dataset, follow the steps of my previous post at https://towardsdatascience.com/run-bigquery-sql-using-python-api-client-b5287ac05b99
Use the following SQL query and press the play orange triangle at the left:
SELECT name, SUM(number) as total_people
WHERE state = 'TX'
GROUP BY name, state
ORDER BY total_people DESC
You will get the following result:
There are two ways to connect to Google BigQuery with a local IDE, one is using ODBC, and the other is with JDBC.
I found that connecting with JDBC can be confusing, and It can take too much time if you don’t know what to do, which is why I explained how to configure it, and I am pretty sure the same procedure can work with other Database Tool that supports it.
Be aware that one important thing you will lose is the visibility of the volume to be scanned:
To my wife Diana, who has accompanied, endured, and pushed me to continue sharing.