I released a series of seven free public articles on Medium.com “How to build a modern data platform on the free tier of Google Cloud Platform”. The lead article is available at: https://medium.com/@markwkiehl/building-a-data-platform-on-gcp-0427500f62e8
Part One “Building a Data Platform on GCP” defined the functional requirements, and detailed how to install the required software.
Part Two “GCP Infrastructure & Authentication” explained how to use Google application default credentials (ADC) to authenticate a user-managed service account.
Part Three “Google Cloud Pub/Sub Messaging” showed how to use a Python script to generate and subscribe to the Google Pub/Sub Messaging service.
Part Four “Containerization using Docker” covered how to build a local Docker image for a Python script, run it locally, and then push it to Google Artifact Registry (repository).
Part Five “Google Cloud Run Jobs & Scheduler” demonstrated how to configure Google Cloud Run Jobs and Cloud Scheduler Jobs using Google CLI to execute a Python script stored in Google Artifact Registry on a specified interval from any Google region.
Part Six “Google BigQuery Cloud Database” set up a Google BigQuery dataset and table using the Google CLI, and then a Python script was used to write and query data with SQL.
Part Seven “Google Cloud Analytics” explored how to extract data from a Google BigQuery table, load it into a Pandas DataFrame, and effortlessly perform analysis and visualizations — all from a Python script.
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3