GCP BigQuery
What is Big Query:
Big Query is Google Cloud’s fully managed and serverless database
Scalability: BigQuery is a highly scalable data warehouse that can handle petabyte-scale data. It is designed to accommodate growing data volumes without the need for manual intervention or performance tuning.
Speed: BigQuery offers lightning-fast query performance, as it’s a distributed architecture and columnar storage format.
Cost-effective: BigQuery is a fully managed, serverless platform that eliminates the need for infrastructure management. It allows businesses to pay only for what they use, which makes it an affordable option for organizations of all sizes.
Integration: BigQuery integrates seamlessly with other Google Cloud Platform (GCP) services, such as Google Data Studio and Google Cloud Storage, making it easy to build end-to-end analytics pipelines.
Big Query Features:
Serverless and easy to setup, No infra
Realtime data Analytics — Live stream from different sources like SaaS, IoT, Data flow into BigQuery
High Availability — 99.99% commit by google
Standard SQL
Flexible pricing models
Public datasets
Easy operations and maintenance
Big Query Pricing
Demo Queries:
============================
SELECT
state,
SUM(confirmed_cases) AS total_cases,
SUM(deaths) AS total_deaths
FROM
`bigquery-public-data.covid19_usafacts.summary`
GROUP BY
state
ORDER BY
total_cases DESC
LIMIT
10
============================
SELECT
tag_name,
SUM(view_count) AS total_views
FROM
`bigquery-public-data.stackoverflow.posts_questions`
JOIN UNNEST(SPLIT(tags, ‘|’)) AS tag_name
WHERE
tag_name IN (‘java’, ‘python’, ‘javascript’, ‘c#’, ‘php’, ‘html’, ‘css’, ‘mysql’, ‘android’, ‘jquery’)
GROUP BY
tag_name
ORDER BY
total_views DESC
LIMIT
10