google-big-queryHow can I use Google BigQuery and Hadoop together to analyze data?
Google BigQuery and Hadoop can be used together to analyze data by using BigQuery to transfer data to Hadoop and then running analytics on the data. For example, you can use the following code to transfer data from BigQuery to Hadoop:
#!/bin/bash
# Use bq command line tool to export data from BigQuery
bq extract --destination_format=NEWLINE_DELIMITED_JSON <your_table> gs://<your_bucket>/<your_file>.json
# Use the hadoop fs command to move data from Google Cloud Storage to the Hadoop Distributed File System
hadoop fs -copyFromLocal gs://<your_bucket>/<your_file>.json <your_file>.json
# Use hadoop streaming to run analytics on the data
hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar -file <your_mapper>.py -mapper <your_mapper>.py -file <your_reducer>.py -reducer <your_reducer>.py -input <your_file>.json -output <your_output>
Code explanation
bq extract
: exports data from BigQueryhadoop fs
: moves data from Google Cloud Storage to the Hadoop Distributed File Systemhadoop streaming
: runs analytics on the data
Helpful links
More of Google Big Query
- How do I use the YEAR function in Google BigQuery?
- How can I use Google Big Query to count the number of zeros in a given dataset?
- ¿Cuáles son las ventajas y desventajas de usar Google BigQuery?
- How can I export data from Google Big Query to an XLSX file?
- How can I use Google BigQuery to wait for a query to complete?
- How can I use Google BigQuery to retrieve data from a specific year?
- How do I use Google Big Query to zip files?
- How can I use the CASE WHEN statement in Google Big Query?
- How do I use wildcards in Google BigQuery?
- How can I use Google BigQuery to access Wikipedia data?
See more codes...