{"id":192839,"date":"2021-08-07T15:33:36","date_gmt":"2021-08-07T22:33:36","guid":{"rendered":"https:\/\/m2msupport.net\/m2msupport\/?page_id=192839"},"modified":"2021-08-08T08:40:40","modified_gmt":"2021-08-08T15:40:40","slug":"setup-a-dataflow-to-write-topics-to-biqquery-in-google-cloud-platform-gcp","status":"publish","type":"page","link":"https:\/\/m2msupport.net\/m2msupport\/setup-a-dataflow-to-write-topics-to-biqquery-in-google-cloud-platform-gcp\/","title":{"rendered":"Setup a dataflow to write messages from pub\/sub topics to BiqQuery table in Google Cloud Platform (GCP)"},"content":{"rendered":"<p>Google Cloud Dataflow is data processing service that can be used for streaming and batch applications. Users can setup pipelines in Dataflow to integrate and process large datasets.<\/p>\n<p>With pub\/sub, users can setup dataflow pipelines to write messages from a pub\/sub topic or subscription to a BigQuery table.<\/p>\n<p><a href=\"https:\/\/m2msupport.net\/m2msupport\/download-iot-cloud-tester\/\">IoT Cloud Tester<\/a>&nbsp; application provides an easy interface to setup a dataflow to write messages from pub\/sub topic to a BigQuery table in Google Cloud Platform.<\/p>\n<h1>To setup a dataflow to write messages from pub\/sub topic to&nbsp; BigQuery table,<\/h1>\n<ul>\n<li>In the &#8216;Dataflow&#8217; tab, click on &#8216;Create Job&#8217; tab.<\/li>\n<li>Enter the job name<\/li>\n<li>Select &#8216;Pub\/Sub Topic to BigQuery&#8217; option<\/li>\n<li>Get the list of topics and select one<\/li>\n<li>Get the available cloud storage buckets for the project and select one.<\/li>\n<li>Enter the file name. This file is used by the dataflow.<\/li>\n<li>Setup the Dataset and Table to be use to write the messages from the pub\/sub topic. Note that the table schema should match the pub\/sub topic message structure.<\/li>\n<\/ul>\n<p><a href=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-192933\" src=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow1.png\" alt=\"\" width=\"849\" height=\"749\" srcset=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow1.png 849w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow1-300x265.png 300w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow1-768x678.png 768w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow1-600x529.png 600w\" sizes=\"auto, (max-width: 849px) 100vw, 849px\" \/><\/a><\/p>\n<p>Dataflow Job &#8216;topic_to_bq&#8217; is created immediately with pending status.<\/p>\n<p><a href=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-192934\" src=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow2.png\" alt=\"\" width=\"850\" height=\"731\" srcset=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow2.png 850w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow2-300x258.png 300w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow2-768x660.png 768w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow2-600x516.png 600w\" sizes=\"auto, (max-width: 850px) 100vw, 850px\" \/><\/a><\/p>\n<p>A post request is made to GCP to create the dataflow job. In this case, we&#8217;re using the pre-build template&nbsp;PubSub_to_BigQuery to write pub\/sub topic messages to BigQuery.<\/p>\n<p>POST https:\/\/dataflow.googleapis.com\/v1b3\/projects\/second-inquiry-315605\/locations\/asia-east1\/templates:launch?gcsPath=gs:\/\/dataflow-templates\/latest\/PubSub_to_BigQuery HTTP\/1.1<\/p>\n<p><strong>Server response for job creation.<\/strong><\/p>\n<p>{&#8220;jobName&#8221;:&#8221;topic_to_bq&#8221;,&#8221;environment&#8221;:{&#8220;tempLocation&#8221;:&#8221;gs:\/\/my-iot-bucket-5\/temp_file_for_dataflow&#8221;,&#8221;additionalExperiments&#8221;:[],&#8221;bypassTempDirValidation&#8221;:false,&#8221;ipConfiguration&#8221;:&#8221;WORKER_IP_UNSPECIFIED&#8221;},&#8221;parameters&#8221;:{&#8220;outputTableSpec&#8221;:&#8221;second-inquiry-315605:device_data.environment&#8221;,&#8221;inputTopic&#8221;:&#8221;projects\/second-inquiry-315605\/topics\/environment&#8221;}}<\/p>\n<p>{<\/p>\n<p>&#8220;job&#8221;: {<\/p>\n<p>&#8220;id&#8221;: &#8220;2021-08-08_08_10_53-4543477010421041882&#8221;,<\/p>\n<p>&#8220;projectId&#8221;: &#8220;second-inquiry-315605&#8221;,<\/p>\n<p>&#8220;name&#8221;: &#8220;topic_to_bq&#8221;,<\/p>\n<p>&#8220;type&#8221;: &#8220;JOB_TYPE_STREAMING&#8221;,<\/p>\n<p>&#8220;currentStateTime&#8221;: &#8220;1970-01-01T00:00:00Z&#8221;,<\/p>\n<p>&#8220;createTime&#8221;: &#8220;2021-08-08T15:10:54.353075Z&#8221;,<\/p>\n<p>&#8220;location&#8221;: &#8220;asia-east1&#8221;,<\/p>\n<p>&#8220;startTime&#8221;: &#8220;2021-08-08T15:10:54.353075Z&#8221;<\/p>\n<p>}<\/p>\n<p>}<\/p>\n<p>The newly created dataflow job can be viewed in the Google console.<\/p>\n<p><a href=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-192936\" src=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow3.png\" alt=\"\" width=\"1636\" height=\"816\" srcset=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow3.png 1636w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow3-300x150.png 300w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow3-768x383.png 768w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow3-1024x511.png 1024w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow3-600x299.png 600w\" sizes=\"auto, (max-width: 1636px) 100vw, 1636px\" \/><\/a><\/p>\n<p>Now let us see the dataflow in action. We&#8217;ll have a device publish message to that topic and verify&nbsp; that the data is written to the BigQuery table.<\/p>\n<p>Below device &#8216;dev_23992&#8217; is publishing to topic &#8216;environment&#8217;. Above we have setup a dataflow job to write that topic messages to the &#8216;environment&#8217; table in the device_data dataset in BigQuery.<\/p>\n<p><a href=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow4.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-192938\" src=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow4.png\" alt=\"\" width=\"1092\" height=\"922\" srcset=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow4.png 1092w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow4-300x253.png 300w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow4-768x648.png 768w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow4-1024x865.png 1024w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow4-600x507.png 600w\" sizes=\"auto, (max-width: 1092px) 100vw, 1092px\" \/><\/a><\/p>\n<p>We can verify in BiqQuery that the streaming topic messages are written to the table.<\/p>\n<p><a href=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-192939\" src=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow5.png\" alt=\"\" width=\"1630\" height=\"819\" srcset=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow5.png 1630w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow5-300x151.png 300w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow5-768x386.png 768w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow5-1024x515.png 1024w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/dataflow5-600x301.png 600w\" sizes=\"auto, (max-width: 1630px) 100vw, 1630px\" \/><\/a><\/p>\n<div class=\"video-responsive\"><iframe loading=\"lazy\" id=\"youTubePlayer\" src=\"https:\/\/www.youtube.com\/embed\/snV7iAAOoF8?hd=1\" width=\"750\" height=\"421\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Google Cloud Dataflow is data processing service that can be used for streaming and batch applications. Users can setup pipelines in Dataflow to integrate and process large datasets. With pub\/sub, users can setup dataflow pipelines to write messages from a &hellip; <a href=\"https:\/\/m2msupport.net\/m2msupport\/setup-a-dataflow-to-write-topics-to-biqquery-in-google-cloud-platform-gcp\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"iot_tutorial_template.php","meta":{"footnotes":""},"class_list":["post-192839","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/pages\/192839","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/comments?post=192839"}],"version-history":[{"count":6,"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/pages\/192839\/revisions"}],"predecessor-version":[{"id":192940,"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/pages\/192839\/revisions\/192940"}],"wp:attachment":[{"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/media?parent=192839"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}