{"id":192943,"date":"2021-08-08T10:26:54","date_gmt":"2021-08-08T17:26:54","guid":{"rendered":"https:\/\/m2msupport.net\/m2msupport\/?page_id=192943"},"modified":"2021-08-08T10:29:28","modified_gmt":"2021-08-08T17:29:28","slug":"setup-a-dataflow-to-write-messages-from-pub-sub-subscription-to-biqquery-table-in-google-cloud-platform-gcp","status":"publish","type":"page","link":"https:\/\/m2msupport.net\/m2msupport\/setup-a-dataflow-to-write-messages-from-pub-sub-subscription-to-biqquery-table-in-google-cloud-platform-gcp\/","title":{"rendered":"Setup a dataflow to write messages from pub\/sub subscription to BigQuery table in Google Cloud Platform (GCP)"},"content":{"rendered":"<p>Google Cloud Dataflow is data processing service that can be used for streaming and batch applications. Users can setup pipelines in Dataflow to integrate and process large datasets.<\/p>\n<p>With pub\/sub, users can setup dataflow pipelines to write messages from a pub\/sub topic or subscription to a BigQuery table.<\/p>\n<p><a href=\"https:\/\/m2msupport.net\/m2msupport\/download-iot-cloud-tester\/\">IoT Cloud Tester<\/a>\u00a0 application provides an easy interface to setup a dataflow to write messages from pub\/sub topic to a BigQuery table in Google Cloud Platform.<\/p>\n<h1>To setup a dataflow to write messages from pub\/sub topic to\u00a0 BigQuery table,<\/h1>\n<ul>\n<li>In the &#8216;Dataflow&#8217; tab, click on &#8216;Create Job&#8217; tab.<\/li>\n<li>Enter the job name<\/li>\n<li>Select &#8216;Pub\/Sub Topic to BigQuery&#8217; option<\/li>\n<li>Get the list of topics and select one<\/li>\n<li>Get the available cloud storage buckets for the project and select one.<\/li>\n<li>Enter the file name. This file is used by the dataflow.<\/li>\n<li>Setup the Dataset and Table to be use to write the messages from the pub\/sub topic. Note that the table schema should match the pub\/sub topic message structure.<\/li>\n<\/ul>\n<p><a href=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-192950\" src=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription1.png\" alt=\"\" width=\"850\" height=\"729\" srcset=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription1.png 850w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription1-300x257.png 300w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription1-768x659.png 768w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription1-600x515.png 600w\" sizes=\"auto, (max-width: 850px) 100vw, 850px\" \/><\/a><\/p>\n<p>Dataflow Job &#8216;subscription_to_bq&#8217; is created immediately with pending status.<\/p>\n<p><a href=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-192944\" src=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription3.png\" alt=\"\" width=\"847\" height=\"731\" srcset=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription3.png 847w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription3-300x259.png 300w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription3-768x663.png 768w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription3-600x518.png 600w\" sizes=\"auto, (max-width: 847px) 100vw, 847px\" \/><\/a><\/p>\n<p>A post request is made to GCP to create the dataflow job. In this case, we&#8217;re using the pre-build template\u00a0PubSub_Subscription_to_BigQuery to write pub\/sub subscription messages to BigQuery.<\/p>\n<p>POST https:\/\/dataflow.googleapis.com\/v1b3\/projects\/second-inquiry-315605\/locations\/asia-east1\/templates:launch?gcsPath=gs:\/\/dataflow-templates\/latest\/PubSub_Subscription_to_BigQuery HTTP\/1.1<\/p>\n<p><strong>Server response for job creation.<\/strong><\/p>\n<p>{<\/p>\n<p>&#8220;job&#8221;: {<\/p>\n<p>&#8220;id&#8221;: &#8220;2021-08-08_10_07_27-12462317361521435203&#8221;,<\/p>\n<p>&#8220;projectId&#8221;: &#8220;second-inquiry-315605&#8221;,<\/p>\n<p>&#8220;name&#8221;: &#8220;subscription_to_bq&#8221;,<\/p>\n<p>&#8220;type&#8221;: &#8220;JOB_TYPE_STREAMING&#8221;,<\/p>\n<p>&#8220;currentStateTime&#8221;: &#8220;1970-01-01T00:00:00Z&#8221;,<\/p>\n<p>&#8220;createTime&#8221;: &#8220;2021-08-08T17:07:28.696060Z&#8221;,<\/p>\n<p>&#8220;location&#8221;: &#8220;asia-east1&#8221;,<\/p>\n<p>&#8220;startTime&#8221;: &#8220;2021-08-08T17:07:28.696060Z&#8221;<\/p>\n<p>}<\/p>\n<p>}<\/p>\n<p>The newly created dataflow job can be viewed in the Google console.<\/p>\n<p><a href=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription4.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-192945\" src=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription4.png\" alt=\"\" width=\"1636\" height=\"822\" srcset=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription4.png 1636w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription4-300x151.png 300w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription4-768x386.png 768w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription4-1024x515.png 1024w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription4-600x301.png 600w\" sizes=\"auto, (max-width: 1636px) 100vw, 1636px\" \/><\/a><\/p>\n<p>Now let us see the dataflow in action. We&#8217;ll have a device publish message to that subscription and verify\u00a0 that the data is written to the BigQuery table.<\/p>\n<p>Below device &#8216;dev_23992&#8217; is publishing to topic &#8216;environment&#8217;. We have a subscription &#8216;subscription_628331&#8217; to topic &#8216;environment&#8217;. Above we also have setup a dataflow job to write that subscription messages to the &#8216;environment_subscription&#8217; table in the device_data dataset in BigQuery.<\/p>\n<p><a href=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-192946\" src=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription5.png\" alt=\"\" width=\"1038\" height=\"886\" srcset=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription5.png 1038w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription5-300x256.png 300w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription5-768x656.png 768w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription5-1024x874.png 1024w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription5-600x512.png 600w\" sizes=\"auto, (max-width: 1038px) 100vw, 1038px\" \/><\/a><\/p>\n<p>We can verify in BiqQuery that the streaming subscription messages are written to the table.<\/p>\n<p><a href=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-192947\" src=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription2.png\" alt=\"\" width=\"1630\" height=\"819\" srcset=\"https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription2.png 1630w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription2-300x151.png 300w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription2-768x386.png 768w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription2-1024x515.png 1024w, https:\/\/m2msupport.net\/m2msupport\/wp-content\/uploads\/2021\/08\/subscription2-600x301.png 600w\" sizes=\"auto, (max-width: 1630px) 100vw, 1630px\" \/><\/a><\/p>\n<div class=\"video-responsive\"><iframe loading=\"lazy\" id=\"youTubePlayer\" src=\"https:\/\/www.youtube.com\/embed\/mrrn2TyGMZo?hd=1\" width=\"750\" height=\"421\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Google Cloud Dataflow is data processing service that can be used for streaming and batch applications. Users can setup pipelines in Dataflow to integrate and process large datasets. With pub\/sub, users can setup dataflow pipelines to write messages from a &hellip; <a href=\"https:\/\/m2msupport.net\/m2msupport\/setup-a-dataflow-to-write-messages-from-pub-sub-subscription-to-biqquery-table-in-google-cloud-platform-gcp\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"iot_tutorial_template.php","meta":{"footnotes":""},"class_list":["post-192943","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/pages\/192943","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/comments?post=192943"}],"version-history":[{"count":4,"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/pages\/192943\/revisions"}],"predecessor-version":[{"id":192953,"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/pages\/192943\/revisions\/192953"}],"wp:attachment":[{"href":"https:\/\/m2msupport.net\/m2msupport\/wp-json\/wp\/v2\/media?parent=192943"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}