Skip to content

How to update values from PCollection. #1

@SnoozingSimian

Description

@SnoozingSimian

Hey I have been trying to use your library (which is very useful by the way) to try to Update a few records in by database. I am providing a snippet of code here which should provide some insight on what I am trying to achieve.

import apache_beam as beam
from beam_postgres.io import WriteToPostgres
from dataclasses import dataclass

DB_USERNAME = <db user name>
DB_PASSWORD = <db pass>
DB_HOST = <db host>
DB_PORT = <db port>
DB_DATABASENAME = <db name>

@dataclass
class DocId:
    document_id: str

class ProcessVals(beam.DoFn):
    def process(self, value):
        if value is None:
            yield DocId(None)
        else:
            yield DocId(value)

def check_if_none(value):
    if value.document_id is None:
        return False
    else :
        return True
    
with beam.Pipeline() as p:
    data = p | "Creating" >> beam.Create(
        ['02267a6d-0a9f-40bf-9051-4971961cb0ac', 
         '05db919e-adda-41c2-9197-54623dff6d1a', 
         '04c607ec-64e8-420e-b5a5-bb63e035ba2f', 
         None, 
         None]
       )
    
    data2 = data | "Molding" >> beam.ParDo(ProcessVals()) | "Filter" >> beam.Filter(check_if_none)
    
    data2 | "Writing example records to database" >> WriteToPostgres(
        conninfo = f"host={DB_HOST} dbname={DB_DATABASENAME} user={DB_USERNAME} password={DB_PASSWORD}",
        statement ="UPDATE documents SET is_available = true WHERE document_id = %s",
        )

Unfortunately this does not seem to be working, the pipeline runs fine, but I do not see the changes reflected in my database. I am really new to the apcache-beam space and I am having troubleshooting, could you help?

Metadata

Metadata

Assignees

Labels

questionFurther information is requested

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions