I miss Pandas very useful i/o operations in this list like .read_csv(), to_csv(). or to_sql().
I suggest to either leave them out by now (and document how feed data into it) or implement just the most straightforward ones.
At first sight pandas.DataFrame.to_sql() seems hard to port; but pandas.DataFrame.read_gbq() seems much easier since we already have BigQuery io.
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
Robert Bradshaw
Aug 15, 2020
Approver
Yes, I was just thinking about how to implement read_csv (and other IOs) the other day; one interesting question is how to deal with (liquid) sharding.
The current offering requires providing the input as a PCollection, but this is certainly in the roadmap.