Do you mind sharing the targeting use case of the added implementation? For example, do we expect TFX users who train Pytorch/scikit-learn model to use it someday? @jyzhao@google.com
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
Kerry Donny-Clark
Feb 11, 2022
Approver
The use case is to allow users that are developing non-TFX models to use those models for inference in Dataflow
Reply was deleted
Show more
Show less
Comment details cannot be verified
J Z
Feb 11, 2022
Approver
Curious, Do they share the same RunInference interfaces with Tensorflow? If so, it would be good that we can embed them into TFX-BSL/BulkInfer Component. TFX is not Tensorflow only, it support sklearn, jax etc.
Reply was deleted
Show more
Show less
Comment details cannot be verified
J Z
Feb 11, 2022
Approver
go/run-inference-beam plans to extra the bulk infer part from TFX-BSL to Beam, are we able to converge the interface thus we can use single BulkInfer component (with maybe slightly different configs) to handle the use cases.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Kerry Donny-Clark
Feb 11, 2022
Approver
Beam users may want to run outside of tensorflow. We need to allow users to upload the model of their choice, and shouldn't force tight coupling with Tensorflow
Reply was deleted
Show more
Show less
Comment details cannot be verified
Kerry Donny-Clark
Feb 11, 2022
Approver
It is good to know that running other models through TFX is an option. Where can we find documentation for that?
Reply was deleted
Show more
Show less
Comment details cannot be verified
Jiayi Zhao
Feb 11, 2022
Approver
Yep it shouldn't tight coupling with Tensorflow, but since it's all batch infer, I wonder if the interface(input/output format) can be aligned.
For tfx we currently have xgboost/sklearn/jax examples, our current bulkinfer supports Tensorflow/Jax and cloud bulkinferr but not sklearn/xgboost, thus I wonder if the interface aligned, we can add then to TFX BulkInfer component too.
The current design calls for input/output format to not be aligned (TF protos) but rather the same as what users of the framework are familiar with. This is done with the purpose of easing usage for developers that are mainly familiar with their framework.
Reply was deleted
Show more
Show less
Comment details cannot be verified
J Z
Feb 11, 2022
Approver
Just wonder if we can utilize TFXIO/TFX-bsl(or similar idea) thus we can have different format but unified memory representation and processing
TFXIO operates on pyarrow.RecordBatches that represent batches of data columns that have the same length. If the intent is to do batched inference (eg for efficiency reasons) then it may also make sense to use this format as intermediate representation (and tfx_bsl has a number of utils to convert to/from this from/to different formats, including tensors and ndarrays->RB, pyarrow also supports pandas conversions).
Reply was deleted
Show more
Show less
Comment details cannot be verified
Ryan Thompson
Feb 15, 2022
Approver
I also wanted to say, thanks for the links. I think the relevant file that is interesting here is:
1- Users who need the whole of the input data 2- Users who need a reference to the input data only
The distinction between these two user requirements is significant:
A Input SIgnature of KV ( also supported by TFX Inference ) allowed the user to provide the reference as the K which can then be used post processing with the <K,V> becoming <K,Predict<V>> on the output. For users in category 2 this means we do not need to replicate the input data into the output ( making this optional would be ideal).
Concrete example:
I have a PCollection of images as KV K=imageuuid V=bytes()
Having the input bytes as output will only be useful in a subset of cases, often its the image_uuid and the prediction that is needed in the resulting PCollection.
Thanks Reza. I've modified the wording. Please see the Note below about the TFX implementation. Their output type contains information about the input data.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Reza Rokni
Feb 11, 2022
Approver
Thanx! Yes in their impl input is always included, but they also have KV support as well this is there to support metadata to be passed in without using Tensorflow Signature tricks for metadata passthrough. With Tensorflow signatures you can have some data bypass the input_0 layer of the model and still be available in the output. But by having the KV option it means users dont need to mess with the Signatures for this.
For our impl we will need KV for the same reason, however I think an extra nice feature for us would be to allow the user to omit the input from the output to reduce bytes processed in the resulting PCollection.
Reply was deleted
Show more
Show less
Comment details cannot be verified
3 replies
New
Ahmet Altay
Feb 11, 2022
Approver
How is it done in the current API?
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 11, 2022
Approver
process() returns maps the key of the example to a prediction log proto. i.e. Iterable[Tuple[_K, prediction_log_pb2.PredictionLog]].
+1 to python dataclass. It will be the easier on-ramp for our users. Although I understand that this has draw backs in language portability etc... I believe the trade off is worth it in this case.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Ryan Thompson
Feb 11, 2022
Approver
Our assumption is that the language portability will not be an issue since the current beam coders should handle a data class easily.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Ahmet Altay
Feb 12, 2022
Approver
dataclasses sounds like a better option. Just to make sure that we considered everything: - draw backs in language portability -> what are those. Could you explicitly mention here? - would this result in a split api between existing run inference and the new one?
(Side note: Consider adding an "alternatives considered" section for parts that could go different ways. This, and 1-or-n transforms etc. It would streamline the discussion and agreement process once reviewers can see that different options were considered and they can compare pros and cons easily.)
Reply was deleted
Show more
Show less
Comment details cannot be verified
Brian Hulette
Feb 14, 2022
Approver
Note that just because existing Beam coders support a type doesn't mean it's portable. In Python we have a catch-all coder that just serializes data with pickle - this can't be interpreted by non-Python runners or SDKs. I think this would only be a problem when/if we want to expose this as an external transform though.
To workaround we could add support for inferring Beam schemas from dataclasses, then use row coder. We could also do this for protobuf-generated types. And we already have support for inferring schemas from NamedTuple types (perhaps less familiar to python developers than dataclasses though).
Reply was deleted
Show more
Show less
Comment details cannot be verified
Robert Bradshaw
Feb 14, 2022
Approver
+1 to supporting schema coder for known dataclasses (regardless of its use here).
Reply was deleted
Show more
Show less
Comment details cannot be verified
7 replies
New
Robert Bradshaw
Feb 14, 2022
Approver
Is this just for PyTorch?
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 15, 2022
Approver
For illustrative purposes, we just created PytorchOutputData, but scikit learn should also have one too. Or, as discussed in the other comments, we could just have a unified dataclass that is agnostic of framework
Reply was deleted
Show more
Show less
Comment details cannot be verified
1 reply
New
Robert Bradshaw
Feb 14, 2022
Approver
Are these 1:1. Is batching going to be explicit in the API?
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
0 replies
New
Assigned to
rezarokni@google.com
Andy Ye
Feb 11, 2022
Approver
@rezarokni@google.com Should we be replicating how TFX adds serialized examples to the output prediction log?
Suggestion was deleted
Show more
Show less
Assigned to rezarokni@google.com
Comment details cannot be verified
Reza Rokni
Feb 11, 2022
Approver
I would suggest thinking about this problem more generally..
Given n frameworks which may have m output types, do we have a output signature which - Changes by framework being used, - Do we have a single output type which we do the conversion. - Do we have a generic type were input<T> will be output<T> - Or some other variation
I suspect us having to coerce the output into 'our' thing is too much tech debt and impractical. Again I suspect that the easiest is to have each framework output its own format ( which for tensorflow will be tf.example) . This option however will need us to think about the RunInference(config) API marked in the future section below. If we want inference to be single transform which is easily composable into a DAG then maybe we need a <bytes> output option.
@robertwb@google.com @bhulette@google.com view of this is I belive for us to output using Schemas. Which falls into the option about us coercing output into a known 'thing'.
But here I would need to defer to your good selves as I maybe thinking about this wrong.. It feels like a very important design decision that needs to be done right so we dont have tech debt in the future.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Robert Bradshaw
Feb 11, 2022
Approver
Note that in Java the notion of having a schema doesn't force a concrete representation, just the ability to convert to and from a row object (or, equivalently, the ability to enumerate and extract fields). This hasn't been fleshed out as much in Python though (ess need, due to language differences like the ease of duck typing) but we have given it some thought.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Yifan Mai
Feb 11, 2022
Approver
@rongliu@google.com could you review this portion? TLDR is that the TFX RunInference type signature is different from what a "canonical" inference signature would look like.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Reza Rokni
Feb 11, 2022
Approver
Good point by Robert, maybe we need a signature per framework and ( once the python bit is fleshed out for <Row>) we add a <Row> output which solves for the single RunInference transform being easily composable into a DAG while still allowing the config to change the framework being used.
@rongliu@google.com I assume this is a solved problem for the TFX runinference as the output is standardized on Predictlog proto.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Rose Liu
Feb 11, 2022
Approver
Reza, you're correct that TFX consider the inference output to be PredictionLog for downstream jobs.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Brian Hulette
Feb 15, 2022
Approver
Note the schema concept doesn't have support for dynamic schemas - the schema must be known (or at least discoverable) at pipeline construction time. I don't think that's a blocker here though IIUC.
Reply was deleted
Show more
Show less
Comment details cannot be verified
6 replies
New
Robert Bradshaw
Feb 14, 2022
Approver
Can we standardize on a single type?
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
0 replies
New
Yifan Mai
Feb 11, 2022
Approver
@iindyk@google.com could you review this portion re: TFX IO and Apache Arrow?
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
0 replies
New
Ihor Indyk
Feb 12, 2022
Approver
TFX components already use arrow format on input path and some use it on output path as well
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
Ihor Indyk
Feb 12, 2022
Approver
and we had significant efficiency gains of doing so, one of the reasons is that it's columnar
Having a single transform which calls our other transforms is actually a P[0]
Some features that this provides:
If we build a inference Templates, where the configuration dictates the model+framework info and the DAG does not need to be changed at all then when we support new frameworks the template does not change, just the config needed to be sent to it.
All docs / patterns and examples will make use of a single transform, which is changed simply by the configuration we send to it, which is very powerful message.
Any future additions to this framework from outside contributors are much more likely to follow the same example and launch as RunInference()
There is potential for integration into KFP as a inference end point using this transform, just like templates having a single transform that allows config to indicate its impl would simply the work needed there.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Ahmet Altay
Feb 11, 2022
Approver
+1 to having a single transform as a P0 and it should be done in the initial work. Changing APIs are very hard. It is better to start where we already know we want to go.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 11, 2022
Approver
We can definitely pursue having a single wrapper around these individual transforms. It was what we had in mind initially, but wondered if developers would feel more comfortable working with framework-specific transforms.
Here are reasons I see for one transform per framework: 1) We will need to support tfx runinference() separately, at least for some time. Legacy customers should not need to rebuild or refactor pipelines using tfx. 2) Different frameworks can take and return different data types, and require different parameters. This is very explicit if you have a transform matched to the framework. It allows users to express themselves using the native types and signatures for the framework, without learning Beam-specific configuration. 3) It is easier to add new frameworks when they are decoupled. A generic interface builds in assumptions, and a framework that violates those assumptions will be difficult to assimilate.
Overall, I see frameworks akin to IOs. They are specific enough that a generic approach cause more issues than it solves.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Ahmet Altay
Feb 12, 2022
Approver
These are also very good reasons.
I would suggest getting a second opinion from klk@ / robertwb@, and committing to one option and moving on the execution phase.
I see having a single transform as a P0. We may need to expose the TFX one for legacy reasons, but it may be acceptable to just leave that one where it is for those that expect the (exact) legacy behavior.
IMHO, the way it should be used is that every system should support conversion of their own data type to/from schemas (logically, there's not necessarily a need for physical transformation of records). RunInference(TypedModel) would accept any schema'd PCollection, as well as a PCollection of its own specific type, apply the model, and output a schema'd PColletion (again, possibly in its own specific type, but in such a way that it could be easily followed by further transforms in that framework, or any schema-understanding post-processing. All inference is of the form [features] -> [more features], ideally paired with the input features in a standard way.
The configuration and implementation may need to be quite specific, but that's more swappable (e.g. we should be able to swap out o PyTorch model for a TF one by swapping the designation of the model, not how it's used.
Note that even for IOs we are tying to move towards things like SchemaIO and IO pushdown and the ability to treat IOs in a unified way (e.g. you can just swap out one for another) rather than a disparate set of transforms. From what I've seen, the concerns here are even more similar than IOs (and I have yet to see a compelling reason any implementation differences force a difference in API here). This also fits in well with the relational first push. Ideally this cold even open up the possibility of mix and match models, e.g. using an online CloudVision API composed with a PyTorch trained model to act on it.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Kenneth Knowles
Feb 17, 2022
Approver
I'm still thinking about this one. I'm generally not automatically for unification of things just because they fit a category or have similarities. If the surrounding code cannot also be swappable, then there's also less benefit.
One example of something that isn't really swappable is SqlTransform with zetasql vs standard sql: there is not that much chance of a context where this swapping matters. But there is some benefit to unification anyhow. SchemaIO is another case where you have a certain generalized API but not for the purposes of swapping but rather the purpose of programmatic inspection/instantiation.
It isn't straightforward to me which way to go in the case of RunInference so I am not ready to weigh in.
Reply was deleted
Show more
Show less
Comment details cannot be verified
8 replies
New
Ahmet Altay
Feb 11, 2022
Approver
Could there be a single RunInference transform that could be parametrized (or even better auto detect??) instead of N different transforms for N different frameworks?
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
Reza Rokni
Feb 11, 2022
Approver
+1
Reply was deleted
Show more
Show less
Comment details cannot be verified
Kerry Donny-Clark
Feb 11, 2022
Approver
I would say that a framework is like an IO. We don't have a single very generic IO, with complex parameters that get parsed to make a specific connection to a given source.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Ryan Thompson
Feb 11, 2022
Approver
After having looked through the TFX runinference code closely and trying to make generic sections of it, the idea of trying to combine all these frameworks into a single implementation scares me. There is a lot of very specific TFX only sections to that code.
I anticipate a lot of complexity and framework specific code entering into the individual runInference implementations.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Ahmet Altay
Feb 12, 2022
Approver
(It looks like I created a second comment thread on the same topic. Let's use the other comment thread as the single place to discuss this issue. Ryan, if you do not mind you can add your comment there as well. Both Reza and Kerry already commented there.)
Reply was deleted
Show more
Show less
Comment details cannot be verified
Robert Bradshaw
Feb 14, 2022
Approver
I just realized I started a new thread on this as well. I think there are two considerations here: the API question (in which case there seems to be a clear preference for unification) and a the implementation one (where I think a lot could be shared, but if it contorts things too much/there's a lot of platform specific stuff we don't have to share everything).
We think this is worthwhile, but maybe for a future iteration.
Reply was deleted
Show more
Show less
Comment details cannot be verified
1 reply
New
Rose Liu
Feb 11, 2022
Approver
Keeping the same style of inputs with RunInference could make these 2 extensible for future changes. For example, we have a PR to enable streaming model input [1], where a collection of model parameters are consumed.
I agree. I think we are looking to find some sort of feature parity between what we can support with newer frameworks to what TFX does and is doing. I know TFX runInference is being very actively developed and we are not as familiar with the roadmap.
Our feeling is that as long as we have an extensible object as the parameters for each framework we should be able to extend features going forward.
Fleshing a concrete implementation of PyTorch and/or Sklearn would help validate this.
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 24, 2022
Approver
Thanks for this sketch! Ryan and I will leverage this to guide our implementation.
FYI: the downside to the composition paradigm here is that we will have to do a bit more refactoring of TFX's logic to support that Tensorflow. @rezarokni@google.com
Reply was deleted
Show more
Show less
Comment details cannot be verified
Robert Bradshaw
Feb 24, 2022
Approver
That is true. Cleanly introducing a new extension point often requires a bit of refactoring, but IMHO that is generally preferable to a sub-optimal structuring of the code for historic reasons only. Fortunately the amount of code in question here is relatively small.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Reza Rokni
Feb 25, 2022
Approver
+1 With Roberts comment, if we need a hidden shim layer etc as temporary tech debt, while painful the end goal will be work it.
Minor note ( from reading the sketch):
Can we avoid the word Batch from any external areas of code . This for example has lead to confusion with tfx RunInference where folks have assumed its 'batch' due to naming when digging into the code.
Reply was deleted
Show more
Show less
Comment details cannot be verified
3 replies
New
Robert Bradshaw
Feb 14, 2022
Approver
Prefer composition to inheritance.
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
0 replies
New
Reza Rokni
Feb 10, 2022
Approver
Is this @Setup in the DoFn? If yes then might be worth mentioning that there would need to be a check in the process() function ( I think ) to see if the sidinput has changed version to a new model.
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 14, 2022
Approver
Good point! Yes, it's in the DoFn. Just added that note.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Robert Bradshaw
Feb 23, 2022
Approver
Side inputs are not available form setup(). Above it looks like the model is specified by a url at pipeline construction. Is the intent to be able to specify it in both ways?
Reply was deleted
Show more
Show less
Comment details cannot be verified
2 replies
New
Robert Bradshaw
Feb 14, 2022
Approver
How hard would it be to coerce the data into the right format as needed. It'd be extremely useful to just read schema'd data (e.g. from a CSV, or another ML-agnostic transform) and have it automatically adapt to the framework-specific input (as well as handling intelligent batching as needed). This is the kind of thing that I think will add a lot of value to users.
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
0 replies
New
Robert Bradshaw
Feb 23, 2022
Approver
This should be something that is in the generic code, and specific models don't have to worry about.
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 24, 2022
Approver
Sounds good. Reworded!
Reply was deleted
Show more
Show less
Comment details cannot be verified
1 reply
New
Anand Inguva
Feb 10, 2022
Approver
Do we pre-process the data somewhere when user passes numpy arrays to RunInference(Pytorch)?
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 10, 2022
Approver
Yes, we can have a preprocessing step to check the type. If it's Tensor, then keep it. If it's numpy, then we can do a transform to Tensor.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 14, 2022
Approver
Added a note that for the first iteration, we will assume input is Tensor. In future iterations, we can add logic to do conversions.
Reply was deleted
Show more
Show less
Comment details cannot be verified
2 replies
New
Valentyn Tymofieiev
Feb 15, 2022
Approver
This sounds pipeline-specific. Would users need to implement this logic?
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
Robert Bradshaw
Feb 23, 2022
Approver
+1. I'm not quite following what the role of this function is either.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 24, 2022
Approver
Sorry for this misleading description here. I've updated it. Essentially, TFX does this to their data to make it conform to their protobufs, but other framework implementations may not need this.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 24, 2022
Approver
Marked as resolved
Reply was deleted
Show more
Show less
Comment details cannot be verified
Robert Bradshaw
Feb 24, 2022
Approver
Re-opened
Does this have to be done as part of RunInference, or could it be a subsequent DoFn?
Reply was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 24, 2022
Approver
TFX currently needs a very specific post-process object that matches the type of inference being done (regression vs classification vs multi-inference), so there needs to be this post-process step. But we don't plan to do any framework-level post-processing for Scikit-learn and Pytorch.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Robert Bradshaw
Feb 24, 2022
Approver
OK, in that case I would just call it part of the inference.
Replace:“num_model_versions ** new - Note: This is relevant to models changing via side input.” with “F”
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
0 replies
New
Brian Hulette
Feb 23, 2022
Approver
It would be great if we could leverage s.apache.org/batched-dofns here (when it's available in Beam). Note that batching up Beam schema types into common "table" types (pyarrow RecordBatches, pandas DataFrames, etc..) is a primary goal of that effort.
Unfortunately that efforts is going on in parallel with this one, I don't want my work on Batched DoFns to block you, but I do think this could be a valuable use-case for it and I'd be happy to collaborate.
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 23, 2022
Approver
Sounds good, looking forward to incorporating it! 👍
Reply was deleted
Show more
Show less
Comment details cannot be verified
1 reply
New
Anand Inguva
Feb 16, 2022
Approver
Making a note here
BatchElements returns a list. We need to convert the list into array/tensor and apparently converting to list to tensor is extremely slow[1]
@anandinguva@google.com We started a section on testing here. Feel free to edit, and even link to a dedicated testing doc
Suggestion was deleted
Show more
Show less
Assigned to anandinguva@google.com
Comment details cannot be verified
Yifan Mai
Feb 11, 2022
Approver
Beam or Dataflow benchmarks would be nice. In particular, TFX would need performance parity before migrating to the Beam implementation.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Anand Inguva
Feb 14, 2022
Approver
Do we have current TFX benchmarks on Beam or Dataflow? It would be helpful to see how TFX calculated the Benchmarks
Reply was deleted
Show more
Show less
Comment details cannot be verified
2 replies
New
Ahmet Altay
Feb 11, 2022
Approver
What platforms we can integrate with? Does CAIP support scikit and pytorch models?
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 11, 2022
Approver
Initial scope is to integrate with Vertex AI. Yes, it supports TF, Pytorch, scikit-learn, XGB.
Reply was deleted
Show more
Show less
Comment details cannot be verified
Ahmet Altay
Feb 12, 2022
Approver
Nice!
Reply was deleted
Show more
Show less
Comment details cannot be verified
Ahmet Altay
Feb 12, 2022
Approver
(Let's explicitly add that information to the document.)
Reply was deleted
Show more
Show less
Comment details cannot be verified
Andy Ye
Feb 14, 2022
Approver
👍
Reply was deleted
Show more
Show less
Comment details cannot be verified
4 replies
New
Ryan Thompson
Feb 14, 2022
Approver
I think iteration #1 doesn't call for support for remote inferences. Vertex AI is supported with the TFX implementation but we don't plan on supporting it for our first pass with sklearn or pytorch.
Suggestion was deleted
Show more
Show less
Comment details cannot be verified
0 replies
New
You're suggesting
Gemini created these notes. They can contain errors so should be double-checked. How Gemini takes notes
Drag image to reposition
11
10
9
8
7
6
5
4
3
2
1
1
2
3
4
5
6
7
8
9
10
Outline
Outline
Document tabs
RunInference: ML Inference in Beam
28
Headings you add to the document will appear here.