Yeah but LLMs don’t train off of data automatically, you need a separate dedicated process for that, it won’t happen from just using them. In that sense, companies can still use your data to train them in the background, even if you aren’t directly using an LLM, or they can not train them even when you are using them. I guess in the latter case there is a bigger incentive for them to train them than otherwise, but to me it seems basically the same thing privacy wise.
If they’re exposing their LLM to the public, there’s a higher chance of it leaking training data to the public. You don’t know what they trained with, but there’s a chance it’s customer data. Sure they may not train with anything, but why assume they don’t? If they have an internal LLM that’s of lesser concern, because that LLM would probably only show them data those employees already have access to.
Yeah but LLMs don’t train off of data automatically, you need a separate dedicated process for that, it won’t happen from just using them. In that sense, companies can still use your data to train them in the background, even if you aren’t directly using an LLM, or they can not train them even when you are using them. I guess in the latter case there is a bigger incentive for them to train them than otherwise, but to me it seems basically the same thing privacy wise.
If they’re exposing their LLM to the public, there’s a higher chance of it leaking training data to the public. You don’t know what they trained with, but there’s a chance it’s customer data. Sure they may not train with anything, but why assume they don’t? If they have an internal LLM that’s of lesser concern, because that LLM would probably only show them data those employees already have access to.