So what about Snowflake buying Datavolo?
Some thoughts on what this means?
Datavolo is an integration software focused on unstructured data in order to feed LLMs. It's based on Apache NiFi, and starter costs are $35k / year at the moment, it seems to have lots of integrations, as you'd expect. Its been about a little while - it had raised $21m total between seed / series A according to the press release in April this year.
Snowflake clearly wants to make it easier to do things with data on its platform, differentiating it further from the more technical/engineering focus of Databricks.
Integration has been part of the roadmap for a while, but focused on data sharing for structured data. Datavolo seems to solve more for unstructured data and this isn't as developed a space as structured data.
Datavolo proposes on their website 'ETL' for unstructured data rather than ELT - the reason being that you need to make sense of unstructured data before doing anything with it, while structured data is usually an observation of something that happened.
It will be interesting to see what happens with the integration - does Datavolo remain a separate product, or be fully integrated into Snowflake. I'd assume Datavolo customers who are using Databricks or other downstreams will start asking questions about this.
I would expect Snowflake to totally change the pricing model for 'Datavolo on Snowflake' to align it with usage based pricing like the rest of Snowflake but retain something like the current pricing model for Datavolo outside Snowflake.
Having an easy to use LLM / RAG data integration / generator just ‘ready to go’ and within Snowflake’s walled garden will increase ease of use and security.
these are all just my thoughts based on reading the releases.