If you're a Snowflake user, chances are you're either also on Databricks or have considered it and vice versa. In 2024, ~40% of Snowflake users were also on Databricks and ~60% of Databricks users were also on Snowflake, per a survey done by ETR and SiliconAngle.
The most common reasons? Databricks has a much more mature MLOps and model training platform. If you wanted to create custom models based on your data, Databricks was the place to do it. Meanwhile, Snowflake's managed nature and foundation in warehousing give it a perceived edge in the BI and analytics space.
For leadership, this can be frustrating. You're maintaining multiple cloud data platforms and are receiving bills from Snowflake, Databricks, and your cloud providers every month.
Snowflake's answer to this is their new Cortex AI feature set. Now instead of building your dataset in Snowflake and taking it to Databricks, you can do the entire operation inside of Snowflake's managed platform.
That means that you get to retain all the compliance and managed niceties of Snowflake while also training models and building RAG databases. That's less billing, less overhead, and less platforms to manage accounts for.
Data Quality Comes Before AI Features
Both of these platforms have plenty of ways for you to run up costs, whether that's poorly written queries in Snowflake or leaving everything as All-Purpose compute in Databricks. One of the most common drivers of these costs though is the underlying health of the data in the platform.
Well-organized data is just as important as, if not more than, your query optimization or the type of compute you're using. In one example, when Smart Data was working with a large Medicaid provider, we discovered a datetime discrepancy on query logic in Medicaid claims for Skilled Nursing Facilities. The cost? $16M in lost revenue per year. A well optimized query on that data doesn't get you back your $16M. It just reduces your cloud bill while you lose it.
Similarly, a high powered model, whether it's a random forest regressor or an enterprise LLM, won't fix that data. It might just return it faster, and the LLM will throw in an 'Absolutely! Here's that data you asked for!' for good measure.
Cortex AI is not immune to this. Nor is a model you build in Databricks.
The Transformation Layer Is the Foundation
So how do you ensure you can get value from a feature like Cortex AI and maintain data quality? The data transformation layer is what drives this. Snowflake makes use of what they call the Extract, Load, Transform pattern, rather than the common ETL framework you may be used to. If you have some Databricks experience, you're going to be familiar with the term Medallion architecture. We can think of ELT as the Bronze and Silver steps in Medallion, with Gold being the final queries and analytics you're writing for the business using Snowflake's columnar cloud data warehouses.

There are plenty of tools to help in this area. You might be using dbt and their new Fusion engine. Maybe you're doing all of the work in Snowflake with things like Snowpark and having engineers write Python or Java.
Whatever your setup is, this is where the bulk of your time and investment should live. It is the foundation of your house of data. It doesn't matter how energy efficient your windows are if the house can't stand.
If you're seeing a lot of costs in Snowflake, or you feel like Cortex AI just isn't working for you, start by taking a look at the underlying data. Chances are the roots of your issues can be found there. Who knows, a misread datetime might be costing you $16M.
So What Actually Is Cortex AI?
"Wait a second" I can hear you saying. "What the heck actually IS Cortex AI?" and that's fair. AI is a big broad buzzword these days, and Snowflake calling it the "AI Data Cloud" doesn't really help answer this either. Cortex can be broken down into 3 big feature sets. Those are: Cortex Analyst, Cortex Search, and Cortex Agents
Cortex Analyst
Cortex Analyst is like an AI SQL developer, but for your datasets. You describe the query you want, Cortex generates the specific query in SQL for your data. If you're someone who doesn't know SQL but does know what metrics and analytics they want to see, then Cortex Analyst is going to be your best friend.
Cortex Search
Cortex Search is an LLM powered search service over unstructured data, so it tackles things like documentation, contracts, and meeting transcripts. If you're trying to remember 'hey what did accounting say in that meeting 3 weeks ago?" then Cortex Search is the answer to your problems. Maybe your company has a large internal API with extensive documentation and you're trying to find the exact API call and the params you'll need for some data you want. Once again, Cortex Search is your friend.
Cortex Agents
Cortex Agents are multi-step retrieval reasoning agents that work across structured and unstructured data sources. However, this technical explanation doesn't do them justice. Let me give you an example instead. Let's say your sales team contacts you and they want to know which contracts are up for renewal this quarter and what the last communication with each of those clients was. The answer to this question is found somewhere in a pile of email threads and CRM tables. Cortex Agents can pull from both and quickly return what you're looking for.
See Cortex AI in Action
If you were hoping to try these tools or see them in action beyond what the docs provide, here at Smart Data we built an open source lab you can present or work through yourself specifically to demo all the various aspects of Cortex AI in Snowflake.
Should You Use Cortex AI?
That's all well and good, but you're probably thinking "I'm not using Cortex AI, how do I know if I should use it?" Great question.
First, if you're already in Snowflake, it's a pretty low barrier to entry. There's nothing you have to enable or opt into. For our compliance focused friends in healthcare and government work, don't worry. All the Cortex AI features maintain the exact same compliance standards that you've grown to love, or at least deal with. FedRAMP, SOC2, HIPAA, CMMC 2.0, and more. Snowflake Horizon also maintains all of your existing governance, access control, and lineage.
Cortex is a feature that sits inside your existing architecture, rather than existing outside of it. Furthermore, thanks to Snowflake's 3 layer architecture (storage, compute, and cloud services) where each layer is independently scalable, running AI queries with Cortex won't tank the performance on your existing reporting.
However, if you're not on Snowflake and you're wondering if Cortex AI is something that should on its own convince you to get onto it, the answer is honestly no. Cortex AI is just one feature set in what is a very large ecosystem that Snowflake provides.
For example, take a look at the Snowflake Marketplace. The Marketplace allows you, as a Snowflake user, to make use of existing maintained and clean datasets from other Snowflake account holders. These 3rd party datasets let you enrich your own work without worrying about the overhead of maintaining or storing the data.
The Marketplace also makes available native apps that users have created in Snowflake, and these apps will run on your internal data. The specialized data and data solutions you've been looking for might very well exist already in the Snowflake Marketplace and could help fast track you to your data goals.
Where Smart Data Fits
Whether Snowflake is the right fit for your company, I can't answer today. If you're weighing that decision, reach out to Smart Data.
We're a Snowflake Select Tier Services Partner, and we have the experience to give you a straight answer. More than once, that's meant talking a client out of an expensive idea when a simpler one made more sense. Our job isn't to sell you on a particular solution; it's to point you towards the right solution.



