The Behavioral Health Analytics Stack: Secure, Scalable, and Under $4,000
How CCBHCS and behavioral health organizations new to automated analytics can build their first secure, scalable data stack for under $4K
My first couple of months building data analytics at my organization were somewhat messy. I was working for an organization run by clinicians who knew they needed an analyst but didn’t know what a good analyst looked like. With few places to turn, I spent countless hours researching, fretting, and searching for solutions. Looking back, I wish I’d had a guide like this—it would have saved me from a lot of heartache and frustration.
That confusion wasn’t unique to me. In my 5 years in behavioral health, I’m still discovering that analytics and measurement-based solutions are new for most organizations. Many data teams operate like startups, and in a non-profit environment where funding isn’t guaranteed, leadership can be reluctant to spend on tools whose value hasn’t yet been demonstrated. Yet, with CCBHCs and significant investment at both a Federal and State level aimed at increasing measurement-informed care, small analytics teams carry huge responsibilities. This is a Windows-friendly, low-cost, scalable and secure stack that small analytics teams with overwhelming reporting demands can actually implement.
Building this entire stack will run you approximately $3,851 upfront (in the most expensive scenario) with about $612 in annual subscriptions. Compare that to commercial analytics platforms that routinely cost $50,000+ per year, and you’re looking at a fraction of the price for a solution you actually control.
Even better? Most of these costs can be reduced significantly if you’re a nonprofit. Many vendors offer nonprofit discounts ranging from 30-50%, and some tools have free tiers that might cover your needs entirely as you’re getting started.
We’ll start by summarizing the essential components every organization needs for an analytics stack. Then we’ll dive into each tool in detail, with some advanced options thrown in for those ready to push the envelope.
Essential Components:
Think of these as your “data analytic building blocks.” For all our building blocks I will attempt to stick to secure - or easily securable solutions - to minimize your security headaches. Remember, you’re a data analyst not a cybersecurity expert. Our goal is to provide you with as much access to automation as possible by leaving the security responsibilities to others. At minimum, you need:
Storage and Processing Power: A computer/server that’s easy to back up, move, and scale as your team grows - Your stage and equipment
Data Orchestration: Software that schedules reports, automates pipelines, and alerts you if something breaks – Your director, managing when everything happens
Data Pipelining: Tools to clean, transform, and prepare your raw EHR and claims data - Your actors, transforming raw data into something meaningful
Data Storage: A secure database where transformed data lives, ready for reporting – Your script library, holding the refined story
Analytics & Visualization: Software that turns your data into dashboards, reports, and insights with role-based security – Your final production for the audience
Storage and Processing Power:
Let’s start from the ground up, with the hardware and system setup that everything else depends on.
Hardware: $2,000–$3,000:
Start simple and scale later is our motto. For most small to mid-sized clinics (up to ~5,000 patients annually), a powerful desktop computer is enough to get started:
Intel i7, i9 or AMD equivalent.
32GB RAM (64GB is even better)
1TB SSD
NVIDIA GPU (optional, RTX 5080-5090)
Enough processing power, memory and storage will keep your pipelines running fast and smooth.
Windows Server VM – $70–$300/year:
Lets multiple users log in, ensures easier backups, and keeps your system portable.
Make sure to have your IT department enable backups and file tracking (In Windows Server 2019 this isn’t enabled by default)
Advanced Tip: If you have an NVIDIA GPU, you can use libraries like cudf to process data up to 300x faster. This isn’t necessary when you’re starting out, but it’s a nice upgrade when data volumes grow.
I recommend that you have IT do installation for all steps in this category.
Data Orchestration:
Power Automate – $20/user/month:
Automates workflows and reporting schedules. I’ll talk more about this later. Power Automate Desktop Installation Instructions.
Security: Microsoft, HIPAA BAA available.
Windows Built-In Batch File - Free:
An alternative to Power Automate is to create a batch file (.bat) file that runs your pipelines using Window’s built-in Task Scheduler. The tricky part is that you can’t easily see when something goes wrong, so you’ll have to build your own alerts for pipeline failures (through python or something).
Data Pipelines:
Python – Free:
Python is a powerful programming language with many free, open-source tools. We’ll use if for data pipelines (cleaning and transforming data).
Recommended libraries: Polars (fast, modern alternative to Pandas) but pandas will work for people new to this.
As this is the only open-source item in our setup, security setup should be guided by your IT/cybersecurity vendor. However, I chose polars/pandas for their popularity. Knowing they’ve been reviewed and used by tens of thousands of programmers should ease our minds a little. (See my other article for more on open-source security in healthcare)
You can download and install python here. I recommend using the “add to PATH” option when installing.
Advanced tip: Python has the ability to automate data and even documentation in a way that Power Query doesn’t. Utilizing popular libraries like Selenium allow you to automate Browser tasks (like manually clicking that download button for data you didn’t have automated access to, until now :) )
Optional: GitHub ($4–$21/month):
Code version control. Nice to have, but small teams can get by without it if file tracking and backups are in place. If you’re going the non-GitHub route, I advise you to create a “Production” and “Testing” folder.
Visual Studio Code – Free
Coding environment for Python.
Data Storage:
SQL Server (~$500):
Easier for Microsoft shops and HIPAA compliance. Download SSMS for setup and interaction with your database. I recommend you have IT install this as well.
Postgres (Free):
Solid option, but open-source tools may add security complexity.
Power BI Semantic Models:
These are the Power BI cloud databases that connect to your local database (SQL Server). You’ll find they’re quite useful because once you have one strong star schema data model (a topic for a different article) you’ll be able to build multiple dashboards off the same semantic model, reducing refresh times, increasing efficiency and simplifying analytic capabilities.
Advanced tip: You can also connect Excel directly to your semantic model for those ad-hoc data requests, using the “Get Data → From Power BI” options under the Data tab allowing you take advantage of the relationships you’ve already built in Power BI to automatically merge data into the correct format.
Analytics & Visualization
Analytics & Visualization: Power BI – $10/user/month
Power BI allows you to create graphs, build dashboards, and publish your dashboards to the cloud so the right people can see them.
Row-based security:
Also called RLS (Allows you to set up roles with certain levels of access to data. Ex: Clinicians can only see their assigned patients)
Check out GuyInACube’s (Popular Power BI educator) YouTube video for a tutorial.
Advanced Tip: If you want to dynamically assign users access to data based off the access they’ve already been assigned in your EHR, use USERPRINCIPALNAME in Power BI’s RLS. However, this means you need access to a table mapping their Microsoft Emails to their EHR usernames:
Power BI Gateways:
Allows your dashboards to automatically refresh. More importantly, it ensures the secure transfer of data from your local system to the cloud.
GuyInACube’s Installation instructions
Security: Microsoft, HIPAA BAA available.
Implementation Guide:
Now that we’ve covered the core components, let’s talk about how to actually put this system together. If you have any problems with the instructions below, googling it is a great first step for troubleshooting. You should be able to find simple step-by-step instructions to solve almost any problem you have.
Work with IT to install and secure a Windows Server VM with daily backups on your new Desktop. Make sure to enable the recycle bin and file tracking.
Create a dedicated admin account that all pipelines run under.
Install:
SQL Server,
Power Automate Desktop, and make sure you have a license for Power Automate Online
Power BI Gateways
Python (make sure to select the “add to path” option during install),
Visual Studio Code (install Python, and Jupyter Notebook extensions).
In Visual Studio Code, run the following in the terminal (if you don’t see one by default you should see an option at the top of the page):
Copy and paste the below one at a time.
pip install sqlalchemy(for database connections)pip install polars(for data pipelining)Optionally install pip install pandas for familiarity.
Create a Testing and Production folder to help keep your code organized.
Try to keep all projects in their own folders that clearly labels their purpose. Ex: “Get most recent encounter per patient” or “CCBHC Quality Measurements”
Power Automate: I apologize ahead of time for Power Automate’s confusing setup but it’s significantly cheaper than the alternatives. I’ll walk you through it step by step. We’re going to divide Power Automate into two parts:
Power Automate Desktop (PAD): Any python file you want to run, you’ll utilize Power Automate desktop and either it’s “run Powershell” or “run application” flows. These flows will produce errors you can access in PAO when your pipelines fail. You can use these error to send yourself email alerts or attempt backup options as necessary.
Advanced Tip: If you’ve created different python environments beyond the base then you can activate that specific environment either as part of your PowerShell script or as part of your “Run Application” flows:
Using the PowerShell flow, you can activate the correct environment directly (google how to activate Python environment in PowerShell)
Using the “Run Application” flow, you can activate the correct environment by using the python.exe file within the environment you want in the “Program path” parameter.
Power Automate Online (PAO). to schedule a daily pipeline by using the cloud version of Power Automate to run Power Automate Desktop flows. I know that’s confusing but once you open Power Automate Desktop, you’ll understand why. PAD doesn’t have the same intuitive layout as PAO nor the ability to run multiple items at the same time. And at the time of writing this article, Microsoft has no plans on changing.
Expect installation to take a couple of days, and go easy on yourself—there’s a learning curve.
Other Advanced Options (For the Technically Curious)
If your team has stronger technical skills, you can layer in:
NVIDIA cudf library – GPU-accelerated pipelines for very large datasets.
Airflow, Prefect or Kestra– More powerful open-source orchestration than Power Automate, but with higher setup and maintenance cost.
GitHub Actions – Automate testing and deployment of pipelines.
Logging - Logging errors to a log file will help you troubleshoot problems with your pipelines as they arise.
These aren’t necessary to get started — think of them as “Phase 2 or 3” upgrades.
Final Thoughts
After five years of building analytics in behavioral health, here’s what I think: you don’t need Silicon Valley money to build Silicon Valley-level analytics.
This stack addresses the three challenges every behavioral health analytics team faces:
Compliance becomes manageable: Instead of scrambling every time a CCBHC or Medicaid report is due, you’ll have automated processes and secure systems ready to go.
Your team gets their time back: No more staff spending hours on repetitive Excel work when they could be focusing on what matters—your patients and programs.
You can grow at your own pace: Start with one machine and expand to cloud or GPU acceleration as your comfort level and need increases.
For under $4,000 upfront, behavioral health organizations can build a secure, scalable system that doesn’t just check boxes, it transforms how you work with data. And I sincerely believe that.
If you’re a small clinic or a new data team feeling overwhelmed (like I was), start with the essentials. As your needs grow, you’ll already have the foundation in place to expand. And unlike my first messy months, you’ll have a roadmap to follow.
And remember, you’re not alone in this. The behavioral health/CCBHC analytics community is small but growing, and we’re all figuring this out together.






