The Story of Netdata: Building Infrastructure Monitoring That Actually Works
Costa Tsaousis spent a couple of million dollars on monitoring tools that didn’t work. As a C-level executive at a fintech company in Greece migrating infrastructure to the cloud, the monitoring systems that were supposed to help instead created more problems.
So he did what frustrated engineers do: he started writing code nights and weekends to fix it himself.
In a recent episode of Category Visionaries, Costa Tsaousis, CEO and Founder of Netdata, shared the journey from burning cash on broken tools to building a monitoring platform that now leads the CNCF observability category with 66,000 GitHub stars.
The Frustration That Sparked Everything
The problems Costa faced were universal. “I spent a couple of million in monitoring just to figure out what is happening and what is wrong,” he recalls. “I realized that monitoring systems have something is wrong there.”
What bothered him wasn’t that monitoring was hard—it was that it seemed unnecessarily hard. “Why there is so big time, such a big learning curve, such a big setup and preparation that you have to do, why it’s not real time, why you have to know every metric and go through all the burden to understand exactly what’s happening in very detail.”
Initially curiosity, it quickly turned into obsession. He started experimenting, writing code, working weekends.
The Breakthrough
After months of weekend coding, something clicked. “I managed to solve the problems that were facing,” Costa says. “Actually, these were very nasty bugs at the cloud provider infrastructure, and we managed to find them.”
The real breakthrough was realizing he’d built something fundamentally different. His tool didn’t just monitor better; it eliminated the entire setup process.
Costa released it on GitHub. “Nothing happened,” he admits. “You have spent a lot of time working on a project. It solves your problems. You release it and nothing.”
The Reddit Post That Changed Everything
For weeks, the project sat on GitHub collecting digital dust. Then Costa tried something different. “One morning, I write a post on Reddit and say, okay, guys, I build this tool, check it on GitHub if you like it.”
What happened next rarely happens to open source projects. “And boom, it went viral at the top of hacker news. Hundreds of people, thousands of installations.”
The response wasn’t just big—it was overwhelming. “It was crazy, amazing. I have never even. It’s something unique. I think that for people to leave this love of this acceptance, this adoption, so after that point, my life changed completely.”
Why It Resonated
The viral adoption wasn’t random. Costa had solved a problem every infrastructure team faced: time.
“The problem with monitoring is that you really have to spend a tremendous amount of time and you need serious skills to understand, to set up a monitoring system and start using it,” Costa explains. His insight: “All of the companies across the world have to go through the same process.”
Every company uses standardized infrastructure—database servers, web servers, containers. Yet every company manually configured monitoring from scratch. “Why people across all the companies have to go through the same process again in order to monitor their standardized infrastructure?”
His solution: build that knowledge into the tool. “Once you build a thing and you know, this starts up and starts collecting stuff by itself, you don’t do anything,” he explains. “It finds a database server, it connects to, it starts connecting stuff from the database, it finds these containers, network interfaces, whatever it is there.”
Building in Public
What started as a side project became Costa’s full-time focus. Today, Netdata has 66,000 stars on GitHub and leads the observability category in the CNCF landscape, surpassing Elastic. The platform attracts 5,000 to 10,000 new users daily, with 250,000 Docker Hub downloads. The SaaS offering brings in 150 to 200 business signups daily and monitors 100,000 nodes.
The Enterprise Validation
The most telling validation came from Fortune 500 companies with custom monitoring systems. “Today we have many Fortune 500 companies that they stop. They shut down the monitoring systems that they have developed themselves using, of course, open source tools or proprietary tools or whatever, in order to use the data,” Costa notes.
Why? “They find that the completeness of the data is such that they can never do it by themselves. They don’t have the skills, the time, the effort.”
The Technical Innovation
Netdata’s architecture differs fundamentally. While traditional monitoring centralizes data, Netdata distributes it. “You install as many data agents as you need out there on all your servers,” Costa explains. “When all of them connect together, they built a massive distributed database that is spread all over the infrastructure.”
The result: “You can scale to infinity, and still you don’t need to scale up the servers by bigger servers and the likes just for monitoring.”
The team also trains machine learning models at the edge. “We train machine learning models on each server. So each server collects its own metrics and it trains its own models at the edge.” To avoid false positives, they look for synchronized anomalies: “When metrics, all these anomalies get synchronized and a lot of metrics have a lot of anomalies together concurrently, then for sure we know that there is something bad happening in the infrastructure.”
The Unexpected Challenge
With over 30 million in funding and clear product-market fit, Costa identifies a surprising challenge: remote work. “Building the product is not that hard. But managing a company that is 100% remote is probably the toughest,” he admits. Without proper systems, employees face “a lot of noise or misunderstanding or things that you heard here and there that are not company decisions.”
Racing Against Themselves
When asked about competition, Costa frames it differently: “We are racing against ourselves, we’re not racing against someone else, because the product is so unique.”
This confidence comes from market need. Even Fortune 500 companies “need solutions, they need tools.”
The opportunity ahead isn’t about beating competitors—it’s about execution speed. The market is “thirsty” for solutions that work, and Netdata has proven it can deliver at scale. From a Reddit post to 66,000 GitHub stars, from weekend project to monitoring 100,000 nodes, the journey validates a simple principle: solve real problems completely, and growth follows.