Navigating Open Source and In-house Tooling in Data Platform Engineering

Stan Chen
3 min readOct 2, 2023

In the evolving world of data platform engineering, a common debate is whether to use open source tools or to create custom in-house solutions. This decision isn’t just about the technology; it’s also about managing people and helping them grow in their careers.

Open source tools are created and improved by communities of developers. They often provide ready-to-use solutions for common problems. On the other hand, building in-house means creating a tool from scratch to meet the unique needs of a company.

From a manager’s perspective, building everything in-house can be a drawback for the team. If engineers work only with custom-built tools, they might not gain the skills that are recognized outside the company. This could limit their career growth, hence higher turnovers. Also, if a company spends too many resources building tools that already exist in the market, it’s like reinventing the wheel. This can slow down progress and affect the efficiency of the data platform, resulting in increased cost, and lower usership.

Points to Consider for In-house Development

Technical Considerations

Are there existing tools in the market that can do the job?

  • Assess if there are already solutions that meet your requirements to avoid reinventing the wheel.

How will this tool affect the users’ work?

  • Consider the learning curve, ease of use, and how it will fit into existing workflows.

Does the tool follow good coding practices like CI/CD, linting, and automated testing?

  • Ensuring best practices can make maintenance easier and reduce bugs.

Scalability

  • Will the tool be able to handle increased loads or more data over time?

Security

  • How will the tool ensure data privacy and resist unauthorized access?

Interoperability

  • Can the tool easily integrate with other systems and tools in use?

Any other unique needs or goals of the company?

  • This could include compliance with industry regulations or specific customization needs.

Business Considerations

Is there a real need for this feature?

  • Validate the demand for the feature among your user base.

What’s the return on investment (ROI) on this feature?

  • Calculate both the direct and indirect returns, including time saved or revenue generated.

Does it align with the company’s strategy and current tech market trends?

  • Ensure that the tool or feature fits into the larger business strategy.

Cost Analysis

  • Consider the total cost of ownership, including development, maintenance, and training costs.

Time to Market

  • How quickly can the tool be developed and deployed?

Resource Availability

  • Do you have the necessary skills and manpower to build and maintain the tool?

User Training

  • What will be needed to train users on the new tool?

Feedback Loops

  • How will you collect and implement user feedback?

Exit Strategy

  • If the tool is not successful or needs to be replaced, what is the plan for phasing it out?

Highlighting the pros and cons

Table created using https://stancsz.github.io/md-to-img/

Both product and engineering managers should work together on these topics. Their decisions can affect many users and shape the company’s tech environment.

Additionally, it’s important to match the data platform’s features with the skills and needs of the users. Some tech-savvy users might enjoy the freedom to build code, while others might prefer user-friendly tools and hands-free tools to get their work done.

In closing, finding the right balance between using open source tools and building in-house is essential for a productive data platform and happy, productive teams.

I’m a platform capabilities lead, just writing this article to share insights on this aspect of data platform engineering. Until next time.

--

--