Data Pipelines and ETL can be ridiculously time consuming. Luckily, data-ingest-as-a-service tools are getting more robust with their pipeline offerings.
We’ve all had this problem: “All of my data is siloed in different apps, and I want to be able to analyze it all in one place. What do I do?”
My favorite solutions are Segment, Fivetran, and Stitch. Additionally, Zapier deserves an honorable mention here. It has the most integrations of any connector service (plus I work there!), but we don’t offer a managed sync like these other tools do. The opinions in this article are mine, not Zapier’s.
When I last wrote on this topic Fivetran was the only option that had the integrations I needed. Since then, both Stitch and Segment’s Sources product have come a long way, so I thought it was time to revisit. Each option has benefits for certain situations. Hopefully, I can help you decide which to use for your data infrastructure setup.
Who has the integrations I need? Choose that one.
Do we need to send data to other apps than our data warehouse? Use Segment.
Do we like open source? Use Stitch.
Do we need the most robust option? Use Fivetran.
Check what integrations you need
The available universe of integrations for this type of data pipeline and ETL is outrageously large. Each app featured has been systematically increasing their variety of options. Most have double or triple the number they did a year ago.
Even so, for many of the implementations I’ve done, only one tool has support for the integrations that they need. Some of the time, they even need to mix and match. While this is not ideal, and is expensive, it’s still less than half of what you’d pay a data engineer to manage these pipelines for you. Worth it.
Segment gets top billing here because I’ve written as far back as 2015 about how much I like their primary user-event-tracking product. The addition of data pipeline, sources, and warehouses came more recently.
Best reason to use: you can leverage the main segment data pipeline product, too.
This gives us a clear primary reason to leverage segment. If you’re already using their event tracking, it feels ridiculous not to rely on them as the first choice for your data pipeline, too.
Additionally, Segment offers many more data destinations that these other tools. Fivetran and stitch focus on allowing you to pipe the data into a data warehouse (i.e. a powerful analytics database) so that you can analyze it. Segment offers that functionality as a by-product of wanting to pipe your data everywhere.
This means that you can use events from your event-tracking service to trigger workflows in your marketing automation software. Or, make tickets from your customer support service automatically assign a bill in your invoice software.
If you have a need for data to co-mingle inside of other third party tools, segment’s the best option of three.
Segment is the largest company of the three (and it’s not close). That means there’s a bit more process and procedure, and each customer matters a little bit less. I don’t have anything negative to say about the support team in general, but my experiences have been just average, your concern gets put in a queue, and dealt with in a reasonable timeframe. For a company at their stage, that’s normal. It’s just not the founder’s touch that you’re liable to get at some of their competitors.
Because this data pipeline is a secondary product, it doesn’t quite get the attention that it might get at the other organizations. But, you do get the piece of mind that the company’s quite stable and you can count on them to support you indefinitely. Each of the other two companies are much smaller and therefore have more startup risk.
Best Reason to Use: In my experience, the most robust of these options.
This is a hard claim to backup with specific data, but it’s supported by some of the client claims from fivetran, as well as my own experience. The team over there claims to have replicated “some of the largest recorded Marketo, Zendesk, Zuora accounts.”
When I used it and interacted with my data warehouse, the resulting data just felt more consistent and less flakey than the other tools here. I completely recognize that this is a tiny sample size, but that was how it felt to me. As I mentioned before, they were also first to this particular use case.
Support / Other Thoughts
“If schemas change or if something goes wrong with your sync, the Fivetran team will fix it immediately. We also offer around the clock technical support if for some reason you do need to get ahold of us.” – Fivetran Team
This quote comes directly from Fivetran, but I chose to call it out here because this was certainly the case when I used them. We were extremely early users, and core functionality broke with the stripe integration. They investigated and fixed it after hours to have us up and running the next day.
Best Reason to Use: Open Source Taps
Stitch has 2 big things going for it:
Open Source Integrations – Stitch’s Singer Tap Open Source framework means that you can build a connector into their data infrastructure. Plus, if you make it well enough, I’ve seen cases where they’ll take over support for it. Given how difficult it is for each of these apps to have the particular integration that you need, Stitch is the most likely to get new ones quickly.
Tight relationship with Fishtown Analytics. (Full disclosure here, I contracted with Fishtown – and loved it!) Fishtown’s a great analytics agency no matter what tools you use. The reason to mention them here is that related to the Singer tap ecosystem mentioned above. If you don’t want to build the Singer Tap that you need, you might be able to hire Fishtown to build it for you. That flexibility and ability to accelerate your timelines is hugely beneficial.
They seem to have the most appealing free plan for small / low volume startups. I appreciate that, since getting budget for a tool like this can be a struggle early on if your team hasn’t seen the value in a previous life. That means that you can go give it a try whenever you want!
All three choices use some sort of variable pricing to charge more as you grow (as they should).
Segment and stitch are both up front with their pricing schedule: listing it on pricing pages. Fivetran pricing, for now, makes you contact them. You should go get a quote, as the pricing will almost certainly update between when I write this and when you’re reading it. That said, here’s been my experience.
Note: Stitch and Segment have options for small, free plans. Fivetran offers a 14 day free trial, but no ongoing free plan. This only matters to you if you are a very early startup – as otherwise you’ll have scaled out of the free options anyway.
For Small Scale Projects <10000 users, stitch and segment have solid free plans, and remain cheaper options than Fivetran.
Typical usage patterns that I’ve seen end up somewhere $0-$500/mo for Segment and Stitch, and $1,000 to $1,500 / month for Fivetran. **These numbers have been updated to reflect update pricing structures (12/19/17), but as always pricing for SaaS companies is subject to change without notice.
For Medium Scale products, we’re probably looking at $300-$750 for Segment and Stitch, with FT still at the previously mentioned range. Really it starts to depend roughly on how much data you’re recording and generating for Segment and Stitch, and on how many, and what types, of data sources for Fivetran.
At Scale, the numbers start to diverge, and you absolutely need to get quotes. It’s much less applicable to look at what others paid because it could really be anything from $500-$10k+. Fivetran positions themselves as the premium option, and touts the fact that they’ve “replicated some of the largest recorded Marketo, Zendesk, Zuora accounts.” That said, because FT currently bills based on what data sources you’re connecting and how many data sources you need, they could end up as a lower priced option on the high end.
No matter who you use, you’re likely to end up with a more cost effective solution than building it yourself!