Skip to main content

JBoss Fuse - JBoss Fuse or ETL?

While I was hosting a workshop on JBoss Fuse, I got a question from on of the partners, why not use ETL?

"ETL (Extract, Transform and Load) is a process in data warehousing responsible for pulling data out of the source systems and placing it into a data warehouse."

To me it's rare to compare between Fuse to an ETL solution, because I think these two should work together to compliment each other. They are solving different problems when integrating. JBoss Fuse is a light weight more agile implementation of ESB. When trying to introduce SOA or even micro-service into your system, JBoss Fuse will help in many ways. I found this excellent decision tree diagram to decide when is best to use what.

The most obvious difference between an ESB and ETL if you need your real-time processing or if it is ok to have the data prepared couple of hours beforehand? To me ETL is more like an DATA tier integration solution, if you have 2 or more databases with massive amount of data that you want to join to produce some kind of report, then yes, ETL works best. So yes, if you ever hear things like, data, massive amount and report, then ETL is probably your best bet. 

On the other-hand, integrating in an enterprise is not just simply joining data. There are many departments, systems in an enterprise, each provide different services to end users, to provide suitable information, ETL can soon get very complicated, as different end users requires various kinds, granularity of data also their need of how recent the data needs to be, too. You will soon run into this very complex, tight coupling architecture. From developer's point of view, to get data they need, they will also need broad understanding of the other database architecture. And there are exceeding amount of service now move to cloud, to integrate these services in data level is going to be a challenge. That's why we need a layer that hide all the detail implementation for us to allow loose coupling between systems, much more agile and hides unnecessary informations. 

JBoss Fuse ESB can help you in this layer of services, developer no longer have to worried about what kind of protocols, these information are from, because of the prebuilt connectors in Camel. And the Enterprise Integration Pattern of Camel can help developer freely route the data between base services, or even use it to provide or combined services. Since data can come into may forms, Camel also has different built in mappers to transform between data types. And moving your services to the cloud. 

In real life there is no one size fit all solution, it is our job to find what solution fits best with our needs, and this is how I view ETL and ESB should work together. 

Here you can see there are a numbers of datasource, it could be file, database, or messages, when it comes to providing data for BI and reports, because it needs to process a large amount of data, so it's best to use ETL to generate in batch time. to hide the complexity, some service or functions requires less amount of data can be provide through JBoss Fuse ESB to ensure it expose the right amount of data and better with a restful endpoint call. The Data generated by ETL can also be extracted and provide services to others in the enterprise through JBoss ESB, again to make sure it's loose couple, we can expose them in the formate of CSV data file or XML file. Then when it comes to integrating services on the cloud or expose our service on cloud. With JBoss Fuse and achieve high availability and load balance in our service through use of Fabric Gateway. Also we can orchestrate these already built in service to provide more functionality to users.

In summary, JBoss Fuse and ETL solve different aspect when dealing with integration, one focus on Data tier integration, which is perfect for Big Data or reports, but when it comes to decoupling services and make then more agile, reuse of code, taking your enterprise to the cloud. JBoss Fuse is a much better solution.


Edward said…
Really nice topics you had discussed above. I am much impressed. Thank you for providing this nice information here.

Blockchain Development

Product Development

Google Analytics Consulting
Chris Hemsworth said…
I have been reading for the past two days about your blogs and topics, still on fetching! Wondering about your words on each line was massively effective. Techno-based information has been fetched in each of your topics. Sure it will enhance and fill the queries of the public needs. Feeling so glad about your article. Thanks…!
best software testing training in chennai
best software testing training institute in chennai with placement
software testing training

software testing training and placement
software testing training online
software testing class
software testing classes in chennai
best software testing courses in chennai
automation testing courses in chennai

Popular posts from this blog

JBoss EAP 6 - 效能調校 (一) DataSource 的 Connection Pool

效能沒有什麼Best Practice, 反正能調整的就那些。 通常,一個程式的效能大概有70-80% 都跟程式怎麼寫的其實比較有關係。

最近我最疼愛的小貓Puji 因為膀胱結石開刀的時候過世了,心情很差請原諒我的口氣沒有很好,也沒有心情寫部落格。

Puji R.I.P.



JBoss 的 SubsystemDatasource WebWeb Service EJB Hibernate JMSJCAJVM 調校OS (作業系統)

先來看一下 DataSource Subsystem, DataSource 的部分主要是針對Connection Pool 做調校。

通常,程式都會需要跟資料庫界接,電腦在本機,尤其是在記憶體的運算很快,但是一旦要外部的資源連接,就是會非常的耗資源。所以現在的應用程式伺服器都會有個Pool 放一些先連接好的 資料庫connection,當程式有需要的時候就可以馬上提供,而不用花那些多餘的資源去連接資料庫。

這就是為什麼要針對Connection Pool 去做調校。

以下會討論到的參數,都是跟效能比較有關係,Datasource 還有很多參數,像是檢核connection 是否正確的,我都不會提到。如果你追求的是非常快速的效能,那我建議你一個檢核都不要加。當然,這樣就會為伺服器上面執行的程式帶來風險。這就是你要在效能與正確,安全性上面的取捨了。 (套句我朋友說的話,不可能又要馬兒好,又要馬兒不吃草的..)

最重要的調校參數就是 Connection 的 Pool 數量。(也就是那個Pool 裡面要放幾條的connection.) 這個參數是每一個應用程式都不一樣的。


Connection Pool 最少會存留的connection 數量


Connection Pool 最多可以開啓的 connection 數量


事先將connection pool 裡面建立好min-pool-size 的connection.

我的建議是觀察一下平常程式要用到的量設定為 min-pool-size 。

My 2cents on the future of Integration - With Service Mesh/Istio and Serverless/KNative

It's been a year and half since I blogged about "Agile Integration architecture" (Gosh, time just flies). With the "microservices" and "cloud-native" hype, I was especially curious on how all these new concept and technology affect us on how to architect the integration systems. If you ever pay close attention to all the latest and greatest news from the Kubernetes community, I am sure you will hear a lot about the new "Service Mesh". And rumor has it that this is how integration can/should be done in cloud native world, but, is that so? Anyone who has ever worked on an integration project would tell you, it's a LOT more COMPLEX and can get worst overtime. I did a talk with Christian Posta in Red Hat Tech Exchange coming from a more comprehensive view of how different Red Hat technologies are applied under different patterns when building integration solutions. In fact he also did a great blog about it.

Since then, another topics has be…

Red Hat Fuse - Announcing Fuse 7 Tech preview 3 release.

Red Hat Fuse 7.0 technical preview three is out today! On the pathway to become one of the best cloud-native integration platform, Fuse gives developer freedom to choose how they want to develop the integration solution, where they want to deploy it and capabilities to address new integration personas that do not have development experience.
By supporting the three major runtime, developer is free to work on the runtime of their choice.By supporting standalone and cloud deployment, it simplifies the complexity to distinguish between these environments, allowing application to deploy freely among the environment of your choice. All levels of developers are welcome, you can either dive deep into creating customize complex integration logic, or using the new low code platform to quickly build a simple integration. In this Tech Preview release you get it all.
Fuse StandaloneSpring-boot for microserviceKaraf 4 for OSGi loverJBoss EAP for JavaEE developersFuse on OpenShiftPlugins for easy co…