Skip to main content

JBoss Fuse - JBoss Fuse or ETL?

While I was hosting a workshop on JBoss Fuse, I got a question from on of the partners, why not use ETL?

"ETL (Extract, Transform and Load) is a process in data warehousing responsible for pulling data out of the source systems and placing it into a data warehouse."

To me it's rare to compare between Fuse to an ETL solution, because I think these two should work together to compliment each other. They are solving different problems when integrating. JBoss Fuse is a light weight more agile implementation of ESB. When trying to introduce SOA or even micro-service into your system, JBoss Fuse will help in many ways. I found this excellent decision tree diagram to decide when is best to use what.

The most obvious difference between an ESB and ETL if you need your real-time processing or if it is ok to have the data prepared couple of hours beforehand? To me ETL is more like an DATA tier integration solution, if you have 2 or more databases with massive amount of data that you want to join to produce some kind of report, then yes, ETL works best. So yes, if you ever hear things like, data, massive amount and report, then ETL is probably your best bet. 

On the other-hand, integrating in an enterprise is not just simply joining data. There are many departments, systems in an enterprise, each provide different services to end users, to provide suitable information, ETL can soon get very complicated, as different end users requires various kinds, granularity of data also their need of how recent the data needs to be, too. You will soon run into this very complex, tight coupling architecture. From developer's point of view, to get data they need, they will also need broad understanding of the other database architecture. And there are exceeding amount of service now move to cloud, to integrate these services in data level is going to be a challenge. That's why we need a layer that hide all the detail implementation for us to allow loose coupling between systems, much more agile and hides unnecessary informations. 

JBoss Fuse ESB can help you in this layer of services, developer no longer have to worried about what kind of protocols, these information are from, because of the prebuilt connectors in Camel. And the Enterprise Integration Pattern of Camel can help developer freely route the data between base services, or even use it to provide or combined services. Since data can come into may forms, Camel also has different built in mappers to transform between data types. And moving your services to the cloud. 

In real life there is no one size fit all solution, it is our job to find what solution fits best with our needs, and this is how I view ETL and ESB should work together. 

Here you can see there are a numbers of datasource, it could be file, database, or messages, when it comes to providing data for BI and reports, because it needs to process a large amount of data, so it's best to use ETL to generate in batch time. to hide the complexity, some service or functions requires less amount of data can be provide through JBoss Fuse ESB to ensure it expose the right amount of data and better with a restful endpoint call. The Data generated by ETL can also be extracted and provide services to others in the enterprise through JBoss ESB, again to make sure it's loose couple, we can expose them in the formate of CSV data file or XML file. Then when it comes to integrating services on the cloud or expose our service on cloud. With JBoss Fuse and achieve high availability and load balance in our service through use of Fabric Gateway. Also we can orchestrate these already built in service to provide more functionality to users.

In summary, JBoss Fuse and ETL solve different aspect when dealing with integration, one focus on Data tier integration, which is perfect for Big Data or reports, but when it comes to decoupling services and make then more agile, reuse of code, taking your enterprise to the cloud. JBoss Fuse is a much better solution.

Comments

Popular posts from this blog

JBoss EAP 6 - 效能調校 (一) DataSource 的 Connection Pool

效能沒有什麼Best Practice, 反正能調整的就那些。 通常,一個程式的效能大概有70-80% 都跟程式怎麼寫的其實比較有關係。

最近我最疼愛的小貓Puji 因為膀胱結石開刀的時候過世了,心情很差請原諒我的口氣沒有很好,也沒有心情寫部落格。

Puji R.I.P.

=======================正文=======================

這個題目很多人叫我寫,可是這題目好大,這分明就是整死我咩~
所以我會分幾段慢慢寫。

JBoss 的 SubsystemDatasource WebWeb Service EJB Hibernate JMSJCAJVM 調校OS (作業系統)

先來看一下 DataSource Subsystem, DataSource 的部分主要是針對Connection Pool 做調校。

通常,程式都會需要跟資料庫界接,電腦在本機,尤其是在記憶體的運算很快,但是一旦要外部的資源連接,就是會非常的耗資源。所以現在的應用程式伺服器都會有個Pool 放一些先連接好的 資料庫connection,當程式有需要的時候就可以馬上提供,而不用花那些多餘的資源去連接資料庫。

這就是為什麼要針對Connection Pool 去做調校。

以下會討論到的參數,都是跟效能比較有關係,Datasource 還有很多參數,像是檢核connection 是否正確的,我都不會提到。如果你追求的是非常快速的效能,那我建議你一個檢核都不要加。當然,這樣就會為伺服器上面執行的程式帶來風險。這就是你要在效能與正確,安全性上面的取捨了。 (套句我朋友說的話,不可能又要馬兒好,又要馬兒不吃草的..)

最重要的調校參數就是 Connection 的 Pool 數量。(也就是那個Pool 裡面要放幾條的connection.) 這個參數是每一個應用程式都不一樣的。

min-pool-size 

Connection Pool 最少會存留的connection 數量

max-pool-size 

Connection Pool 最多可以開啓的 connection 數量

prefill

事先將connection pool 裡面建立好min-pool-size 的connection.

我的建議是觀察一下平常程式要用到的量設定為 min-pool-size 。
加上…

Red Hat JBoss Fuse/A-MQ - Fuse and A-MQ Version 6.3 GA is released!

Fuse and A-MQ 6.3 GA has just went out. Maybe, you would think this is just only a minor version release why should I care? Hold your thoughts on that! Because they have done a lot of improvements and also added many new features into this release.

Besides various bug fixes and making sure Fuse Fabric is much more stable. There are two major change in this version update:

New Tooling in JBoss Developer Studio (JBDS) 9.1 GA. Newer Apache Camel version – Camel v2.17. I was really impressed by the work put in to make developing Camel application much simpler. First is the installation of tooling itself. Now it has a all-in-one installer so you don't need to worry about which plugins you need to check. See the videos below to see the new "Getting Started" of Fuse 6.3.



And If you notice from the above video, the presentation of camel route in JBDS has also updated. It fixed some of the miss representation of logic and making it easier to read.

Old Camel Route
New Camel Route
On …

Red Hat JBoss Fuse - Getting Started with Fuse Integration Service 2.0 Tech preview

I just realized that I did not do a getting started for Fuse Integration Service 2.0 Tech preview before I did the pipeline demo, thanks for those of you who reminded me! :)

To get started with FIS 2.0, for people who has just getting to know the technology, here is how I interpret it. Basically, it's divide into two aspect,

1. Integration development, FIS uses Apache Camel as the core technology that creates, orchestrate, compose microservices into a super lightweight thin integration layer, and become the API provider and service orchestrator through exposing RESTful or messaging service endpoints. And you can choose to either package and run it with Spring-Boot or Karaf.


2. Application Deployment and Management, FIS takes advantages of OpenShift platform, and allows you to separately deploy the micro-integration service among distributed environment, at the same time takes care of the failover, high availability, load balancing and service lookup problem for you.


So, now we know …