Knowledge.ToString()

Improve Attunity Driver Performance When Extracting Millions of Rows From Oracle

If you are extracting data from Oracle, Attunity driver helps you extract it faster. But if you are extracting millions of rows from Oracle and you think it is slower than anticipated, here are few tips. These advanced tips will help you drastically improve the performance.

Advanced Tips to Improve Performance

Right click on Oracle Data Source and click on “Show Advanced Editor…”.

oracle-source-right-click
Oracle Source Right Click

Go to “Component Properties” tab and under “Custom Properties” > “BatchSize”.

Oracle Source Advanced Editor
Oracle Source Advanced Editor

The default batch size is 100 which means that as soon as 100 rows are fetched, it will be sent to SSIS pipeline.

If you are using a cloud environment, sending a record batch has extra overhead. You will not notice the difference with few thousand rows but if you fetching millions of rows, you will notice performance degradation.

Instead of using Batch Size of 100, increase the Batch Size value to 10,000 and you will feel the performance difference. 10,000 is my magic number and it worked in various situations for me but you may have to find your own magic number by experimenting with it.

Share

Comments

13 responses to “Improve Attunity Driver Performance When Extracting Millions of Rows From Oracle”

  1. Siushil Rajendra Bandal Avatar
    Siushil Rajendra Bandal

    Hello Vishal, I am facing issue over performance I have been loading data from oracle source to sql server destination so it will take 9 hrs. to load 6 cr data .Oracle source having join on 3 to 4 tables. Can you pls suggest How can I improve the performance.Thnaks in advance

    1. Vishal Monpara Avatar
      Vishal Monpara

      Hello Sushil,

      There are couple of performance improvements that can be done. Make sure your query is optimized, physical distance between servers are minimal, use Attunity to fetch Oracle data, use proper Batch Size parameter, destination database transaction batch should be optimal, disable indexes on destination table before uploading.

      Regards,
      Vishal Monpara

  2. nancy Avatar
    nancy

    Hi Vishal,

    I had a question which is a little different from the topic. But I couldn’t get an answer to it so I am back here 🙂
    I am trying to do an incremental load using attunity connector for connecting to the database. But I am not sure what I should be using. I read that we cant use lookup so I had been trying the Merge join but somehow haven’t been able to achieve anything. There is a pic of the package on the below link. It would be great if you could help me out
    https://stackoverflow.com/questions/51777894/incremental-loading-using-attunity-connector

  3. nancy Avatar
    nancy

    Thanks a ton Vishal .

    I just tried loading like 24.5 million rows and it took only 23.5 minutes and last time it took 4 hours to load. Although I am trying to reduce it further but this is amazing with this small change!!!

    Although my next target is ” load 1 Million rows from Oracle in 30 secs”

    I had been trying to find out more articles on improving Attunity ETLs and this blog was the best and easy one to get..

    1. Vishal Monpara Avatar
      Vishal Monpara

      Hello Nancy,

      I am glad that you had drastic performance improvement. For further improvements, check out the physical distance between multiple servers (source db, destination db, ssis execution server), indexes on source/destination. If destination is db, check the bottleneck for record insertion.

      Regards,
      Vishal Monpara

  4. nancy Avatar
    nancy

    Thanks a lot Vishal. I tried the new query, although it didn’t improve the timing but it actually starting extracting the rows as soon as I started the process. Previously the Data extraction and loading was starting after almost 2 minutes. The query seems to have done that trick. I’ll have to do some reading on why it did that.
    I later changed my defaultbuffermaxrow size to 1000 from 10000 and it reduced the time further by 15 seconds. Can you suggest me anymore properties that can be tuned.

  5. nancy Avatar
    nancy

    Hi Vishal,

    I have to extract almost 300 million rows per table from oracle. I tried using a dataset of approx 6 million rows and it takes almost 5 minutes to extract from oracle (using attunity/OLE DB) and load into sql server. I tried modifying the batchsize using attunity and it reduced the time to 3:30 mints. Can you suggest more ways of improving the speed. I am really new to DWH. I am querying a single table
    “SELECT *
    FROM ORDER_DETAIL where to_char(UPDT_DT_TM,’YYYY-MM-DD’) >= ‘2018-07-01’ ”
    Can you guide me to the right direction.
    Thanks
    Nancy

    1. Vishal Monpara Avatar
      Vishal Monpara

      Hello Nancy,

      Try this query
      SELECT *
      FROM ORDER_DETAIL where UPDT_DT_TM >= TO_DATE(‘2018-07-01’, ‘YYYY-MM-DD’)

      Based on your NLS settings, you may need to tweak your parameters for TO_DATE functions.

      Regards,
      Vishal Monpara

  6. Joseph Avatar
    Joseph

    You are too wonderful,work like magic

  7. Jeffrey Avatar
    Jeffrey

    Thanks a Million Vishal,

    It made my day, for me it runs at least 20 times quicker now thanks to this option.
    By default, I couldn’t see the point of using oracle Attunity. I had the same performance as with the Oracle Client connector.
    I changed the BatchSize to 100000 and it’s life changer. I load 1 Million rows from Oracle in 30 secs. Took me 20 mins before.
    Thanks again

    1. Vishal Monpara Avatar
      Vishal Monpara

      Hi Jeffery,

      Good to see a performance improvement with this little setting.

      Regards,
      Vishal Monpara

  8. prasanth Avatar
    prasanth

    I do have 1 billion records to process from oracle Attunity source whose output is join of 4 large tables.What would the properties be to set in source to extract data faster.

    1. Vishal Monpara Avatar
      Vishal Monpara

      Hi Prashanth,

      There are two things to consider when you want to improve performance.
      1) The query should be optimized (left/right join, where conditions, indexes etc)
      2) As mentioned in this post, change the BatchSize number to 5000/10000/15000 or any other number and experiment with it to find the right number for your environment.

Leave a Reply

Your email address will not be published. Required fields are marked *