Start a Conversation

Unsolved

This post is more than 5 years old

4247

June 13th, 2012 03:00

Getting back data from Centera

Our customer wants to pull out entrie data stored on the Centera. Getting the data through our application is taking lot of time. Is there any utility/script/method through which  customer can get all the data stored on the centera. Can we write any script on top of this to restore objects(files) in the respective directory structure.

Thanks

Sreenidhi

41 Posts

June 13th, 2012 05:00

There are several possibilities:

- use casscript to read the clip and the blobs. This will work if your amount of clips is small (10s of clips), but will involve quite some manual labour.

- Depending on your expertise you can also write your own tool on top of the SDK to extract the blobs and write them to disk in the right directory structure. This will work with large amounts of clips, depending on how your tool is written to deal with this (checkpointing, threading, etc.)

- The company I work for (Datadobi) has a product, DobiMiner, that offers the functionality to migrate from Centera to a directory structure. Contact us through our company website if you're interested in this.

Regards,

Kim Marivoet

Datadobi (www.datadobi.com)

208 Posts

June 13th, 2012 05:00

Hello Sreenidhi -

Recalling archived data through an application for mass migration can be slow even if the application functions fine in normal day-to-day usage scenarios.

My company (Interlock Technology, www.interlock-tech.com) is an EMC Partner offering a service to migrate application data from Centera to another platform such as VNX, Data Domain or Isilon at high speed (multi-TB per day), with retention policy and chain-of-custody preservation, full reporting and without taking the current system offline until migration is complete.   Our system is designed to allow us to write 'plug-ins' to support any Centera API application, and we can provide consulting services if needed to advise you an any changes needed in your application to support the storage platform change.

You can find us on PowerLink or contact us directly at info@interlock-tech.com.

Regards,

Mike Horgan

409 Posts

June 13th, 2012 05:00

You dont say at what rate you are getting the objects back from Centera through your application.  Normally I would expect retrieval via the your application to be the quickest.  This is because to get a list of content addresses to restore by other means you would need to use the jcasscript tool to run a query against the centera and this is slow relative to you doing a query on your applications database.

Having said that if the retrieval part is still slow then

1) get a list of content addresses of the clips you want either by querying your database for them or running the query command with jcasscript

2) with that list of clips either write a tool or script jcasscript to retrieve the blob(s) to your local file system.  If you divide the list up into say 10 lists then you could run 10 parralel copies of your tool or scripted jcasscript.  Your number my be 10, 4, 8  or 20 or even more depending on your environment

How long both these steps would take depends on object size and the number of them.

You should be able to get this set up reasonably quickly but as always if you've never done it before there's a learning curve but the good news is jcasscript is free.  You can download it from the centera tools section of the community and it comes with documentation

Any question on its use please just post and the community will help out

June 13th, 2012 07:00

Thanks for the quick response. Ofcourse my application has content address of each file. But getting back entire data(milliions of files accumulated over several years) through SDK would take time. If there is a way to directly work on Centera my guess is it would be faster. Further customer has their own application and want to move entire data stored on Centera(irrespective of our application) to new device.

Thanks,
Sreenidhi

208 Posts

June 13th, 2012 07:00

Hi Sreenidhi -

I agree that transferring millions of Centera objects takes time, but probably not as much as you might think. Our system routinely transfers millions of objects per day, at typical rates of 3TB - 6TB per day (sometimes even higher depending on the Centera cluster configuration and data profile).  We would be happy to have a conversation with your customer to see if our solution meets their needs.

Best Regards,

Mike Horgan
www.interlock-tech.com

41 Posts

June 13th, 2012 12:00

As a matter of fact there is no other way to get data of a Centera in a scalable way other than through the SDK (except for ssh-ing directly into the cluster and dumping all blob-partitions manually, but that's not a viable option for most people). All tools, like casscript, use the SDK to get the data of the Centera, and this isn't necessarily a problem for performance: you can get very good performance using the SDK if you read the data back using multiple threads and and an appropriate read pattern to avoid random I/O on your Centera cluster.

I think your main problem will not be performance but how to transform the Clip/metadata/blob association in to the right directory structure, and to make sure that all clips are tranformed correctly into this structure, especially if you are migrating a live system. Depending on how complex these transformation rules are you can either use casscript (if your rules are simple) or a more complex third-party migration tool that allows you to define these transformation rules and migrate a set of clips using these rules.

Regards,

Kim Marivoet

Datadobi (http://www.datadobi.com/services/centera-migration)

June 13th, 2012 22:00

Thank you very much

regards
Sreenidhi

No Events found!

Top