IRMS
The interactive hub of the information world

IRMS Blog

Ramblings

Thursday, 04 April 2013

 

I had considered another week of Big Data as a key theme for this Blog Post, especially after reading this interesting article from Tom Davenport; http://www.zdnet.com/tom-davenport-big-data-is-too-important-to-be-left-to-the-quants-7000013317/ ) It reminded me of web development and those early days of coders being king (I spent a great deal of my early career coding HTML and other wonderous languages that built the earliest iterations of the internet). 

Big Data relies on analysts, statisticians and other suitably big brained folk to extract, manipulate and create information that is consumable by business decision makers and I have wondered for some time how long this will remain the case as new technology slowly proliferates the business environment that enables business decision makers to undertake queries and analysis through more usable interfaces, something akin to the wysiwyg interface for website design.

The article is worth a read and is certainly food for thought if analysis and analytics is your thing. In terms of relevance to the world of IRMS members, it bought me back to the fundamentals of good Governance; regardless of the analytics or the processes that use information and data, the reliability and integrity of information is the critical to its usefulness and the principles of good information and records management practice is what provides the appropriate degree of assurance of the integrity of information used in our business processes. 

This also played out in a discussion on one of our new blogs, this time the Blog of the South West Group who’d opened invitations from their network to suggest agenda items for their forthcoming meeting, the current front-runners for discussion are 

  • Sharepoint as a solution for Records and Information Management, 
  • Retention, Destruction and Disposal, 
  • Providing User Guidance and Awareness Raising. 

What’s interesting here is that these same topics have been present as a challenge for business and organisations for as long as I have been a Information and Records Management professional (albeit the Sharepoint aspect is technology specific, the EDRM/Shared Network/Personal Network Drive/Local Storage etc. challenges have been kicking around for a while!). 

I’ll be attending this meeting and will be interested to hear the discussions around their agenda items and of course I will be reporting back. 

I’d encourage any of our members to participate in local and specialist Group events (and if you’re not a member, our groups always open their doors to guests if you get in touch with the chair and register your attendance). These meetings are always a fabulous opportunity for interesting and rich debate on all things that impact our affect the world of Information and Records Management and are also a fantastic opportunity to meet likeminded people and learn from their experience as well as collaborate on shared challenges. 

Emily Overton is the Groups Director and is doing a great deal of work supporting the Groups and promoting their work along with the Groups Officer, Scott Sammon. This week they’re attending the IRMS Ireland Group Meeting which is focussed on Data Protection from the perspectives of the Private and Public Sectors, more details of their meeting can be found on the IRMS Ireland Blog. Its also worth keeping an eye on the Group Blogs for any follow up from the meeting and also minutes and presentations for those unable to attend. 

We are planning to introduce a feed to the homepage of the IRMS Website that aggregates all of our blog posts so that you can be confident that you won’t miss out on content, we’re hoping this will be available in the next month or so. You won’t miss it on the homepage and I’ll be sure to mention it in the weekly post here. 

 

 

 

  

Big Data #2

Thursday, 28 March 2013

So, lets resume the exploration of the subject of Big Data; 

As usual, I headed to Wikipedia for a definition to get us started; 

Big Data is a collection of data sets so large and complex that it becomes difficult to process using database management tools or data processing applications. 

You only really get a sense of just how big, Big Data can be when you considered the byte sizes that are referenced, on the wikipedia page linked above it references;

If all sensor data were to be recorded in LHC, the data flow would be extremely hard to work with. The data flow would exceed 150 million petabytes annual rate, or nearly 500 exabytes per day, before replication. To put the number in perspective, this is equivalent to 500 quintillion (5×1020) bytes per day, almost 200 times higher than all the other sources combined in the world.

So that’s pretty big! The article does state that it only currently records 0.001% of the sensor stream data but that still equates to 200 petabytes annually after replication. 

Now, if you’re wondering what a petabyte looks like its 1 million gigabytes or 1 thousand terabytes and if like me you’ve been around IT a while, you’ll remember when a single terabyte server were much coveted and highly prized! *

I tried to do some research about what volumes of stuff different organisations are currently handling and started with some household names and found this super interesting article; 

http://paulwallbank.com/2012/08/23/how-much-server-space-do-internet-companies-need-to-run-their-sites/ 

This article is well worth a look and suggests that Google actually uses 2% of the worlds servers! I don’t doubt it and in my search I found this video on YouTube; http://www.youtube.com/watch?v=avP5d16wEp0

You’ll notice at the end that you can continue your tour using streetview here; http://www.google.com/about/datacenters/inside/streetview/ (according to the You Tube comments, there is a Storm Trooper standing in the server room if you fancy a game of ‘spot the Storm Trooper’) 

I realised in my quest that its easy to get tied up in the numbers, petabytes are huge but yottabytes are bigger! The world we live in now is built upon these vast tides of information that wash the planet with knowledge and frippery and for us as a community of information and records management professionals this can be a daunting prospect but, is it any more daunting than looking into an archives centre? 

The National Archives (UK) contains 1000 years of history and over 20 million descriptions of records created by UK Central Government and the Courts of Law of England and Wales. If you consider trying to find a single term or reference to a single person or place in amongst the billions of words recorded in the millions of records at the National Archives that’s equally huge. 

The difference though is that the physical format record relies on the classification of the object in order for it to be found whereas the technological solution uses the power of that technology to retrieve a simple term in a nanosecond. It doesn’t happen by magic though, it depends upon computing power to retrieve the data, format compatibility to render it useful and whole gambit of things before you can determine its reliability or usefulness (although reliability and usefulness are equally challenging when dealing with physical format records). 

Done right - Big Data introduces a great new opportunity to business and organisations, it means that the data retained in your organisation can be virtualised, optimised and made available a million-times-a-minute** enabling business decision makers real time access to business intelligence using analytics tools that enable them to choose the way in which they consume information and make it useful to them. It removes the need for teams of analysts interpreting the requirements (or whims) of decision makers and it provides a mind blowing degree of insight that can be recreated, recut and reused to suit the audience. 

Big Data enables agility and responsiveness within business which is essential if the world continues to produce and consume information at its current rate.  

Done badly - Big Data is a burden; many organisations already creak under the pressure of systems that fail to communicate, data that is unreliable, untrustworthy and undiscoverable and the  seemingly endless demands of decision makers. The advance of even bigger volumes of information within an organisational context means that businesses that rely on data and information simply must keep up and must keep on top of their data volumes, either through better management or through better systems. 

* alarmingly you can now buy a terabyte hard drive for less that £50 which is mind boggling if you can remember how much a 10 terabyte server was costing as little as 5 years ago!

** not guaranteed

 

Big Data, Open Data

Thursday, 21 March 2013

This weeks theme was going to be Big Data, it fits with the current series of blog posts regarding the terminology that it used to describe the various aspects of information management and use within contemporary organisations and across society. 

Big Data will mean many things to our membership; after all we’re lucky enough to welcome colleagues from across the Information and Records Management profession, whether your interested in traditional IRM concepts or those at the very forefront of change within the sector. The likelihood is that all of our membership will be familiar with the term, especially if you’ve recently received your conference flyer (and yes, that is me on it and yes, I was enthralled by the wonderous table magician that entertained us at last years gala dinner). 

The decision to theme this years conference around Big Data and Open Data is in response to feedback from members at last years Conference, at Regional and Group Meetings and also through the bulletin. Big Data and Open Data are concepts that are a significant part of our day jobs now and a key theme of our future strategies. Its something that isn’t going to go out of fashion anytime soon (unless someone does manage to invent the miracle solution to the data and information needs of our organisations and stakeholders) and so its essential that we ensure that we enable our membership to keep in step with industry and sector changes through the conference and afterwards through the bulletin and our membership communities.

I always look forward to the IRMS Conference; for me it is a great opportunity to catch up with colleagues from across the profession, hear what they’ve been up to and what they they’re thinking about in terms of the future. The insight I gain at Conference from the wealth of experience and industry expertise is more than I ever manage to achieve through (seemingly endless) use of the internet, various professional publications that I subscribe to and the ceaseless reading and research that I undertake as part of my everyday.  

This year I am especially looking forward to hearing from Rt Hon Jack Straw who is our Guest Speaker at the Gala Dinner and also the keynotes; I always enjoy these big sessions where you get to hear from someone doing amazing things or thinking great thoughts. This year is no different; we welcome keynote addresses from SCISYS, LSE, BBC Archive Development and Intelligence. The Keynote addresses cover everything from Outer Space to the BBC and, considering Gareth Meatyard will host a breakout session on how to ‘Become an Information Hero’ and ‘Master your Dark Data’ it sounds like I might need to dust off my lightsaber and recharge R2 for a great adventure! 

You can see more of the Programme over on the Conference pages and in the coming weeks the IRMS Conference Blog will showcase some of our speakers, sponsors and exhibitors giving you an insight or what you can expect at Conference. We’ll be tweeting and sharing any articles but if you want to be sure that you receive all updates to the Conference Blog, you can subscribe via the RSS feed option.

So, I guess I'll talk more abut Big Data next week but in the meantime, I'd encourage you to book your place at Conference and get the lowdown on all things Big Data from a wealth of industry experts, your peers and colleagues. 

Information Optimisation

Wednesday, 13 March 2013

Information Optmisation

I use Google Alerts to find me interesting things relating to information related topics on the internet and I am never disappointed with what it delivers. I manage to keep on top of the changes that constantly cause our professional sphere to shift and expand to incorporate new technologies and developments and move away from less current concepts. 

Today I had a timely notification of a blog post from www.information-management.com; http://www.information-management.com/blogs/big-data-check-under-the-hood-10024058-1.html

The reason this post was timely is that I am currently scratching my head regarding the dilemma of old meets new; how can information professionals provide assurance and governance using old techniques to new technology. 

This article reminded me that information is only valuable when the right information is  available to the right people, at the right time. I would also add that it is essential that this information can be relied upon which means that some of the traditional governance and quality methodologies continue to be relevant when applied to the core information and data assets of an organisation. These are old concepts and a foundation stone in the world of Records Management practice, in an ideal world they would already be embedded as fundamental constructs that support the managed retention of information. 

So, how can the right people gain access to the right information (at the right time) if it is strewn across multiple data platforms, buried in EUC applications and managed without any form of standardisation that would at least enable some kind of discovery technology to be deployed. This article suggests that to pursue a traditional route is pointless, the problem is too great and the resource isn’t available to solve the problem that way. 

and so the alternative to trying to fix and manage your legacy could be the implementation and deployment of an information optimisation platform that

“...must be able to handle the volume of information requests from groups and individuals based on all the data residing inside and outside the organization. Moreover, as these are certain to increase, it must offer scalability and the potential to meet users’ increasing performance needs. To satisfy the many types of users and skill levels, it must rank high for usability, offer flexibility in user interfaces and be accessible from many applications, portals and mobile devices. In short, the platform must become an effective foundation element of the organization’s information architecture and support a range of needs that can span from capturing data, modeling and converting it while maintaining security, automating the process and sharing and distributing information.”

Information Optimisation enables an organisation to move away from the manual application of controls, the manual checking and compliance audits necessary to demonstrate appropriate information governance practice and the manual way in which data is classified, accessed and used. Instead access to information and reporting tools are provided via a simple interface that enables the selection of key data items required to produce the information required by a business user without the need for technical skills required to produce complex reporting solutions. 

Some of these tools have been around for a while; 

Quite some time ago I got very excited about the Autonomy EDC product that looked to provide a variety of opportunities to integrate and exploit structured and unstructured information from various sources.  I’m not 100% sure where in the Autonomy product suite this now sits and I am uncertain of its development (although I am really interested to learn how its been adopted). 

The much publicised acquisition of Autonomy by HP has no doubt added to HPs’ Product Portfolio for IO and they currently have a wealth of information supporting their Information Optimisation products and solutions.

I have also long been excited about Active Navigation; when I first learned of Active Navigation and the possibilities available via its products and solutions I felt it was nothing short of a miracle. I’d recently emerged from a 3 year programme that had seen the implementation of an EDRM solution with associated controls to bring governance to the unstructured information estate and a forward agenda to scale out those controls across all information and data assets within the organisation. I recognised early on that Active Navigation would enable the acceleration of the implementation of the controls necessary to bring about good governance but also quickly enable the organisation to make full use of the information that it retained.

The really interesting thing about these providers, and many others in the Information Optimisation space is that any attendee of the IRMS Conference will be familiar with them as they’re regular exhibitors and speakers at Conference. I’ve learnt about their products through conversations with their staff in the exhibition hall, through presentations and seminars that they have sponsored and also at Regional Group Meetings where they have cascaded information and insight to our Regional Group Members. 

Optimisation isn’t new to us, we’ve been waiting for it for some time as it will enable any IRM practitioner engaged in any size of organisation to ensure better governance and assurance of the information that they retain and also to enable better access, retrieval and usage for business users. I am certainly looking forward to its adoption as a key discipline within the Information Architecture of organisations.

   

 

 

 

 

 

eDisclosure - eDiscovery

Thursday, 07 March 2013

This weeks post is courtesy of Nicholas Cooper, our Secretary who offered up some thoughts on eDisclosure and eDiscovery;

Background

The UK has addressed eDisclosure since 2006 as the post Enron legislation in the US started to have an impact on firms that were registered on Wall Street, or had offices that operated in the US.

What it means for UK businesses

Quite simply eDisclosure is where there is a demand from a judge or statutory regulatory body, for an organisation to handover information in any format whether it be digital (including data), physical or communications based (including email, social media or telephony) to support a legal investigation and eDiscovery is the process and capability necessary to make eDisclosure a lot less costly and a lot less painful.

One of the main issues is the conflict between the understanding by an organisation of what is disclosable versus the view of the legal entity that requires the disclosure; eDisclosure can require the disclosure of literally anything that an organisation holds and can involve searching the entire information infrastructure including electronic systems and servers, back-ups, archives, disaster recovery sites, CD's, key sticks, laptops, old floppy discs, fiche etc. (not having a means to read these obsolete storage mediums is no excuse - if an organisation is legally obliged to disclose it they will need to ensure access to the facilities necessary to enable the access they need to find files stored on any medium).

The fundamentals that mitigate the potential impact of eDisclosure are not new, it requires basic good housekeeping of files. The use of laws and guidelines on how long to keep information and records in accordance with the law and, where legislation doesn't define a retention period, creation of business rules that reflect Board risk appetite regarding retention and disposal. Whilst many organisations have this reasonably well managed for physical format information, there is usually far less control over digital assets; in many organisations there is still a conceptual split between the physical and electronic storage of information; moreover there is a conceptual split between corporate governance and information governance.

In the US it is this issue that has caused organisations to have to incur many hundreds of thousands of dollars in legal costs and people resource (and court fines) to manually make sense of their information assets because that have failed to implement proper control and effective management practices for electronic information. There are a great many horror stories and case studies available on the internet and we plan to review some of these incidents via this blog in the future. 

 

Blog Categories

IRMS Blog (29)

Feedback

If you have any comments about the site, please contact the eOfficer via the contact form or by email; This email address is being protected from spambots. You need JavaScript enabled to view it..