The world of storage has changed radically. The transition of fixed content from shared network drives and home folders is overshadowed by a massive increase in machine- and user-generated data (audio, photos and video). Expensive high-performance block or file storage can no longer economically hold this volume of data, nor is it advantageous to do so. This shift in datasets is bringing with it new requirements for accessibility and distribution—flipping the storage paradigm upside down. If the definition of Tier 1 is based on importance or value to the organization, then for many data-driven organizations, object storage will become the new Tier 1.
Databases or analytical environments (SPARK, HDFS, etc.) are considered transient destinations to run analysis, and cloud compute is ideal for processing data sets. However, once results are obtained, data sets are often deleted because of ongoing storage costs. In addition, web applications now need to service millions of distributed customers. The impact on the infrastructure? Expensive file or block storage (with single-file write or block-write high-performance) has now become the temporary target for analyzing data; and, object storage that was designed to handle massive throughput has become the more permanent environment that supports storage and management of ever-expanding and distributed data sets.
To remain competitive and manage the costs of storage, you need to go with the only infrastructure that can absorb the massive influx of data, which is object or scale-out storage. Object storage is capable of handling many connections, writing in parallel, providing massive throughput, consolidating data from distributed sources leveraging the web, and handling many protocol inputs (for seamless integration with analysis and cloud platforms). This provides the most cost-effective platform for data-driven organizations.
The management of data should be thought of in terms of “gathering,” “cataloging,” “analyzing,” “annotating” and “distributing.” A proper object storage platform with built-in metadata management leveraging NoSQL infrastructure can serve four of of these five functions. With the many parallel options for inputting and outputting data, object storage can readily place data into temporary compute space for analysis. The output can be stored with rich metadata and the data process can have its metadata annotated as necessary to constantly improve the organization of your corporate data resource.
Object storage optimizes your compute infrastructure and costs by allowing the sharing of these resources or leveraging the temporary compute infrastructure that the cloud provides. So, if object storage provides an instantly accessible platform for your data that reduces your current storage TCO while enabling analysis and distribution in an elastic fashion that you can quickly scale up and scale down, then I argue that object storage should be the new Tier 1 for any organization that is serious about extracting value from continued access to its unstructured data sets.
The value of the cloud and mobile devices is undeniable. The elastic resources provided and the ability to create and access data from any location has changed society. However, as with any disruptive technological movement, it presents new challenges. Privacy concerns run rampant and laws and regulations that have struggled to keep up are now being retrofitted to existing workflows—testing the limits of most information governance and IT execs. Unless you have been avoiding email, you have undoubtedly received a General Data Protection Regulation (GDPR) compliance request of some sort. Why? Because penalty fees for non-compliance can go all the way up to $20M euros! So, how do you enable information governance in the cloud age?
This is a topic we are going to explore in our upcoming webinar titled Enabling Information Governance for Rapidly Scaling Data Sets. Our presentation will focus on solving 3 pain points:
- Where is the sensitive information?
- Who owns it?
- How do we protect the organization for its data?
In collaboration with our longtime partner and customer NetGovern (formerly Netmail), we will present a high-level framework to help you develop your information governance approach. In addition, we will show how NetGovern integrates Caringo Swarm, leveraging our secure software appliance in a way that is completely transparent to administrators and end users.
Expect this webinar to be a bit different from some of our more technical webinars. Jacques Sauvé, Information Governance Professional (IGP) and Director Partner Enablement at NetGovern, and I will focus on helping any executive new to information governance understand the steps in electronic discovery and the importance of providing a secure archive. We hope you can join us live on May 17 at 10 AM Pacific so that you have the chance to ask questions.
The webinar will also be available on demand shortly after the broadcast.
Last week, I had the unique opportunity to attend and speak at the Salishan Conference on High Speed Computing at the Salishan Lodge on Gleneden Beach, Oregon. The conference was founded in 1981 and this year’s theme was “Maximizing Return on Investment for HPC in a Changing Computing Landscape,” with the majority of attendees hailing from Los Alamos, Lawrence Livermore, and Sandia National Laboratories. From many of the talks, I got the distinct impression that the topic of ROI may have been an undercurrent of virtually all 27 past conferences. Given the nature of the computational simulations they are undertaking, it’s no wonder that much of the applications and hardware are quite distinctive to this space with somewhat limited applicability elsewhere. In fact, I learned from one of the attendees from D-Wave Systems that there are only 3 of their Quantum computers in production at customer sites, and all of those customers also attended the conference.
Setting aside the hardware though, they seem to have made some great strides on the software side, leveraging DoE budgets to develop open source software, and even creating communities around some of those projects. But here’s where it gets a bit sticky, as was evidenced by Dan Stanzione’s presentation, “A University HPC Center Perspective on HPC and Cloud Providers.” He concluded that while HPC centers and Cloud Providers potentially share some similar use cases, for the most part, you wouldn’t use an HPC system to run your company’s email, nor would you use a cloud service provider’s HPC service if you truly required highly optimized, high-performance simulations like many of the attendees conduct on a daily basis.
A great example on the need to optimize (customize) software for the hardware to squeeze every last drop of performance was presented by Andrew Connolly from the University of Washington, “Surveying the Sky with LSST: Software as the Instrument of the Next Decade.” Over the first ten years of its lifetime, this new generation of telescopes will survey half of the sky in six optical colors, discovering 37 billion stars and galaxies and detecting about 10 million variable sources every night. The telescope will gather 15 Terabytes per night and will release over 12 Petabytes of queryable data annually.
So, what does all this have to do with object storage? Well, in addition to object storage as an economical back-end archive for all this simulation data that is being generated (as is the case at Argonne National Laboratory), my talk on whether object storage could actually replace parallel file systems for read-intensive HPC workloads (which is, in fact, what is happening in phase 4 of the JASMIN project with our customer Rutherford Appleton Laboratory—more on that at another time), seemed to resonate with much of the audience. And, it spawned some internal debate on whether there could be a reduced need for POSIX front-ends to back-end object stores. This is, of course, a debate that will play out over time and is yet another example of the tension between rewriting applications to take advantage of the latest hardware (or software for that matter) versus running more simulations and analysis with the existing software. Big trade-offs to think about…which was the entire point of the conference.
I’d like to extend my appreciation to the organizers of the Salishan Conference for the opportunity to speak and to learn about the challenges still facing the individuals and teams in this important industry, and invite you to contact us if you have questions about role object storage in high performance computing.
While NAB stands for National Association of Broadcasters, the attendees in Las Vegas at the 95th NAB Show earlier this month represented a far broader audience than traditional broadcasters. From houses of worship, education, government, defense contractors, and the more obvious creators of television, movies, documentaries, music videos and news agencies, attendees came by to discuss use cases that involve rapidly scaling media libraries that need to remain instantly accessible.
Seeing “The Miracle Season” movie this week and a story on the “Bobby Kennedy for President” Documentary Series that will be airing on Netflix this coming weekend, remind me of just how often actual footage and artistic recreations are blended to tell stories and create art. Without the technology to keep yesterday’s and today’s unrelenting explosion of digital video accessible, we would quickly lose a wealth of information and history. “Moving pictures” date back to the 1890s, and the challenge for archivists and organizations around the world is how to safely archive those “moving pictures” so they can protect that history and keep it accessible.
From government and law enforcement agencies to local news stations and major television and motion pictures, we hear the same pain points of how difficult it is not just to store this type of data, but how trying it can be to find what you need years later. This is just the problem that Caringo set out to solve in 2005, as we pioneered the concept of Content Addressable Storage (CAS). Throughout the past 13 years, many organizations have turned to the experts at Caringo to provide the technology that enables them to meet their specific requirements, as our VP of Product Tony Barbagallo expounded upon last summer in the blog How Object Storage Meets Vertical Market Requirements.
Tony discussed in depth why object storage is a smart alternative to traditional storage systems (such as file-based storage systems, SAN and NAS) as well as the requirements that lead organizations to consider object storage solutions. Most importantly, he explained how Swarm Object Storage rises to meet those challenges. As our list of customers has grown, we’ve not only field-hardened our solution, we’ve expanded our product line and amassed a wealth of best practices that we use to help our customers implement the right solution to protect their data and support their business objectives. In fact, we actually help our customers turn storage into a competitive advantage. To learn more, I invite you to read this article recently published in the MESA Winter Journal.
Have questions? Our experts can help. Contact us and learn how Caringo Solutions can enable our business.
The post Merging Past & Present: NAB Show, M&E and Object Storage appeared first on Caringo.
Does anyone remember the children’s book The Hungry Thing? It’s a simple story about a Hungry Thing coming to town, sitting on his tail and pointing to a sign around his neck that says, “Feed Me.” He asks the townspeople for “Shmancakes,” which any smart preschooler knows rhymes with pancakes and so goes the story. Why do I bring this up? Well, because Swarm can feed data to other “hungry” clusters for collaboration and disaster recovery.
Feeds in Swarm is the name of our object routing mechanism that simply uses your internet connection to distribute data to and from a source cluster to a destination cluster. Now there are two other reasons for Feeds in Swarm, but we’ll talk about them later.
Swarm Storage protects against various disk failures and other hardware failures that might take out a machine, but it can’t protect against a true disaster like a flood. Feeds enables that protection by making copies of your data elsewhere. What gets replicated is a high fidelity copy of the complete object, metadata and all, so it’s accessible and usable in any cluster that the object resides. Feeds provide a backup and disaster recovery solution for environments with a network connection between the source and target cluster — the internet works quite well for Feeds. In these environments, feeds operate continuously in the background to keep up with source cluster intake. When Swarm recognizes new or updated objects in a domain that has been configured to be replicated, it copies these objects to an internal queue for transport.
Replication can be as simple or complex as you required. You can use Feeds to create an offsite DR cluster or even create n-way replication for collaboration and data locality. I will also mention that data is replicated on a domain-by-domain basis, so you can choose what data to replicate and to where. Check out the diagram for a couple of examples:
And you can even monitor your feeds from the Swarm UI:
I mentioned two other uses for earlier and here they are:
First, Swarm also uses feeds to speed up searching through objects’ metadata. Metadata Search provides real-time metadata indexing and ad-hoc search capabilities within Swarm by name or metadata. The integrated Elasticsearch service (view this on-demand webinar for more on Elasticsearch) collects the metadata for each object and updates the search database in your Swarm network. When you update an object’s metadata or create a new object, domain, or bucket, the service collects only the metadata and not the actual content. Once metadata is indexed, I can search through all of the metadata in the cluster, both system-defined as well as custom metadata. If my cluster consists of surveillance video for instance, I can create a search to identify all the surveillance videos from the back parking lot of corporate headquarters for the last 24 hours. Watch this short video to learn more about metadata and how it is used in Swarm object storage.
Second, we took the replication capability of Feeds, leveraging our original methodology of sending data between clusters, and have extended that to Azure Blob storage. With Feeds, you can now replicate objects on a domain-by-domain basis to native Azure blobs. Once the data is on Azure, you can leverage Azure’s compute, data-protection and long-term archive services. All data that remains on-premise is protected and managed by Caringo Swarm.
In other words, Feeds satisfies any data “hungry” process and is a robust, standard feature of Swarm enabling replication for collaboration, disaster recovery and search indexing. Any version of Swarm over X.X supports Feeds and it is just one of the many features that are standard. To learn more about some of these unique features you I recommend reading ‘Emergent Behavior: The Smarts of Swarm’ and if you have any questions don’t hesitate to contact us.
Today, we announced a technology partnership with Square Box Systems, who recently certified Caringo Swarm with their CatDV media asset management (MAM) suite. I wanted to share a few of my thoughts from the NAB Show floor, where we we’ve been meeting many talented and smart IT and Creative Professionals from media and entertainment (M&E) as well as government organizations, educational institutions and houses of worship.
Our technology partnership with Square Box Systems is part of our ongoing mission to integrate with best-of-breed solutions and to simplify workflows and the underlying storage infrastructure. As media libraries continue to scale at an impressive rate, our object storage solution is an ideal way to bring security and accessibility that evolves with content creation and viewer consumption patterns.
“What Happens in Vegas, Stays in Vegas” may be the most famous marketing slogan ever attached to a city, and certainly represents the ambiance of “Sin City.” Since 1991, professionals from the Media & Entertainment (M&E) industry as well as other verticals that produce, store or distribute video have converged at the National Association of Broadcasters Show (NAB Show) in Las Vegas.
Now in it’s 95th year, the NAB show is billed as the “ultimate event for the media, entertainment and technology industry, showcasing ground-breaking innovation and powerful solutions for professionals looking to create, manage, deliver and monetize content on any platform.” For those of us in the high tech world with products that are used in M&E, the NAB Show is the place to be next week. Our Caringo crew is preparing to head to Vegas, and can be found first at the JB&A Pre-NAB Technology Event, where we will preview how organizations can leverage the benefits of hassle-free scale while enabling direct streaming, S3 support and protecting their media libraries from file system exploits that lead to hacks and ransomware events.
Then, we head to the NAB Show expo (booth SL11807) where we will be hosting happy hour on Monday, April 9, and Tuesday, April 10, starting at 1:30 p.m. Come by for a cold brew and a demo, where our engineers can show you Swarm and explain how it provides limitless scale, the ability for just 1 administrator to literally manage hundreds of petabytes, and built-in protection from hacks and ransomware.
Swarm does this all the while reducing storage total cost of ownership (TCO) to the point where it is almost sinful (often up to 75%). This makes Swarm an ideal as a target for media movers and asset managers such as Marquis Project Parking, Pixit PixStor/Ngenea, CatDV and ReachEngine. Read Turning Your Storage into a Competitive Advantage, originally published in Broadcast Beat, to learn more.
Next week, what happens in Vegas won’t stay in Vegas. The knowledge gained, the connections and memories made, and swag like the infamous Caringo light-up yo-yos will return with attendees to the far reaches of the globe. And, for a limited time, Caringo is offering a no-charge, full-featured 100 TB Swarm software license for qualified M&E firms (including but not limited to recording studios, content creation and post-production houses, broadcasters, and studios). Stop by our booth to learn more or visit https://www.caringo.com/media-entertainment/. You can also email us at info@Caringo.com.
The post What Happens in Vegas, Stays in Vegas…Except Object Storage appeared first on Caringo.
We are less than two weeks away from the JB&A Pre-NAB Technology Event and the NAB Show, where Caringo will be focusing on workflow integration and optimization, specifically with asset managers, storage managers and media-moving solutions from Pixit, CatDV, Levels Beyond, Marquis Broadcast, FileCatalyst and others. Some of our technology partners have storage as part of their solution, many partner with other object storage vendors as well as with Caringo. With so many storage options on the market, it’s beneficial to take a step back every now and then, put the information in context, and think about how your requirements have evolved and will continue to evolve in the future. So what does any of this have to do with macro-economics? Read on.
Historically, M&E storage has been focused on ‘creating and sharing’ where SAN and NAS devices are needed to provide high performance and features like file-locking to enable collaboration. The goal is enabling a quick turn-around, handing files off between editors, VFX, colorists, composers etc. Then, everything is rendered, encoded, transcoded and packaged. You need a lot of horsepower for these tasks and they are, generally, driven by specific applications. Performance is often measured in IOPS, with capacities range from TBs to single digit PBs. File counts may reach a few million and features are measured through a file system lens. As file sizes grow, the need for SSDs and faster network technologies like NVMe increases. Because of this, for the foreseeable future, you will continue to need NAS and SAN and (possibly) tape. But if you look down the road 5 years, how will your requirements evolve? It all depends on where the growth is and where the viewers and resulting advertising and subscription dollars are.
We are in the midst of a macro-economic shift that is upending many social-structures, economic models and, interestingly enough, traditional M&E workflows. We now live in a mobile economy where you can go direct to the source for any product or service, instantly check prices, and find any song you want instantly. Everyone has a platform. Tech companies are pushing into the content creation space and bringing with them web-scale approaches (like object storage) and architectures developed during the time of the Internet. These new workflows are all about delivery, analysis and “active” archive (i.e., video on demand, direct-to-consumer streaming, frame analysis, metadata extraction, etc.). Performance is measured in throughput (i.e., how many concurrent files can you deliver), capacities range from TBs to hundreds of PBs, file counts reach to the billions and features are often measured through interface support. In this new mobile economy, nothing is ever deleted and new methods of accessing and finding content are being developed every day. Content is compounding. Now, with this in mind, take another look down the road 5 years from now. Where are the advertising and subscription dollars going and how are your requirements going to evolve?
How will your organization keep up with the new mobile economy so that you can focus on developing new methods and approaches for end-users to consume content? And, will you be able to bolt on to existing workflows while enabling new workflows, or is there a way to grow your storage and data management capabilities seamlessly while keeping your data secure and instantly accessible? Of course, this leads me to invite you to come visit with our object storage experts at the JB&A Pre-NAB Technology Event and the NAB Show (booth #SL11807), so you can learn just how Caringo Swarm can provide you with a hassle-free, limitless, secure object storage platform that will evolve with your needs. You can schedule a meeting with us at NAB or email us at email@example.com. And don’t forget to join us for our happy hours at the NAB expo Monday and Tuesday at 1:30 pm.
The post Macro-Economics, M&E Workflows and…Object Storage? appeared first on Caringo.
Back in 2012, Storage Spaces replaced Logical Disk Manager (technology licensed from Veritas) with a Microsoft implementation of virtual volume management into the core of the Windows ecosystem. This introduced advanced storage management technology into the heart of Microsoft Windows Servers. Storage Spaces has been enhanced with each major release of Windows and today includes a limited choice of software RAID options, virtual volumes that can be made up of one or more virtual disks and basic automated tiering of files between different types of physical disk assigned to a storage pool. These functions combined can be utilised to create a cost-effective in-house Windows based storage solution, all on standard server hardware, without purchasing expensive proprietary storage hardware from a vendor such as NetApp or Dell EMC.
However, Storage Spaces does have some limitations in functionality, scale and ease of management. For example, Storage Spaces is generally implemented using physical internal disks or JBODs attached directly to the Windows Server. This physical storage cannot be shared or utilised in raw form between Windows systems. Not allowing physical storage to be shared between Windows Servers dramatically increases storage costs, especially when an organisation’s requirement is greater than a couple of hundred terabytes. Scaling storage using traditional methods such as those implemented by Storage Spaces brings an additional challenge: the more data retained, the more data that must be protected using legacy or cloud backup solutions. And, we all know that can introduce complexity and quickly burn through capital and operating resources.
Wouldn’t it be great if you could have a cost-effective, massively scalable central pool of modern self-healing storage that can be tiered to by all of your Storage Spaces—allowing you to provision just enough high-performance local storage such as SSDs to meet hot data needs?
These types of storage challenges are something that we are quite familiar with at Caringo and that we work to solve for our customers all over the globe. When we originally launched FileFly in 2015, this was top of mind for us. Since then, FileFly has been awarded 4.9 out of 5 stars by Brien Posey in his TechGenix Review (read the review) and received the Silver Award in the TechTarget’s Storage magazine and SearchStorage.com 2017 Product of the Year Data Storage Management Tools category (as detailed in our blog).
By combining Storage Spaces with award-winning Caringo FileFly, you can now scale from TBs to 100s of PBs at a fraction of the cost. FileFly will also protect all of that data, further decreasing costs by reducing your reliance on backup. On March 27, Caringo VP of Marketing Adrian “AJ” Herrera and I will present a webinar with strategies, a live demo and use cases for re-architecting your storage by combining a secure tier of storage that is easy to expand and manage, and that has continuous data protection built in. Join us during the live webinar and feel free to bring your questions. We will have live Q&A throughout the presentation. Or, watch on demand at your convenience.
The post Tiering from Microsoft Storage Spaces to Object Storage appeared first on Caringo.
Now in my 7th year at Caringo, I’ve observed a surge of interest in object storage. With Caringo as the leading pioneer in the field, I’ve had the opportunity to work with many established experts. Given my focus on the design, implementation and management of object storage solutions, I’d like to share what we’ve learned at Caringo from our many installations.
I’m passionate about helping our customers and regularly see the benefit that our technology brings to them. From providing massive scale and protocol compatibility with a true global namespace to continuous data protection, our hassle-free storage eliminates concerns about losing valuable data or experiencing expensive downtime to restore systems after failure. The “pay-as-you-grow” pricing model also helps avoid a large total cost of acquisition (TCA) while enabling a lower storage TCO (total cost of ownership). In addition, the searchability, data management features and organizational benefits of eliminating storage silos enables compliance adherence and helps organizations mine the long-tail value of their data.
Caringo’s most frequently accessed content on our website is educational in nature, ranging from webinars (like A Deeper Dive into Object Storage and The Power of Metadata) to blogs (such as Back to Basics: What is Object Storage and Object Storage Basics for 2018) and eBooks (our most popular being NAS vs. Object—Which is Best for Your Data Center?). This was the inspiration for our Tech Tuesday webinar series, where we present a new topic monthly that dives into the details of adding object storage into your IT solutions.
In January, we started with How Does Object Storage Fit into Your IT Infrastructure? and in February we continued with Hardware Selection Criteria for Object Storage. Now, we are gearing up for our third webinar in the series: Capacity Planning and Scaling for Object Storage, Join us live on March 27, or watch on demand after the event.
In the meantime, if you have questions about how object storage can be used in a specific use case, contact us. We would be happy to answer your questions and schedule a demo with you.
Recently, we announced performance and interoperability enhancements to Swarm Scale-Out Hybrid Storage and SwarmNFS, further narrowing the gap between existing block-based filesystems and object storage. At Caringo, we continue to bring to market exciting enhancements in response to customer demand and an ever-evolving data landscape that needs a cohesive, scalable and highly intelligent storage strategy. These releases deliver on the commitment we have made to our customers to develop technology innovations that help manage the unbridled growth of unstructured data while meeting both modern and traditional data access requirements.
The Origin of SwarmNFS
When we released Caringo SwarmNFS in 2016, it was the first lightweight file protocol converter to bring the benefits of scale-out object storage—including built-in data protection, high-availability, and powerful metadata—to NFSv4. Our CEO and Co-Founder, Jonathan Ring, spoke with Storage Switzerland’s George Crump about the problems with NFS on object storage, and just how Caringo solved this issue.
Unlike cumbersome file gateways and connectors still in use by a number of object storage solutions, Caringo SwarmNFS is a stateless Linux® process facilitating in-memory protocol conversion. It integrates directly with Caringo Swarm allowing mount points to be accessed across campus, across country or across the world delivering a truly global namespace over NFSv4, S3, HDFS, and SCSP/HTTP, delivering data distribution and data management at scale without the high cost and complexity of legacy solutions.
What’s New in SwarmNFS 2.0?
SwarmNFS 2.0 leverages a powerful new patent-pending feature in Swarm 9.5 that allows a client to only send the changed data of an object. Swarm then combines those changes with the unchanged data from the existing object, resulting in a new object version. This dramatically reduces the bandwidth requirements, complexities and the time taken for a client to update an existing object. Prior to this breakthrough, an object storage client such as SwarmNFS would often retrieve and re-upload all unchanged data bytes in an object, even if only a small change was needed. The ability to only send a few bytes of modified data when updating an object resulting a new true object is not only an object storage industry first, it’s also a game changer, dynamically closing the gap between file and object. With this enhancement alone for combined concurrent client access and file operations, we are achieving improved overall NFS performance of 20x and greater.
Want to learn more about SwarmNFS? Contact us with your questions or for a custom demo today.
Are you a gambler by nature? What risks are you willing to take? As we prepare for the NAB Show in Las Vegas` (April 9–12; booth SL11807), I cannot help but think about how high the stakes are when you select storage technology. No matter the industry, nature of your organization or type of data, not having the right storage solution is risky business. You could jeopardize your content, company earnings and your organization’s reputation if data is not properly secured and cannot be accessed when needed.
Media & Entertainment (M&E) IT professionals from Studios, Production Houses, Broadcasters and Service Providers at the NAB Show will be looking to optimize storage and access at every stage of the digital asset lifecycle—from production to delivery to long-term preservation. We are committed to making certain they can store what they need, ensure media integrity and keep assets online and accessible for reuse and monetization. To help M&E IT professionals understand how Caringo’s object storage technologies work, we will have live demos of our products and technology integrations. For example:
- Integration with leading Asset Managers and Tiering solutions from Pixit Media, Levels Beyond, and Marquis (among others) so you can experience how easy it is to combine Swarm S3-compliant storage as a limitless target for nearline storage & archive with high-performance storage for production purposes.
- Video streaming from the storage layer: The native interface to Swarm is based on HTTP, so you can stream content directly from storage enabling origin storage, video-on-demand and OTT infrastructure use cases.
- Universal Access: Keeping content online and accessible from any application is quickly becoming a requirement. Caringo will be demonstrating mobile and web-browser access to all content directly from the storage layer.
When Caringo was founded in 2005, it was with the goal of changing the economics of storage. Caringo solutions give you many of the benefits of the cloud without actually sending your data to the cloud. You can easily scale to Petabytes of data on S3-compliant storage, and you can keep it in your own data center on the hardware of your choice or opt for a hybrid cloud solution.
Since Caringo Swarm gives you automated protection without RAID, instant access over HTTP, S3 or NFS, and the ability to use policies to optimize data center footprint and integrated search result, you can reduce overall storage TCO up to 75%. And, you can do this while ensuring that you, your colleagues and your viewers can always securely access content. And, with continuous built-in data protection, your odds of sleeping well at night improve dramatically.
If you are in the Las Vegas area and would like a free pass to the NAB Show Expo, we invite you to use our discount code (LV8799; valid until March 2, 2018).
Join us for happy hour Monday and Tuesday at 1:30 p.m. as we tap a keg of local brew. You can meet with our object storage engineers, executives, sales and marketing teams to learn more about how we help customers solve issues with unbridled data growth.
You can also find us at the JB&A Pre-NAB Show Technology Event April 7–8 at the SLS Las Vegas—a must-stop on everyone’s NAB itinerary for a jump-start on the new technologies you’ll see on the NAB Show floor.
Two weeks ago, I blogged about the challenges of implementing technology to enable General Data Protection Regulation (GDPR) compliance, and last week, I explained how object storage enables compliance. In today’s blog, we will take a look at what should be considered the most important aspect of GDPR, which is of course “data protection,” and how best-of-breed object storage can help secure your data. More specifically, we will examine the features of Caringo Swarm Scale-Out Hybrid Storage that provide Data Protection Officers with the information needed to monitor compliance.Data Protection Capabilities of Object Storage
“Data protection” is a key stipulation of GDPR, specifically protecting personal data from theft and unauthorised access. To achieve this, it is imperative that data access control is incorporated within the storage system. Here the challenge lies with traditional block storage, as data security is generally controlled by the file system. This makes it possible for anyone with the know-how to bypass the filesystem driver and access the data directly at the storage, thus bypassing any filesystem security. Caringo Swarm Object Storage was designed so that you can police all data access requests and require a valid login and password or security token. When this is implemented correctly, the possibility of unauthorised data access can be eliminated. Even in a worst case scenario (e.g., a client system is compromised), without valid storage credentials the intruder would remain isolated from all data stored within the Caringo Swarm ecosystem. Caringo Swarm’s inherent data security combined with advanced architecture protects your data from attacks utilising the likes of Meltdown and Spectre as well as ransomware attacks.Data Encryption Brings Peace of Mind
Data encryption plays an important role in compliance as, over time, physical media such as disks will fail and be replaced. When it does, that failed media becomes a security risk. In the right hands (or worse, in the wrong hands), data can still be recovered from failed media. In fact, there are a number of companies that specialise in just this. Encryption at rest is included as a standard offering with Caringo Swarm. By enabling encryption at rest, you gain peace of mind knowing that even if your old media makes its way into the wrong hands, any data recovered from the media would be encrypted and useless to third parties.Monitoring of the Proper Use of Data
Data Protection Officers overseeing the security of data is only the beginning of GDPR compliance. Data Protection Officers must also make certain that those who have approved access to data are really only accessing what they need to do their job at any point in time. It is well publicised that many of the highest profile data breaches have been inside jobs originating from trusted employees.
How can such inside breaches be avoided? The unfortunate answer is that often they cannot be avoided. Therefore, we must monitor what data is being accessed and by whom, looking for unusual data access patterns. Then, the Data Protection Officer must stop such breaches while they are still in progress and, hopefully, before any data is transferred offsite.Monitoring Data Access Activity
Monitoring unauthorised data access attempts is critical in stopping possible data breaches in their tracks and before they have a chance to take hold. For example, if an unusually high number of failed data attempts are occurring, then its likely that data is being improperly accessed and immediate investigation is warranted so the perpetrator can be stopped. It is for that reason that Swarm Object Storage creates a real-time audit log of all data attempts, no matter if the attempt is successful or not. These logs can be ingested into analytics applications for data access monitoring, both real-time and historical. Such analytics can provide an organisation and Data Protection Officers with a powerful tool to not only monitor GDPR compliance, but also to alert the proper personnel in real-time when compliance is jeopardised.
Register today for our upcoming webinar on February 27, when Alex Oldfield, Solutions Architect, and I will present a webinar on the challenges organisations face and how Caringo Swarm provides a cost-effective solution to meet GDPR requirements and how Data Protection Officers can use Swarm to both monitor and enable compliance.
In the meantime, if you have any questions, you can always reach us at firstname.lastname@example.org. We will be happy to have one of our object storage experts answer your questions or give you a demo.
When I co-founded Caringo with Paul Carpentier and Mark Goros 13 years ago, the IT landscape was different. In storage, you had large monolithic devices purpose built for very specific applications. Innovation was measured in cost and speed (IOPS) and the primary interface to these devices was based on standards (NFS and CIFS). Many systems were closed, RESTful interfaces did not exist (we were grateful when Amazon announced S3, so we had company) and IT infrastructure was sold as a package with hardware and software combined, but digital content was growing.
Here is what the world looked like in 2005:
- Caringo was launched with the the vision to change the economics of storage
- Raw storage was around $0.70/GB with hard-drive sizes in the 250-400 GB range
- Amazon AWS was not officially launched and Microsoft Azure and Google Cloud were still years away (launched in 2008)
- Facebook was only 1 year old with about 1 million users and closed it’s initial round of financing
- Netflix had 4.2M subscribers to it’s DVD delivery service, streaming was still 2 years away
- Twitter didn’t exist and YouTube had just launched.
What we saw was enterprise storage requirements evolving from specific application needs to content driven needs. Capacity growth, file growth and over the web access requirements were straining file systems. We saw the need to disconnect storage software from hardware so that organizations can take advantage of the advancements in standard x86 servers and protect data without RAID. We also saw the need for the native interface for content focused storage to be based on HTTP, the language of the Web.
Fast forward 13 years to 2018:
- Caringo has enabled hundreds of organizations worldwide to store billions of objects
- A 3TB drive is $75 or $0.02/ GB and you can get a 12TB hard drive
- AWS generated $17.4B in revenue for Amazon, on track for $20B in 2018
- Facebook has 4B active users daily and 2.13B active users per month
- Netflix has 118M subscribers, is about to overtake Cable Subscribers and has an $8B content budget
- Twitter has 1.3B accounts and 500M tweets are sent out daily, that’s 6K tweets every second
- Every day on YouTube 300 hours of video are uploaded and 5B videos are watched
It’s exciting to see the fire we started in 2005 taking hold; it is changing the whole storage industry for the better. Object storage has become a part of everyday vocabulary, HTTP as a storage protocol is now an accepted practice, large volumes of data can be stored cost effectively, and enterprises can leverage new generations of hardware without expensive migrations. We have achieved much of what we set out to do.
The next phase is for the industry to not only see Software-Defined Object Storage as cheap and deep, but as a new way of managing data and solving problems. There is untapped potential in metadata-driven applications replacing databases and leveraging big data and search engine technology to build applications more rapidly and cost effectively. Using Swarm pure Object Storage, data can now be self describing, portable, and live forever separate from the hardware on which it resides. This has wide ranging ramifications for the deluge of data coming from IoT and the ability to leverage existing data sets across the enterprise. As companies wake up to the new possibilities and let go of the old model—managing the metadata in databases and storage being dumb files systems—they will reap the benefits of object storage as a data management platform and create competitive advantages for themselves while reducing costs.
In last week’s blog, I looked at the challenges GDPR presents. In this week’s blog, we will take a look at how Caringo Scale-Out Hybrid Storage provides a simple and cost-effective solution that can enable GDPR compliance.
The first challenge: how do you find all the relevant data when you receive a “Right to access,” a “Data Portability” or a “Right to be forgotten” request from an EU resident? Expecting that anyone can manually login to every application and storage system used by your organization and locate the right data is unrealistic. Not only would this be time-consuming, but it opens the door to the risk of data being missed or data being included that is not within scope.
It would be far more practical to run a single search request against the storage system that then matches all relevant data that is within scope. Of course, depending on the type of request, the scope may change. For example, a “Right to access” request may need to include data that cannot be removed during a “Right to be forgotten” request, and of course a “Right to be forgotten” request cannot include any data that must be retained by law.
Metadata Radically Simplifies Search
This is where the power of metadata in Caringo Swarm comes into play. Unlike traditional storage and other object storage solutions, Caringo Swarm object storage allows metadata to be directly attached to the object (data), not stored in a separate database. Swarm also allows this metadata to be modified without having to rewrite the entire file. Most importantly, all of this metadata is indexed and becomes searchable. Watch this on-demand webinar to learn more.
For example, for all data related to myself, I can attach a piece of metadata such as ‘person-name: Glen Olsen’ and ‘person-DOB: YYYY/MM/DD.’ Now, I can run a query directly against Caringo Swarm requesting a list of all data with matching metadata attached. As long as this metadata has been attached directly to the object, it will be listed regardless of which application or system wrote the object. Additional metadata I can attach might include the application that owns and wrote the data, why we retained that data in the first place, and if that data is eligible for “Right to be forgotten” removal. I now have all the information attached as metadata and searchable to create a list using a simple query against the Swarm storage system, without the need to manually log into any application. Yes, with Caringo Swarm, it can be as simple as that.
Replication Automates Data Removal
As we discussed last week, the “Right to be forgotten” can present a whole new set of challenges when this comes to Dev, Test and QA, since often a copy of production data is taken for these environments. How can we ensure that any data removal related to a “Right to be forgotten” request is populated to any copies created of the data? Caringo again offers a simple solution to this problem. Instead of Dev taking a copy of production data, Dev can instead configure replication of data from production so that if the data is removed from production it will also automatically be removed from any replication destination. Moreover, each time a department needs to refresh their copy of data, they can replicate the data from production, ensuring that no data that was removed during a “Right to be forgotten” request reappears in any other environment.
Continuous Protection without Backups, Significantly Reduces Discovery Time
Traditional backup will quickly become a thorn in the side of any organisation trying to maintain GDPR compliance, as any “Right to be forgotten” request will require data to be removed not only from Production, Dev, Test/QA and similar systems, but also every copy of that data held in backup. This includes physical media such as tape or other archive. The only easy answer here? Stop using traditional backup systems and count on the continuous, built-in data protection features of Caringo Swarm. That way, your Chief Security & Risk Officer can breathe easy, knowing that your data is protected and that you have the tools in place to enable compliance with GDPR.
Next week, in the last of this GDPR blog series, we will look at “data protection” and, more specifically, how Caringo Scale-Out Hybrid Storage helps GDPR data protection officers maintain and monitor compliance.
Don’t forget to register for our upcoming webinar on February 27, when my colleague Alex Oldfield, Solutions Architect, and I will present a webinar on the challenges organisations face and how Caringo Swarm provides a cost-effective solution to meet GDPR requirements as well as how Data Protection Officers can use Swarm to monitor and ensure compliance.
Register now to watch live or on demand.
February 27, 10:30 a.m. GMT
GDPR’s Dirty Little Secret & How Object Storage Enables Compliance
In the meantime, if you have any questions, feel free to reach out to us at email@example.com, and we will be happy to have one of our object storage experts get back with you.
While Olympic athletes from around the world are battling it out for Gold, Silver and Bronze medals in PyeongChang, South Korea at the XXIII Olympic Winter Games, we’ve been waiting to see who wins in TechTarget’s Storage magazine and SearchStorage 2017 Storage Products of the Year Data Management category. For 16 years, Storage magazine and SearchStorage have held the annual Products of the Year awards to recognize the best data storage products the tech industry has to offer.
Yesterday, the winners were announced and we are pleased that FileFly 2.0 Secondary Storage Platform received the Silver Award in the Data Management Tools category. Rodney Brown, TechTarget Senior Site Editor reported that one judge called the Caringo FileFly 2.0 appliance an “excellent solution and innovative approach to increasing adoption of object storage” and that another judge said Caringo FileFly Secondary Storage Platform is a “good Swiss army knife solution for integrating file and object in a unified, policy-managed repository.”
FileFly 2.0 can be used by those who have Windows file servers and NetApp filers filling up too quickly to store near unlimited data and improve performance while avoiding expensive additions or upgrades–all without modifying applications or end user behavior. Glen Olsen, Caringo Product Manager, explains these powerful tiering data protection enhancements in his on-demand webinar Introducing FileFly 2.0.
As a pioneer in object storage technology and with a number of nominations under our belt (Swarm in 2012 and 2014, the original FileFly for Swarm in 2015 and SwarmNFS in 2016), Caringo might be considered a veteran. For their Product of the Year awards, SearchStorage balances the scale to give products that don’t necessarily have broad name recognition or long track records by only accepting products released or significantly upgraded during the previous year. And, with more than 100 entries, the judges narrowed the field to 47 finalists with only fourteen of those winning gold, silver or bronze awards. This makes the recognition particularly meaningful to us.
Of course, as Adrian Herrera noted in his blog on our nomination, the best prize for us is being able to help so many customers who are struggling to find effective ways to store massive amounts of unstructured data. If you are evaluating storage solutions and have questions about object storage solutions, email us at firstname.lastname@example.org. We will have one of our experts work with you to determine if our solutions are right for your organization.
We will be exhibiting at the JB&A pre-NAB Technology event April 8–9, 2018 as well as in booth number SL11807 of the NAB Show Expo in Las Vegas, NV April 9–12, 2018. We will be demonstrating our award-winning FileFly Secondary Storage Platform and Caringo Drive as well as the new capabilities of Swarm 9.5 and SwarmNFS 2.0.
Snag a Free Expo Pass for NAB before March 2 using our discount code LV8799.
The post Silver Award for Solving Windows Server & NetApp Filer Problems appeared first on Caringo.
If you reside in Europe, you likely already know about the EU General Data Protection Regulation (GDPR), yet, those outside of Europe may only now be gaining awareness of GDPR. Moreover, many are now coming to the realisation that any organisation physically located outside of Europe who has any dealing with European residents will be required to comply.
This is the first important aspect to understand. Every organisation collecting data (Data Collectors) and third party processing data (Data Processors) pertaining to EU residents, is obligated to comply. No matter where that organisation is physically located.
If your organisation is not based within the EU, you may be asking yourself “why would I need to comply?” The answer here is financial, with hefty fines of up to 4% of global annual revenue or 20 Million euros, whichever is greater. Any organisation collecting or acting as a third party by processing European resident personal information is within scope. You will be classified as a collector of personal data simply by allowing an EU resident to sign up to your mailing list.
So what is GDPR, and what does it mean to how an organisation stores and manages data?
To begin, an organisation must, on request, quickly identify all data they control belonging to an EU resident. This is known as an individual’s “right to access.” On request, an organisation must be able provide an electronic copy of the data held, including where it is stored, plus the reason for retaining that data. An individual must also be allowed to submit corrections to any data held. This does not mean the individual requires direct access to the data, only that there must be a means for individuals to submit corrections.
Individuals now also have the right to transmit their data from one controller to another. Upon receipt, an organisation must provide a copy of all the individual’s data held in a “commonly used and machine readable” format. This requirement is known as “data portability.”
An important part of GDPR is an EU residents “right to be forgotten.” Any EU resident can request all personal data retained by an organisation to be removed. This includes any data shared with third parties such as data processors. That regulation persists unless an organisation has a valid reason not to comply; for example, a debt collection agency could not be expected to destroy all personal data related to a debt they are actively pursuing against an individual. Of course, not all personal data is eligible for deletion, such as if removal of the requested data would breach laws on compulsory data retention.
“Data Protection” is also a key stipulation of GDPR, specifically protecting personal data from being stolen and from unauthorised access. On becoming aware of any data breach that is likely to jeopardise personal data, the data controller must report the breach within 72 hours to the relevant regulator, as well as to every likely affected individual within an acceptable time frame.
Organisations may also be required to employ a “data protection officer” whose role is not only to inform and advise of obligations under GDPR, but to also monitor compliance of GDPR.
An organisation relying on traditional systems, storage, processes and workflows is going to find GDPR compliance both expensive and challenging. It will require a review of all aspects of data storage and protection. For example, legacy data backups could make compliance to the “right to be forgotten” requirement extremely difficult. To comply with a data removal request, not only will the active data need to be removed, but also every instance of that data held on backup media or archive. Similarly, the loss or misplacement of a backup tape is likely to trigger an expensive “data protection” exercise.
The “right to be forgotten” and “data protection” requirements will also force many organisations to change the way they develop and test their applications. Testing using a copy of production data containing personal data could become unmanageable. Not only would such a data-sharing practice put at risk retained personal data, but, in the case of a right to be forgotten request, the organisation will be required to also ensure that data is removed from all development, test and QA environments, and most importantly, not inadvertently restored during test or QA cycles.
Yet, the biggest challenge many organisations will face is simply how to locate in a timely and cost-effective manner all personal information pertaining to an individual, along with why that data is retained in the first place and were it is stored—without forgetting which data is in scope for data removal and what data must be retained.
“Privacy by design” requires that these capabilities, along with compliance to all other GDPR requirements, be built into products and processes from day one.
While this may seem overwhelming, it is important to know that the technology already exists to enable you to comply with GDPR. In next week’s blog, I will look at how Caringo Scale-Out Hybrid Storage provides a simple and cost-effective solution. And, on February 27, my colleague Alex Oldfield, Solutions Architect, and I will present a webinar on the challenges organisations face and how Caringo Swarm provides a cost-effective solution to meet GDPR requirements as well as how Data Protection Officers can use Swarm to monitor and ensure compliance.
Register now to watch live or on demand.
February 27, 10:30 a.m. GMT
GDPR’s Dirty Little Secret & How Object Storage Ensures Compliance
The post EU General Data Protection Regulation (GDPR) & Object Storage appeared first on Caringo.
2017 was a turning point for object storage. Ever since content addressable or key-value based storage was created, storage analysts and IT execs wondered if it was a technology in search of a problem to solve. Was it really just a “better mousetrap?”
As we enter into 2018, there are a number of trends in storage, access, analysis and regulations that will drive the adoption of object storage proving that it is indeed a better mousetrap and much more, serving as the foundation to any content-driven need. For the Zetabytes of data on the horizon, it’s not just about the “application” and its needs anymore (speeds, feeds and file locking). It’s about the end-user (human or machine), and convenient and instant access to content for analysis, reuse or to protect privacy. Here are my predictions for 2018.
Prediction 1: The European Union (EU) General Data Protection Regulation (GDPR) Will Take IT Execs by Surprise and More Major Breaches Will Occur
In today’s digital world, it should come as no surprise that ransomware and major security breaches will continue to occur. Lawmakers around the world will continue to pass legislation like GDPR in response. In 2018, compliance with GDPR and the need to respond to secure data breaches will take IT execs by surprise offering an opportunity for those in the data management and storage space to provide technologies that can quickly identify personal and sensitive data stored.
Prediction 2: The Metadata Deluge Driven by Artificial Intelligence (AI), Machine Learning (ML) and IoT Will Accelerate
As AI, ML and IoT applications continue to become part of the mainstream, the data about interaction, analysis and consumer behavior will increase exponentially. Storage technologies that integrate NoSQL search will help by enabling organizations to capture everything they want, without defining a schema for future, but yet-to-be-defined use.
Predication 3: Major Brands Will Continue to Move Off of AWS as Amazon Continues to Dominate Many Industries
While S3 is seen by many as a de facto API for internet accessible storage, there will be more popular brands in the brick and mortar, M&E and web application space moving off of AWS. The benefit that AWS brought by streamlining operations and reducing time to market will be overshadowed by the threat other Amazon business units represent to many organizations’ primary business. This will not impact AWS’s overall growth, but I predict some highly visible migrations to Azure and a move by some to bring the most strategic operations and near-line archives in-house.
We are looking forward to a great year! To stay up to date on the trends throughout the year you can follow us on Twitter or LinkedIn. We are also increasing the frequency of informational webinars and will be covering topics related to these predictions. Subscribe to our BrightTalk channel to be notified of upcoming live broadcasts where you get a chance to ask our experts your questions and, as always, if there is anything we can assist you with, don’t hesitate to contact us.