Tag Archives: Azure Storage

#Azure: Step by step Azcopy

When an organization of any size looks at the cloud, data migration becomes focal point of each discussion. Available data transfer options can help you to achieve your goal. In command line methodologies Azcopy is the best tool to migrate reasonable amount of data. You may prefer this tool if you have hundreds of GB data to migrate using sufficient bandwidth. You can use this tool to copy or move data between a file system and a storage account or between storage account. This tool can be deployed on both Windows and Linux systems. It is built on .Net framework for Windows and .Net core framework for Linux. It leverages windows style command-line for windows and POSIX style command-line for Linux.

Let me explain, how to do it step by step on windows system.

First, download the latest version of Azcopy tool for Windows.

Once downloaded run the .msi file. Click on Next to continue installation.

Accept the license agreement and click on Next.

Define the destination folder and click on Next to continue.

Click on Install to begin the installation.

Click on Finish once installation completed successfully to exit the installation wizard.

Open “Microsoft Azure Storage command line” tool from the programs.

Now, look at the source and destination location and type. If I am copying data from internal filesystem to the cloud blob storage then local filesystem is my source and blob container in cloud storage account is going to be my destination.

Note down the location of source data.

Copy the URL of your blob container.

Copy the Access Key. You can find “Access keys” under setting in storage account.

Run the Azcopy command in following syntax: Azcopy /source:<source path> /dest:<destination path> /destkey:<Access key of destination blob> /s

You can monitor the copy activity.

If any error occurs during copy operations, you can monitor that as well.

Note: In the example below, to simulate an error scenario, I had tried to copy all blog posts along with this blog post on that I was working on. Therefore, you can see the same error description.

Another error was .tmp file. This .tmp file error, we can ignore.

Now, let me explain you that “how to perform retry option”. Run the same command and “Incomplete operation with same command line…” prompt enter Y to retry the operation for failed data. As you can observe that the filed operation of in-use file has completed successfully. However, we can ignore the .tmp file error.

Once you have copied all the data, go to the blob container and verify the same.

If you have high bandwidth internet connection or express route, you can move large amount of data as well using Azcopy but it is more relevant option for xyz GB of data. Here xyz represents the numbers.


#Azure : Map your traditional datacenter storage with cloud storage

Storage plays a critical role in technology, without storage you can’t think about anything. Because of the criticality of its existence, storage providers have made enormous amount of money in the past by selling disks with the intelligent built into their proprietary storage controllers that controllers the behavior and performance of the storage. At present, that intelligence has been integrated with the software and that’s why this kind of storage are known as software defined storage. SDS has given a chance to the industry to have infinite amount of data without the hurdle of jumbling cables connectivity between the compute and the storage. In short, now you can leverage your compute with local disks attached to it and create your own software defined storage using software capabilities such as VMware vSAN, Windows S2D, Scale IO etc.

Let me provide you a high-level overview of traditional datacenter storage:

Direct Attach Storage: Direct attach storage as name describes, it is a pool of disk that can be directly connected with the server locally using SAS cable. You can attach multiple DAS with on one server in the form of chain, while number of DAS in one chain depends on DAS specification. Example: Very good option for application like Microsoft Exchange to provide large mailbox size.

Storage Area Network: SAN is a block level data storage technology and commonly used storage across enterprises. It is also known as intelligent storage and supports high availability, scalability, resiliency and so on. In general SAN has two storage controllers in HA mode and it can be connected using two methodologies based on storage type, these methodologies are fiber channel and iSCSI. Fiber channel storage uses fiber channel switch to establish a connection between storage arrays and with compute using FC cables. Now-a-days few fiber channel SAN supports FCoE (Fiber channel over Ethernet). Another option in SAN is iSCSI, it uses network switch to connect between storage arrays and with compute using SFP+ cables. There are many hybrid SANs are available in the market, which supports both the technologies.

Network Ares Storage: NAS is a file level computer data storage, which used over the network. Many enterprises use this technology for File Share storage and NFS. This type of storage directly connects to your local area network using ethernet switch.

Software Defined Storage: Software defined storage is a latest technology that is being used in traditional datacenter as well as in the cloud. This technology generally used your local internal storage of the server and leverage the storage logic defined by software services such as vSAN from VMware, Storage space direct from Microsoft and Scale IO from Dell EMC etc. It uses a pool of servers and their local storage to create a single storage pool that can leveraged across the compute nodes. It also provides traditional SAN capabilities such as tiering.

When it comes to the cloud and specifically Microsoft Azure storage services, you can differentiate available services in four categories i.e. blob, file queue and table storage. In most of the cases, you leverage blob storage. To know more about in details, read the following posts:

Storage Accounts

Storage services

Storage replication

Step-by-step Microsoft Azure Storage

Storage Explorer

Now, let me explain how to do a selection of storage in cloud.

Largely data in cloud has been divided into two large pieces i.e. structured data and unstructured data. While the type of storage has been divided into two tiers i.e. Standard and Premium. Standards storage is common across entire storage service landscape while premium is only part of the blob and can be leveraged only to create OS and data disks. Follow the following steps to select the right set of storage service option for your specific solutions.

OS & data disk: Uses blob and performance tier can be selected between standard and premium. Standard is backed by SAS HDDs and premium is backed by high-performance SSDs. Many high-performance OS disks use SSD by default. OS disk is to install the operating systems and data disk can be used to install application binaries and to store the application data as well.

Application storage requirement: Look at the application architecture and IOPS requirement. Look at your exiting application environment and storage configuration in case of brown field and use the application definition documentation to estimate the storage size, performance tier and type of storage service. Once defined then use the storage service based on your application requirement.

Database storage requirement: First select between PaaS and IaaS based on your application need and then define the storage tier and storage service based on your database type and application need.

Above mentioned criteria are a very high-level decision-making technique but in one line I would say, know your application and data requirement in terms of size (first) and availability/replication (second) and performance (third) to make the right selection.

#Azure : Storage Explorer

When you move from traditional datacenter to the cloud, you know the limitation very well. In the cloud, you don’t much worry about the underlying infrastructure management. In the context of virtual machine, you don’t play with hypervisor at all. But once you look at the storage, you feel that management tool is required to manage the storage pool. To help you in this area, you have few options from Microsoft but at the same time you may find multiple offerings from the Microsoft Partners. Microsoft natively provide following options for storage management.

  • Microsoft Azure Portal
  • Microsoft Visual Studio Server Explorer
  • Microsoft Azure Storage Explorer

Microsoft Azure Storage Explorer comes with full-fledged functionalities and support all major operating systems such as Windows, Linux and OSX. It is a thick client application that you need to download and install on your system.

To download this tool, go to the “Azure Storage Explorer” page and download the storage explorer by selecting right operating system based on your need.

Once download completes, install this tool in your system.

To install, run StorageExplorer.exe on your system. Accept the agreement and click on Next.

Select the installation directory and click on Next.

Go with default setting and click on Next as it will create a short cut in the Start Menu.

It will take few seconds to compete the installation.

Once done, click on Next to open the Storage Explorer wizard.

First time, it will take few seconds to load the wizard.

Once loaded successfully, you will get an option to connect your storage account or service. Click on “Sign in” to continue.

In my case, I would like to manage entire storage portfolio of my Azure subscription. But you can use specific storage accounts or provide access to others on a specific storage accounts using different methods.

Login using your Microsoft Azure subscription account.

Go to the Microsoft Azure Storage Explorer and select “Manage Accounts” option and then click on either “All subscriptions” or any specific subscription and then finally click on apply.

Once connected, you will be able to see all your storage accounts.

Now, you can play with it. You can use storage explorer for following activities.

Blob storage

  • View, delete, and copy blobs and folders
  • Upload and download blobs while maintaining data integrity
  • Manage snapshots for blobs

Queue storage

  • Peek most recent 32 messages
  • View, add, and dequeue messages
  • Clear queue

Table storage

  • Query entities with OData or query builder
  • Add, edit, and delete entities
  • Import and export tables and query results

File storage

  • Navigate files through directories
  • Upload, download, delete, and copy files and directories
  • View and edit file properties

Azure Cosmos DB storage

  • Create, manage and delete databases and collections
  • Generate, edit, delete and filter documents
  • Manage stored procedure, triggers and user-defined functions

Azure Data Lake storage

  • Navigate ADLS resources across multiple ADL accounts
  • Upload, download files and folders
  • Copy folders or files to the clipboard
  • Delete files and folders

Above mentioned activities are just an overview that you can see when opening Microsoft Azure Store Explorer web page but you can do many more things with storage explorer. Therefore, explore this explorer as much as possible.

#Azure : Step-by-step Microsoft Azure Storage

Microsoft Azure storage offers variety of services that can be used to fulfill most of the business needs. Configuration and selection of services differs based on the storage requirements. I have covered most of the services and configuration options in the following post:

Storage Accounts

Storage services

Storage replication

In this blogpost, I am going to cover step-by-step process for creating storage accounts, configuring required storage services and selecting right storage replication methodology to fulfill your business needs.

Login to Azure Portal. Create a new resource group or use existing one based on your requirements.

Select “+ Create a resource” to create storage account.

Select “Storage” from the Azure market place and then select “Storage account – blob, file, table, queue”.

In the “Create storage account” panel, enter the unique name for the storage account. This name should be in all lowercase without any space or special characters.

By default deployment model is set to “Resource manager”, if you choose “Classic” under deployment model then you will not be able to select the kind of the storage account and as well as many new features that comes with general purpose v2 storage account.

When you go with default deployment model “Resource Manager”, you can select one of the following account kind:

Storage (general purpose v1)

StorageV2 (general purpose v2)

Blob storage

Select the performance tier based on your requirements.

Standard performance tier provides, four replication methodologies. Select most suitable replication options based on data availability requirements.

If you select premium performance tier then only you have one replication option i.e. LRS, and default access tier will be Hot. In premium storage, there is no option for cold access tier.

All four replication methodology is very specific to location. In few locations, you don’t have option to go with ZRS.

In “Resource Manager” deployment model, you have an option to select virtual network. In general, we don’t enable this option (by default it is disabled) but if you have any specific requirement based on your data confidentiality then you can define the virtual network / subnet so that the services running under specified network subnet can only use this storage account.

Once done with the configuration, click on create to create a storage account with specified storage services. I hope, it gives you a clearer picture about storage configuration in Microsoft Azure. Please feel free to share your experience by leaving a comment in the comment box section.

#Azure : Storage replication

Microsoft Azure storage offers numerous type of availability and durability of the data, within the datacenter, across datacenters, within the same region, or across regions. Based on your needs, you can select the right replication methodology. For example, if you would like to save your data from catastrophic failure in a single region then choose replication option that supports replication across regions.

Please keep in mind that you can configure replication options when you are creating a storage account and each region doesn’t support all replication options. Microsoft Azure offers four different replication options.

LRS: Locally redundant storage

ZRS: Zone-redundant storage

GRS: Geo-redundant storage

RA-GRS: Read-access geo-redundant storage

Table below provides you a quick overview about differences between all four replication options.

Data is replicated across multiple datacenters. No Yes Yes Yes
Data can be read from a secondary location as well as the primary location. No No No Yes
Designed to provide _durability of objects over a given year. at least 99.999999999% (11 9’s) at least 99.9999999999% (12 9’s) at least 99.99999999999999% (16 9’s) at least 99.99999999999999% (16 9’s)

Courtesy: Microsoft

Let me explain you all four replication options in detail.

Locally redundant storage: maintains three copies of your data. It replicates your data within a scale unit, which is hosted in a datacenter in the region in which you create your storage account. A scale unit is nothing but a set of multiple racks, which hosts storage nodes. To maintain high availability, these replicas reside in separate fault domains and update domain. A fault domain is nothing but a group of nodes, which belongs to a single point of failure. While an update domain is a group of nodes that can be upgraded at the same time. LRS is cost effective solution but doesn’t safeguard your data from datacenter level failure.

Zone-redundant storage: maintains three copies of your data. ZRS is little confusing as of now because of its two versions. ZRS is in preview, which falls under general purpose v2 storage accounts, it replicates data synchronously across multiple availability zones with in a region and very useful for highly available applications. While the existing or old ZRS capability is now referred to as ZRS classic, which falls under general purpose v1 storage accounts. ZRS classic replicates data asynchronously three times across two to three facilities within same region or in some cases across two regions. ZRS classic are planned to depreciate by March 2021 and once new ZRS generally available in a region then ZRS classic can’t be created.

Geo-redundant storage: maintains six copies of your data. It replicates three copies in one region and another three copies in another region. In primary region, it replicates your data within a scale unit, which is hosted in a datacenter in the region in which you create your storage account. A scale unit is nothing but a set of multiple racks, which hosts storage nodes. To maintain high availability, these replicas reside in separate fault domains and update domain just like LRS. While in secondary region also it does the same thing but between the region data replication take place in asynchronous mode. Your data doesn’t become available in case of primary region failure, until Microsoft initiates the failover. Primary and secondary region association is pre-defined based on the locations and can’t be changes manually. Once you create a storage account, you just need to specify your primary azure region. Here is the list of primary and their respective secondary regions.

Primary Secondary
North Central US South Central US
South Central US North Central US
East US West US
West US East US
US East 2 Central US
Central US US East 2
North Europe West Europe
West Europe North Europe
South East Asia East Asia
East Asia South East Asia
East China North China
North China East China
Japan East Japan West
Japan West Japan East
Brazil South South Central US
Australia East Australia Southeast
Australia Southeast Australia East
India South India Central
India Central India South
India West India South
US Gov Iowa US Gov Virginia
US Gov Virginia US Gov Texas
US Gov Texas US Gov Arizona
US Gov Arizona US Gov Texas
Canada Central Canada East
Canada East Canada Central
UK West UK South
UK South UK West
Germany Central Germany Northeast
Germany Northeast Germany Central
West US 2 West Central US
West Central US West US 2

Courtesy: Microsoft

Read-access geo-redundant storage: maintains six copies of your data. It replicates three copies in one region and another three copies in another region. In primary region, it replicates your data within a scale unit, which is hosted in a datacenter in the region in which you create your storage account. A scale unit is nothing but a set of multiple racks, which hosts storage nodes. To maintain high availability, these replicas reside in separate fault domains and update domain just like LRS. While in secondary region also it does the same thing but between the region data replication take place in asynchronous mode. In case of RA-GRS, your data is available in read mode always, even in case of primary region failure. However, you can’t get write access on your data from secondary region until Microsoft initiates the failover. Primary and secondary region association is pre-defined based on the locations and can’t be changes manually like GRS. Once you create a storage account, you just need to specify your primary azure region.

#Azure : Storage services

In Azure Storage Accounts blogpost, I have covered details of storage accounts, access tiers and performance tiers. Storage account is a kind of container for storage services in Microsoft Azure. There are following storage services provided by Microsoft Azure:

Blob storage

File storage

Queues storage

Table storage

Disk Storage

Let me explain you each storage service in detail.

Blob storage: This name “blob” looks bit confusing to the people who are new in the world of storage. In simple words, a blob is a storage that can store almost any kind of file that you store in your PC, tablet, mobile and cloud drives. For example, MS office documents, HTML files, database, database log files, backup files and big data etc. Once stored, you can access it from anywhere in the world through URLs, REST API, and Azure SDK storage clients etc. There are three types of blobs, block blobs, page blobs and append blobs.

  • Block blobs: It is an ideal for storing any kind of ordinary files such as text or media file. It supports files up to about 4.7 TB in size.
  • Page blobs: It is kind of blob that is meant for random access and more efficient with frequent read/write operations. It supports files up to 8TB in size and used for OS and Data VHDs.
  • Append blobs: It is as same as block blob and in other words it is made up of blocks like block blob but it provides an additional capability of appending the files. It is generally used in logging scenarios, where we store logging information from multiple sources and append it.

File storage: Azure file storage is highly available network file share based on the SMB 3.0 (Server Message Block) protocol also known as CIFS (Common Internet File System). Azure file shares can be accessed by Azure virtual machines and cloud services by mounting the share, while on premises deployments can access it through Rest APIs. One of the amazing capability that distinguish it from normal file share i.e. it can be access from anywhere through the URL that points to any file and includes SAS token. The way we use traditional file share, in the same way azure file share can be used. Let’s take few examples to make it clearer.

  • File share to store data such as files, software, utilities, reports etc.
  • Application that depends on file share
  • Configuration files that need to be accessed by multiple sources at the same time.
  • To store crash dump, metrics and diagnostic logs etc.

These are few examples but in your day-to-day life you find many more. As of now, AD based authentication and ACLs are not supported but in future you may see it as well.

Queue storage: This is not a new word for any experienced IT developer/professional. Azure Queue storage is a service to store and retrieve messages asynchronously. A queue can contain millions of messages, up to a capacity limit of the storage account and within a message size limit of 64 KB each. It can be accessed from anywhere in the world via authenticated calls using HTTP or HTTPS. The maximum time that a message can remain in the queue is 7 days. Few examples of queue storage services are:

  • Passing a message from an Azure web role to an Azure worker role.
  • Covert file types of large number of files such as .png to .jpeg by using azure function. Once you will start uploading the files, azure function will start converting its format.

Table storage: Azure table storage is a service that stores structured data. This service is NoSQL datastore that accepts authenticated calls from inside and outside the Azure cloud. It is ideal for storing structure and non-relational data such as spreadsheet kind of information, address books, user data for web applications etc. You can store millions of structured and non-relational entries, up to limit of the storage account. Few example of table storage service are: (Courtesy: Microsoft docs)

  • Storing terabytes of structured data capable of serving web scale applications.
  • Storing datasets that don’t require complex joins, foreign keys, or stored procedures and can be denormalized for fast access.
  • Quickly querying data using a clustered index.
  • Accessing data using the OData protocol and LINQ queries with WCF Data Service .NET Libraries

Disk Storage: Azure disk storage service is the simplest one to understand as we use it as part of the virtual machines either on-premises or in the cloud. Disk storage can be used for operating systems, application or any other kind of data. All virtual machines in Azure have at least two disks, a disk for operating system and a temporary disk. VMs can have one or more data disks. All disk will be in VHD format and can have capacity up to 1023 GB. Azure disk storage service provides these disks in two ways, a managed disk and an unmanaged disk. These disks can further divide between two performance tiers, standard and premium.

Managed disk: Managed disk are disks that is created and managed by Microsoft and you don’t have to worry about the availability of storage. Managed disks are available in both performance tier, based on our requirement you can select the right size of disk and performance tier. Standard performance is represented by S and premium performance tier is represented by P. The available size for both the performance tier is between 32 GB to 4095 GB.

Unmanaged disk: Unmanaged disks are disks that is created and managed by you. To create these disks, first you create storage account and define availability by selecting replication options and then you create unmanaged disks under it. Unmanaged disk also supports standard and premium tier. Here, you are responsible of availability of the disk based on replication method you select.

Standard tier: Standard tier disks are basically HDD and provide limited number of IOPS. This type of disks provides maximum 500 IOPS.

Premium tier: Premium tier disks are SSDs and provide high IOPS and low latency. This type of disks provides maximum 7500 IOPS. Premium disks are only available with limited series of Azure VMs.

#Azure : Storage Accounts

Storage is an essential part of anything what you do in your day-to-day life and same applies to technology as well. Microsoft Azure Storage is a managed service provided by Microsoft cloud services. When you use any product or service, availability, resiliency, performance, scalability, durability, pricing, security and delivery play an important role, and here in case of Azure Storage it is all taken care by Microsoft.

Azure Storage provides two type of storage accounts: General Purpose and Blob.

While Azure Storage provides services in the following types:

Blob storage

File storage

Queues storage

Table storage

Disk Storage

Storage accounts and services are tightly integrated with each other. To use any one of the above service, you first create a storage account then you define the storage services based on the storage account type.

First, let understand the Storage accounts by an illustration:

Now, let understand the storage accounts in detail:

General purpose: A general purpose storage account cater all your azure storage services such as Tables, Queues, Files, Blobs and Azure virtual machine disks under a single account. This type of storage account has two performance tiers:

  • Standard storage performance tier: This performance tier fulfills all your data storage needs such as Tables, Queues, Files, Blobs and Azure virtual machine disks. This tier supports block blobs, page blobs and append blobs.
  • Premium storage performance tier: This performance tier is backed by SSDs and provides high performance IOPS, best for virtual machine disks and data intensive applications such as database. This tier supports only page blobs.

Currently, these general-purpose accounts are available in 2 versions.

General purpose v1: It is previous version of storage account and doesn’t provide latest and greatest storage capabilities, which is certainly available with new kind of storage. It also doesn’t provide access tier (Hot and Cold).

General purpose v2: This is a newer version of general purpose v1 storage and provide all the features, which are part of v1 storage. It also provides all the latest features available for blob, files, queues and tables with better performance and pricing. It also supports the access tiering (Hot and Cold) for different needs and performance.

You can upgrade your GPv1 account to GPv2 account, using PowerShell and Azure CLI.

Blob: A blob storage account is mainly to store unstructured data as blob (objects). It also provides access tier (hot and cold) to support different needs and performance. It only supports block blobs and append blobs. It provides only standard performance tier.

Access tiers: Access tiers are supported by General purpose V2 storage account and blob storage account to serve different needs.

  • Hot access tier indicates that the objects in the storage account will be more frequently accessed. This allows you to store data at a lower access cost. Premium storage always falls under this access tier.
  • Cool access tier indicates that the objects in the storage account will be less frequently accessed. This allows you to store data at a lower data storage cost.