In this blog, I want to share about an incident that I learn from Regional Data Transfer in AWS, and how to identify and prevent an unexpected cost due to Regional Data Transfer, which was caused by an FTP server (using vsFTPd) that was not configured properly. From this event, I believe you could correlate to other cases which we have AWS resources communicating with each other through public IPs, e.g. a server queries an RDS service in the same subnet but accidentally use public IP rather than connecting through private network.
What is Regional Data Transfer?
Just a quick note, when we talk about Regional Data Transfer, it basically means data transfer either between Availability Zones (AZs) or in the same AZ, and it should be in the same AWS Region for both cases.
Go to AWS Cost Explorer, if we apply filter “Usage type” of Region-DataTransfer-Regional-Bytes, for example, USE2-DataTransfer-Regional-Bytes, it stands for data transfer between Availability Zones in the US East (Ohio) Region.
If you select DataTransfer-Regional-Bytes, it means data transfer cost in the AZs within your current AWS Region.

Incident
This is just a demo that I use to explain the incident. Applications on EC2, they both have two network interfaces, (one for public network, the other is for private network):
- App server:
- Public network – IP: 123.123.123.110
- Private network – IP: 172.30.0.10/24
- FTP server
- Public network – IP: 123.123.123.111
- Private network – IP: 172.30.0.11/24
We have an app server communicating with the FTP server to collect/download data in the same AZ within an AWS Region (we just look at DataTransfer-Regional-Bytes). However, the app server initially connects to the FTP server by using the public network. This led to us being charged for Regional Data Transfer because the data was going out through a public IP. Data transfer going in is free depending on which AWS region you’re in (it’s free in my case); please consult doc to know more.

Initially, we monitored the Regional Data Transfer cost every month was just double-digit dollars, were assuming that’s just a normal operation. But suddenly, the cost for this type started increasing gradually, we didn’t know the reason at that time, and again still think it was normal.
After few months have gone, we identified that we experienced a spike in cost usage of Regional Data Transfer, which has gone up to hundreds of dollars per month. This caught our attention to a real hidden issue. Actually, there were other tasks in the system leading the app server to process data on the FTP server more than normal, which we didn’t think that’s the reason.
We track down cost usage of each EC2 instance because that’s where we suspected the most. Just use Cost Explorer in AWS, in Group By, we grouped “Dimension” by “Tag:Name” combined with applying filter of usage type is “DataTransfer-Regional-Bytes (GB)”, it showed us two EC2 instances transferring the most data. One is FTP server, and the other is the app server. Both have the same amount of GB Data Transfer. This indicated that they were transferring between each other.
And then, we traced back the config in the codebase running on the app server in order to figure out how it was connecting to the FTP server. It actually revealed that the app was configured correctly to connect to the FTP server through a private network already. It seemed to be tricky then because the AWS Cost Explorer obviously reported that both servers were communicating with each other through a public IP instead of the private network. That became hard to investigate.
Just a simple inference, if our app server was configured correctly, it must be at the side of the FTP server. After going through several checks, it revealed us that somehow, when the app server called the FTP server through the private network, the FTP server always returned the data back through the public network. That’s when I discovered that the FTP server was not configured properly. After reading this explanation, I found that the reason was that our FTP server was behind an external firewall; the incoming connection actually comes from the external firewall, and by default, vsFTPd states that “the address is taken from the incoming connected socket,” as mentioned in manual, which means that the FTP always uses its public IP address to respond back.
Solution
I had to create two configurations for the vsFTPd service: one for external use and one for internal use. This means the FTP server provides the firewall’s external IP address for public connections and the internal IP address for private connections. I’ve written the blog How to configure VSFTPD server behind a firewall for handling internal and external IPs to configure this long ago. You could have a look at this to refer.
From this way, our Regional Data Transfer cost went back to zero, which not only addressed the rising cost issue, but the existing data transfer cost that we wouldn’t notice. This was just small lesson that I’ve learned while working with AWS, but triggered a few thoughts later in the future every time I investigate on Data Transfer cost.
Discover more from Turn DevOps Easier
Subscribe to get the latest posts sent to your email.
