-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28996: Implement Custom ReplicationEndpoint to Enable WAL Backup to External Storage #6518
base: HBASE-28957
Are you sure you want to change the base?
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@vinayakphegde You need to add ASF license header to all new files. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments.
if (backupFs.exists(dirPath)) { | ||
LOG.info("{} Directory already exists: {}", Utils.logPeerId(peerId), dirPath); | ||
} else { | ||
backupFs.mkdirs(dirPath); | ||
LOG.info("{} Successfully created directory: {}", Utils.logPeerId(peerId), dirPath); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How many HBase instances are running this code? Isn't there a race condition here?
Is this run by master only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will run on region servers. I believe we can simplify the logic by directly using backupFs.mkdirs(dirPath). It will create the directory if it doesn't exist, and simply return true if the directory already exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be better.
...backup/src/main/java/org/apache/hadoop/hbase/backup/replication/ContinuousBackupManager.java
Show resolved
Hide resolved
@InterfaceAudience.Private | ||
public class ContinuousBackupManager { | ||
private static final Logger LOG = LoggerFactory.getLogger(ContinuousBackupManager.class); | ||
public static final String CONF_BACKUP_ROOT_DIR = "hbase.backup.root.dir"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need this root dir configured?
New full backup command will get a fully qualified backup path:
hbase backup create full s3a://my-hbase-bucket/backups/backup_0001 --continous
Is this for the staging area?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is for the wal backup location. We'll have the WAL backup location, which is passed to the ReplicationEndpoint via the Configuration. Essentially, this serves as the argument for the ReplicationEndpoint to determine where to back up the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the common WAL backup location that we talked about earlier, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes.
continuousBackupWalWriter.write(walEntries, bulkLoadFiles); | ||
|
||
// prevent bulk-loaded files from deleting HFileCleaner thread | ||
stagedBulkloadFileRegistry.addStagedFiles(bulkLoadFiles); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might have a race here:
bulkLoadFiles
are written byWalWriter
bulkLoadFiles
gets added to the exception list forHFileCleaner
If HFileCleaner
runs between these 2 events, it will delete files which recently have been staged. I'm not sure yet how to do this properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that scenario doesn't occur. The replication framework prevents the deletion of files using ReplicationHFileCleaner
. Files are only deleted once the WALs and bulkload files are successfully replicated, as indicated by a success return from the replicate() method.
So, it the replicate() method is complete, it won't delete the file.
However, in our case, the bulkloaded files are still in the staging area and haven’t been copied yet. Therefore, we had to implement an additional cleaner for this purpose.
|
||
@Override | ||
public boolean isFileDeletable(FileStatus fStat) { | ||
// The actual deletion decision is made in getDeletableFiles, so returning true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
Check if the file available for deletion here. Connection to HBase can be kept open.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single GET
op to HBase checking if filename is present in the table should be quick.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach would be slightly faster, right? In getDeletableFiles
, we could run the query once and process all the results together, instead of executing multiple queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about that. Currently you run a scanner while with my approach, you'll run multiple GETs. Generally speaking PUTs and GETs are the most performant operations in key-value stores, therefore they're preferred over scanners. But. The table is small, cached, etc.
Personally I don't like scanning the same table over and over again. It's like a Hashtable.
// Fetch staged files from HBase | ||
Set<String> stagedFiles = StagedBulkloadFileRegistry.listAllBulkloadFiles(connection); | ||
LOG.debug("Fetched {} staged files from HBase.", stagedFiles.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to maintain the list of staged files in an HBase table?
Why not just return false (do not delete) if the file is in the staging area?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use special file name prefix while staging:
bulkload1.hfile
dontdelete_bulkload2.hfile
dontdelete_bulkload3.hfile
bulkload1.hfile
can be deleted from staging area others should be kept. Rename operation is atomic in HDFS afaik.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the bulkload files, we are not moving them anywhere. They remain in the data directory or archive directory. We are simply maintaining a reference to them in the HBase table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested another way (file renaming) to avoid accidentally deleting them. This way we could avoid maintaining an HBase table.
This PR implements a custom
ReplicationEndpoint
for HBase to enable WAL backup to external storage. It introduces several components includingContinuousBackupReplicationEndpoint
,ContinuousBackupManager
, andStagedBulkloadFileRegistry
, among others, to handle the backup process efficiently.